Composable atomic-like operations - haskell

I want to compose operations that may fail, but there is a way to roll back.
For example - an external call to book a hotel room, and an external call to charge a credit card. Both of those calls may fail such as no rooms left, invalid credit card. Both have ways to roll back - cancel hotel room, cancel credit charge.
Is there a name for this type of (not real) atomic. Whenever i search for haskell transaction, I get STM.
Is there an abstraction, a way to compose them, or a library in haskell or any other language?
I feel you could write a monad Atomic T which will track these operations and roll them back if there is an exception.
Edit:
These operations may be IO operations. If the operations were only memory operations, as the two answers suggest, STM would suffice.
For example booking hotels would via HTTP requests. Database operations such as inserting records via socket communication.
In the real world, for irreversible operations there is a grace period before the operation will be done - e.g. credit cards payments and hotel bookings may be settled at the end of the day, and therefore it is fine to cancel before then.

This is exactly the purpose of STM. Actions are composed so that they succeed or fail together, automatically.
Very similar to your hotel room problem is the bank transaction example in Simon Peyton-Jones's chapter in "Beautiful Code": http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/beautiful.pdf

If you need to resort to making your own monad, it will look something like this:
import Control.Exception (onException, throwIO)
newtype Rollbackable a = Rollbackable (IO (IO (), a))
runRollbackable :: Rollbackable a -> IO a
runRollbackable (Rollbackable m) = fmap snd m
-- you might want this to catch exceptions and return IO (Either SomeException a) instead
instance Monad Rollbackable where
return x = Rollbackable $ return (return (), x)
Rollbackable m >>= f
= do (rollback, x) <- m
Rollbackable (f x `onException` rollback)
(You will probably want Functor and Applicative instances also, but they're trivial.)
You would define your rollbackable primitive actions in this way:
rollbackableChargeCreditCard :: CardNumber -> CurrencyAmount -> Rollbackable CCTransactionRef
rollbackableChargeCreditCard ccno amount = Rollbackable
$ do ref <- ioChargeCreditCard ccno amount
return (ioUnchargeCreditCard ref, ref)
ioChargeCreditCard :: CardNumber -> CurrencyAmount -> IO CCTransactionRef
-- use throwIO on failure
ioUnchargeCreditCard :: CCTransactionRef -> IO ()
-- these both just do ordinary i/o
Then run them like so:
runRollbackable
$ do price <- rollbackableReserveRoom roomRequirements when
paymentRef <- rollbackableChargeCreditCard ccno price
-- etc

If your computations could be done only with TVar like things then STM is perfect.
If you need a side effect (like "charge Bob $100") and if there is a error later issue a retraction (like "refund Bob $100") then you need, drumroll please: Control.Exceptions.bracketOnError
bracketOnError
:: IO a -- ^ computation to run first (\"acquire resource\")
-> (a -> IO b) -- ^ computation to run last (\"release resource\")
-> (a -> IO c) -- ^ computation to run in-between
-> IO c -- returns the value from the in-between computation
Like Control.Exception.bracket, but only performs the final action if there was an
exception raised by the in-between computation.
Thus I could imagine using this like:
let safe'charge'Bob = bracketOnError (charge'Bob) (\a -> refund'Bob)
safe'charge'Bob $ \a -> do
rest'of'transaction
which'may'throw'error
Make sure you understand where to use the Control.Exception.mask operation if you are in a multi-threaded program and try things like this.
And I should emphasize that you can and should read the Source Code to Control.Exception and Control.Exception.Base to see how this is done in GHC.

You really can do this with the clever application of STM. The key is to separate out the IO parts. I assume the trouble is that a transaction might appear to succeed initially, and fail only later on. (If you can recognize failure right away, or soon after, things are simpler):
main = do
r <- reserveHotel
c <- chargeCreditCard
let room = newTVar r
card = newTVar c
transFailure = newEmptyTMVar
rollback <- forkIO $ do
a <- atomically $ takeTMVar transFailure --blocks until we put something here
case a of
Left "No Room" -> allFullRollback
Right "Card declined" -> badCardRollback
failure <- listenForFailure -- A hypothetical IO action that blocks, waiting for
-- a failure message or an "all clear"
case failures of
"No Room" -> atomically $ putTMVar (Left "No Room")
"Card Declined" -> atomically $ putTMVar (Right "Card declined")
_ -> return ()
Now, there's nothing here that MVars couldn't handle: all we're doing is forking a thread to wait and see if we need to fix things. But you'll presumably be doing some other stuff with your card charges and hotel reservations...

Related

Timeout on StateT with IO

I have a custom type
type GI a = StateT GenState IO a
where GenState is a state I keep for Generating Random Trees of some kind.
When generating my trees, termination is not guaranteed in a reasonable amount of time. That's why I thought I might terminate the calculation and restart it over and over again with a timeout until a result is given.
So my question is how to write a function of the form
tryGeneration :: GI a -> GI a
tryGeneraton action = ...
where action is the calculation to try in some microseconds and if it runs out of time begin the action from the start.
Please keep in mind that I'm quite new to Monad Transformers and I cannot say that i fully understand them yet.
I tried to use lift with System.Timeout.timeout and did not succeed
EDIT: thank you all for your suggestions. I followed them, and got it done in the IO monad.
tryGenerationTime :: Int -> GenState -> GI a -> IO (a, GenState)
tryGenerationTime time state action = do
(_, s') <- -- change the random state to not generate the same thing over and over
res <- timeout time (runStateT action s')
case res of
Nothing -> tryGenerationTime time s' action
Just r -> return r
timeItT :: Int -> GI a -> GI a
timeItT time action = do
state <- get
(x, s') <- lift $ tryGenerationTime time state action
put s'
return x
Any suggestion to improving this code is welcome. I just wanted to get it done fast, since that wasn't the solution to my generation problem and I needed to set a limit to the tree height to succeed.
I suspect what you actually want is more like
tryGeneration :: GI a -> IO a
tryGeneraton action = ...
in such a way that all of your "build a tree" actions have timeout-based retries.
The key thing to understand is that "attempt to do X; if you aren't done in n milliseconds, start over" is IO's job; IO is where you have access to things like time. (Of course there are wrappers you could and should use when you only need part of what IO has to offer.)
This is fine; you have access to IO in GI, you probably just have to lift it.
That said, there's not enough information here to say exactly how to do what you want, and I'm more familiar with free-monad effect systems than mtl transformers anyway...

IO in Haskell not visible

I have a little interactive game program. The interactive part looks like this,
main :: IO ()
main = playGame newGame
where
playGame :: Game -> IO ()
playGame game =
do putStr $ show game
putStr $ if gameOver game then "Another game? (y or n) > "
else show (whoseMove game) ++ " to play (row col) > "
moveWords <- fmap (words . fmap cleanChar ) getLine
if stopGame game moveWords
then return ()
else playGame $ if gameOver game then newGame else makeMove game moveWords
This works fine. It displays the game state, asks for the next move, applies that move to the state, displays the new state, etc.
Then I saw a video by Moss Collum in which he showed a game that uses the following strategy for interacting with the user.
...
userInput <- getContents
foldM_ updateScreen (12, 40) (parseInput userInput) where
...
I couldn't find a reference to foldM_ but assuming it was some sort of fold I tried this. (I actually tried a number of things, but this seems clearest.)
main' :: IO ()
main' = do
moveList <- fmap (map words . lines . map cleanChar) getContents
let states = scanl makeMove newGame moveList
foldl (\_ state -> putStr . show $ state) (return ()) states
When I run it, I never get the game state printed out until after hitting end-of-file, at which point the correct final game state is printed. Before that, I can enter moves, and they are processed properly (according to the final game state), but I never see the intermediate states. (The idea is that lazy evaluation should print the game states as they become available.)
I'd appreciate help understanding why I don't see the intermediate states and what, if anything, I can do to fix it.
Also, after entering end-of-file (^Z on Windows) the program refuses to play again, saying that the handle has been closed. To play again I have to restart the program. Is there a way to fix that?
First, let's start with foldM:
foldM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
a is the type of things in our container, and b is our result type. It takes a b (the value we've accumulated so far in our fold, ana(then ext thing in the list, and returns an m b, meaning it returns a b and does some monadic actions in the monad m.
Then it takes an initial value, a container of a's, and it returns an
This is basically like a normal fold function, but each step of the fold is monadic, and the final result is monadic. So the actions will actually be performed if you use foldM.
Now let's look at foldM_:
foldM_ :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m ()
Source
This is the same, but the final result is m (). This is for the case where we don't care about the final result. We keep the intermediate results and use them to perform monadic actions at each step, but when we finish, we only care about what we did in the monad, so we throw away the result.
In your case, t would be List, and m would be IO. So foldM_ iterates through the list, doing the selected IO actions at each stage, then throws away the final result.
fold doesn't actually sequence the actions together, it just folds over your list normally, even if your final result is IO. So your foldl creates an IO action, namely putStr . show $ state, then passes it to the next step of the fold. But, you ignore the first argument of your fold, so it throws the IO away without ever doing anything with it!
This is a tricky thing in Haskell. A value of type IO Something doesn't actually do the action when you create it. It just creates an IO value, which is just an instruction to the runtime of what to do when it runs main. If you throw it away, and it never gets sequenced into an IO that your main performs, then the side-effect will never happen.

Building a monad on top of hedis, a haskell redis lib

I want to write a simple DSL on top of hedis, a redis lib. The goal is to write functions like:
iWantThis :: ByteString -> MyRedis ()
iWantThis bs = do
load bs -- :: MyRedis () It fetches a BS from Redis and puts it as
-- the state in a state monad
bs <- get -- :: MyRedis (Maybe ByteString) Gets the current state
put $ doSomethingPure bs -- :: MyMonad () Updates the state
commit -- :: MyRedis () Write to redis
The basic idea is to fetch data from redis, put it in a state monad, do some stuff with the state and then put the updated state back into redis.
Obviously, it should be atomic so load and put should happen in the same Redis transaction. Hedis permits that by wrapping calls to Redis in a RedisTx (Queued a). For example, we have get :: ByteString -> RedisTx (Queued a).
Queued is a monad and you then run multiExec on your Queued a to execute everything in the Queued a in the same transaction. So I tried to define my MyRedis as such:
import qualified Database.Redis as R
newtype MyRedis a = MyRedis { runMyRedis :: StateT MyState R.RedisTx a } -- deriving MonadState, MyState...
The run function calls multiExec so I'm sure that as long as I stay in MyRedis everything happens in the same transaction.
run :: MyRedis (R.Queued a) -> MyState -> IO (R.TxResult a)
run m s = R.runRedis (undefined :: R.Connection) (R.multiExec r)
where r = evalStateT (runMyRedis m) s
Furthermore, I can define commit as:
commit :: ByteString -> MyRedis (R.Queued R.Status)
commit bs = do
MyState new <- get
(MyRedis . lift) (R.set bs new)
And a computation would look like:
computation :: MyRedis (R.Queued R.Status)
computation = do
load gid
MyState bs <- get
put $ MyState (reverse bs)
commit gid
where gid = "123"
But I can't figure out how to write "load"
load :: ByteString -> MyRedis ()
load gid = undefined
Actually, I think that it is not possible to write load, because get is of type ByteString -> RedisTx (Queued (Maybe ByteString)) and I have no way to peek into the Queued monad without executing it.
Questions:
Is it correct that because of the type of Hedis's get, it doesn't make sense to define a load function with the semantics above?
Is it possible to change the MyRedis type definition to make it work?
Hedis doesn't define a RedisT monad transformer. If such a transformer existed, would it be of any help?
Hedis defines (but does not export to lib users) a MonadRedis typeclass; would making my monad an instance of that typeclass help?
Is it the right approach? I want to:
Abstract over Redis (I may switch someday to another DB)
Restrict the Redis functions available to my users (basically only lifting to MyRedis get and set)
Guarantee that when I run my monad everything happens in the same (redis) transaction
Put my redis abstraction at the same level as other functions in my monad
You can play with the code at http://pastebin.com/MRqMCr9Q. Sorry for the pastebin, lpaste.net is down at the moment.
What you want is not possible. In particular, you can't provide a monadic interface while a running a computation in one Redis transaction. Nothing to do with the library you're using - it's just not something Redis can do.
Redis transactions are rather different from the ACID transactions you may be used to from the world of relational databases. Redis transactions have batching semantics, which means that later commands cannot in any way depend on the result of earlier commands.
Look: here's something similar to your example, run at the Redis command line.
> set "foo" "bar"
OK
> multi
OK
> get "foo"
QUEUED -- I can't now set "baz" to the result of this command because there is no result!
> exec
1) "bar" -- I only get the result after running the whole tran
Anyway, that's the purpose of that library's slightly odd Queued type: the idea is to prevent you from accessing any of the results of a batched command until the end of the batch. (It seems that the author wanted to abstract over batched and non-batched commands but there are simpler ways to do that. See below for how I'd simplify the transactional interface.)
So there's no "choosing what to do next" when Redis transactions are involved, but the whole point of (>>=) :: m a -> (a -> m b) -> m b is that later effects can depend on earlier results. You have to choose between monads and transactions.
If you decide you want transactions, there's an alternative to Monad called Applicative which handlily supports purely-static effects. This is exactly what we need. Here's some (entirely untested) code illustrating how I'd cook an Applicative version of your idea.
newtype RedisBatch a = RedisBatch (R.RedisTx (R.Queued a))
-- being a transactional batch of commands to send to redis
instance Functor RedisBatch where
fmap = liftA
instance Applicative RedisBatch where
pure x = RedisBatch (pure (pure x))
(RedisBatch rf) <*> (RedisBatch rx) = RedisBatch $ (<*>) <$> rf <*> rx
-- no monad instance
get :: ByteString -> RedisBatch (Maybe ByteString)
get key = RedisBatch $ get key
set :: ByteString -> ByteString -> RedisBatch (R.Status)
set key val = RedisBatch $ set key val
runBatch :: R.Connection -> RedisBatch a -> IO (R.TxResult a)
runBatch conn (RedisBatch x) = R.runRedis conn (R.multiExec x)
If I wanted to abstract over transactional-or-not behaviour, as the library author has attempted to do, I'd write a second type RedisCmd exposing a monadic interface, and a class containing my primitive operations, with instances for my two RedisBatch and RedisCmd types.
class Redis f where
get :: ByteString -> f (Maybe ByteString)
set :: ByteString -> ByteString -> f (R.Status)
Now, computations with a type of (Applicative f, Redis f) => ... could work for either behaviour (transactional or not), but those which require a monad (Monad m, Redis m) => ... would only be able to run in non-transactional mode.
When all's said and done, I'm not convinced it's worth it. People seem to like building abstractions over libraries like this, invariably providing less functionality than the library did and writing more code for bugs to lurk in. Whenever someone says "I may want to switch databases" I sigh: the only sufficiently abstract abstraction for that purpose is one which provides no functionality. Worry about switching databases when the time comes that you need to (that is, never).
On the other hand, if your goal is not to abstract the database but just to clean up the interface, the best thing may be to fork the library.

Getting parallel IO while accounting for failure

I'm making several API calls that are encapsulated in a type alias:
type ConnectT a = EitherT String (RWST ConnectReader ConnectWriter ConnectState IO) a
Here's a simplified version of a function which connects to two separate APIs:
connectBoth :: ConnectT ()
connectBoth = do
a <- connectAPI SomeAPI someFunction
b <- connectAPI OtherAPI otherFunction
connectAPI OtherAPI (b `sendTo` a)
The final call in connectBoth is very time sensitive (and the transactions are of a financial nature). I figure a and b could be evaluated in parallel, and with lazy IO I should be able to do this:
b <- a `par` connectAPI OtherAPI otherFunction
The documentation for par says that it Indicates that it may be beneficial to evaluate the first argument in parallel with the second.
Does this work with IO?
Can I get any more guaranteed than "it may be beneficial?"
Or if I want greater guarantees will I need to use an MVar and liftIO . forkIO?
If I evaluate a first, I think I can use eitherT to check if a succeeded. But if I evaluate both at the same time I get confused. Here is the situation:
If only a failed, I will retry a, if that fails I will run a function that manually reverses b
If only b failed, I will retry b, write to the log in RWS and return left
if both fail write to the log in RWS and return left
if both succeed process c (which is not as time sensitive as a or b)
But if I evaluate both in parallel, then how can I identify which one failed? If I use eitherT immediately after a then a will evaluate first. If I use it after b then I won't be able to tell which one failed.
Is there a way I can evaluate the IO calls in parallel but respond differently depending on which one (if any) fails? Or am I left with a choice of parallelism vs failure mitigation?
The solution you are looking for will use forkIO and MVars.
par
par is for multiprocessor parallelism, it helps evaluate terms in parallel. It doesn't help with IO. If you do
do
a <- (someProcess :: IO a)
...
By the time you reach ... everything from the IO action has happened (if we ignore evil lazy IO) to a point that a can be determined entirely by ordinary evaluation. This means that by the time you do b <- someOtherProcess, all of someProcess is already done. It's too late to do anything in parallel.
EitherT
You can explicitly examine the Either e a result of an EitherT e m a. runEitherT :: EitherT e m a -> m (Either e a) makes the success or failure explicit in the underlying monad. We can lift that right back into EitherT to make a computation that always succeeds (sometimes with an error) from one that sometimes fails.
import Control.Monad.Trans.Class
examine :: (MonadTrans t, Monad m) => EitherT e m a -> t m (Either e a)
examine = lift . runEitherT
forkIO
The simplest solution for doing two things in IO is forkIO. It starts another lightweight thread that you can forget about.
If you run a value with your transformer stack, there will be four pieces of data when you are done. The state ConnectState, the written ConnectWriter log, whether the computation was successful, and, depending on whether or not it was successful, either the value or the error.
EitherT String (RWST ConnectReader ConnectWriter ConnectState IO) a
^ ^ ^ ^ ^
If we write out the structure of this, it looks like
(RWST ConnectReader ConnectWriter ConnectState IO) (Either String a)
^ ^ ^ ^ ^
ConnectReader -> ConnectState -> IO (Either String a, ConnectState, ConnectWriter)
^ ^ ^ ^ ^
All four of those pieces of information end up in the result of the IO action. If you fork your stack, you need to decide what to do with all of them when you join the results back together. You have already decided that you want to explicitly handle the Either String a. The ConnectWriters can probably be combined together with <>. You will need to decide what to do with ConnectState.
We'll make a fork that returns all four of these pieces of data by shoving them into an MVar.
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Monad.IO.Class
forkConnectT :: ConnectT a -> ConnectT (MVar (Either String a, ConnectState, ConnectWriter))
forkConnectT cta = do
result <- liftIO newEmptyMVar
r <- lift ask
s <- lift get
liftIO $ forkIO $ do
state <- runRWST (runEitherT cta) r s
putMVar result state
return result
Later, when we want the result, we can try and see if it is ready. We'll handle the Either for success and failure explicitly, while handling the state and writer behind the scenes.
import Data.Traversable
tryJoinConnectT :: MVar (Either String a, ConnectState, ConnectWriter) -> ConnectT (Maybe (Either String a))
tryJoinConnectT result = liftIO (tryTakeMVar result) >>= traverse reintegrate
Behind the scenes we reintegrate the ConnectWriter by telling this ConnectT to write what was accumulated in the other thread. You will need to decide what to do to combine the two states.
reintegrate :: (a, ConnectState, ConnectWriter) -> ConnectT a
reintegrate (a, s, w) = do
-- Whatever needs to be done with the state.
-- stateHere <- lift get
lift $ tell w
return a
If we want to wait until the result is ready, we can block reading the MVar. This offers less opportunity for handling errors such as timeouts.
joinConnectT :: MVar (Either String a, ConnectState, ConnectWriter) -> ConnectT (Either String a)
joinConnectT result = liftIO (takeMVar result) >>= reintegrate
Example
Putting it all together, we can fork a task in parallel, do something in this thread explicitly examining the success or failure, join with the result from the other thread, and reason about what to do next with explicit Eithers representing success or failure from each process.
connectBoth :: ConnectT ()
connectBoth = do
bVar <- forkConnectT $ connectAPI OtherAPI otherFunction
a <- examine $ connectAPI SomeAPI someFunction
b <- joinConnectT bVar
...
Going farther
If you are paranoid, you will also want to handle exceptions (some of which can be handled by forkFinally) and asynchronous exceptions. You will need to decide whether to bundle these exceptions into your stack or treat IO like it can always throw exceptions.
Consider using async instead of forkIO and MVars.
monad-control (which you already have dependencies on via either) provides mechanisms for building up, one transformer at a time, the type that represents the state of a monad transformer stack. We wrote this by hand as (Either String a, ConnectState, ConnectWriter). If you are going to grow your transformer stack, you might want to get this from MonadTransControl instead. You can restore the state from the forked thread(see MonadBaseControl section) in the parent to inspect it. You will still need to decide how to deal with the data from the two states..

How do I create a thread pool?

Sometimes I want to run a maximum amount of IO actions in parallel at once for network-activity, etc. I whipped up a small concurrent thread function which works well with https://gist.github.com/810920, but this isn't really a pool as all IO actions must finish before others can start.
The type of what I'm looking for would be something like:
runPool :: Int -> [IO a] -> IO [a]
and should be able to operate on finite and infinite lists.
The pipes package looks like it would be able to achieve this quite well, but I feel there is probably a similar solution to the gist I have provided just using mvars, etc, from the haskell-platform.
Has anyone encountered an idiomatic solution without any heavy dependencies?
You need a thread pool, if you want something short, you could get inspiration from Control.ThreadPool (from the control-engine package which also provide more general functions), for instance threadPoolIO is just :
threadPoolIO :: Int -> (a -> IO b) -> IO (Chan a, Chan b)
threadPoolIO nr mutator = do
input <- newChan
output <- newChan
forM_ [1..nr] $
\_ -> forkIO (forever $ do
i <- readChan input
o <- mutator i
writeChan output o)
return (input, output)
It use two Chan for communication with the outside but that's usually what you want, it really help writing code that don't mess up.
If you absolutely want to wrap it up in a function of your type you can encapsulate the communication too :
runPool :: Int -> [IO a] -> IO [a]
runPool n as = do
(input, output) <- threadPoolIO n (id)
forM_ as $ writeChan input
sequence (repeat (length as) $ readChan output)
This won't keep the order of your actions, is that a problem (it's easy to correct by transmitting the index of the action or just using an array instead to store the responses) ?
Note : the n threads will stay alive forever with this simplistic version, adding a "killAll" returned action to threadPoolIO would resolve this problem handily if you intend to create and trash several of those pool in a long running application (if not, given the weight of threads in Haskell, it's probably not worth the bother).
Note that this function works on finite lists only, that's because IO is normally strict so you can't start to process elements of IO [a] before the whole list is produced, if you really want that you'll have either to use lazy IO with unsafeInterleaveIO (maybe not the best idea) or completely change your model and use something like conduits to stream your results.

Resources