I have a function that pattern matches on its arguments to produce a computation in StateT () Maybe (). This computation can fail when run, in which case I want the current pattern match branch to fail, so to speak.
I highly doubt it's possible to have something like
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f (Just n1) (Just n2) = do
m <- compute (n1 + n2)
guard (m == 42)
f (Just n) _ = do
m <- compute n
guard (m == 42)
f _ (Just n) = do
m <- compute n
guard (m == 42)
behave in the way I want it to: When the first computation fails due to the guard or somewhere in compute, I want f to try the next pattern.
Obviously the above can't work, because StateT (as any other monad might) involves an additional parameter when expanded, so I probably can't formulate this as simple pattern guards.
The following does what I want, but it's ugly:
f' :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f' a b = asum (map (\f -> f a b) [f1, f2, f3])
where
f1 a b = do
Just n1 <- pure a
Just n2 <- pure b
m <- compute (n1 + n2)
guard (m == 42)
f2 a _ = do
Just n <- pure a
m <- compute n
guard (m == 42)
f3 _ b = do
Just n <- pure b
m <- compute n
guard (m == 42)
A call like execStateT (f (Just 42) (Just 1)) () would fail for f but return Just () for f', because it matches f2.
How do I get the behavior of f' while having elegant pattern matching with as little auxiliary definitions as possible like in f? Are there other, more elegant ways to formulate this?
Complete runnable example:
#! /usr/bin/env stack
-- stack --resolver=lts-11.1 script
import Control.Monad.Trans.State
import Control.Applicative
import Control.Monad
import Data.Foldable
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f (Just n1) (Just n2) = do
m <- compute (n1 + n2)
guard (m == 42)
f (Just n) _ = do
m <- compute n
guard (m == 42)
f _ (Just n) = do
m <- compute n
guard (m == 42)
f' :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f' a b = asum (map (\f -> f a b) [f1, f2, f3])
where
f1 a b = do
Just n1 <- pure a
Just n2 <- pure b
m <- compute (n1 + n2)
guard (m == 42)
f2 a _ = do
Just n <- pure a
m <- compute n
guard (m == 42)
f3 _ b = do
Just n <- pure b
m <- compute n
guard (m == 42)
main = do
print $ execStateT (f (Just 42) (Just 1)) () -- Nothing
print $ execStateT (f' (Just 42) (Just 1)) () -- Just (), because `f2` succeeded
Edit: I elicited quite some clever answers with this question so far, thanks! Unfortunately, they mostly suffer from overfitting to the particular code example I've given. In reality, I need something like this for unifying two expressions (let-bindings, to be precise), where I want to try unifying the RHS of two simultaneous lets if possible and fall through to the cases where I handle let bindings one side at a time by floating them. So, actually there's no clever structure on Maybe arguments to exploit and I'm not computeing on Int actually.
The answers so far might benefit others beyond the enlightenment they brought me though, so thanks!
Edit 2: Here's some compiling example code with probably bogus semantics:
module Unify (unify) where
import Control.Applicative
import Control.Monad.Trans.State.Strict
data Expr
= Var String -- meta, free an bound vars
| Let String Expr Expr
-- ... more cases
-- no Eq instance, fwiw
-- | If the two terms unify, return the most general unifier, e.g.
-- a substitution (`Map`) of meta variables for terms as association
-- list.
unify :: [String] -> Expr -> Expr -> Maybe [(String, Expr)]
unify metaVars l r = execStateT (go [] [] l r) [] -- threads the current substitution as state
where
go locals floats (Var x) (Var y)
| x == y = return ()
go locals floats (Var x) (Var y)
| lookup x locals == Just y = return ()
go locals floats (Var x) e
| x `elem` metaVars = tryAddSubstitution locals floats x e
go locals floats e (Var y)
| y `elem` metaVars = tryAddSubstitution locals floats y e
-- case in point:
go locals floats (Let x lrhs lbody) (Let y rrhs rbody) = do
go locals floats lrhs rrhs -- try this one, fail current pattern branch if rhss don't unify
-- if we get past the last statement, commit to this branch, no matter
-- the next statement fails or not
go ((x,y):locals) floats lbody rbody
-- try to float the let binding. terms mentioning a floated var might still
-- unify with a meta var
go locals floats (Let x rhs body) e = do
go locals (Left (x,rhs):floats) body e
go locals floats e (Let y rhs body) = do
go locals (Right (y,rhs):floats) body e
go _ _ _ _ = empty
tryAddSubstitution = undefined -- magic
When I need something like this, I just use asum with the blocks inlined. Here I also condensed the multiple patterns Just n1 <- pure a; Just n2 <- pure b into one, (Just n1, Just n2) <- pure (a, b).
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f a b = asum
[ do
(Just n1, Just n2) <- pure (a, b)
m <- compute (n1 + n2)
guard (m == 42)
, do
Just n <- pure a
m <- compute n
guard (m == 42)
, do
Just n <- pure b
m <- compute n
guard (m == 42)
]
You can also use chains of <|>, if you prefer:
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f a b
= do
(Just n1, Just n2) <- pure (a, b)
m <- compute (n1 + n2)
guard (m == 42)
<|> do
Just n <- pure a
m <- compute n
guard (m == 42)
<|> do
Just n <- pure b
m <- compute n
guard (m == 42)
This is about as minimal as you can get for this kind of “fallthrough”.
If you were using Maybe alone, you would be able to do this with pattern guards:
import Control.Monad
import Control.Applicative
ensure :: Alternative f => (a -> Bool) -> a -> f a
ensure p a = a <$ guard (p a)
compute :: Int -> Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> Maybe Int
f (Just m) (Just n)
| Just x <- ensure (== 42) =<< compute (m + n)
= return x
f (Just m) _
| Just x <- ensure (== 42) =<< compute m
= return x
f _ (Just n)
| Just x <- ensure (== 42) =<< compute n
= return x
f _ _ = empty
(ensure is a general purpose combinator. Cf. Lift to Maybe using a predicate)
As you have StateT on the top, though, you would have to supply a state in order to pattern match on Maybe, which would foul up everything. That being so, you are probably better off with something in the vein of your "ugly" solution. Here is a whimsical attempt at improving its looks:
import Control.Monad
import Control.Applicative
import Control.Monad.State
import Control.Monad.Trans
import Data.Foldable
ensure :: Alternative f => (a -> Bool) -> a -> f a
ensure p a = a <$ guard (p a)
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f a b = asum (map (\c -> f' (c a b)) [liftA2 (+), const, flip const])
where
f' = ensure (== 42) <=< compute <=< lift
While this is an answer specific to the snippet I've given, the refactorings only apply limited to the code I was facing.
Perhaps it's not that far-fetched of an idea to extract the skeleton of the asum expression above to a more general combinator:
-- A better name would be welcome.
selector :: Alternative f => (a -> a -> a) -> (a -> f b) -> a -> a -> f b
selector g k x y = asum (fmap (\sel -> k (sel x y)) [g, const, flip const])
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f = selector (liftA2 (+)) (ensure (== 42) <=< compute <=< lift)
Though it is perhaps a bit awkward of a combinator, selector does show the approach is more general than it might appear at first: the only significant restriction is that k has to produce results in some Alternative context.
P.S.: While writing selector with (<|>) instead of asum is arguably more tasteful...
selector g k x y = k (g x y) <|> k x <|> k y
... the asum version straightforwardly generalises to an arbitrary number of pseudo-patterns:
selector :: Alternative f => [a -> a -> a] -> (a -> f b) -> a -> a -> f b
selector gs k x y = asum (fmap (\g -> k (g x y)) gs)
It looks like you could get rid of the whole pattern match by relying on the fact that Int forms a Monoid with addition and 0 as the identity element, and that Maybe a forms a Monoid if a does. Then your function becomes:
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f a b = pure $ a <> b >>= compute >>= pure . mfilter (== 42)
You could generalise by passing the predicate as an argument:
f :: Monoid a => (a -> Bool) -> Maybe a -> Maybe a -> StateT () Maybe a
f p a b = pure $ a <> b >>= compute >>= pure . mfilter p
The only thing is that compute is now taking a Maybe Int as input, but that is just a matter of calling traverse inside that function with whatever computation you need to do.
Edit: Taking into account your last edit, I find that if you spread your pattern matches into separate computations that may fail, then you can just write
f a b = f1 a b <|> f2 a b <|> f3 a b
where f1 (Just a) (Just b) = compute (a + b) >>= check
f1 _ _ = empty
f2 (Just a) _ = compute a >>= check
f2 _ _ = empty
f3 _ (Just b) = compute b >>= check
f3 _ _ = empty
check x = guard (x == 42)
OK, so I'm trying to write a monad for building test data, but I can't quite get it to work how I want it. It looks something like this:
runBuildM :: [i] -> BuildM i o x -> [o]
-- Given a list of i, build a list of o.
source :: BuildM i o i
-- Fetch unique i.
yield :: o -> BuildM i o ()
-- Return a new o to the caller.
gather :: BuildM i o x -> BuildM i o o
-- Fetch every possible o from sub-computation.
local :: BuildM i o x -> BuildM i o x
-- Isolate any source invocations from the rest of the code.
In other words, it's a supply monad, writer monad and list monad. The idea is that I can write something like this:
build_tests depth = do
local $ do
v <- source
yield v
yield (map toLower v)
yield "[]"
yield "()"
when (depth > 2) $ do
t1 <- gather $ build_tests (depth-1)
yield $ "(" ++ t1 ++ ")"
yield $ "[" ++ t1 ++ "]"
t2 <- gather $ build_tests (depth-1)
yield $ "(" ++ t1 ++ "," ++ t2 ++ ")"
The idea is to generate all possible combinations of data. You can do that just using list comprehensions, but the result ends up syntactically awful. This is much more readable. Unfortunately, it doesn't actually work...
The problem seems to boil down to the local function not behaving correctly. The intention is for any source calls in the sub-computation to have no effect outside of it. (I.e., subsequent calls to source from outside the local block get the first token again.) However, what my implementation of local actually does is reset the next token for everything (i.e., including the contents of the sub-computation). This is clearly incorrect, but I cannot for the life of me bend my mind around how to make it work correctly.
The fact that I'm having this much trouble getting the code to work as required probably means the actual internal representation of my monad is just wrong. Can anybody take a stab at implementing this correctly?
EDIT: I should perhaps have realised this, but I didn't actually specify the expected result I'm trying to get. The above code is supposed to produce this:
["A", "a", "[]", "()", "(A)", "(a)", "[A]", "[a]", "(A, B)", "(A, b)", "(a, B)", "(a, b)"]
It's not super-critical that the results appear in exactly this order. I'd like the single cases to appear before the compound ones, but I'm not too fussed exactly what order the compounds appear. The rule is that the same variable never appears twice in any individual expression.
If we allow the depth to be a bit deeper, we additionally get terms such as
"((A))", "([A])", "[(A)]", "((A, B), C)", "(A, (B, C))"
and so on.
It's clearly broken, but here's what I have so far:
newtype BuildM i o x = BuildM ([i] -> SEQ.Seq ([i], SEQ.Seq o, x))
instance Functor (BuildM i o) where
fmap uf (BuildM sf) =
BuildM $ \ is0 -> do
(is1, os, x) <- sf is0
return (is1, os, uf x)
instance Applicative (BuildM i o) where
pure x = BuildM $ \ is0 -> return (is0, SEQ.empty, x)
BuildM sf1 <*> BuildM sf2 =
BuildM $ \ is1 -> do
(is2, os2, f) <- sf1 is1
(is3, os3, x) <- sf2 is2
return (is3, os2 >< os3, f x)
instance Monad (BuildM i o) where
return = pure
BuildM sf1 >>= uf =
BuildM $ \ is1 -> do
(is2, os2, x) <- sf1 is1
let BuildM sf2 = uf x
(is3, os3, y) <- sf2 is2
return (is3, os2 >< os3, y)
runBuildM :: [i] -> BuildM i o x -> [o]
runBuildM is0 (BuildM sf) =
toList $ do
(is, os, x) <- sf is0
os
source :: BuildM i o i
source =
BuildM $ \ is ->
if null is
then error "AHC.Tests.TestBuilder.source: end of input"
else return (tail is, SEQ.empty, head is)
yield :: o -> BuildM i o ()
yield o = BuildM $ \ is -> return (is, SEQ.singleton o, () )
gather :: BuildM i o x -> BuildM i o' o
gather (BuildM sf1) =
BuildM $ \ is1 -> do
(is2, os2, _) <- sf1 is1
o <- os2
return (is2, SEQ.empty, o)
local :: BuildM i o x -> BuildM i o ()
local (BuildM sf1) =
BuildM $ \ is1 ->
let os = do (is2, os2, x) <- sf1 is1; os2
in return (is1, os, () )
You are trying to reinvent pipes. Your source and yield are pipes await and yield. The other two concerns you are trying to handle are a ReaderT and a WriterT respectively. If you put the entire list of inputs in the environment of the ReaderT you can run local sub computations that start over at the beginning of the list. You can collect all of the results from a sub-computation by adding a WriterT layer to collect the output.
For the nice syntax with gather you are trying to recreate ListT.
Pipes, Readers, and Writers
We're going to use all of the following in very short order.
import Data.Functor.Identity
import Data.Foldable
import Control.Monad
import Control.Monad.Morph
import Control.Monad.IO.Class
import Control.Monad.Trans.Class
import Control.Monad.Trans.Reader hiding (local)
import Control.Monad.Trans.Writer.Strict
import Pipes.Core
import Pipes ((>->))
import qualified Pipes as P
import qualified Pipes.Prelude as P
import Pipes.Lift (runWriterP, runReaderP)
Your builder is a Pipe i o over a Reader [i] which allows you to reset at the beginning of the input. We will define two versions of it, BuildT which is a monad transformer, and BuildM which is a monad. BuildM is jsut the transformer applied to Identity.
type BuildT e i o m r = Pipe i o (ReaderT e m) r
type BuildM e i o r = BuildT e i o Identity r
local runs a builder feeding it the entire input read from the environment. We might want to give this a different name to avoid conflicting with local defined for ReaderT
local :: (Monad m, Foldable f) => BuildT (f i) i o m () -> Proxy a' a () o (ReaderT (f i) m) ()
local subDef = do
e <- lift ask
hoist lift $ runReaderP e $
P.each e >->
subDef
To collect the results of the sub computations, we take advantage of the fact that pipes are so pure that you can swap out the underlying monad provided you have a natural transformation forall x. m x -> n x. The proxies from pipes have an MFunctor instance that provides a function hoist :: (forall x. m x -> n x) -> Proxy a' a b' b m r -> Proxy a' a b' b n r; it lets us lift all of the underlying monad operations under a pipe to use the pipe over another transformer, in this case WriterT.
collect :: (Monad m) => Proxy a' a () b m r -> Proxy a' a c' c m ([b], r)
collect subDef = do
(r, w) <- runWriterP $
hoist lift subDef //> \x -> lift $ tell (++[x])
return (w [], r)
To run a builder we feed it all of the input from the environment, provide the initial environment, collect the results, and run the entire pipe.
runBuildT :: (Monad m) => [i] -> BuildT [i] i o m () -> m [o]
runBuildT e = runEffect . fmap fst . collect . runReaderP e . local
Running the monad instead of the transformer is just
runBuildM :: [i] -> BuildM [i] i o () -> [o]
runBuildM e = runIdentity . runBuildT e
ListT
This section lets us use do-notation when generating all combinations of things. It's equivalent to using pipes' for in place of each >>= and yield in place of each return.
The syntax that gathers all of the results from a sub computation is reinventing ListT. a ListT m a holds a Producer a m () that only returns data downstream. Pipes that get data from upstream and return data downstream don't fit into the Producer b m (). This will take a bit of conversion.
We can convert a Proxy that has both an upstream and a downstream interface into one with only a downstream interface wrapped around another proxy with the upstream interface. To do so, we hoist the underlying monad into our new inner upstream proxy, then replace all of the requests in the outer downstream proxy with requests lifted from the inner upstream proxy.
floatRespond :: (Monad m) => Proxy a' a b' b m r -> Proxy c' c b' b (Proxy a' a d' d m) r
floatRespond = (lift . request >\\) . hoist lift
These can be converted into a ListT. We'll discard any returned data to get a more polymorphic type.
gather :: (Monad m) => Proxy a' a () b m r -> P.ListT (Proxy a' a c' c m) b
gather = P.Select . floatRespond . (>>= return . const ())
Using ListT is a bit cumbersome to use; you need an mplus between returns to get both outputs. It's frequently convenient to shove a proxy into a ListT so you can lift . yield instead of returning. We are going to discard all our ListT results an rely on the output coming from lift . yield.enumeratejust runs aListT` wrapped around anything, discarding all the results
enumerate = P.runListT
Example
We're now equipped to write and run your example. I thing you mean for source to get one value from the source and for yield to return one value. If you don't need to be able to get values one at a time your question is over-specified and this answer is overkill.
source = P.await
yield = P.yield
In the example, where we use gather to build lists, we run that portion of the code with enumerate and yield results with lift . yield.
import Data.Char
build_tests :: Monad m => Int -> BuildT [String] String String m ()
build_tests depth = do
local $ do
v <- source
yield $ v
yield $ (map toLower v)
yield "[]"
yield "()"
when (depth > 2) $ enumerate $ do
t1 <- gather $ build_tests (depth-1)
lift . yield $ "(" ++ t1 ++ ")"
lift . yield $ "[" ++ t1 ++ "]"
t2 <- gather $ build_tests (depth-1)
lift . yield $ "(" ++ t1 ++ "," ++ t2 ++ ")"
If we run this example with the input ["A", "B"] the "B" input is never used because source is only ever used once inside each local.
main = do
putStrLn "Depth 2"
print =<< runBuildT ["A", "B"] (build_tests 2)
putStrLn "Depth 3"
print =<< runBuildT ["A", "B"] (build_tests 3)
The output for depths less than 4 is small enough to repeat here.
["A","a","[]","()"]
Depth 3
["A","a","[]","()","(A)","[A]","(A,A)","(A,a)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,[])","(a,())","([])","[[]]","([],A)","([],a)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),[])","((),())"]
This might be overkill
I suspect that you might have meant for source to get everything from the source.
source = gather P.cat
yield = P.yield
If we use this for the example instead of getting a single item from the source we'll enumerate the first local block and yield results by returning in ListT.
build_tests :: Monad m => Int -> BuildT [String] String String m ()
build_tests depth = do
local $ enumerate $ do
v <- source
lift . yield $ v
lift . yield $ (map toLower v)
yield "[]"
yield "()"
when (depth > 2) $ enumerate $ do
t1 <- gather $ build_tests (depth-1)
lift . yield $ "(" ++ t1 ++ ")"
lift . yield $ "[" ++ t1 ++ "]"
t2 <- gather $ build_tests (depth-1)
lift . yield $ "(" ++ t1 ++ "," ++ t2 ++ ")"
This uses both source values when we run the example with two sources.
Depth 2
["A","a","B","b","[]","()"]
Depth 3
["A","a","B","b","[]","()","(A)","[A]","(A,A)","(A,a)","(A,B)","(A,b)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,B)","(a,b)","(a,[])","(a,())","(B)","[B]","(B,A)","(B,a)","(B,B)","(B,b)","(B,[])","(B,())","(b)","[b]","(b,A)","(b,a)","(b,B)","(b,b)","(b,[])","(b,())","([])","[[]]","([],A)","([],a)","([],B)","([],b)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),B)","((),b)","((),[])","((),())"]
If you never get a single value from the source you could just use ListT (ReaderT [i] m) o instead. You might still want a proxy to avoid messing around with mplus.
You are trying to reinvent pipes and some nice syntax for building lists. The problem is much simpler than how you characterize it. The source of strings can be completely separated from building the structures.
You want to generate structures that draw symbols from some source. Without worrying about the source, let's build the structures. Each structure is a Pipe that will draw from some source and yield strings to concatenate together to build the expression.
import Data.Char
import Data.Functor.Identity
import Pipes.Core
import Pipes ((>->))
import qualified Pipes as P
import qualified Pipes.Prelude as P
build_structures :: Int -> [Pipe String String Identity ()]
build_structures depth = gather $ do
yield $ P.take 1
yield $ P.map (map toLower) >-> P.take 1
when (depth > 2) $ do
t1 <- lift $ build_structures (depth - 1)
yield $ P.yield "(" >> t1 >> P.yield ")"
yield $ P.yield "[" >> t1 >> P.yield "]"
t2 <- lift $ build_structures (depth - 1)
yield $ P.yield "(" >> t1 >> P.yield "," >> t2 >> P.yield ")"
This code uses the ContT yield trick from the continuation answer.
We run one of these structures by feeding it the symbols and concatenating the results.
run :: Pipe String String Identity () -> String
run p = concat . P.toList $ P.each symbols >-> p
-- an infinite source of unique symbols
symbols :: [String]
symbols = drop 1 symbols'
where
symbols' = [""] ++ do
tail <- symbols'
first <- ['A'..'Z']
return (first : tail)
The examples produce the desired strings. I'll leave producing the two special cases "[]" and "()", which do not appear in recursive terms, as an exercise.
import Data.Functor
main = do
putStrLn "Depth 2"
print $ run <$> build_structures 2
putStrLn "Depth 3"
print $ run <$> build_structures 3
putStrLn "Depth 4"
print $ run <$> build_structures 4
This results in
Depth 2
["A","a"]
Depth 3
["A","a","(A)","[A]","(A,B)","(A,b)","(a)","[a]","(a,B)","(a,b)"]
Depth 4
["A","a","(A)","[A]","(A,B)","(A,b)","(A,(B))","(A,[B])","(A,(B,C))","(A,(B,c))","(A,(b))","(A,[b])","(A,(b,C))","(A,(b,c))","(a)","[a]","(a,B)","(a,b)","(a,(B))","(a,[B])","(a,(B,C))","(a,(B,c))","(a,(b))","(a,[b])",...
If my other answer is overkill, the continuation monad transformer provides a convenient way to construct any MonadPlus values.
The continuation monad lets us easily capture the idea of doing something mplus the as yet unknown remainder.
import Control.Monad
import Control.Monad.Trans.Cont
once :: MonadPlus m => m a -> ContT a m ()
once m = ContT $ \k -> m `mplus` k ()
Yielding a result is just returning it once.
yield :: MonadPlus m => a -> ContT a m ()
yield = once . return
We can gather up all the results by sticking mzero at the end.
gather :: MonadPlus m => ContT a m r -> m a
gather m = runContT m (const mzero)
Your example is written in terms of yield, gather, once, and lift.
import Data.Char
import Control.Monad.Trans.Class
build_tests :: MonadPlus m => m String -> Int -> ContT String m ()
build_tests source = go
where
go depth = do
once . gather $ do
v <- lift source
yield v
yield (map toLower v)
yield "[]"
yield "()"
when (depth > 2) $ do
t1 <- lift . gather $ go (depth-1)
yield $ "(" ++ t1 ++ ")"
yield $ "[" ++ t1 ++ "]"
t2 <- lift . gather $ go (depth-1)
yield $ "(" ++ t1 ++ "," ++ t2 ++ ")"
main = print . gather $ build_tests ["A", "B"] 3
This outputs the following:
Depth 2
["A","a","B","b","[]","()"]
Depth 3
["A","a","B","b","[]","()","(A)","[A]","(A,A)","(A,a)","(A,B)","(A,b)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,B)","(a,b)","(a,[])","(a,())","(B)","[B]","(B,A)","(B,a)","(B,B)","(B,b)","(B,[])","(B,())","(b)","[b]","(b,A)","(b,a)","(b,B)","(b,b)","(b,[])","(b,())","([])","[[]]","([],A)","([],a)","([],B)","([],b)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),B)","((),b)","((),[])","((),())"]
I've taken the liberty of getting rid of the requirement to read the original source from the environment for simplicity. You can add a ReaderT to the transformer stack to get it back. I also haven't chosen a list transfomer for you, the example is running using the ordinary list monad. Since it's written in terms of MonadPlus it will work for any (MonadTrans t, MonadPlus (t m)) => t m as well.
Can someone show a simple example where state monad can be better than passing state directly?
bar1 (Foo x) = Foo (x + 1)
vs
bar2 :: State Foo Foo
bar2 = do
modify (\(Foo x) -> Foo (x + 1))
get
State passing is often tedious, error-prone, and hinders refactoring. For example, try labeling a binary tree or rose tree in postorder:
data RoseTree a = Node a [RoseTree a] deriving (Show)
postLabel :: RoseTree a -> RoseTree Int
postLabel = fst . go 0 where
go i (Node _ ts) = (Node i' ts', i' + 1) where
(ts', i') = gots i ts
gots i [] = ([], i)
gots i (t:ts) = (t':ts', i'') where
(t', i') = go i t
(ts', i'') = gots i' ts
Here I had to manually label states in the right order, pass the correct states along, and had to make sure that both the labels and child nodes are in the right order in the result (note that naive use of foldr or foldl for the child nodes could have easily led to incorrect behavior).
Also, if I try to change the code to preorder, I have to make changes that are easy to get wrong:
preLabel :: RoseTree a -> RoseTree Int
preLabel = fst . go 0 where
go i (Node _ ts) = (Node i ts', i') where -- first change
(ts', i') = gots (i + 1) ts -- second change
gots i [] = ([], i)
gots i (t:ts) = (t':ts', i'') where
(t', i') = go i t
(ts', i'') = gots i' ts
Examples:
branch = Node ()
nil = branch []
tree = branch [branch [nil, nil], nil]
preLabel tree == Node 0 [Node 1 [Node 2 [],Node 3 []],Node 4 []]
postLabel tree == Node 4 [Node 2 [Node 0 [],Node 1 []],Node 3 []]
Contrast the state monad solution:
import Control.Monad.State
import Control.Applicative
postLabel' :: RoseTree a -> RoseTree Int
postLabel' = (`evalState` 0) . go where
go (Node _ ts) = do
ts' <- traverse go ts
i <- get <* modify (+1)
pure (Node i ts')
preLabel' :: RoseTree a -> RoseTree Int
preLabel' = (`evalState` 0) . go where
go (Node _ ts) = do
i <- get <* modify (+1)
ts' <- traverse go ts
pure (Node i ts')
Not only is this code more succinct and easier to write correctly, the logic that results in pre- or postorder labeling is far more transparent.
PS: bonus applicative style:
postLabel' :: RoseTree a -> RoseTree Int
postLabel' = (`evalState` 0) . go where
go (Node _ ts) =
flip Node <$> traverse go ts <*> (get <* modify (+1))
preLabel' :: RoseTree a -> RoseTree Int
preLabel' = (`evalState` 0) . go where
go (Node _ ts) =
Node <$> (get <* modify (+1)) <*> traverse go ts
As an example to my comment above, you can write code using the State monad like
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell #-}
import Data.Text (Text)
import qualified Data.Text as Text
import Control.Monad.State
data MyState = MyState
{ _count :: Int
, _messages :: [Text]
} deriving (Eq, Show)
makeLenses ''MyState
type App = State MyState
incrCnt :: App ()
incrCnt = modify (\my -> my & count +~ 1)
logMsg :: Text -> App ()
logMsg msg = modify (\my -> my & messages %~ (++ [msg]))
logAndIncr :: Text -> App ()
logAndIncr msg = do
incrCnt
logMsg msg
app :: App ()
app = do
logAndIncr "First step"
logAndIncr "Second step"
logAndIncr "Third step"
logAndIncr "Fourth step"
logAndIncr "Fifth step"
Note that using additional operators from Control.Lens also lets you write incrCnt and logMsg as
incrCnt = count += 1
logMsg msg = messages %= (++ [msg])
which is another benefit of using State in combination with the lens library, but for the sake of comparison I'm not using them in this example. To write the equivalent code above with just argument passing it would look more like
incrCnt :: MyState -> MyState
incrCnt my = my & count +~ 1
logMsg :: MyState -> Text -> MyState
logMsg my msg = my & messages %~ (++ [msg])
logAndIncr :: MyState -> Text -> MyState
logAndIncr my msg =
let incremented = incrCnt my
logged = logMsg incremented msg
in logged
At this point it isn't too bad, but once we get to the next step I think you'll see where the code duplication really comes in:
app :: MyState -> MyState
app initial =
let first_step = logAndIncr initial "First step"
second_step = logAndIncr first_step "Second step"
third_step = logAndIncr second_step "Third step"
fourth_step = logAndIncr third_step "Fourth step"
fifth_step = logAndIncr fourth_step "Fifth step"
in fifth_step
Another benefit of wrapping this up in a Monad instance is that you can use the full power of Control.Monad and Control.Applicative with it:
app = mapM_ logAndIncr [
"First step",
"Second step",
"Third step",
"Fourth step",
"Fifth step"
]
Which allows for much more flexibility when processing values calculated at runtime compared to static values.
The difference between manual state passing and using the State monad is simply that the State monad is an abstraction over the manual process. It also happens to fit several other widely used more general abstractions, like Monad, Applicative, Functor, and a few others. If you also use the StateT transformer then you can compose these operations with other monads, such as IO. Can you do all of this without State and StateT? Of course you can, and there's no one stopping you from doing so, but the point is that State abstracts this pattern out and gives you access to a huge toolbox of more general tools. Also, a small modification to the types above makes the same functions work in multiple contexts:
incrCnt :: MonadState MyState m => m ()
logMsg :: MonadState MyState m => Text -> m ()
logAndIncr :: MonadState MyState m => Text -> m ()
These will now work with App, or with StateT MyState IO, or any other monad stack with a MonadState implementation. It makes it significantly more reusable than simple argument passing, which is only possible through the abstraction that is StateT.
In my experience, the point of many Monads doesn't really click until you get into larger examples, so here is an example use of State (well, StateT ... IO) to parse an incoming request to a web service.
The pattern is that this web service can be called with a bunch of options of different types, though all except for one of the options have decent defaults. If I get a incoming JSON request with an unknown key value, I should abort with an appropriate message. I use the state to keep track of what the current config is, and what the remainder of the JSON request is, along with a bunch of accessor methods.
(Based on code currently in production, with the names of everything changed and the details of what this service actually does obscured)
{-# LANGUAGE OverloadedStrings #-}
module XmpConfig where
import Data.IORef
import Control.Arrow (first)
import Control.Monad
import qualified Data.Text as T
import Data.Aeson hiding ((.=))
import qualified Data.HashMap.Strict as MS
import Control.Monad.IO.Class (liftIO)
import Control.Monad.Trans.State (execStateT, StateT, gets, modify)
import qualified Data.Foldable as DF
import Data.Maybe (fromJust, isJust)
data Taggy = UseTags Bool | NoTags
newtype Locale = Locale String
data MyServiceConfig = MyServiceConfig {
_mscTagStatus :: Taggy
, _mscFlipResult :: Bool
, _mscWasteTime :: Bool
, _mscLocale :: Locale
, _mscFormatVersion :: Int
, _mscJobs :: [String]
}
baseWebConfig :: IO (IORef [String], IORef [String], MyServiceConfig)
baseWebConfig = do
infoRef <- newIORef []
warningRef <- newIORef []
let cfg = MyServiceConfig {
_mscTagStatus = NoTags
, _mscFlipResult = False
, _mscWasteTime = False
, _mscLocale = Locale "en-US"
, _mscFormatVersion = 1
, _mscJobs = []
}
return (infoRef, warningRef, cfg)
parseLocale :: T.Text -> Maybe Locale
parseLocale = Just . Locale . T.unpack -- The real thing does more
parseJSONReq :: MS.HashMap T.Text Value ->
IO (IORef [String], IORef [String], MyServiceConfig)
parseJSONReq m = liftM snd
(baseWebConfig >>= (\c -> execStateT parse' (m, c)))
where
parse' :: StateT (MS.HashMap T.Text Value,
(IORef [String], IORef [String], MyServiceConfig))
IO ()
parse' = do
let addWarning s = do let snd3 (_, b, _) = b
r <- gets (snd3 . snd)
liftIO $ modifyIORef r (++ [s])
-- These two functions suck a key/value off the input map and
-- pass the value on to the handler "h"
onKey k h = onKeyMaybe k $ DF.mapM_ h
onKeyMaybe k h = do myb <- gets fst
modify $ first $ MS.delete k
h (MS.lookup k myb)
-- Access the "lns" field of the configuration
config setter = modify (\(a, (b, c, d)) -> (a, (b, c, setter d)))
onKey "tags" $ \x -> case x of
Bool True -> config $ \c -> c {_mscTagStatus = UseTags False}
String "true" -> config $ \c -> c {_mscTagStatus = UseTags False}
Bool False -> config $ \c -> c {_mscTagStatus = NoTags}
String "false" -> config $ \c -> c {_mscTagStatus = NoTags}
String "inline" -> config $ \c -> c {_mscTagStatus = UseTags True}
q -> addWarning ("Bad value ignored for tags: " ++ show q)
onKey "reverse" $ \x -> case x of
Bool r -> config $ \c -> c {_mscFlipResult = r}
q -> addWarning ("Bad value ignored for reverse: " ++ show q)
onKey "spin" $ \x -> case x of
Bool r -> config $ \c -> c {_mscWasteTime = r}
q -> addWarning ("Bad value ignored for spin: " ++ show q)
onKey "language" $ \x -> case x of
String s | isJust (parseLocale s) ->
config $ \c -> c {_mscLocale = fromJust $ parseLocale s}
q -> addWarning ("Bad value ignored for language: " ++ show q)
onKey "format" $ \x -> case x of
Number 1 -> config $ \c -> c {_mscFormatVersion = 1}
Number 2 -> config $ \c -> c {_mscFormatVersion = 2}
q -> addWarning ("Bad value ignored for format: " ++ show q)
onKeyMaybe "jobs" $ \p -> case p of
Just (Array x) -> do q <- parseJobs x
config $ \c -> c {_mscJobs = q}
Just (String "test") ->
config $ \c -> c {_mscJobs = ["test1", "test2"]}
Just other -> fail $ "Bad value for jobs: " ++ show other
Nothing -> fail "Missing value for jobs"
m' <- gets fst
unless (MS.null m') (fail $ "Unrecognized key(s): " ++ show (MS.keys m'))
parseJobs :: (Monad m, DF.Foldable b) => b Value -> m [String]
parseJobs = DF.foldrM (\a b -> liftM (:b) (parseJob a)) []
parseJob :: (Monad m) => Value -> m String
parseJob (String s) = return (T.unpack s)
parseJob q = fail $ "Bad job value: " ++ show q