I am learning haskell and learning monads. I've watched and read various tutorials and coded some simple examples for state monad, however I am not able to understand the following piece of code (taken from Haskell Wiki):
import Control.Monad.State
fib n = flip evalState (0,1) $ do
forM [0..(n-1)] $ \_ -> do
(a,b) <- get
put (b,a+b)
(a,b) <- get
return a
My question boils down to the following:
What is going inside the first statement of the inner do, i.e what does (a,b)<-get result into. What will be the values of a and b for some concrete example.
Why would you want to use the state monad over here?
In this example, the state is a pair containing the previous two numbers generated in the sequence. This is initially (0, 1) provided to evalState.
The type of get is MonadState s m => m s so in the inner do block
(a, b) <- get
fetches the state pair and binds a and b to the first and second elements respectively. The state is then updated in the following put.
The state will therefore be:
(0, 1), (1, 1), (1, 2), (3, 2), (3, 5), ...
The outer
(a, b) <- get
return a
unpacks the final state value and returns the first element.
First lets make clear the Fibonacci algorithm being used. The idea is to start with the tuple (0, 1), then find the next as (1, 0 + 1), the next as (1, 1 + 1), (2, 2 + 1), (3, 3 + 2), and so on. Generally, the step is \(a, b) -> (b, a + b). You can see that in these tuples are the Fibonacci numbers.
What is going inside the first statement of the inner do, i.e what
does (a,b)<-get result into?
Haskell does not have statements, only expressions.
y <- x is not a complete expression. It is similar to x >>= \y ->.
y <- x
m
Is a complete expression and is equivalent to x >>= \y -> m. A line n not of the form y <- n is equivalent to _ <- n (excluding let lines and maybe some others I forget).
Using this we can desugar do-notation.
fib n =
flip evalState (0, 1)
( forM
[0..(n-1)]
(\_ -> get >>= (\(a, b) -> put (b, a + b)))
>>= (\_ -> get >>= (\(a, b) -> return a)))
)
Now it is just about understanding >>=, return, get, put, and so on.
State is actually just functions of the type s -> (s, a). They take an initial state and yield the next state plus some other value.
m >>= n a.k.a. "bind" has the type Monad m => m a -> (a -> m b) -> m b. Then, if our Monad is State s, this is the same as:
m >>= n ::
( s -> (s, a))
-> (a -> s -> (s, b))
-> ( s -> (s, b))
The a returned by m has to be passed to n. What else can we guess? We expect the state to pass along as well, so the state returned by m must be passed to n as well. The function m >>= n must return the state and value that n returns. We then know how to implement bind:
m >>= n = uncurry (flip n) . m
return :: Monad m => a -> m a which is then equivalent to return :: a -> s -> (s, a):
return = flip (,)
get :: State s s is equivalent to get :: s -> (s, s):
get = join (,)
put :: s -> State s () or put :: s -> s -> (s, ()):
put s _ = (s, ())
evalState :: s -> State s a -> a or evalState :: s -> (s -> (s, a)) -> a:
evalState s f = snd (f s)
You can expand all the definitions and see exactly what is happening in the example. Just the intuitions should suffice though.
forM
[0..(n-1)]
(\_ -> get >>= (\(a, b) -> put (b, a + b)))
We don't care about having the numbers 0 to n - 1 so the first argument is dropped. get retrieves the current state, then put writes the new state. We do this n times.
>>= (\_ -> get >>= (\(a, b) -> return a)))
We don't care about the accumulated value (which is unit) and so the first parameter is dropped. Then we get the current state and project just the first element of the pair. This is the final answer we're looking for.
flip evalState (0, 1) …
Finally we run starting from the initial state (0, 1).
There are some cleanups we can make to this implementation. First, we don't care about the range [0..(n-1)], we just care about repeating an action n times. A more direct way to do this is the following:
replicateM n (get >>= \(a, b) -> put (b, a + b))
The result is a list of unit which is unused, so a more efficient version is:
replicateM_ n (get >>= \(a, b) -> put (b, a + b))
There is already a function for the common pattern of get followed by put named modify, which is defined as \f -> get >>= put . f. Therefore:
replicateM_ n (modify (\(a, b) -> (b, a + b)))
Then there is the part:
>>= (\_ -> get >>= (\(a, b) -> return a)))
Any time we don't care about the previous result we can use >>.
>> get >>= (\(a, b) -> return a))
This is:
>> get >>= return . fst
m >>= return . f simplifies to fmap f m:
>> fmap fst get
Now we have, in total:
fib n =
evalState
( replicateM_ n (modify (\(a, b) -> (b, a + b)))
>> fmap fst get
)
(0, 1)
We might also use, for comparison:
fib n =
fst
( evalState
( replicateM_ n (modify (\(a, b) -> (b, a + b)))
>> get
)
(0, 1)
)
And then because I am silly:
fib =
fst
. flip evalState (0, 1)
. (>> get)
. flip replicateM_ (modify (snd &&& uncurry (+)))
Why would you want to use the state monad over here?
You wouldn't. This is clear because we only use the state value; the other value is always unit and discarded. In other words, we only need n (i.e. which Fibonacci number to find) at the beginning and afterwards we only need the accumulated tuple.
Sometimes you think to have a string of compositions like h . g . f but you want to send two arguments through instead of just one. That is when State may be applicable.
If some functions read and some write the state (the second argument), or do both, then State fits the bill. If there are only readers then use Reader and if there are only writers then use Writer.
We can alter the example to make better use of the State Monad. I will make the tuple disappear!
fib =
flip evalState 0
. foldr (=<<) (return 1)
. flip replicate (\x -> get >>= \y -> put x $> x + y)
So the docs state: get :: m s -- Return the state from the internals of the monad (see here).
But I remember very well that when I tried to wrap my head around the State Monad this didn't help me a lot.
I can only recommend playing around with :i and :t in ghci and test out different sub-expressions. Just to get a feel for it. A bit like this:
import Control.Monad.State.Lazy
runState (get) 0
runState (get >>= \x -> put (x+1)) 0
:t return 1 :: State Int Int
runState (return 1) 0
runState (return 1 >>= \x -> (get >>= \y -> return (x+y))) 0
-- Keeping a pair of (predecessor/current) in the state:
let f = (get >>= (\(a,b) -> put (b,a+b))) :: State (Int, Int) ()
runState (f >> f >> f >> f >> f >> f) (0,1)
-- only keeping the predecessor in the state:
let f x = (get >>= (\y -> put x >> return (x+y))) :: State Int Int
runState (return 1 >>= f >>= f >>= f >>= f >>= f >>= f) 0
Also play around with modify, runState, evalState, execState.
Related
From what I understand now the WAY that a monad tries to simulate a global variable is by "storing" the variable (or input) in a function (and the function also is a variable), and storing the function into another function by "passing it"; in the end, all the functions in the chain become a single function that folds multiple smaller functions.
Example:
(+2) >>= \a-> (+3) >>= \b -> return(a+b)
Based on the tutorial, >>= works like this f >>= g = \w -> g (f w) w
so what I understand is we are getting a function that takes two inputs, a w (variable) and a function (that also uses that variable)
and output ((+3) w). Here, w was the input.
So this line:
(+2) >>= \a-> (+3) >>= \b -> return(a+b)
is equivalent to
\w -> (\(2+w) -> (+3) w) >>=\b -> return (a+b)
a
(and this 2+w function been "stored" in (\a->(+3))'s first input \a).
Similarly:
\w2 -> (\b ->return (...)) ( \w -> (\(2+w) -> (+3) w) ) w2
and \b is stored in the \w -> (\(2+w) -> (+3) w):
\w2 -> ( \(( \w -> (\(2+w) -> (+3) w) ) -> return(...)) w2
Then, the return (a+b) is not actually just simply doing return x = \_ -> x, but is also able to take functions (a and b) out of that big function \w2 -> ( \b(a inside) -> return(...)) w2 to get another big function:
( (\w -> 2+w) +(\w -> (\(2+w) -> (+3) w) )
Do I understand it correctly? Or am I entirely wrong. I didn't find any tutorial that talked about return (taking input from function). Monads are still very confusing to me.
Let’s go through your example step by step in two ways. First, let’s see what the do notation equivalent is:
do
a <- (+ 2)
b <- (+ 3)
return (a + b)
This is essentially an abstraction over the following repetitive pattern, where many different functions are all applied to the same shared “environment” value and then there’s some final result returned:
\ env -> let
a = (+ 2) env
b = (+ 3) env
in (a + b)
In fact, for consistency with the other actions, the return also receives the environment, and just ignores it:
\ env -> let
a = (+ 2) env
b = (+ 3) env
in const (a + b) env -- const x y = x
We can make this fit the Monad pattern by having >>= carry this environment along and pass it to each function ((+ 2), (+ 3), const (a + b)).
>>= takes two functions function and continue, where function is one of (+ 2) or (+ 3) above; >>= applies function to the environment, and then passes two things along to continue: the result of function, and the environment again, so that subsequent steps in continue can also read the environment.
-- Type:
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b
-- or:
(>>=) :: (->) env a -> (a -> (->) env b) -> (->) env b
-- or:
(>>=) :: (env -> a) -> (a -> env -> b) -> env -> b
-- Implementation:
instance Monad ((->) env) where
function >>= continue = \ env -> continue (function env) env
-- or:
(>>=) = \ function continue env -> continue (function env) env
In fact, with the {-# LANGUAGE InstanceSigs #-} extension, you can write exactly the last type above as the signature for >>= in the Monad instance, and I recommend using that extension to help you implement typeclasses as you’re learning.
return takes a value x, and produces a function that fits in place of continue above, which, as I mentioned before, just ignores the environment and returns x.
return x = \ _env -> x
-- =
return = \ x _env -> x
-- =
return = const
Now secondly, if we go back to the desugared do notation, and add parentheses to clarify the scope of the lambdas:
(+ 2) >>= (\ a -> (+ 3) >>= (\ b -> return (a + b)))
Then we can inline those definitions of >>= and return to see how they work in this context. I’ve numbered the variables in each substitution to make the scopes a little more explicit.
(\ function1 continue1 env1 -> continue1 (function1 env1) env1)
(+ 2)
(\ a ->
(\ function2 continue2 env2 -> continue2 (function2 env2) env2)
(+ 3)
(\ b ->
(\ x env3 -> x)
(a + b)))
Then we can substitute all the variables throughout, by applying the functions to their arguments (beta-reduction):
let
function1 = (+ 2)
continue1 = \ a ->
let
function2 = (+ 3)
continue2 = \ b ->
let
x = (a + b)
in \ _env3 -> x
in \ env2 -> continue2 (function2 env2) env2
in \ env1 -> continue1 (function1 env1) env1)
-- =
\ env1 ->
(\ a env2 ->
(\ b _env3 -> a + b)
(env2 + 3)
env2)
(env1 + 2)
env1
-- =
\ env1 -> let
a = (env1 + 2)
env2 = env1
in let
b = (env2 + 3)
_env3 = env2
in (a + b)
Then we can omit all the redundant env variables, since they’re all equal to each other, and just used to pass the value along to all the functions:
\ env -> let
a = (env + 2)
b = (env + 3)
in (a + b)
And this is exactly the code we were trying to abstract away!
This is only one Monad instance, though; all other instances work differently based on the specific type involved. What all the basic instances have in common is that they abstract over some common repetitive pattern in pure code, e.g.:
-- State:
\ state1 -> let
(a, state2) = function1 state1
(b, state3) = function2 state2
(c, state4) = function3 state3
…
in (x, stateX)
-- Writer:
\ log1 -> let
(a, log2) = function1
(b, log3) = function2
(c, log4) = function3
…
in (x, log1 <> log2 <> log3 <> log4)
-- Maybe:
case function1 of
Just a -> case function2 of
Just b -> case function3 of
Just c -> …
Just x -> Just x
Nothing -> Nothing
Nothing -> Nothing
Nothing -> Nothing
Nothing -> Nothing
-- Either:
case function1 of
Right a -> case function2 of
Right b -> case function3 of
Right c -> …
Right x -> Right x
Left errorX -> Left errorX
Left error3 -> Left error3
Left error2 -> Left error2
Left error1 -> Left error1
-- List:
concatMap
(\ a -> concatMap
(\ b -> concatMap
(\ c -> concatMap
…
[x])
function3)
function2)
function1
Likewise, the basic monad transformers abstract over such patterns in monadic code.
-- MaybeT:
do
ma <- function1
case ma of
Just a -> do
mb <- function2
case mb of
Just b -> do
mc <- function3
case mc of
Just c -> …
Just x -> pure (Just x)
Nothing -> pure Nothing
Nothing -> pure Nothing
Nothing -> pure Nothing
Nothing -> pure Nothing
Nothing -> pure Nothing
The Monad/Applicative/Functor hierarchy provide a common way for data structures and control structures to abstract over some kind of sequential pattern like this so they can all be used with do notation and library functions like replicateM that work for any Monad.
It helps to look at lots of different examples, not just one instance, to get a feel for what they have in common and how they differ.
In your question, you are describing one specific monad, the "reader" monad. At the level of detail you're trying to understand it, this reader monad will behave differently from other monads.
That said, I think your understanding of the detailed operation of the reader monad is partially correct. However, you've missed a couple of things: you've messed up the precise order of operations in your example expression that's dictated by how lambda expressions are parsed, and you've sort of glossed over the "merge" step that combines the \w -> ... lambdas that are introduced by each >>= operator so they all refer to the same global variable.
To make it a little clearer, let me spell it out in detail, using simple transformations of expressions that match the way GHC would actually evaluate your code...
Let's take your example:
(+2) >>= \a -> (+3) >>= \b -> return (a+b)
and start by inserting some parenthesis. Even though the operator >>= is left associative, lambda expressions like \a -> ... suck up as much valid expression as they can, so the correct placement of parentheses in this expression is:
(+2) >>= ( \a -> ( (+3) >>= ( \b -> return (a+b) ) ) ) -- (1)
Now, as you note, the definition of >>= is:
f >>= g = \w -> g (f w) w
Also, the definition of return is:
return x = \w -> x
so let's work on the expression for the right-most occurrence of >>= in (1):
(+3) >>= ( \b -> return (a+b) )
= -- apply definition of `>>=` with f = (+3); g = \b -> return (a+b)
\w -> (\b -> return (a+b)) ((+3) w) w
= -- apply lambda expression (\b -> ...) to first argument ((+3) w)
\w -> (return (a + (+3) w)) w
= -- apply definition of `return` (with renamed variable w')
\w -> (\w' -> a + (+3) w) w
= -- apply lambda expression (\w' -> ...) to first argument w
\w -> a + (+3) w
So, we have:
(+3) >>= ( \b -> return (a+b) ) === \w -> a + (+3) w -- (2)
and, in a sense, the operator >>= works by passing the function (+3) as the argument b. However, it doesn't literally do this, because the argument b has numeric type, not function type. Instead, it "pulls out" a lambda for the "global variable" \w -> ... and then passes (+3) w as b, which works fine, since (+3) w has a numeric type.
Now, let's look at the original example (1) with the substitution (2):
-- from (1) and the subsitution (2):
(+2) >>= ( \a -> ( \w -> a + (+3) w ) )
= -- apply defn of >>= with f = (+2); g = \a -> (\w -> a + (+3) w)
\w' -> (\a -> (\w -> a + (+3) w)) ((+2) w') w'
= -- apply lambda (\a -> ...) to ((+2) w')
\w' -> (\w -> (+2) w' + (+3) w) w'
= -- apply lambda (\w -> ...) to w'
\w' -> (+2) w' + (+3) w'
and, again, in a sense, the operator >>= works by passing the function (+2) as argument a. Since it can't do it literatlly, it does it by pullling a lambda for the "global variable" \w' -> ... out and passing (+2) w' as a, and then it sort of "merges" the \w -> ... lambda pulled out by the rightmost >>= operator with the \w' -> ... it's added, so they point to the same global variable.
Obviously, if we added additional similar binds on the left:
(+8) >>= \e -> (+1) >>= \o -> (+2) >>= \a -> (+3) >>= \b -> return (e+o+a+b)
we'd get the same pattern of adding a new lambda \w' -> ... for the global variable and then "merging" it with the existing lambda \w -> ...:
\w -> (+8) w + (+1) w + (+2) w + (+3) w
How can I get a maximum element of an effectful container where computing attribute to compare against also triggers an effect?
There has to be more readable way of doing things like:
latest dir = Turtle.fold (z (ls dir)) Fold.maximum
z :: MonadIO m => m Turtle.FilePath -> m (UTCTime, Turtle.FilePath)
z mx = do
x <- mx
d <- datefile x
return (d, x)
I used overloaded version rather than non-overloaded maximumBy but the latter seems better suite for ad-hoc attribute selection.
How can I be more methodic in solving similar problems?
So I know nothing about Turtle; no idea whether this fits well with the rest of the Turtle ecosystem. But since you convinced me in the comments that maximumByM is worth writing by hand, here's how I would do it:
maximumOnM :: (Monad m, Ord b) => (a -> m b) -> [a] -> m a
maximumOnM cmp [x] = return x -- skip the effects if there's no need for comparison
maximumOnM cmp (x:xs) = cmp x >>= \b -> go x b xs where
go x b [] = return x
go x b (x':xs) = do
b' <- cmp x'
if b < b' then go x' b' xs else go x b xs
I generally prefer the *On versions of things -- which take a function that maps to an Orderable element -- to the *By versions -- which take a function that does the comparison directly. A maximumByM would be similar but have a type like Monad m => (a -> a -> m Ordering) -> [a] -> m a, but this would likely force you to redo effects for each a, and I'm guessing it's not what you want. I find *On more often matches with the thing I want to do and the performance characteristics I want.
Since you're already familiar with Fold, you might want to get to know FoldM, which is similar.
data FoldM m a b =
-- FoldM step initial extract
forall x . FoldM (x -> a -> m x) (m x) (x -> m b)
You can write:
maximumOnM ::
(Ord b, Monad m)
=> (a -> m b) -> FoldM m a (Maybe a)
maximumOnM f = FoldM combine (pure Nothing) (fmap snd)
where
combine Nothing a = do
f_a <- f a
pure (Just (f_a, a))
combine o#(Just (f_old, old)) new = do
f_new <- f new
if f_new > f_old
then pure $ Just (f_new, new)
else pure o
Now you can use Foldl.foldM to run the fold on a list (or other Foldable container). Like Fold, FoldM has an Applicative instance, so you can combine multiple effectful folds into one that interleaves the effects of each of them and combines their results.
It's possible to run effects on foldables using reducers package.
I'm not sure if it's correct, but it leverages existing combinators and instances (except for Bounded (Maybe a)).
import Data.Semigroup.Applicative (Ap(..))
import Data.Semigroup.Reducer (foldReduce)
import Data.Semigroup (Max(..))
import System.IO (withFile, hFileSize, IOMode(..))
-- | maxLength
--
-- >>> getMax $ maxLength ["abc","a","hello",""]
-- 5
maxLength :: [String] -> (Max Int)
maxLength = foldReduce . map (length)
-- | maxLengthIO
--
-- Note, this runs IO...
--
-- >>> (getAp $ maxLengthIO ["package.yaml", "src/Lib.hs"]) >>= return . getMax
-- Just 1212
--
-- >>> (getAp $ maxLengthIO []) >>= return . getMax
-- Nothing
maxLengthIO :: [String] -> Ap IO (Max (Maybe Integer))
maxLengthIO xs = foldReduce (map (fmap Just . f) xs) where
f :: String -> IO Integer
f s = withFile s ReadMode hFileSize
instance Ord a => Bounded (Maybe a) where
maxBound = Nothing
minBound = Nothing
I have a function that pattern matches on its arguments to produce a computation in StateT () Maybe (). This computation can fail when run, in which case I want the current pattern match branch to fail, so to speak.
I highly doubt it's possible to have something like
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f (Just n1) (Just n2) = do
m <- compute (n1 + n2)
guard (m == 42)
f (Just n) _ = do
m <- compute n
guard (m == 42)
f _ (Just n) = do
m <- compute n
guard (m == 42)
behave in the way I want it to: When the first computation fails due to the guard or somewhere in compute, I want f to try the next pattern.
Obviously the above can't work, because StateT (as any other monad might) involves an additional parameter when expanded, so I probably can't formulate this as simple pattern guards.
The following does what I want, but it's ugly:
f' :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f' a b = asum (map (\f -> f a b) [f1, f2, f3])
where
f1 a b = do
Just n1 <- pure a
Just n2 <- pure b
m <- compute (n1 + n2)
guard (m == 42)
f2 a _ = do
Just n <- pure a
m <- compute n
guard (m == 42)
f3 _ b = do
Just n <- pure b
m <- compute n
guard (m == 42)
A call like execStateT (f (Just 42) (Just 1)) () would fail for f but return Just () for f', because it matches f2.
How do I get the behavior of f' while having elegant pattern matching with as little auxiliary definitions as possible like in f? Are there other, more elegant ways to formulate this?
Complete runnable example:
#! /usr/bin/env stack
-- stack --resolver=lts-11.1 script
import Control.Monad.Trans.State
import Control.Applicative
import Control.Monad
import Data.Foldable
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f (Just n1) (Just n2) = do
m <- compute (n1 + n2)
guard (m == 42)
f (Just n) _ = do
m <- compute n
guard (m == 42)
f _ (Just n) = do
m <- compute n
guard (m == 42)
f' :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f' a b = asum (map (\f -> f a b) [f1, f2, f3])
where
f1 a b = do
Just n1 <- pure a
Just n2 <- pure b
m <- compute (n1 + n2)
guard (m == 42)
f2 a _ = do
Just n <- pure a
m <- compute n
guard (m == 42)
f3 _ b = do
Just n <- pure b
m <- compute n
guard (m == 42)
main = do
print $ execStateT (f (Just 42) (Just 1)) () -- Nothing
print $ execStateT (f' (Just 42) (Just 1)) () -- Just (), because `f2` succeeded
Edit: I elicited quite some clever answers with this question so far, thanks! Unfortunately, they mostly suffer from overfitting to the particular code example I've given. In reality, I need something like this for unifying two expressions (let-bindings, to be precise), where I want to try unifying the RHS of two simultaneous lets if possible and fall through to the cases where I handle let bindings one side at a time by floating them. So, actually there's no clever structure on Maybe arguments to exploit and I'm not computeing on Int actually.
The answers so far might benefit others beyond the enlightenment they brought me though, so thanks!
Edit 2: Here's some compiling example code with probably bogus semantics:
module Unify (unify) where
import Control.Applicative
import Control.Monad.Trans.State.Strict
data Expr
= Var String -- meta, free an bound vars
| Let String Expr Expr
-- ... more cases
-- no Eq instance, fwiw
-- | If the two terms unify, return the most general unifier, e.g.
-- a substitution (`Map`) of meta variables for terms as association
-- list.
unify :: [String] -> Expr -> Expr -> Maybe [(String, Expr)]
unify metaVars l r = execStateT (go [] [] l r) [] -- threads the current substitution as state
where
go locals floats (Var x) (Var y)
| x == y = return ()
go locals floats (Var x) (Var y)
| lookup x locals == Just y = return ()
go locals floats (Var x) e
| x `elem` metaVars = tryAddSubstitution locals floats x e
go locals floats e (Var y)
| y `elem` metaVars = tryAddSubstitution locals floats y e
-- case in point:
go locals floats (Let x lrhs lbody) (Let y rrhs rbody) = do
go locals floats lrhs rrhs -- try this one, fail current pattern branch if rhss don't unify
-- if we get past the last statement, commit to this branch, no matter
-- the next statement fails or not
go ((x,y):locals) floats lbody rbody
-- try to float the let binding. terms mentioning a floated var might still
-- unify with a meta var
go locals floats (Let x rhs body) e = do
go locals (Left (x,rhs):floats) body e
go locals floats e (Let y rhs body) = do
go locals (Right (y,rhs):floats) body e
go _ _ _ _ = empty
tryAddSubstitution = undefined -- magic
When I need something like this, I just use asum with the blocks inlined. Here I also condensed the multiple patterns Just n1 <- pure a; Just n2 <- pure b into one, (Just n1, Just n2) <- pure (a, b).
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f a b = asum
[ do
(Just n1, Just n2) <- pure (a, b)
m <- compute (n1 + n2)
guard (m == 42)
, do
Just n <- pure a
m <- compute n
guard (m == 42)
, do
Just n <- pure b
m <- compute n
guard (m == 42)
]
You can also use chains of <|>, if you prefer:
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f a b
= do
(Just n1, Just n2) <- pure (a, b)
m <- compute (n1 + n2)
guard (m == 42)
<|> do
Just n <- pure a
m <- compute n
guard (m == 42)
<|> do
Just n <- pure b
m <- compute n
guard (m == 42)
This is about as minimal as you can get for this kind of “fallthrough”.
If you were using Maybe alone, you would be able to do this with pattern guards:
import Control.Monad
import Control.Applicative
ensure :: Alternative f => (a -> Bool) -> a -> f a
ensure p a = a <$ guard (p a)
compute :: Int -> Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> Maybe Int
f (Just m) (Just n)
| Just x <- ensure (== 42) =<< compute (m + n)
= return x
f (Just m) _
| Just x <- ensure (== 42) =<< compute m
= return x
f _ (Just n)
| Just x <- ensure (== 42) =<< compute n
= return x
f _ _ = empty
(ensure is a general purpose combinator. Cf. Lift to Maybe using a predicate)
As you have StateT on the top, though, you would have to supply a state in order to pattern match on Maybe, which would foul up everything. That being so, you are probably better off with something in the vein of your "ugly" solution. Here is a whimsical attempt at improving its looks:
import Control.Monad
import Control.Applicative
import Control.Monad.State
import Control.Monad.Trans
import Data.Foldable
ensure :: Alternative f => (a -> Bool) -> a -> f a
ensure p a = a <$ guard (p a)
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f a b = asum (map (\c -> f' (c a b)) [liftA2 (+), const, flip const])
where
f' = ensure (== 42) <=< compute <=< lift
While this is an answer specific to the snippet I've given, the refactorings only apply limited to the code I was facing.
Perhaps it's not that far-fetched of an idea to extract the skeleton of the asum expression above to a more general combinator:
-- A better name would be welcome.
selector :: Alternative f => (a -> a -> a) -> (a -> f b) -> a -> a -> f b
selector g k x y = asum (fmap (\sel -> k (sel x y)) [g, const, flip const])
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f = selector (liftA2 (+)) (ensure (== 42) <=< compute <=< lift)
Though it is perhaps a bit awkward of a combinator, selector does show the approach is more general than it might appear at first: the only significant restriction is that k has to produce results in some Alternative context.
P.S.: While writing selector with (<|>) instead of asum is arguably more tasteful...
selector g k x y = k (g x y) <|> k x <|> k y
... the asum version straightforwardly generalises to an arbitrary number of pseudo-patterns:
selector :: Alternative f => [a -> a -> a] -> (a -> f b) -> a -> a -> f b
selector gs k x y = asum (fmap (\g -> k (g x y)) gs)
It looks like you could get rid of the whole pattern match by relying on the fact that Int forms a Monoid with addition and 0 as the identity element, and that Maybe a forms a Monoid if a does. Then your function becomes:
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f a b = pure $ a <> b >>= compute >>= pure . mfilter (== 42)
You could generalise by passing the predicate as an argument:
f :: Monoid a => (a -> Bool) -> Maybe a -> Maybe a -> StateT () Maybe a
f p a b = pure $ a <> b >>= compute >>= pure . mfilter p
The only thing is that compute is now taking a Maybe Int as input, but that is just a matter of calling traverse inside that function with whatever computation you need to do.
Edit: Taking into account your last edit, I find that if you spread your pattern matches into separate computations that may fail, then you can just write
f a b = f1 a b <|> f2 a b <|> f3 a b
where f1 (Just a) (Just b) = compute (a + b) >>= check
f1 _ _ = empty
f2 (Just a) _ = compute a >>= check
f2 _ _ = empty
f3 _ (Just b) = compute b >>= check
f3 _ _ = empty
check x = guard (x == 42)
Learn You a Haskell presents the addStuff function:
import Control.Monad.Instances
addStuff :: Int -> Int
addStuff = do
a <- (*2) -- binds (*2) to a
b <- (+10) -- binds (+10) to b
return (a+b) -- return has type sig: 'Monad m => a -> m a'
Are the types of a, b, and return (a+b) all Int -> Int? I think so, but I'm not sure how bind-ing plays a role.
I tried to implement it using >>=, but I'm not sure how to complete it (hence ...).
addStuff' :: Int -> Int
addStuff' = (*2) >>= (+10) >>= ...
Please give me a hint to complete it, as well as edit my understanding of the do notation version.
As I understand, the ... needs to include a type of Int -> Int. In the do version, I could use a and b, but I'm not sure how to add them with the >>= version.
When working with the reader monad (a.k.a. the function monad), you have the type a -> b, which can be rewritten as (->) a b. The actual monad instance here is
instance Monad ((->) r) where
return x = const x
f >>= g = \r -> g (f r) r
Notice that during >>=, the type is
(>>=) :: ((->) r a) -> (a -> ((->) r b)) -> ((->) r b)
Which can be rewritten as
(>>=) :: (r -> a) -> (a -> (r -> b)) -> (r -> b)
Or even
(>>=) :: (r -> a) -> (a -> r -> b) -> (r -> b)
So as you can see, what >>= does is take a single input, apply that to f, and then apply that result to g to produce a new function r -> b. So for your example, you could use:
addStuff' :: Int -> Int
addStuff' = (*2) >>= (+)
And so addStuff' 10 == 30, since it performs the computation (10 * 2) + (10). Note how 10 is fed both to (*2) and (+), and the result of (10*2) is fed to (+) as well. It might make things a little more clear to see it as
test :: Int -> (Int, Int, Int)
test = do
x <- (*2)
y <- (*3)
z <- (*5)
return (x, y, z)
And it's result would be
> test 1
(2, 3, 5)
> test 10
(20, 30, 50)
What this essentially is doing is taking the argument to test "before" it's been applied, feeding it to each of the functions on the right hand side of the <-s, and then combining that result in the return.
So how can you write these without do notation? You could do something like
test :: Int -> (Int, Int, Int)
test =
(\r -> r * 2) >>= (\x ->
(\r -> r * 3) >>= (\y ->
(\r -> r * 5) >>= (\z ->
return (x, y, z))))
Which, admittedly, is not very readable, even with formatting, but the gist is basically that r gets fed to each intermediate function, which produces a result, and a couple nested lambda expressions later you return all three of those results in a tuple.
With a bit of simplification, you could also make each of those nested lambdas into two arguments lambdas:
test =
(\r -> r * 2) >>=
(\x r -> r * 3) >>=
(\y r -> r * 5) >>=
(\z r -> const (x, y, z) r)
I've also replaced the last \z -> return (x, y, z) with its equivalent \z -> const (x, y, z) => \z r -> const (x, y, z) r, just so they all have the same form.
As a rough rule if you want to manually desugar do-notation, first erase the do at the top and flip the bind arrow (<-) on the left-hand-side to a (>>=) on the right-hand-side with the variable on the left as a lambda variable on the right. So:
addStuff :: Int -> Int
addStuff = do
a <- (*2)
... rest ...
Becomes:
addStuff :: Int -> Int
addStuff =
(*2) >>= (\a ->
... rest ...
)
This is recursive, so the next term in the do-notation then becomes nested in the lambda of the desugared term above it, all the way down to the last expression which is just the body of the nested lambda expression.
The desugaring is quite mechanical, it's defined by the following rewrites, where ; denotes a newline.
do { a <- f ; m } ≡ f >>= \a -> do { m }
do { f ; m } ≡ f >> do { m }
do { m } ≡ m
Both a and b are of type Int while return (a+b) has type Int -> Int which is the last term in the do-notation so it has to be identical to the toplevel signature. Using -XScopedTypeVariables we can manually annotate the subterms:
{-# LANGUAGE ScopedTypeVariables #-}
import Control.Monad.Instances
addStuff :: Int -> Int
addStuff = do
(a :: Int) <- (*2)
(b :: Int) <- (+10)
(return (a+b)) :: Int -> Int
Thanks to bheklilr.
I wrote my own code.
addStuff :: Int -> Int
addStuff = (\r -> r * 2) >>= (\x ->
(\r -> r + 10) >>= (\y ->
return (x + y)))
I wanted to make a generic function that folds over a wide range of inputs (see Making a single function work on lists, ByteStrings and Texts (and perhaps other similar representations)). As one answer suggested, the ListLike is just for that. Its FoldableLL class defines an abstraction for anything that is foldable. However, I need a monadic fold. So I need to define foldM in terms of foldl/foldr.
So far, my attempts failed. I tried to define
foldM'' :: (Monad m, LL.FoldableLL full a) => (b -> a -> m b) -> b -> full -> m b
foldM'' f z = LL.foldl (\acc x -> acc >>= (`f` x)) (return z)
but it runs out of memory on large inputs - it builds a large unevaluated tree of computations. For example, if I pass a large text file to
main :: IO ()
main = getContents >>= foldM'' idx 0 >> return ()
where
-- print the current index if 'a' is found
idx !i 'a' = print i >> return (i + 1)
idx !i _ = return (i + 1)
it eats up all memory and fails.
I have a feeling that the problem is that the monadic computations are composed in a wrong order - like ((... >>= ...) >>= ...) instead of (... >>= (... >>= ...)) but so far I didn't find out how to fix it.
Workaround: Since ListLike exposes mapM_, I constructed foldM on ListLikes by wrapping the accumulator into the state monad:
modifyT :: (Monad m) => (s -> m s) -> StateT s m ()
modifyT f = get >>= \x -> lift (f x) >>= put
foldLLM :: (LL.ListLike full a, Monad m) => (b -> a -> m b) -> b -> full -> m b
foldLLM f z c = execStateT (LL.mapM_ (\x -> modifyT (\b -> f b x)) c) z
While this works fine on large data sets, it's not very nice. And it doesn't answer the original question, if it's possible to define it on data that are just FoldableLL (without mapM_).
So the goal is to reimplement foldM using either foldr or foldl. Which one should it be? We want the input to be processed lazily and allow for infinte lists, this rules out foldl. So foldr is it going to be.
So here is the definition of foldM from the standard library.
foldM :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
foldM _ a [] = return a
foldM f a (x:xs) = f a x >>= \fax -> foldM f fax xs
The thing to remember about foldr is that its arguments simply replace [] and : in the list (ListLike abstracts over that, but it still serves as a guiding principle).
So what should [] be replaced with? Clearly with return a. But where does a come from? It won’t be the initial a that is passed to foldM – if the list is not empty, when foldr reaches the end of the list, the accumulator should have changed. So we replace [] by a function that takes an accumulator and returns it in the underlying monad: \a -> return a (or simply return). This also gives the type of the thing that foldr will calculate: a -> m a.
And what should we replace : with? It needs to be a function b -> (a -> m a) -> (a -> m a), taking the first element of the list, the processed tail (lazily, of course) and the current accumulator. We can figure it out by taking hints from the code above: It is going to be \x rest a -> f a x >>= rest. So our implementation of foldM will be (adjusting the type variables to match them in the code above):
foldM'' :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
foldM'' f z list = foldr (\x rest a -> f a x >>= rest) return list z
And indeed, now your program can consume arbitrary large input, spitting out the results as you go.
We can even prove, inductively, that the definitions are semantically equal (although we should probably do coinduction or take-induction to cater for infinite lists).
We want to show
foldM f a xs = foldM'' f a xs
for all xs :: [b]. For xs = [] we have
foldM f a []
≡ return a -- definition of foldM
≡ foldr (\x rest a -> f a x >>= rest) return [] a -- definition of foldr
≡ foldM'' f a [] -- definition of foldM''
and, assuming we have it for xs, we show it for x:xs:
foldM f a (x:xs)
≡ f a x >>= \fax -> foldM f fax xs --definition of foldM
≡ f a x >>= \fax -> foldM'' f fax xs -- induction hypothesis
≡ f a x >>= \fax -> foldr (\x rest a -> f a x >>= rest) return xs fax -- definition of foldM''
≡ f a x >>= foldr (\x rest a -> f a x >>= rest) return xs -- eta expansion
≡ foldr (\x rest a -> f a x >>= rest) return (x:xs) -- definition of foldr
≡ foldM'' f a (x:xs) -- definition of foldM''
Of course this equational reasoning does not tell you anything about the performance properties you were interested in.