I'm currently preparing for my final exam regarding Haskell, and I am going over the Monads and we were giving an example such as:
Given the following definition for the List Monad:
instance Monad [] where
m >>= f = concatMap f m
return x = [x]
where the types of (>>=) and concatMap are
(>>=) :: [a] -> (a -> [b]) -> [b]
concatMap :: (a -> [b]) -> [a] -> [b]
What is the result of the expression?
> [1,2,3] >>= \x -> [x..3] >>= \y -> return x
[1, 1, 1, 2, 2, 3] //Answer
Here the answer is different from what I thought it to be, now we briefly went over Monads, but from what I understand (>>=) is called bind and could be read in the expression above as "applyMaybe". In this case for the first part of bind we get [1,2,3,2,3,3] and we continue to the second part of the bind, but return x is defined to return the list of x. Which should have been [1,2,3,2,3,3]. However, I might have misunderstood the expression. Can anyone explain the wrong doing of my approach and how should I have tackled this. Thanks.
this case for the first part of bind we get [1,2,3,2,3,3]
Correct.
and we continue to the second part of the bind, but "return x" is defined to return the list of x. Which should have been [1,2,3,2,3,3].
Note that, in...
[1,2,3] >>= \x -> [x..3] >>= \y -> return x
... x is bound by (the lambda of) the first (>>=), and not by the second one. Some extra parentheses might make that clearer:
[1,2,3] >>= (\x -> [x..3] >>= (\y -> return x))
In \y -> return x, the values bound to y (that is, the elements of [1,2,3,2,3,3]) are ignored, and replaced by the corresponding values bound to x (that is, the elements of the original list from which each y was generated). Schematically, we have:
[1, 2, 3] -- [1,2,3]
[1,2,3, 2,3, 3] -- [1,2,3] >>= \x -> [x..3]
[1,1,1, 2,2, 3] -- [1,2,3] >>= \x -> [x..3] >>= \y -> return x
First, let's be clear how this expression is parsed: lambdas are syntactic heralds, i.e. they grab as much as they can to their right, using it as the function result. So what you have there is parsed as
[1,2,3] >>= (\x -> ([x..3] >>= \y -> return x))
The inner expression is actually written more complicated than it should be. y isn't used at all, and a >>= \_ -> p can just be written as a >> p. There's an even better replacement though: generally, the monadic bind a >>= \q -> return (f q) is equivalent to fmap f a, so your expression should really be written
[1,2,3] >>= (\x -> (fmap (const x) [x..3]))
or
[1,2,3] >>= \x -> map (const x) [x..3]
or
[1,2,3] >>= \x -> replicate (3-x+1) x
At this point it should be pretty clear what the result will be, since >>= in the list monad simply maps over each element and concatenates the results.
Related
I am from python world, but try to use functional as much as possible, and change my imperative thinking.
Now i study haskell and found that
list = [(x,y) | x<-[1,2,3], y<-[4,5,6]]
are translated to
list' =
[1,2,3] >>= \x ->
[4,5,6] >>= \y ->
return (x,y)
I try to understand step by step processing of list monad binds chain:
bind defind as:
xs >>= f = concat (map f xs)
and bind is left associative
So as I understend, at start first bind ([1,2,3] >>= \x -> [4,5,6]) executed with result [4,5,6,4,5,6,4,5,6]
And then next bind
[4,5,6,4,5,6,4,5,6] >>= \y -> return (x,y) executed
But how it can see x inside lambda if it's allready calculated ?? x is just argument that has no concrete value at all (lambda can be call with varios arguments at any time, how can we se it outside and fixed??). If it somehow can see it how can it know about that x call history was changed with 1,2,3 ? in my understanding onces first bind calculation done, only result [4,5,6,4,5,6,4,5,6] avaible and it goes next, to first argument of next bind.
So I can't understand how can i read this construction, and how it step by step logicaly produce correct result?
This is a common source of confusion. The scope of the lambdas extends as far as possible, so this:
[1,2,3] >>= \x ->
[4,5,6] >>= \y ->
return (x,y)
Is equivalent to this:
[1,2,3] >>= (\x ->
[4,5,6] >>= (\y ->
return (x,y)))
So the inner lambda \y -> … is inside the scope of the outer lambda, \x -> …, meaning x and y are both in scope. You can then inline the definitions of >>= and return for the [] monad instance and step through the evaluation:
concatMap (\x ->
concatMap (\y ->
[(x,y)]) [4,5,6]) [1,2,3]
concat
[ concatMap (\y -> [(1,y)]) [4,5,6]
, concatMap (\y -> [(2,y)]) [4,5,6]
, concatMap (\y -> [(3,y)]) [4,5,6]
]
concat
[ concat [[(1,4)], [(1,5)], [(1,6)]]
, concat [[(2,4)], [(2,5)], [(2,6)]]
, concat [[(3,4)], [(3,5)], [(3,6)]]
]
concat
[ [(1,4), (1,5), (1,6)]
, [(2,4), (2,5), (2,6)]
, [(3,4), (3,5), (3,6)]
]
[ (1,4), (1,5), (1,6)
, (2,4), (2,5), (2,6)
, (3,4), (3,5), (3,6)
]
A common and more succinct way of expressing this is with the Applicative instance instead of the Monad instance, using the <$> (fmap) and <*> (ap) operators or liftA2:
(,) <$> [1,2,3] <*> [4,5,6]
liftA2 (,) [1,2,3] [4,5,6]
Suppose I have following code
do {x <- (Just 3); y <- (Just 5); return (x:y:[])}
Which outputs Just [3,5]
How does haskell know that output value should be in Maybe monad? I mean return could output [[3, 5]].
do {x <- (Just 3); y <- (Just 5); return (x:y:[])}
desugars to
Just 3 >>= \x -> Just 5 >>= \y -> return $ x:y:[]
Since the type of >>= is Monad m => m a -> (a -> m b) -> m b and per argument Just 3 (alternatively Just 5) we have m ~ Maybe, the return type of the expression must be some Maybe type.
There is a possibility to make this return [[3, 5]] using something called natural transformations from category theory. Because there exists a natural transformation from Maybe a to [a], namely
alpha :: Maybe a -> [a]
alpha Nothing = []
alpha (Just a) = [a]
we have that your desired function is simply the natural transformation applied to the result:
alpha (Just 3 >>= \x -> Just 5 >>= \y -> return $ x:y:[])
-- returns [[3, 5]]
Since this is a natural transformation, you can also apply alpha first and your function second:
alpha (Just 3) >>= \x -> alpha (Just 5) >>= \y -> return $ x:y:[]
-- returns [[3, 5]]
As #duplode pointed out, you can find alpha in the package Data.Maybe as maybeToList.
I wanted to make a generic function that folds over a wide range of inputs (see Making a single function work on lists, ByteStrings and Texts (and perhaps other similar representations)). As one answer suggested, the ListLike is just for that. Its FoldableLL class defines an abstraction for anything that is foldable. However, I need a monadic fold. So I need to define foldM in terms of foldl/foldr.
So far, my attempts failed. I tried to define
foldM'' :: (Monad m, LL.FoldableLL full a) => (b -> a -> m b) -> b -> full -> m b
foldM'' f z = LL.foldl (\acc x -> acc >>= (`f` x)) (return z)
but it runs out of memory on large inputs - it builds a large unevaluated tree of computations. For example, if I pass a large text file to
main :: IO ()
main = getContents >>= foldM'' idx 0 >> return ()
where
-- print the current index if 'a' is found
idx !i 'a' = print i >> return (i + 1)
idx !i _ = return (i + 1)
it eats up all memory and fails.
I have a feeling that the problem is that the monadic computations are composed in a wrong order - like ((... >>= ...) >>= ...) instead of (... >>= (... >>= ...)) but so far I didn't find out how to fix it.
Workaround: Since ListLike exposes mapM_, I constructed foldM on ListLikes by wrapping the accumulator into the state monad:
modifyT :: (Monad m) => (s -> m s) -> StateT s m ()
modifyT f = get >>= \x -> lift (f x) >>= put
foldLLM :: (LL.ListLike full a, Monad m) => (b -> a -> m b) -> b -> full -> m b
foldLLM f z c = execStateT (LL.mapM_ (\x -> modifyT (\b -> f b x)) c) z
While this works fine on large data sets, it's not very nice. And it doesn't answer the original question, if it's possible to define it on data that are just FoldableLL (without mapM_).
So the goal is to reimplement foldM using either foldr or foldl. Which one should it be? We want the input to be processed lazily and allow for infinte lists, this rules out foldl. So foldr is it going to be.
So here is the definition of foldM from the standard library.
foldM :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
foldM _ a [] = return a
foldM f a (x:xs) = f a x >>= \fax -> foldM f fax xs
The thing to remember about foldr is that its arguments simply replace [] and : in the list (ListLike abstracts over that, but it still serves as a guiding principle).
So what should [] be replaced with? Clearly with return a. But where does a come from? It won’t be the initial a that is passed to foldM – if the list is not empty, when foldr reaches the end of the list, the accumulator should have changed. So we replace [] by a function that takes an accumulator and returns it in the underlying monad: \a -> return a (or simply return). This also gives the type of the thing that foldr will calculate: a -> m a.
And what should we replace : with? It needs to be a function b -> (a -> m a) -> (a -> m a), taking the first element of the list, the processed tail (lazily, of course) and the current accumulator. We can figure it out by taking hints from the code above: It is going to be \x rest a -> f a x >>= rest. So our implementation of foldM will be (adjusting the type variables to match them in the code above):
foldM'' :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
foldM'' f z list = foldr (\x rest a -> f a x >>= rest) return list z
And indeed, now your program can consume arbitrary large input, spitting out the results as you go.
We can even prove, inductively, that the definitions are semantically equal (although we should probably do coinduction or take-induction to cater for infinite lists).
We want to show
foldM f a xs = foldM'' f a xs
for all xs :: [b]. For xs = [] we have
foldM f a []
≡ return a -- definition of foldM
≡ foldr (\x rest a -> f a x >>= rest) return [] a -- definition of foldr
≡ foldM'' f a [] -- definition of foldM''
and, assuming we have it for xs, we show it for x:xs:
foldM f a (x:xs)
≡ f a x >>= \fax -> foldM f fax xs --definition of foldM
≡ f a x >>= \fax -> foldM'' f fax xs -- induction hypothesis
≡ f a x >>= \fax -> foldr (\x rest a -> f a x >>= rest) return xs fax -- definition of foldM''
≡ f a x >>= foldr (\x rest a -> f a x >>= rest) return xs -- eta expansion
≡ foldr (\x rest a -> f a x >>= rest) return (x:xs) -- definition of foldr
≡ foldM'' f a (x:xs) -- definition of foldM''
Of course this equational reasoning does not tell you anything about the performance properties you were interested in.
I found precedence and associativity is a big obstacle for me to understand what the grammar is trying to express at first glance to haskell code.
For example,
blockyPlain :: Monad m => m t -> m t1 -> m (t, t1)
blockyPlain xs ys = xs >>= \x -> ys >>= \y -> return (x, y)
By experiment, I finally got it means,
blockyPlain xs ys = xs >>= (\x -> (ys >>= (\y -> return (x, y))))
instead of
blockyPlain xs ys = xs >>= (\x -> ys) >>= (\y -> return (x, y))
Which works as:
*Main> blockyPlain [1,2,3] [4,5,6]
[(1,4),(1,5),(1,6),(2,4),(2,5),(2,6),(3,4),(3,5),(3,6)]
I can get info from ghci for (>>=) as an operator, (infixl 1 >>=).
But there's no information for -> since it's not an operator.
Could someone of you guys give some reference to make this grammar thing easier to grasp?
The rule for lambdas is pretty simple: the body of the lambda extends as far to the right as possible without hitting an unbalanced parenthesis.
f (\x -> foo (bar baz) *** quux >>= quuxbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
body
A good rule of thumb seems to be that you can never make a custom operator that has precedence over built in syntactic constructs. For instance consider this example:
if b then f *** x else f *** y
Regardless of the associativity of ***, no one would expect it to binds as:
(if b then f *** x else f) *** y
There aren't a lot of syntactic constructs in Haskell (do and case are a little special because of layout syntax) but let can be used as another example:
(let x = y in y *** x) /= ((let x = y in y) *** x)
UPDATE: Okay this question becomes potentially very straightforward.
q <- mapM return [1..]
Why does this never return?
Does mapM not lazily deal with infinite lists?
The code below hangs. However, if I replace line A by line B, it doesn't hang anymore. Alternatively, if I preceed line A by a "splitRandom $", it also doesn't hang.
Q1 is: Is mapM not lazy? Otherwise, why does replacing line A with line B "fix this" code?
Q2 is: Why does preceeding line A with splitRandom "solve" the problem?
import Control.Monad.Random
import Control.Applicative
f :: (RandomGen g) => Rand g (Double, [Double])
f = do
b <- splitRandom $ sequence $ repeat $ getRandom
c <- mapM return b -- A
-- let c = map id b -- B
a <- getRandom
return (a, c)
splitRandom :: (RandomGen g) => Rand g a -> Rand g a
splitRandom code = evalRand code <$> getSplit
t0 = do
(a, b) <- evalRand f <$> newStdGen
print a
print (take 3 b)
The code generates an infinite list of random numbers lazily. Then it generates a single random number. By using splitRandom, I can evaluate this latter random number first before the infinite list. This can be demonstrated if I return b instead of c in the function.
However, if I apply the mapM to the list, the program now hangs. To prevent this hanging, I have to apply splitRandom again before the mapM. I was under the impression that mapM can lazily
Well, there's lazy, and then there's lazy. mapM is indeed lazy in that it doesn't do more work than it has to. However, look at the type signature:
mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
Think about what this means: You give it a function a -> m b and a bunch of as. A regular map can turn those into a bunch of m bs, but not an m [b]. The only way to combine the bs into a single [b] without the monad getting in the way is to use >>= to sequence the m bs together to construct the list.
In fact, mapM is precisely equivalent to sequence . map.
In general, for any monadic expression, if the value is used at all, the entire chain of >>=s leading to the expression must be forced, so applying sequence to an infinite list can't ever finish.
If you want to work with an unbounded monadic sequence, you'll either need explicit flow control--e.g., a loop termination condition baked into the chain of binds somehow, which simple recursive functions like mapM and sequence don't provide--or a step-by-step sequence, something like this:
data Stream m a = Nil | Stream a (m (Stream m a))
...so that you only force as many monad layers as necessary.
Edit:: Regarding splitRandom, what's going on there is that you're passing it a Rand computation, evaluating that with the seed splitRandom gets, then returning the result. Without the splitRandom, the seed used by the single getRandom has to come from the final result of sequencing the infinite list, hence it hangs. With the extra splitRandom, the seed used only needs to thread though the two splitRandom calls, so it works. The final list of random numbers works because you've left the Rand monad at that point and nothing depends on its final state.
Okay this question becomes potentially very straightforward.
q <- mapM return [1..]
Why does this never return?
It's not necessarily true. It depends on the monad you're in.
For example, with the identity monad, you can use the result lazily and it terminates fine:
newtype Identity a = Identity a
instance Monad Identity where
Identity x >>= k = k x
return = Identity
-- "foo" is the infinite list of all the positive integers
foo :: [Integer]
Identity foo = do
q <- mapM return [1..]
return q
main :: IO ()
main = print $ take 20 foo -- [1 .. 20]
Here's an attempt at a proof that mapM return [1..] doesn't terminate. Let's assume for the moment that we're in the Identity monad (the argument will apply to any other monad just as well):
mapM return [1..] -- initial expression
sequence (map return [1 ..]) -- unfold mapM
let k m m' = m >>= \x ->
m' >>= \xs ->
return (x : xs)
in foldr k (return []) (map return [1..]) -- unfold sequence
So far so good...
-- unfold foldr
let k m m' = m >>= \x ->
m' >>= \xs ->
return (x : xs)
go [] = return []
go (y:ys) = k y (go ys)
in go (map return [1..])
-- unfold map so we have enough of a list to pattern-match go:
go (return 1 : map return [2..])
-- unfold go:
k (return 1) (go (map return [2..])
-- unfold k:
(return 1) >>= \x -> go (map return [2..]) >>= \xs -> return (x:xs)
Recall that return a = Identity a in the Identity monad, and (Identity a) >>= f = f a in the Identity monad. Continuing:
-- unfold >>= :
(\x -> go (map return [2..]) >>= \xs -> return (x:xs)) 1
-- apply 1 to \x -> ... :
go (map return [2..]) >>= \xs -> return (1:xs)
-- unfold >>= :
(\xs -> return (1:xs)) (go (map return [2..]))
Note that at this point we'd love to apply to \xs, but we can't yet! We have to instead continue unfolding until we have a value to apply:
-- unfold map for go:
(\xs -> return (1:xs)) (go (return 2 : map return [3..]))
-- unfold go:
(\xs -> return (1:xs)) (k (return 2) (go (map return [3..])))
-- unfold k:
(\xs -> return (1:xs)) ((return 2) >>= \x2 ->
(go (map return [3..])) >>= \xs2 ->
return (x2:xs2))
-- unfold >>= :
(\xs -> return (1:xs)) ((\x2 -> (go (map return [3...])) >>= \xs2 ->
return (x2:xs2)) 2)
At this point, we still can't apply to \xs, but we can apply to \x2. Continuing:
-- apply 2 to \x2 :
(\xs -> return (1:xs)) ((go (map return [3...])) >>= \xs2 ->
return (2:xs2))
-- unfold >>= :
(\xs -> return (1:xs)) (\xs2 -> return (2:xs2)) (go (map return [3..]))
Now we've gotten to a point where neither \xs nor \xs2 can be reduced yet! Our only choice is:
-- unfold map for go, and so on...
(\xs -> return (1:xs))
(\xs2 -> return (2:xs2))
(go ((return 3) : (map return [4..])))
So you can see that, because of foldr, we're building up a series of functions to apply, starting from the end of the list and working our way back up. Because at each step the input list is infinite, this unfolding will never terminate and we will never get an answer.
This makes sense if you look at this example (borrowed from another StackOverflow thread, I can't find which one at the moment). In the following list of monads:
mebs = [Just 3, Just 4, Nothing]
we would expect sequence to catch the Nothing and return a failure for the whole thing:
sequence mebs = Nothing
However, for this list:
mebs2 = [Just 3, Just 4]
we would expect sequence to give us:
sequence mebs = Just [3, 4]
In other words, sequence has to see the whole list of monadic computations, string them together, and run them all in order to come up with the right answer. There's no way sequence can give an answer without seeing the whole list.
Note: The previous version of this answer asserted that foldr computes starting from the back of the list, and wouldn't work at all on infinite lists, but that's incorrect! If the operator you pass to foldr is lazy on its second argument and produces output with a lazy data constructor like a list, foldr will happily work with an infinite list. See foldr (\x xs -> (replicate x x) ++ xs) [] [1...] for an example. But that's not the case with our operator k.
This question is showing very well the difference between the IO Monad and other Monads. In the background the mapM builds an expression with a bind operation (>>=) between all the list elements to turn the list of monadic expressions into a monadic expression of a list. Now, what is different in the IO monad is that the execution model of Haskell is executing expressions during the bind in the IO Monad. This is exactly what finally forces (in a purely lazy world) something to be executed at all.
So IO Monad is special in a way, it is using the sequence paradigm of bind to actually enforce execution of each step and this is what our program makes to execute anything at all in the end. Others Monads are different. They have other meanings of the bind operator, depending on the Monad. IO is actually the one Monad which execute things in the bind and this is the reason why IO types are "actions".
The following example show that other Monads do not enforce execution, the Maybe monad for example. Finally this leds to the result that a mapM in the IO Monad returns an expression, which - when executed - executes each single element before returning the final value.
There are nice papers about this, start here or search for denotational semantics and Monads:
Tackling the awkward squad: http://research.microsoft.com/en-us/um/people/simonpj/papers/marktoberdorf/mark.pdf
Example with Maybe Monad:
module Main where
fstMaybe :: [Int] -> Maybe [Int]
fstMaybe = mapM (\x -> if x == 3 then Nothing else Just x)
main = do
let r = fstMaybe [1..]
return r
Let's talk about this in a more generic context.
As the other answers said, the mapM is just a combination of sequence and map. So the problem is why sequence is strict in certain Monads. However, this is not restricted to Monads but also Applicatives since we have sequenceA which share the same implementation of sequence in most cases.
Now look at the (specialized for lists) type signature of sequenceA :
sequenceA :: Applicative f => [f a] -> f [a]
How would you do this? You were given a list, so you would like to use foldr on this list.
sequenceA = foldr f b where ...
--f :: f a -> f [a] -> f [a]
--b :: f [a]
Since f is an Applicative, you know what b coule be - pure []. But what is f?
Obviously it is a lifted version of (:):
(:) :: a -> [a] -> [a]
So now we know how sequenceA works:
sequenceA = foldr f b where
f a b = (:) <$> a <*> b
b = pure []
or
sequenceA = foldr ((<*>) . fmap (:)) (pure [])
Assume you were given a lazy list (x:_|_). The above definition of sequenceA gives
sequenceA (x:_|_) === (:) <$> x <*> foldr ((<*>) . fmap (:)) (pure []) _|_
=== (:) <$> x <*> _|_
So now we see the problem was reduced to consider weather f <*> _|_ is _|_ or not. Obviously if f is strict this is _|_, but if f is not strict, to allow a stop of evaluation we require <*> itself to be non-strict.
So the criteria for an applicative functor to have a sequenceA that stops on will be
the <*> operator to be non-strict. A simple test would be
const a <$> _|_ === _|_ ====> strict sequenceA
-- remember f <$> a === pure f <*> a
If we are talking about Moands, the criteria is
_|_ >> const a === _|_ ===> strict sequence