Monadic list comprehension in Haskell

Monadic list comprehension in Haskell - haskell

List comprehension is very easy to understand. Look at h in the following definition. It uses pure_xs of type [Int], and pure_f of type Int -> String, using both in the list comprehension.
pure_xs :: [Int]
pure_xs = [1,2,3]
pure_f :: Int -> String
pure_f a = show a
h :: [(Int,Char)]
h = [(a,b) | a <- pure_xs, b <- pure_f a]
-- h => [(4,'4'),(5,'5'),(6,'6')]
Great. Now take two slightly different expressions, monadic_f and monadic_xs. I would like to construct g using list comprehensions, to look as similar to h as possible. I have a feeling that a solution will involve generating a sequence of IO actions, and using sequence to generate a list of type [(Int,Char)] in the IO monad.
monadic_xs :: IO [Int]
monadic_xs = return [1,2,3]
monadic_f :: Int -> IO String
monadic_f a = return (show a)
g :: IO [(Int,Char)]
g = undefined -- how to make `g` function look
-- as similar to `h` function as possible, i.e. using list comprehension?
-- g => IO [(4,'4'),(5,'5'),(6,'6')]

The natural way to write this would be
do xs <- monadic_xs
ys <- mapM monadic_f xs
return (zip xs ys)
But we can't translate that naturally into a list comprehension because we need the (>>=) binds in there to extract the monadic values. Monad transformers would be an avenue to interweave these effects. Let's examine the transformers ListT monad transformer — even though it's not actually a monad transformer.
newtype ListT m a = ListT { runListT :: m [a] }
listT_xs :: ListT IO Int
listT_xs = ListT monadic_xs
listT_f :: Int -> ListT IO String
liftT_f = ListT . fmap return . monadic_f
>>> runListT $ do { x <- listT_xs; str <- listT_f x; return (x, str) }
[(1,"1"),(2,"2"),(3,"3")]
So that appears to work and we can turn on MonadComprehensions to write it in a list comprehension format.
>>> runListT [ (x, str) | x <- listT_xs, str <- listT_f x ]
[(1,"1"),(2,"2"),(3,"3")]
That's about as similar to the result you get with the pure version as I can think of, but it has a few dangerous flaws. First, we're using ListT which may be unintuitive due to it breaking the monad transformer laws, and, second, we're using only a tiny fraction of the list monadic effect---normally list will take the cartesian product, not the zip.
listT_g :: Int -> ListT IO String
listT_g = ListT . fmap (replicate 3) . monadic_f
>>> runListT [ (x, str) | x <- listT_xs, str <- listT_g x ]
[(1,"1"),(1,"1"),(1,"1"),(2,"2"),(2,"2"),(2,"2"),(3,"3"),(3,"3"),(3,"3")]
To solve these problems you might want to experiment with pipes. You'll get the "correct" solution there, though it won't look nearly as much like a list comprehension.

Related

Is there a lazy mapM?

At first glance I thought these two functions would work the same:
firstM _ [] = return Nothing
firstM p (x:xs) = p x >>= \r -> if r then return (Just x) else firstM p xs
firstM' p xs = fmap listToMaybe (mapM p xs)
But they don't. In particular, firstM stops as soon as the first p x is true. But firstM', because of mapM, needs the evaluate the whole list.
Is there a "lazy mapM" that enables the second definition, or at least one that doesn't require explicit recursion?

There isn't (can't be) a safe, Monad-polymorphic lazy mapM. But the monad-loops package contains many lazy monadic variants of various pure functions, and includes firstM.

One solution is to use ListT, the list monad transformer. This type interleaves side effects and results, so you can peek at the initial element without running the whole computation first.
Here's an example using ListT:
import Control.Monad
import qualified ListT
firstM :: Monad m => (a -> Bool) -> [a] -> m (Maybe a)
firstM p = ListT.head . mfilter p . ListT.fromFoldable
(Note that the ListT defined in transformers and mtl is buggy and should not be used. The version I linked above should be okay, though.)

If there is, I doubt it's called mapM.
As I recall, mapM is defined in terms of sequence:
mapM :: Monad m => (a -> b) -> [a] -> m [b]
mapM f = sequence . map f
and the whole point of sequence is to guarantee that all the side-effects are done before giving you anything.
As opposed to using some alternate mapM, you could get away with just using map and sequence, so you could change the container from [a] to Just a:
firstM p xs = sequence $ listToMaybe (map p xs)
or even:
firstM p xs = mapM f $ listToMaybe xs
Now that mapM and sequence can operate on generic Traversables, not just lists.

Lazy list wrapped in IO

Suppose the code
f :: IO [Int]
f = f >>= return . (0 :)
g :: IO [Int]
g = f >>= return . take 3
When I run g in ghci, it cause stackoverflow. But I was thinking maybe it could be evaluated lazily and produce [0, 0, 0] wrapped in IO. I suspect IO is to blame here, but I really have no idea. Obviously the following works:
f' :: [Int]
f' = 0 : f'
g' :: [Int]
g' = take 3 f'
Edit: In fact I am not interested in having such a simple function f, original code looked more along the lines:
h :: a -> IO [Either b c]
h a = do
(r, a') <- h' a
case r of
x#(Left _) -> h a' >>= return . (x :)
y#(Right _) -> return [y]
h' :: IO (Either b c, a)
-- something non trivial
main :: IO ()
main = mapM_ print . take 3 =<< h a
h does some IO computations and stores invalid (Left) responses in a list until a valid response (Right) is produced. The attempt is to construct the list lazily even though we are in the IO monad. So that someone reading the result of h can start consuming the list even before it is complete (because it may even be infinite). And if the one reading the results cares only for the first 3 entries no matter what, the rest of the list does not even have to be constructed. And I am getting the feeling that this will not be possible :/.

Yes, IO is to blame here. >>= for IO is strict in the "state of the world". If you write m >>= h, you'll get an action that first performs the action m, then applies h to the result, and finally performs the action h yields. It doesn't matter that your f action doesn't "do anything"; it has to be performed anyway. Thus you end up in an infinite loop starting the f action over and over.
Thankfully, there is a way around this, because IO is an instance of MonadFix. You can "magically" access the result of an IO action from within that action. Critically, that access must be sufficiently lazy, or you'll throw yourself into an infinite loop.
import Control.Monad.Fix
import Data.Functor ((<$>))
f :: IO [Int]
f = mfix (\xs -> return (0 : xs))
-- This `g` is just like yours, but prettier IMO
g :: IO [Int]
g = take 3 <$> f
There's even a bit of syntactic sugar in GHC for this letting you use do notation with the rec keyword or mdo notation.
{-# LANGUAGE RecursiveDo #-}
f' :: IO [Int]
f' = do
rec res <- (0:) <$> (return res :: IO [Int])
return res
f'' :: IO [Int]
f'' = mdo
res <- f'
return (0 : res)
For more interesting examples of ways to use MonadFix, see the Haskell Wiki.

It sounds like you want a monad that mixes the capabilities of lists and IO. Luckily, that's just what ListT is for. Here's your example in that form, with an h' that computes the Collatz sequence and asks the user how they feel about each element in the sequence (I couldn't really think of anything convincing that fit the shape of your outline).
import Control.Monad.IO.Class
import qualified ListT as L
h :: Int -> L.ListT IO (Either String ())
h a = do
(r, a') <- liftIO (h' a)
case r of
x#(Left _) -> L.cons x (h a')
y#(Right _) -> return y
h' :: Int -> IO (Either String (), Int)
h' 1 = return (Right (), 1)
h' n = do
putStrLn $ "Say something about " ++ show n
s <- getLine
return (Left s, if even n then n `div` 2 else 3*n + 1)
main = readLn >>= L.traverse_ print . L.take 3 . h
Here's how it looks in ghci:
> main
2
Say something about 2
small
Left "small"
Right ()
> main
3
Say something about 3
prime
Left "prime"
Say something about 10
not prime
Left "not prime"
Say something about 5
fiver
Left "fiver"
I suppose modern approaches would use pipes or conduits or iteratees or something, but I don't know enough about them to talk about the tradeoffs compared to ListT.

I'm not sure if this is an appropriate usage, but unsafeInterleaveIO would get you the behavior you're asking for, by deferring the IO actions of f until the value inside of f is asked for:
module Tmp where
import System.IO.Unsafe (unsafeInterleaveIO)
f :: IO [Int]
f = unsafeInterleaveIO f >>= return . (0 :)
g :: IO [Int]
g = f >>= return . take 3
*Tmp> g
[0,0,0]

Haskell way to join [IO String] into IO String

my goal is to write Haskell function which reads N lines from input and joins them in one string. Below is the first attempt:
readNLines :: Int -> IO String
readNLines n = do
let rows = replicate n getLine
let rowsAsString = foldl ++ [] rows
return rowsAsString
Here haskell complaints on foldl:
Couldn't match expected type [a]'
against inferred type(a1 -> b -> a1)
-> a1 -> [b] -> a1'
As I understand type of rows is [IO String], is it possible some how join such list in a single IO String?

You're looking for sequence :: (Monad m) => [m a] -> m [a].
(Plus liftM :: Monad m => (a1 -> r) -> m a1 -> m r and unlines :: [String] -> String, probably.)

Besides what ephemient points out, I think you have a syntax issue: The way you're using the ++ operator makes it look like you are trying to invoke the ++ operator with operands foldl and []. Put the ++ operator in parentheses to make your intent clear:
foldl (++) [] rows

The functions you are looking for is is sequence, however it should be noted that
sequence (replicate n f)
is the same as
replicateM n f
And foldl (++) [] is equivalent to concat. So your function is:
readNLines n = liftM concat (replicateM n getLine)
Alternatively if you want to preserve line breaks:
readNLines n = liftM unlines (replicateM n getLine)

The shortest answer I can come up with is:
import Control.Applicative
import Control.Monad
readNLines :: Int -> IO String
readNLines n = concat <$> replicateM n getLine

replicate returns a list of IO String actions. In order to perform these actions, they need to be run in the IO monad. So you don't want to join an array of IO actions, but rather run them all in sequence and return the result.
Here's what I would do
readNLines :: Int -> IO String
readNLines n = do
lines <- replicateM n getLine
return $ concat lines
Or, in applicative style:
import Control.Applicative
readNLines :: Int -> IO String
readNLines n = concat <$> replicateM n getLine
Both of these use the monadic replicate (replicateM), which evaluates a list of monadic values in sequence, rather than simply returning a list of actions

Is Haskell's mapM not lazy?

UPDATE: Okay this question becomes potentially very straightforward.
q <- mapM return [1..]
Why does this never return?
Does mapM not lazily deal with infinite lists?
The code below hangs. However, if I replace line A by line B, it doesn't hang anymore. Alternatively, if I preceed line A by a "splitRandom $", it also doesn't hang.
Q1 is: Is mapM not lazy? Otherwise, why does replacing line A with line B "fix this" code?
Q2 is: Why does preceeding line A with splitRandom "solve" the problem?
import Control.Monad.Random
import Control.Applicative
f :: (RandomGen g) => Rand g (Double, [Double])
f = do
b <- splitRandom $ sequence $ repeat $ getRandom
c <- mapM return b -- A
-- let c = map id b -- B
a <- getRandom
return (a, c)
splitRandom :: (RandomGen g) => Rand g a -> Rand g a
splitRandom code = evalRand code <$> getSplit
t0 = do
(a, b) <- evalRand f <$> newStdGen
print a
print (take 3 b)
The code generates an infinite list of random numbers lazily. Then it generates a single random number. By using splitRandom, I can evaluate this latter random number first before the infinite list. This can be demonstrated if I return b instead of c in the function.
However, if I apply the mapM to the list, the program now hangs. To prevent this hanging, I have to apply splitRandom again before the mapM. I was under the impression that mapM can lazily

Well, there's lazy, and then there's lazy. mapM is indeed lazy in that it doesn't do more work than it has to. However, look at the type signature:
mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
Think about what this means: You give it a function a -> m b and a bunch of as. A regular map can turn those into a bunch of m bs, but not an m [b]. The only way to combine the bs into a single [b] without the monad getting in the way is to use >>= to sequence the m bs together to construct the list.
In fact, mapM is precisely equivalent to sequence . map.
In general, for any monadic expression, if the value is used at all, the entire chain of >>=s leading to the expression must be forced, so applying sequence to an infinite list can't ever finish.
If you want to work with an unbounded monadic sequence, you'll either need explicit flow control--e.g., a loop termination condition baked into the chain of binds somehow, which simple recursive functions like mapM and sequence don't provide--or a step-by-step sequence, something like this:
data Stream m a = Nil | Stream a (m (Stream m a))
...so that you only force as many monad layers as necessary.
Edit:: Regarding splitRandom, what's going on there is that you're passing it a Rand computation, evaluating that with the seed splitRandom gets, then returning the result. Without the splitRandom, the seed used by the single getRandom has to come from the final result of sequencing the infinite list, hence it hangs. With the extra splitRandom, the seed used only needs to thread though the two splitRandom calls, so it works. The final list of random numbers works because you've left the Rand monad at that point and nothing depends on its final state.

Okay this question becomes potentially very straightforward.
q <- mapM return [1..]
Why does this never return?
It's not necessarily true. It depends on the monad you're in.
For example, with the identity monad, you can use the result lazily and it terminates fine:
newtype Identity a = Identity a
instance Monad Identity where
Identity x >>= k = k x
return = Identity
-- "foo" is the infinite list of all the positive integers
foo :: [Integer]
Identity foo = do
q <- mapM return [1..]
return q
main :: IO ()
main = print $ take 20 foo -- [1 .. 20]

Here's an attempt at a proof that mapM return [1..] doesn't terminate. Let's assume for the moment that we're in the Identity monad (the argument will apply to any other monad just as well):
mapM return [1..] -- initial expression
sequence (map return [1 ..]) -- unfold mapM
let k m m' = m >>= \x ->
m' >>= \xs ->
return (x : xs)
in foldr k (return []) (map return [1..]) -- unfold sequence
So far so good...
-- unfold foldr
let k m m' = m >>= \x ->
m' >>= \xs ->
return (x : xs)
go [] = return []
go (y:ys) = k y (go ys)
in go (map return [1..])
-- unfold map so we have enough of a list to pattern-match go:
go (return 1 : map return [2..])
-- unfold go:
k (return 1) (go (map return [2..])
-- unfold k:
(return 1) >>= \x -> go (map return [2..]) >>= \xs -> return (x:xs)
Recall that return a = Identity a in the Identity monad, and (Identity a) >>= f = f a in the Identity monad. Continuing:
-- unfold >>= :
(\x -> go (map return [2..]) >>= \xs -> return (x:xs)) 1
-- apply 1 to \x -> ... :
go (map return [2..]) >>= \xs -> return (1:xs)
-- unfold >>= :
(\xs -> return (1:xs)) (go (map return [2..]))
Note that at this point we'd love to apply to \xs, but we can't yet! We have to instead continue unfolding until we have a value to apply:
-- unfold map for go:
(\xs -> return (1:xs)) (go (return 2 : map return [3..]))
-- unfold go:
(\xs -> return (1:xs)) (k (return 2) (go (map return [3..])))
-- unfold k:
(\xs -> return (1:xs)) ((return 2) >>= \x2 ->
(go (map return [3..])) >>= \xs2 ->
return (x2:xs2))
-- unfold >>= :
(\xs -> return (1:xs)) ((\x2 -> (go (map return [3...])) >>= \xs2 ->
return (x2:xs2)) 2)
At this point, we still can't apply to \xs, but we can apply to \x2. Continuing:
-- apply 2 to \x2 :
(\xs -> return (1:xs)) ((go (map return [3...])) >>= \xs2 ->
return (2:xs2))
-- unfold >>= :
(\xs -> return (1:xs)) (\xs2 -> return (2:xs2)) (go (map return [3..]))
Now we've gotten to a point where neither \xs nor \xs2 can be reduced yet! Our only choice is:
-- unfold map for go, and so on...
(\xs -> return (1:xs))
(\xs2 -> return (2:xs2))
(go ((return 3) : (map return [4..])))
So you can see that, because of foldr, we're building up a series of functions to apply, starting from the end of the list and working our way back up. Because at each step the input list is infinite, this unfolding will never terminate and we will never get an answer.
This makes sense if you look at this example (borrowed from another StackOverflow thread, I can't find which one at the moment). In the following list of monads:
mebs = [Just 3, Just 4, Nothing]
we would expect sequence to catch the Nothing and return a failure for the whole thing:
sequence mebs = Nothing
However, for this list:
mebs2 = [Just 3, Just 4]
we would expect sequence to give us:
sequence mebs = Just [3, 4]
In other words, sequence has to see the whole list of monadic computations, string them together, and run them all in order to come up with the right answer. There's no way sequence can give an answer without seeing the whole list.
Note: The previous version of this answer asserted that foldr computes starting from the back of the list, and wouldn't work at all on infinite lists, but that's incorrect! If the operator you pass to foldr is lazy on its second argument and produces output with a lazy data constructor like a list, foldr will happily work with an infinite list. See foldr (\x xs -> (replicate x x) ++ xs) [] [1...] for an example. But that's not the case with our operator k.

This question is showing very well the difference between the IO Monad and other Monads. In the background the mapM builds an expression with a bind operation (>>=) between all the list elements to turn the list of monadic expressions into a monadic expression of a list. Now, what is different in the IO monad is that the execution model of Haskell is executing expressions during the bind in the IO Monad. This is exactly what finally forces (in a purely lazy world) something to be executed at all.
So IO Monad is special in a way, it is using the sequence paradigm of bind to actually enforce execution of each step and this is what our program makes to execute anything at all in the end. Others Monads are different. They have other meanings of the bind operator, depending on the Monad. IO is actually the one Monad which execute things in the bind and this is the reason why IO types are "actions".
The following example show that other Monads do not enforce execution, the Maybe monad for example. Finally this leds to the result that a mapM in the IO Monad returns an expression, which - when executed - executes each single element before returning the final value.
There are nice papers about this, start here or search for denotational semantics and Monads:
Tackling the awkward squad: http://research.microsoft.com/en-us/um/people/simonpj/papers/marktoberdorf/mark.pdf
Example with Maybe Monad:
module Main where
fstMaybe :: [Int] -> Maybe [Int]
fstMaybe = mapM (\x -> if x == 3 then Nothing else Just x)
main = do
let r = fstMaybe [1..]
return r

Let's talk about this in a more generic context.
As the other answers said, the mapM is just a combination of sequence and map. So the problem is why sequence is strict in certain Monads. However, this is not restricted to Monads but also Applicatives since we have sequenceA which share the same implementation of sequence in most cases.
Now look at the (specialized for lists) type signature of sequenceA :
sequenceA :: Applicative f => [f a] -> f [a]
How would you do this? You were given a list, so you would like to use foldr on this list.
sequenceA = foldr f b where ...
--f :: f a -> f [a] -> f [a]
--b :: f [a]
Since f is an Applicative, you know what b coule be - pure []. But what is f?
Obviously it is a lifted version of (:):
(:) :: a -> [a] -> [a]
So now we know how sequenceA works:
sequenceA = foldr f b where
f a b = (:) <$> a <*> b
b = pure []
or
sequenceA = foldr ((<*>) . fmap (:)) (pure [])
Assume you were given a lazy list (x:_|_). The above definition of sequenceA gives
sequenceA (x:_|_) === (:) <$> x <*> foldr ((<*>) . fmap (:)) (pure []) _|_
=== (:) <$> x <*> _|_
So now we see the problem was reduced to consider weather f <*> _|_ is _|_ or not. Obviously if f is strict this is _|_, but if f is not strict, to allow a stop of evaluation we require <*> itself to be non-strict.
So the criteria for an applicative functor to have a sequenceA that stops on will be
the <*> operator to be non-strict. A simple test would be
const a <$> _|_ === _|_ ====> strict sequenceA
-- remember f <$> a === pure f <*> a
If we are talking about Moands, the criteria is
_|_ >> const a === _|_ ===> strict sequence

Haskell -- problem with pretty-printing a list

I'm new to haskell, and i read through and digested Learn You A Haskell For Great Good, trying out a couple of things along the way. For my first project i wanted to try the classic: FizzBuzz. So i came up with the following code:
import System.IO
fizzBuzz :: (Integral a) => a -> String
fizzBuzz num
| fizz && buzz = "FizzBuzz"
| fizz = "Fizz"
| buzz = "Buzz"
| otherwise = show num
where fizz = num `mod` 3 == 0
buzz = num `mod` 5 == 0
main = print $ map fizzBuzz [1..100]
Worked great, except i got a rather dense looking list that was hard to read. So i tried this main function instead:
main = map putStrLn $ map fizzBuzz [1..100]
And that gives me the error Couldn't match expected type 'IO t' against inferred type '[IO ()]'. I tried half a dozen things and none of it seemed to help. What's the proper way to do what i'm trying to do?

map :: (a -> b) -> [a] -> [b]
putStrLn :: Show a => a -> IO ()
map putStrLn :: Show a => [a] -> [IO ()]
You've got a list of IO () actions.
main :: IO ()
You need to join them into a single IO () action.
What you want to do is to perform each of those IO () actions in sequence/sequence_:
sequence :: Monad m => [m a] -> m [a]
sequence_ :: Monad m => [m a] -> m ()
For convenience, mapM/mapM_ will map a function over a list and sequence the resulting monadic results.
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
mapM_ :: Monad m => (a -> m b) -> [a] -> m ()
So your fixed code would look like this:
main = mapM_ putStrLn $ map fizzBuzz [1..100]
Although I'd probably write it like this:
main = mapM_ (putStrLn . fizzBuzz) [1..100]
Or even this:
main = putStr $ unlines $ map fizzBuzz [1..100]
Let's write our own sequence. What do we want it to do?
sequence [] = return []
sequence (m:ms) = do
x <- m
xs <- sequence ms
return $ x:xs
If there's nothing left in the list, return (inject into the monad) an empty list of results.
Otherwise, within the monad,
Bind (for the IO monad, this means execute) the first result.
sequence the rest of the list; bind that list of results.
Return a cons of the first result and the list of other results.
GHC's library uses something more like foldr (liftM2 (:)) (return []) but that's harder to explain to a newcomer; for now, just take my word that they're equivalent.
sequence_ is easier, since it doesn't bother keeping track of the results. GHC's library implements it as sequence_ ms = foldr (>>) (return ()) ms. Let's just expand the definition of foldr:
sequence [a, b, c, d]
= foldr (>>) (return ()) [a, b, c, d]
= a >> (b >> (c >> (d >> return ())))
In other words, "do a, discard the result; do b; discard the result, … finally, return ()".
mapM f xs = sequence $ map f xs
mapM_ f xs = sequence_ $ map f xs
On the other hand, you don't even need to know monads at all with the alternate unlines solution.
What does unlines do? Well, lines "a\nb\nc\nd\n" = ["a", "b", "c", "d"], so of course unlines ["a", "b", "c", "d"] = "a\nb\nc\nd\n".
unlines $ map fizzBuzz [1..100] = unlines ["1", "2", "Fizz", ..] = "1\n2\nFizz\n..." and off it goes to putStr. Thanks to the magic of Haskell's laziness, the full string never needs to be constructed in memory, so this will happily go to [1..1000000] or higher :)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Monadic list comprehension in Haskell - haskell

Related

Is there a lazy mapM?

Lazy list wrapped in IO

Haskell way to join [IO String] into IO String

Is Haskell's mapM not lazy?

Haskell -- problem with pretty-printing a list

Categories

Resources