How to set strictness in list comprehension? - haskell

I'm bit stuck how to rewrite following strict-evaluated list comprehension to use seq instead of bang pattern:
zipWith' f l1 l2 = [ f e1 e2 | (!e1, !e2) <- zip l1 l2 ]
Any idea ?
I've tried
zipWith' f l1 l2 = [ e1 `seq` e2 `seq` f e1 e2 | (e1, e2) <- zip l1 l2 ]
but this unfortunately does not force an evaluation to WHNF.

You can mechanically translate bang patterns into calls to seq
following the GHC
manual:
This:
zipWith' f l1 l2 = [ f e1 e2 | (!e1, !e2) <- zip l1 l2 ]
Becomes Too lazy:
zipWith' f l1 l2 =
[ f e1 e2
| e <- zip l1 l2
, let t = case e of (x,y) -> x `seq` y `seq` (x,y)
, let e1 = fst t
, let e2 = snd t
]
Which is more concisely written as (too lazy):
zipWith' f l1 l2 =
[ f e1 e2
| e <- zip l1 l2
, let (e1,e2) = case e of (x,y) -> x `seq` y `seq` (x,y)
]
Though I'd write it as (wrong, too lazy)
zipWith' f l1 l2 = zipWith (\x y -> uncurry f (k x y)) l1 l2
where
k x y = x `seq` y `seq` (x, y)
You could also move the strictness hint to a data structure:
data P = P !Integer !Integer
zipWith' f l1 l2 = [ f e1 e2 | P e1 e2 <- zipWith P l1 l2 ]
Or as:
zipWith' f l1 l2 = [ f e1 e2 | (e1, e2) <- zipWith k l1 l2 ]
where
k x y = x `seq` y `seq` (x,y)

The essential thing you want the strict zipWith to do is to evaluate the elements of the list at the time the cons cell is forced. Your function does not do that. Just try
main = print $ length $ zipWith' undefined [1..10] [1..100]
It's a fluke of the strictness analysis when you use (+) that makes it work.
The function you want is something like this:
zipW f (x:xs) (y:ys) = let z = f x y in seq z (z : zipW f xs ys)
zipW _ _ _ = []
You cannot decouple the production of the cons cell from the production of the value, because the former should force the latter.

To explain the original poster's example and some of Don's noted behavior. Using (>>=) as a concise concatMap, the original example translates into:
zip l1 l2 >>= \(!e1, !e2) -> [f e1 e2]
The second example is:
zip l1 l2 >>= \(e1, e2) -> [e1 `seq` e2 `seq` f e1 e2]
But what the desugared first version actually further desugars into is:
zip l1 l2 >>= \(e1, e2) -> e1 `seq` e2 `seq` [f e1 e2]
e1 and e2 get forced when the concat in concatMap flattens these singleton lists which provides the "force as I go along" behavior.
While this isn't what one would normally think of as a "strict" zipWith, it is not a fluke that it works for the fibs example.

Related

Two implementations for List Monad bind in the literature: why are they equivalent?

Reading the Monad chapter in "Programming in Haskell" 2nd ed. from Graham Hutton, I found this example on page 167 to illustrate the behaviour of the List Monad:
> pairs [1,2] [3,4]
[(1,3),(1,4),(2,3),(2,4)]
With pairs defined like this:
pairs :: [a] -> [b] -> [(a,b)]
pairs xs ys = do x <- xs
y <- ys
return (x,y)
And this implementation of bind:
instance Monad [] where
-- (>>=) :: [a] -> (a -> [b]) -> [b]
xs >>= f = [y | x <- xs, y <- f x]
I tried to understand with pencil and paper how the example worked out, but didn't get it.
Then I found, that in other books the bind operation is defined differently:
...
xs >>= f = concat (fmap f xs)
With this definition I understand why the example works.
But the first definition is the one I found on hackage in the prelude, so I trust its correct.
My question:
Can anybody explain why the first definition is equivalent to the second? (Where does the concat-stuff happen in the first one?)
List comprehensions are just syntactic sugar. Basically, [f x y | x<-l, y<-m] is sugar for
concatMap (\x -> concatMap (\y -> return $ f x y) m) l
or equivalently
concat $ fmap (\x -> concat $ fmap (\y -> return $ f x y) m) l
thus the two implementations are indeed equivalent by definition.
Anyway you can of course manually evaluate the example from the comprehension-based definition, using “intuitive set comprehension” evaluation:
pairs [1,2] [3,4]
≡ do { x <- [1,2]; y <- [3,4]; return (x,y) }
≡ [1,2] >>= \x -> [3,4] >>= \y -> return (x,y)
≡ [p | x<-[1,2], p <- (\ξ -> [3,4] >>= \y -> return (ξ,y)) x]
≡ [p | x<-[1,2], p <- ([3,4] >>= \y -> return (x,y))]
≡ [p | x<-[1,2], p <- [q | y<-[3,4], q <- (\υ -> return (x,υ)) y]]
≡ [p | x<-[1,2], p <- [q | y<-[3,4], q <- return (x,y)]]
≡ [p | x<-[1,2], p <- [q | y<-[3,4], q <- [(x,y)]]]
≡ [p | x<-[1,2], p <- [(x,3), (x,4)]]
≡ [(1,3), (1,4)] ++ [(2,3), (2,4)]
≡ [(1,3), (1,4), (2,3), (2,4)]
[y | x <- xs, y <- f x] takes all the xs in xs one-by-one and
apply f to them, which is a monadic action a -> [a], thus the result is a list of values ([a])
the comprehension proceeds to address each y in f x one-by-one
each y is sent to the output list
this is equivalent to first mapping f over each of the elements of the input list, resulting in a list of nested lists, that are then concatenated. Notice that fmap is map for lists, and you could use concatMap f xs as the definition of xs >>= f.

Strictness of pattern matching vs. deconstructing

I'm trying to define primitive recursion in term of foldr, as explained in A tutorial on the universality and expressiveness on fold chapter 4.1.
Here is first attempt at it
simpleRecursive f v xs = fst $ foldr g (v,[]) xs
where
g x (acc, xs) = (f x xs acc,x:xs)
However, above definition does not halt for head $ simpleRecursive (\x xs acc -> x:xs) [] [1..]
Below is definition that halt
simpleRecursive f v xs = fst $ foldr g (v,[]) xs
where
g x r = let (acc,xs) = r
in (f x xs acc,x:xs)
Given almost similar definition but different result, why does it differ? Does it have to do with how Haskell pattern match?
The crucial difference between the two functions is that in
g x r = let (acc, xs) = r
in (f x xs acc, x:xs)
The pattern match on the tuple constructor is irrefutable, whereas in
g x (acc, xs) = (f x xs acc, x:xs)
it is not. In other words, the first definition of g is equivalent to
g x ~(acc, xs) = (f x xs acc, x:xs)

Laziness of (>>=) in folding

Consider the following 2 expressions in Haskell:
foldl' (>>=) Nothing (repeat (\y -> Just (y+1)))
foldM (\x y -> if x==0 then Nothing else Just (x+y)) (-10) (repeat 1)
The first one takes forever, because it's trying to evaluate the infinite expression
...(((Nothing >>= f) >>= f) >>=f)...
and Haskell will just try to evaluate it inside out.
The second expression, however, gives Nothing right away. I've always thought foldM was just doing fold using (>>=), but then it would run into the same problem. So it's doing something more clever here - once it hits Nothing it knows to stop. How does foldM actually work?
foldM can't be implemented using foldl. It needs the power of foldr to be able to stop short. Before we get there, here's a version without anything fancy.
foldM f b [] = return b
foldM f b (x : xs) = f b x >>= \q -> foldM f q xs
We can transform this into a version that uses foldr. First we flip it around:
foldM f b0 xs = foldM' xs b0 where
foldM' [] b = return b
foldM' (x : xs) b = f b x >>= foldM' xs
Then move the last argument over:
foldM' [] = return
foldM' (x : xs) = \b -> f b x >>= foldM' xs
And then recognize the foldr pattern:
foldM' = foldr go return where
go x r = \b -> f b x >>= r
Finally, we can inline foldM' and move b back to the left:
foldM f b0 xs = foldr go return xs b0 where
go x r b = f b x >>= r
This same general approach works for all sorts of situations where you want to pass an accumulator from left to right within a right fold. You first shift the accumulator all the way over to the right so you can use foldr to build a function that takes an accumulator, instead of trying to build the final result directly. Joachim Breitner did a lot of work to create the Call Arity compiler analysis for GHC 7.10 that helps GHC optimize functions written this way. The main reason to want to do so is that it allows them to participate in the GHC list libraries' fusion framework.
One way to define foldl in terms of foldr is:
foldl f z xn = foldr (\ x g y -> g (f y x)) id xn z
It's probably worth working out why that is for yourself. It can be re-written using >>> from Control.Arrow as
foldl f z xn = foldr (>>>) id (map (flip f) xn) z
The monadic equivalent of >>> is
f >=> g = \ x -> f x >>= \ y -> g y
which allows us to guess that foldM might be
foldM f z xn = foldr (>=>) return (map (flip f) xn) z
which turns out to be the correct definition. It can be re-written using foldr/map as
foldM f z xn = foldr (\ x g y -> f y x >>= g) return xn z

Enumerating all pairs of possibly infinite lists [duplicate]

I have a function for finite lists
> kart :: [a] -> [b] -> [(a,b)]
> kart xs ys = [(x,y) | x <- xs, y <- ys]
but how to implement it for infinite lists? I have heard something about Cantor and set theory.
I also found a function like
> genFromPair (e1, e2) = [x*e1 + y*e2 | x <- [0..], y <- [0..]]
But I'm not sure if it helps, because Hugs only gives out pairs without ever stopping.
Thanks for help.
Your first definition, kart xs ys = [(x,y) | x <- xs, y <- ys], is equivalent to
kart xs ys = xs >>= (\x ->
ys >>= (\y -> [(x,y)]))
where
(x:xs) >>= g = g x ++ (xs >>= g)
(x:xs) ++ ys = x : (xs ++ ys)
are sequential operations. Redefine them as alternating operations,
(x:xs) >>/ g = g x +/ (xs >>/ g)
(x:xs) +/ ys = x : (ys +/ xs)
[] +/ ys = ys
and your definition should be good to go for infinite lists as well:
kart_i xs ys = xs >>/ (\x ->
ys >>/ (\y -> [(x,y)]))
testing,
Prelude> take 20 $ kart_i [1..] [101..]
[(1,101),(2,101),(1,102),(3,101),(1,103),(2,102),(1,104),(4,101),(1,105),(2,103)
,(1,106),(3,102),(1,107),(2,104),(1,108),(5,101),(1,109),(2,105),(1,110),(3,103)]
courtesy of "The Reasoned Schemer". (see also conda, condi, conde, condu).
another way, more explicit, is to create separate sub-streams and combine them:
kart_i2 xs ys = foldr g [] [map (x,) ys | x <- xs]
where
g a b = head a : head b : g (tail a) (tail b)
this actually produces exactly the same results. But now we have more control over how we combine the sub-streams. We can be more diagonal:
kart_i3 xs ys = g [] [map (x,) ys | x <- xs]
where -- works both for finite
g [] [] = [] -- and infinite lists
g a b = concatMap (take 1) a
++ g (filter (not . null) (take 1 b ++ map (drop 1) a))
(drop 1 b)
so that now we get
Prelude> take 20 $ kart_i3 [1..] [101..]
[(1,101),(2,101),(1,102),(3,101),(2,102),(1,103),(4,101),(3,102),(2,103),(1,104)
,(5,101),(4,102),(3,103),(2,104),(1,105),(6,101),(5,102),(4,103),(3,104),(2,105)]
With some searching on SO I've also found an answer by Norman Ramsey with seemingly yet another way to generate the sequence, splitting these sub-streams into four areas - top-left tip, top row, left column, and recursively the rest. His merge there is the same as our +/ here.
Your second definition,
genFromPair (e1, e2) = [x*e1 + y*e2 | x <- [0..], y <- [0..]]
is equivalent to just
genFromPair (e1, e2) = [0*e1 + y*e2 | y <- [0..]]
Because the list [0..] is infinite there's no chance for any other value of x to come into play. This is the problem that the above definitions all try to avoid.
Prelude> let kart = (\xs ys -> [(x,y) | ls <- map (\x -> map (\y -> (x,y)) ys) xs, (x,y) <- ls])
Prelude> :t kart
kart :: [t] -> [t1] -> [(t, t1)]
Prelude> take 10 $ kart [0..] [1..]
[(0,1),(0,2),(0,3),(0,4),(0,5),(0,6),(0,7),(0,8),(0,9),(0,10)]
Prelude> take 10 $ kart [0..] [5..10]
[(0,5),(0,6),(0,7),(0,8),(0,9),(0,10),(1,5),(1,6),(1,7),(1,8)]
you can think of the sequel as
0: (0, 0)
/ \
1: (1,0) (0,1)
/ \ / \
2: (2,0) (1, 1) (0,2)
...
Each level can be expressed by level n: [(n,0), (n-1, 1), (n-2, 2), ..., (0, n)]
Doing this to n <- [0..]
We have
cartesianProducts = [(n-m, m) | n<-[0..], m<-[0..n]]

Concatenation of lists in Haskell

I want a function that takes two lists of any type and returns one (i.e. f:: [[a]] -> [[a]] -> [[a]]). Basically, too produce the 'concatenation' of the two input lists.
e.g.
> f [[1,2,3], [123]] [[4,5,6], [3,7]]
[[1,2,3,4,5,6], [1,2,3,3,7], [123,4,5,6], [123,3,7]]
I currently have got this far with it:
f _ [] = []
f [] _ = []
f (xs:xss) (ys:yss) = ((xs ++ ys) : [m | m <- f [xs] yss])
But this doesn't take into account xss and is wrong. Any suggestions?
It's a Cartesian product, so you can simply use one list comprehension to do everything.
Prelude> let xs = [[1,2,3], [123]]
Prelude> let ys = [[4,5,6], [3,7]]
Prelude> [x ++ y | x <- xs, y <- ys]
[[1,2,3,4,5,6],[1,2,3,3,7],[123,4,5,6],[123,3,7]]
import Control.Applicative
(++) <$> [[1,2,3], [123]] <*> [[4,5,6], [3,7]]
[[1,2,3,4,5,6],[1,2,3,3,7],[123,4,5,6],[123,3,7]]
f l1 l2 = [x ++ y | x <- l1, y <- l2]
In Alternative:
import Control.Applicative
f :: (Applicative f, Alternative g) => f (g a) -> f (g a) -> f (g a)
f = liftA2 (<|>)
f a b = map concat . sequence $ [a,b]
Scales up for combining any number of lists.

Resources