How is Haskell's seq used? - haskell

So, Haskell seq function forces the evaluation of it's first argument and returns the second. Consequently it is an infix operator. If you want to force the evaluation of an expression, intuitively such a feature would be a unary operator. So, instead of
seq :: a -> b -> b
it would be
seq :: a -> a
Consequently, if the value you want is a, why return b and how do you construct for the return of b. Clearly, I am not thinking Haskell. :)

The way to think about a `seq` b is not that it "evaluates a" but that it creates a dependency between a and b, so that when you go to evaluate b you evaluate a as well.
This means, for example, that a `seq` a is completely redundant: you're telling Haskell to evaluate a when you evaluate a. By the same logic, seq a with just one argument would not be any different than simply writing a by itself.
Just having seq a that somehow evaluates a would not work. The problem is that seq a is itself an expression that might not be evaluated—it might be deep inside some nested thunks, for example. So it would only become relevant when you get to evaluating the whole seq a expression—at which point you would have been evaluating a by itself anyhow.
#Rhymoid's example of how it's used in a strict fold (foldl') is good. Our goal is to write a fold such that its intermediate accumulated value (acc) is completely evaluated at each step as soon as we evaluate the final result. This is done by adding a seq between the accumulated value and the recursive call:
foldl' f z (x:xs) =
let z' = f z x in z' `seq` foldl' f z' xs
You can visualize this as a long chain of seq between each application of f in the fold, connecting all of them to the final result. This way when you evaluate the final expression (ie the number you get by by summing a list), it evaluates the intermediate values (ie partial sums as you fold through the list) strictly.

Related

How does GHC know how to cache one function but not the others?

I'm reading Learn You a Haskell (loving it so far) and it teaches how to implement elem in terms of foldl, using a lambda. The lambda solution seemed a bit ugly to me so I tried to think of alternative implementations (all using foldl):
import qualified Data.Set as Set
import qualified Data.List as List
-- LYAH implementation
elem1 :: (Eq a) => a -> [a] -> Bool
y `elem1` ys =
foldl (\acc x -> if x == y then True else acc) False ys
-- When I thought about stripping duplicates from a list
-- the first thing that came to my mind was the mathematical set
elem2 :: (Eq a) => a -> [a] -> Bool
y `elem2` ys =
head $ Set.toList $ Set.fromList $ filter (==True) $ map (==y) ys
-- Then I discovered `nub` which seems to be highly optimized:
elem3 :: (Eq a) => a -> [a] -> Bool
y `elem3` ys =
head $ List.nub $ filter (==True) $ map (==y) ys
I loaded these functions in GHCi and did :set +s and then evaluated a small benchmark:
3 `elem1` [1..1000000] -- => (0.24 secs, 160,075,192 bytes)
3 `elem2` [1..1000000] -- => (0.51 secs, 168,078,424 bytes)
3 `elem3` [1..1000000] -- => (0.01 secs, 77,272 bytes)
I then tried to do the same on a (much) bigger list:
3 `elem3` [1..10000000000000000000000000000000000000000000000000000000000000000000000000]
elem1 and elem2 took a very long time, while elem3 was instantaneous (almost identical to the first benchmark).
I think this is because GHC knows that 3 is a member of [1..1000000], and the big number I used in the second benchmark is bigger than 1000000, hence 3 is also a member of [1..bigNumber] and GHC doesn't have to compute the expression at all.
But how is it able to automatically cache (or memoize, a term that Land of Lisp taught me) elem3 but not the two other ones?
Short answer: this has nothing to do with caching, but the fact that you force Haskell in the first two implementations, to iterate over all elements.
No, this is because foldl works left to right, but it will thus keep iterating over the list until the list is exhausted.
Therefore you better use foldr. Here from the moment it finds a 3 it in the list, it will cut off the search.
This is because foldris defined as:
foldr f z [x1, x2, x3] = f x1 (f x2 (f x3 z))
whereas foldl is implemented as:
foldl f z [x1, x2, x3] = f (f (f (f z) x1) x2) x3
Note that the outer f thus binds with x3, so that means foldl first so if due to laziness you do not evaluate the first operand, you still need to iterate to the end of the list.
If we implement the foldl and foldr version, we get:
y `elem1l` ys = foldl (\acc x -> if x == y then True else acc) False ys
y `elem1r` ys = foldr (\x acc -> if x == y then True else acc) False ys
We then get:
Prelude> 3 `elem1l` [1..1000000]
True
(0.25 secs, 112,067,000 bytes)
Prelude> 3 `elem1r` [1..1000000]
True
(0.03 secs, 68,128 bytes)
Stripping the duplicates from the list will not imrpove the efficiency. What here improves the efficiency is that you use map. map works left-to-right. Note furthermore that nub works lazy, so nub is here a no op, since you are only interested in the head, so Haskell does not need to perform memberchecks on the already seen elements.
The performance is almost identical:
Prelude List> 3 `elem3` [1..1000000]
True
(0.03 secs, 68,296 bytes)
In case you work with a Set however, you do not perform uniqueness lazily: you first fetch all the elements into the list, so again, you will iterate over all the elements, and not cut of the search after the first hit.
Explanation
foldl goes to the innermost element of the list, applies the computation, and does so again recursively to the result and the next innermost value of the list, and so on.
foldl f z [x1, x2, ..., xn] == (...((z `f` x1) `f` x2) `f`...) `f` xn
So in order to produce the result, it has to traverse all the list.
Conversely, in your function elem3 as everything is lazy, nothing gets computed at all, until you call head.
But in order to compute that value, you just the first value of the (filtered) list, so you just need to go as far as 3 is encountered in your big list. which is very soon, so the list is not traversed. if you asked for the 1000000th element, eleme3 would probably perform as badly as the other ones.
Lazyness
Lazyness ensure that your language is always composable : breaking a function into subfunction does not changes what is done.
What you are seeing can lead to a space leak which is really about how control flow works in a lazy language. both in strict and in lazy, your code will decide what gets evaluated, but with a subtle difference :
In a strict language, the builder of the function will choose, as it forces evaluation of its arguments: whoever is called is in charge.
In a lazy language, the consumer of the function chooses. whoever called is in charge. It may choose to only evaluate the first element (by calling head), or every other element. All that provided its own caller choose to evaluate his own computation as well. there is a whole chain of command deciding what to do.
In that reading, your foldl based elem function uses that "inversion of control" in an essential way : elem gets asked to produce a value. foldl goes deep inside the list. if the first element if y then it return the trivial computation True. if not, it forwards the requests to the computation acc. In other words, what you read as values acc, x or even True, are really placeholders for computations, which you receive and yield back. And indeed, acc may be some unbelievably complex computation (or divergent one like undefined), as long as you transfer control to the computation True, your caller will never see the existence of acc.
foldr vs foldl vs foldl' VS speed
As suggested in another answer, foldr might best your intent on how to traverse the list, and will shield you away from space leaks (whereas foldl' will prevent space leaks as well if you really want to traverse the other way, which can lead to buildup of complex computations ... and can be very useful for circular computation for instance).
But the speed issue is really an algorithmic one. There might be better data structure for set membership if and only if you know beforehand that you have a certain pattern of usage.
For instance, it might be useful to pay some upfront cost to have a Set, then have fast membership queries, but that is only useful if you know that you will have such a pattern where you have a few sets and lots of queries to those sets. Other data structure are optimal for other patterns, and it's interesting to note that from a API/specification/interface point of view, they are usually the same to the consumer. That's a general phenomena in any languages, and why many people love abstract data types/modules in programming.
Using foldr and expecting to be faster really encodes the assumption that, given your static knowledge of your future access pattern, the values you are likely to test membership of will sit at the beginning. Using foldl would be fine if you expect your values to be at the end of it.
Note that using foldl, you might construct the entire list, you do not construct the values themselves, until you need it of course, for instance to test for equality, as long as you have not found the searched element.

Forcing Strict Evaluation - What am I doing wrong?

I want an intermediate result computed before generating the new one to get the benefit of memoization.
import qualified Data.Map.Strict as M
import Data.List
parts' m = newmap
where
n = M.size m + 1
lists = nub $ map sort $
[n] : (concat $ map (\i -> map (i:) (M.findWithDefault [] (n-i) m)) [1..n])
newmap = seq lists (M.insert n lists m)
But, then if I do
take 2000 (iterate parts' (M.fromList [(1,[[1]])]))
It still completes instantaneously.
(Can using an Array instead of a Map help?)
Short answer:
If you need to calculate the entire list/array/map/... at once, you can use deepseq as #JoshuaRahm suggests, or the ($!!) operator.
The answer below how you can enforce strictness, but only on level-1 (it evaluates until it reaches a datastructure that may contain (remainders) of expression trees).
Furthermore the answer argues why laziness and memoization are not (necessarily) opposites of each other.
More advanced:
Haskell is a lazy language, it means it only calculates something, if it is absolutely necessary. An expression like:
take 2000 (iterate parts' (M.fromList [(1,[[1]])]))
is not evaluated immediately: Haskell simply stores that this has to be calculated later. Later if you really need the first, second, i-th, or the length of the list, it will evaluate it, and even then in a lazy fashion: if you need the first element, from the moment it has found the way to calculate that element, it will represent it as:
element : take 1999 (<some-expression>)
You can however force Haskell to evaluate something strictly with the exclamation mark (!), this is called strictness. For instance:
main = do
return $! take 2000 (iterate parts' (M.fromList [(1,[[1]])]))
Or in case it is an argument, you can use it like:
f x !y !z = x+y+z
Here you force Haskell to evaluate y and z before "increasing the expression tree" as:
expression-for-x+expression-for-y+expression-for-z.
EDIT: if you use it in a let pattern, you can use the bang as well:
let !foo = take 2000 (iterate parts' (M.fromList [(1,[[1]])])) in ...
Note that you only collapse the structure to the first level. Thus let !foo will more or less only evaluate up to (_:_).
Note: note that memoization and lazyness are not necessary opposites of each other. Consider the list:
numbers :: [Integer]
numbers = 0:[i+(sum (genericTake i numbers))|i<-[1..]]
As you can see, calculating a number requires a large amount of computational effort. Numbers is represented like:
numbers ---> (0:[i+(sum (genericTake i numbers))|i<-[1..]])
if however, I evaluate numbers!!1, it will have to calculate the first element, it returns 1; but the internal structure of numbers is evaluated as well. Now it looks like:
numbers (0:1:[i+(sum (genericTake i numbers))|i<-[2..]])
The computation numbers!!1 thus will "help" future computations, because you will never have to recalcuate the second element in the list.
If you for instance calculate numbers!!4000, it will take a few seconds. Later if you calculate numbers!!4001, it will be calculated almost instantly. Simply because the work already done by numbers!!4000 is reused.
Arrays might be able to help, but you can also try taking advantage of the deepseq library. So you can write code like this:
let x = take 2000 (iterate parts' (M.fromList [(1,[[1]])])) in do
x `deepseq` print (x !! 5) -- takes a *really* long time
print (x !! 1999) -- finishes instantly
You are memoizing the partitions functions, but there are some drawbacks to your approach:
you are only memoizing up to a specific value which you have to specify beforehand
you need to call nub and sort
Here is an approach using Data.Memocombinators:
import Data.Memocombinators
parts = integral go
where
go k | k <= 0 = [] -- for safety
go 1 = [[1]]
go n = [[n]] ++ [ (a : p) | a <- [n-1,n-2..1], p <- parts (n-a), a >= head p ]
E.g.:
ghci> parts 4
[[4],[3,1],[2,2],[2,1,1],[1,1,1,1]]
This memoization is dynamic, so only the values you actually access will be memoized.
Note how it is constructed - parts = integral go, and go uses parts for any recursive calls. We use the integral combinator here because parts is a function of an Int.

Why doesn't product [0..] evaluate to 0 "instantly"?

I am trying to understand laziness. Because 0 multiplied with any number is 0, shouldn't product [0..] evaluate to 0? I tried also foldl (*) 1 [0..], and to define my own product as
myProduct 0 _ = 0
myProduct _ 0 = 0
myProduct a b = a*b
Why doesn't the fold stop as soon as a 0 is found?
Because the multiply operator doesn't know it's getting chained, and the fold function doesn't know the multiply operator's particular behaviour for any argument. With that combination, it needs to exhaust the list to finish the fold. In fact, for this reason foldl doesn't work at all on infinite lists. foldr does, because it can expand the function from the head of the list.
foldl (*) 1 [0..] -> (((..(((1*0)*1)*2)*3....)*inf
The outermost multiplication in the foldl case can never be found, because the list is infinite. It therefore cannot follow the chain to conclude the result is zero. It can, and does, calculate the product along the list, and that product happens to stay zero, but it will not terminate. If you use scanl instead you can see these intermediate products.
foldr (*) 1 [0..] -> 0*(1*(2*(3*((...((inf*1)))...)))
The outermost multiplication in the foldr case is found immediately, because the rest of the list is in fact left as a lazy thunk. It only runs one step:
foldr (*) 1 [0..] -> 0*(foldr (*) 1 [1..])
So because your custom multiplication operator myProduct is not strict in the second argument if the first argument is zero, foldr myProduct 1 [0..] can terminate.
As a side note, the prelude product function is restricted to finite lists (and may be implemented with foldl). Even if it used foldr, it probably would not shortcut because the standard multiply operator is strict; doing otherwise would be computationally expensive in the common case where the products are neither zero nor chained.
-- sum and product compute the sum or product of a finite list of numbers.
sum, product :: (Num a) => [a] -> a
sum = foldl (+) 0
product = foldl (*) 1
In addition, there's a reason it does not use foldr; as we could see in the expansions and scanl function, the left folds can compute as they consume the list. The right fold, if the operator does not shortcut, needs to build an expression as large as the list itself to even begin computation. This difference is because it's the innermost expression that starts the computation in the strict case, but the outermost expression that produces the result, allowing the lazy case. Lazy vs. non-strict in the Haskell wiki might explain better than I can, and even mentions that pattern matching, which you used to describe the shortcut in myProduct, can be strict.
If you switch the first two lines:
myProduct _ 0 = 0
myProduct 0 _ = 0
myProduct a b = a*b
the second argument will always be evaluated before the first one and the infinite foldr won't work anymore.
Since its impossible to define a myProduct that works lazily for both arguments (not evaluating the second if the first is 0 and not evaluating the first if the second is 0) maybe we are better off with having * always evaluate both its arguments.
You can have it thusly:
myproduct xs = foldr op id xs 1
where
op x r acc = if x==0 then 0 else acc `seq` r (acc*x)
This is a right fold that multiplies the numbers from the left, operating in constant space, and stops as soon as a 0 is encountered.

Why is this tail-recursive Haskell function slower ?

I was trying to implement a Haskell function that takes as input an array of integers A
and produces another array B = [A[0], A[0]+A[1], A[0]+A[1]+A[2] ,... ]. I know that scanl from Data.List can be used for this with the function (+). I wrote the second implementation
(which performs faster) after seeing the source code of scanl. I want to know why the first implementation is slower compared to the second one, despite being tail-recursive?
-- This function works slow.
ps s x [] = x
ps s x y = ps s' x' y'
where
s' = s + head y
x' = x ++ [s']
y' = tail y
-- This function works fast.
ps' s [] = []
ps' s y = [s'] ++ (ps' s' y')
where
s' = s + head y
y' = tail y
Some details about the above code:
Implementation 1 : It should be called as
ps 0 [] a
where 'a' is your array.
Implementation 2: It should be called as
ps' 0 a
where 'a' is your array.
You are changing the way that ++ associates. In your first function you are computing ((([a0] ++ [a1]) ++ [a2]) ++ ...) whereas in the second function you are computing [a0] ++ ([a1] ++ ([a2] ++ ..)). Appending a few elements to the start of the list is O(1), whereas appending a few elements to the end of a list is O(n) in the length of the list. This leads to a linear versus quadratic algorithm overall.
You can fix the first example by building the list up in reverse order, and then reversing again at the end, or by using something like dlist. However the second will still be better for most purposes. While tail calls do exist and can be important in Haskell, if you are familiar with a strict functional language like Scheme or ML your intuition about how and when to use them is completely wrong.
The second example is better, in large part, because it's incremental; it immediately starts returning data that the consumer might be interested in. If you just fixed the first example using the double-reverse or dlist tricks, your function will traverse the entire list before it returns anything at all.
I would like to mention that your function can be more easily expressed as
drop 1 . scanl (+) 0
Usually, it is a good idea to use predefined combinators like scanl in favour of writing your own recursion schemes; it improves readability and makes it less likely that you needlessly squander performance.
However, in this case, both my scanl version and your original ps and ps' can sometimes lead to stack overflows due to lazy evaluation: Haskell does not necessarily immediately evaluate the additions (depends on strictness analysis).
One case where you can see this is if you do last (ps' 0 [1..100000000]). That leads to a stack overflow. You can solve that problem by forcing Haskell to evaluate the additions immediately, for instance by defining your own, strict scanl:
myscanl :: (b -> a -> b) -> b -> [a] -> [b]
myscanl f q [] = []
myscanl f q (x:xs) = q `seq` let q' = f q x in q' : myscanl f q' xs
ps' = myscanl (+) 0
Then, calling last (ps' [1..100000000]) works.

Haskell: foldl' accumulator parameter

I've been asking a few questions about strictness, but I think I've missed the mark before. Hopefully this is more precise.
Lets say we have:
n = 1000000
f z = foldl' (\(x1, x2) y -> (x1 + y, y - x2)) z [1..n]
Without changing f, what should I set
z = ...
So that f z does not overflow the stack? (i.e. runs in constant space regardless of the size of n)
Its okay if the answer requires GHC extensions.
My first thought is to define:
g (a1, a2) = (!a1, !a2)
and then
z = g (0, 0)
But I don't think g is valid Haskell.
So your strict foldl' is only going to evaluate the result of your lambda at each step of the fold to Weak Head Normal Form, i.e. it is only strict in the outermost constructor. Thus the tuple will be evaluated, however those additions inside the tuple may build up as thunks. This in-depth answer actually seems to address your exact situation here.
W/R/T your g: You are thinking of BangPatterns extension, which would look like
g (!a1, !a2) = (a1, a2)
and which evaluates a1 and a2 to WHNF before returning them in the tuple.
What you want to be concerned about is not your initial accumulator, but rather your lambda expression. This would be a nice solution:
f z = foldl' (\(!x1, !x2) y -> (x1 + y, y - x2)) z [1..n]
EDIT: After noticing your other questions I see I didn't read this one very carefully. Your goal is to have "strict data" so to speak. Your other option, then, is to make a new tuple type that has strictness tags on its fields:
data Tuple a b = Tuple !a !b
Then when you pattern match on Tuple a b, a and b will be evaluated.
You'll need to change your function regardless.
There is nothing you can do without changing f. If f were overloaded in the type of the pair you could use strict pairs, but as it stands you're locked in to what f does. There's some small hope that the compiler (strictness analysis and transformations) can avoid the stack growth, but nothing you can count on.

Resources