Why foldr works on infinity list? - haskell

This function may work on infinity association lists, and it is easy to find out why:
findKey :: (Eq k) => k -> [(k,v)] -> Maybe v
findKey key [] = Nothing
findKey key ((k,v):xs) = if key == k
then Just v
else findKey key xs
When it find the key, it returns Just v, stopping the recursion.
Now look at this another implementation:
findKey' :: (Eq k) => k -> [(k,v)] -> Maybe v
findKey' key = foldr (\(k,v) acc -> if key == k then Just v else acc) Nothing
How does the compiler/interpreter know that when the key matches k, it can return it?
*Main> findKey' 1 $ zip [1..] [1..]
returns Just 1
When it finds that key == k, it returns Just v. Why does the recursion stop there, allowing us to do such things with infinity association lists?

Because the function passed to foldr does not always evaluate the acc parameter, i.e. it is lazy in that parameter.
For example,
(\(k,v) acc -> if 1 == k then Just v else acc) (1,"one") (error "here be dragons!")
will return "one" without even attempting to evaluate the error expression.
Moreover, foldr by definition satisfies:
foldr f a (x:xs) = f x (foldr f a xs)
If x:xs is infinite, but f does not use its second argument, then foldr can return immediately.
In your example, f evaluates its second element if and only if the first argument is not the wanted association. This means that the association list will only be evaluated only enough to find the key association.
If you like to experiment, try this instead:
foldr (\(k,v) acc -> case acc of
Nothing -> if key == k then Just v else acc
Just y -> if key == k then Just v else acc) Nothing
The case looks redundant, since the function returns the same thing in both branches. However, this demands the evaluation of acc breaking the code on infinite lists.
Another thing you might want to try
foldr (:) [] [0..]
This basically rebuilds the infinite list as it is.
foldr (\x xs -> x*10 : xs) [] [0..]
This multiplies everything by 10, and is equivalent to map (*10) [0..].

The non-empty case of foldr can be defined as foldr f init (x:xs) = f x (foldr f init xs). In your case f is (\(k,v) acc -> if key == k then Just v else acc), so (k,v) stands for the current element in the list and acc stands for (foldr f init xs). That is, acc stands for the recursive call. In the then-case, you do not use acc, therefore the recursive call does not happen since Haskell is lazy meaning arguments aren't evaluated until (and unless) used.

Related

Why is it sometimes possible to fold an infinite list from the right?

I have been going through the excellent CIS 194 course when I got stuck on Part 5 of Homework 6. It revolves around implementing the ruler function without any divisibility testing.
I found that it is possible to build the ruler function by continuously interspersing an accumulator with values from an infinite list.
nats = [0,1,2,3,..]
[3]
[2,3,2]
[1,2,1,3,1,2,1]
[0,1,0,2,0,1,0,3,0,1,0,2,0]
Then I tried implementing this algorithm for Stream datatype which is a list without nil
data Stream a = Cons a (Stream a)
streamToList :: Stream a -> [a]
streamToList (Cons x xs) = x : streamToList xs
instance Show a => Show (Stream a) where
show = show . take 20 . streamToList
streamFromSeed :: (a -> a) -> a -> Stream a
streamFromSeed f x = Cons x (streamFromSeed f (f x))
nats :: Stream Integer
nats = streamFromSeed succ 0
interleave x (Cons y ys) = Cons x (Cons y (interleave x ys))
foldStream f (Cons x xs) = f x (foldStream f xs)
ruler = foldStream interleave nats
As expected, I got stackoverflow error since I was trying to fold from the right. However, I was surprised to see the same algorithm work for normal infinite lists.
import Data.List
interleave x list = [x] ++ (intersperse x list) ++ [x]
ruler = take 20 (foldr interleave [] [0..])
What am I missing? Why one implementation works while the other doesn't?
Your interleave is insufficiently lazy. The magic thing that right folds must do to work on infinite structures is to not inspect the result of the folded value too closely before they do the first bit of computation. So:
interleave x stream = Cons x $ case stream of
Cons y ys -> Cons y (interleave x ys)
This produces Cons x _ before inspecting stream; in contrast, your version requires stream to be evaluated a bit before it can pass to the right hand side of the equation, which essentially forces the entire fold to happen before any constructor gets produced.
You can also see this in your list version of interleave:
interleave x list = [x] ++ intersperse x list ++ [x]
The first element of the returned list (x) is known before intersperse starts pattern matching on list.
We can inspect the source code of foldr [src]. A less noisy version looks like:
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
Haskell does not evaluate eagerly. This thus means that, unless you need (foldr f z xs), it will not evaluate the accumulator. This thus means that f does not need the second parameter, for example because the first item x has a certain value, it will not evaluate the accumulator.
For example if we implement takeWhileNeq:
takeWhileNeq a = foldr f []
where f x xs -> if x == a then [] else (x:xs)
if we thus run this on a list takeWhileNeq 2 [1,4,2,5], then it will not evaluate anything. If we however want to print the result it will evaluate this as:
f 1 (foldr f [4,2,5])
and f will inspect if 1 == 2, since that is not the case, it will return (x:xs), so:
-> 1 : foldr f [4,2,5]
so now it will evaluate 4 == 2, and because this is false, it will evaluate this to:
-> 1 : (4 : foldr f [2,5])
now we evaluate 2 == 2, and since this is True, the function returns the empty list, and ingores the accumulator, so it will never look at foldr f [5]:
-> 1 : (4 : [])
For an infinite list, it will thus also result an empty list and ignore folding the rest of the list.

Is using fold less efficient than standard recursion

I'm going through the Learn You a Haskell book right now and I'm curious about how this particular example works. The book first demonstrates an implementation of findKey using traditional recursion:
findKey :: (Eq k) => k -> [(k,v)] -> Maybe v
findKey key [] = Nothing
findKey key ((k,v):xs) = if key == k
then Just v
else findKey key xs
The book then follows up with a shorter implementation using foldr
findKey :: (Eq k) => k -> [(k,v)] -> Maybe v
findKey key = foldr (\(k,v) acc -> if key == k then Just v else acc) Nothing
With the standard recursion, the function should immediately return once it hits the first element with the provided key. If I understand the foldr implementation correctly, it will iterate over the entire list every time, even if it matched the first element it came across. That doesn't seem like a very efficient way to handle the problem.
Is there something I'm not getting about how the foldr implementation works? Or is there some kind of magic within Haskell that makes this implementation not quite as inefficient as I think it is?
foldr is written using standard recursion.
The recursive call to foldr is hidden inside of acc. If your code doesn't use acc, it will never be computed (because Haskell is lazy). So the foldr version is efficient and will also return early.
Here's an example demonstrating this:
Prelude> foldr (\x z -> "done") "acc" [0 ..]
"done"
This expression returns "done" immediately, even though the input list is infinitely long.
If foldr is defined as:
foldr f z (x : xs) = f x (foldr f z xs)
foldr _ z [] = z
, then evaluation goes via
f x (foldr f z xs)
where
f = \x z -> "done"
x = 0
z = "acc"
xs = ... -- unevaluated, but is [1 ..]
which is
(\x z -> "done") 0 (foldr (\x z -> "done") "acc" [1 ..])
which turns into "done" because the first function doesn't use z, so the recursive call is never needed.
If I understand the foldr implementation correctly, it will iterate over the entire list every time, even if it matched the first element it came across.
This is wrong. foldr will evaluate the list only as much as needed.
E.g.
foldr (&&) True [True, False, error "unreached code here"]
returns False since the error is never evaluated, precisely as in
(True && (False && (error "unreached code here" && True)))
Indeed, since the end of the list is never reached, we can also write
foldr (&&) (error "end") [True, False, error "unreached code here"]
and still obtain False.
Here is code which demonstrates that foldr does indeed "short-circuit" the evaluation of findKey:
import Debug.Trace
findKey :: (Eq k) => k -> [(k,v)] -> Maybe v
findKey key = foldr (\(k,v) acc -> if key == k then Just v else acc) Nothing
tr x = trace msg x
where msg = "=== at: " ++ show x
thelist = [ tr (1,'a'), tr (2,'b'), tr (3, 'c'), tr (4, 'd') ]
An example of running findKey in ghci:
*Main> findKey 2 thelist
=== at: (1,'a')
=== at: (2,'b')
Just 'b'
*Main>
Think of foldr using the following definition (using standard recursion):
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f e [] = e
foldr f e (x:xs) = f x (foldr f e xs)
The third line shows that the second implementation for findKey will return upon finding the first match.
As a sidenote: assume you had the following definition (which does not have identical functionality) for findKey (as an exercise you might want to rewrite the definition using foldr):
findKey :: (Eq k) => k -> [(k,v)] -> [v]
findKey key [] = []
findKey key ((kHead, vHead):rest) = if (key == kHead) then vHead:(findKey key rest) else findKey key rest
Now you might think that this would iterate through the whole input list. Depending on how you invoke this function, it could be the case that it iterates through the whole list, but at the same time this can give you the first match efficiently too. Due to Haskell's lazy evaluation the following code:
head (findKey key li)
will give you the first match (assuming that there's one) with the same efficiency as your first example.
foldr f z [a,b,c,...,n] ==
a `f` (b `f` (c `f` (... (n `f` z) ...))) ==
f a (foldr f z [b,c,...,n]) ==
f a acc where acc = foldr f z [b,c,...,n]
So if your f returns before forcing acc, acc remains not forced, i.e. no part of the list argument beyond its head element a is accessed, like e.g. when you have
f a acc = ...
If, on the other hand, your f does force its second argument, e.g. if it's defined as
f a (x:xs) = ...
then the acc is forced before f starts its work, and the list will be accessed in whole before the processing begins -- in whole, because acc = f b acc2 and that invocation of f must force its second argument, acc2, so its value, acc, can be forced (pattern-matched with (x:xs), that is); and so forth.

Find the K'th element of a list using foldr

I try to implement own safe search element by index in list.
I think, that my function have to have this signature:
safe_search :: [a] -> Int -> Maybe a
safe_search xs n = foldr iteration init_val xs n
iteration = undefined
init_val = undefined
I have problem with implementation of iteration. I think, that it has to look like this:
safe_search :: [a] -> Int -> Maybe a
safe_search xs n = foldr iteration init_val xs n
where
iteration :: a -> (Int -> [a]) -> Int -> a
iteration x g 0 = []
iteration x g n = x (n - 1)
init_val :: Int -> a
init_val = const 0
But It has to many errors. My intuition about haskell is wrong.
you have
safe_search :: [a] -> Int -> Maybe a
safe_search xs n = foldr iteration init_val xs n
if null xs holds, foldr iteration init_val [] => init_val, so
init_val n
must make sense. Nothing to return, so
= Nothing
is all we can do here, to fit the return type.
So init_val is a function, :: Int -> Maybe a. By the definition of foldr, this is also what the "recursive" argument to the combining function is, "coming from the right":
iteration x r
but then this call must also return just such a function itself (again, by the definition of foldr, foldr f z [a,b,c,...,n] == f a (f b (f c (...(f n z)...))), f :: a -> b -> b i.e. it must return a value of the same type as it gets in its 2nd argument ), so
n | n==0 = Just x
That was easy, 0-th element is the one at hand, x; what if n > 0?
| n>0 = ... (n-1)
Right? Just one more step left for you to do on your own... :) It's not x (the list's element) that goes on the dots there; it must be a function. We've already received such a function, as an argument...
To see what's going on here, it might help to check the case when the input is a one-element list, first,
safe_search [x] n = foldr iteration init_val [x] n
= iteration x init_val n
and with two elements,
[x1, x2] n = iteration x1 (iteration x2 init_val) n
-- iteration x r n
Hope it is clear now.
edit: So, this resembles the usual foldr-based implementation of zip fused with the descending enumeration from n down, indeed encoding the more higher-level definition of
foo xs n = ($ zip xs [n,n-1..]) $
dropWhile ((>0) . snd) >>>
map fst >>>
take 1 >>> listToMaybe
= drop n >>> take 1 >>> listToMaybe $ xs
Think about a few things.
What type should init_val have?
What do you need to do with g? g is the trickiest part of this code. If you've ever learned about continuation-passing style, you should probably think of both init_val and g as continuations.
What does x represent? What will you need to do with it?
I wrote up an explanation some time ago about how the definition of foldl in terms of foldr works. You may find it helpful.
I suggest to use standard foldr pattern, because it is easier to read and understand the code, when you use standard functions:
foldr has the type foldr :: (a -> b -> b) -> [a] -> b -> [b],
where third argument b is the accumulator acc for elements of your list [a].
You need to stop adding elements of your list [a] to acc after you've added desired element of your list. Then you take head of the resulting list [b] and thus get desired element of the list [a].
To get n'th element of the list xs, you need to add length xs - n elements of xs to the accumulator acc, counting from the end of the list.
But where to use an iterator if we want to use the standard foldr function to improve the readability of our code? We can use it in our accumulator, representing it as a tuple (acc, iterator). We subtract 1 from the iterator each turn we add element from our initial list xs to the acc and stop to add elements of xs to the acc when our iterator is equal 0.
Then we apply head . fst to the result of our foldr function to get the desired element of the initial list xs and wrap it with Just constructor.
Of course, if length - 1 of our initial list xs is less than the index of desired element n, the result of the whole function safeSearch will be Nothing.
Here is the code of the function safeSearch:
safeSearch :: Int -> [a] -> Maybe a
safeSearch n xs
| (length xs - 1) < n = Nothing
| otherwise = return $ findElem n' xs
where findElem num =
head .
fst .
foldr (\x (acc,iterator) ->
if iterator /= 0
then (x : acc,iterator - 1)
else (acc,iterator))
([],num)
n' = length xs - n

Haskell - format issue

i am a beginner in haskell programming and very often i get the error
xxx.hs:30:1: parse error on input `xxx'
And often there is a little bit playing with the format the solution. Its the same code and it looks the same, but after playing around, the error is gone.
At the moment I've got the error
LookupAll.hs:30:1: parse error on input `lookupAll'
After that code:
lookupOne :: Int -> [(Int,a)] -> [a]
lookupOne _ [] = []
lookupOne x list =
if fst(head list) == x then snd(head list) : []
lookupOne x (tail list)
-- | Given a list of keys and a list of pairs of key and value
-- 'lookupAll' looks up the list of associated values for each key
-- and concatenates the results.
lookupAll :: [Int] -> [(Int,a)] -> [a]
lookupAll [] _ = []
lookupAll _ [] = []
lookupAll xs list = lookupOne h list ++ lookupAll t list
where
h = head xs
t = tail xs
But I have done everything right in my opinion. There are no tabs or something like that. Always 4 spaces. Is there a general solutoin for this problems? I am using notepad++ at the moment.
Thanks!
The problem is not with lookupAll, it's actually with the previous two lines of code
if fst (head list) == x then snd (head list) : []
lookupOne x (tail list)
You haven't included an else on this if statement. My guess is that you meant
if fst (head list) == x then snd (head list) : []
else lookupOne x (tail list)
Which I personally would prefer to format as
if fst (head list) == x
then snd (head list) : []
else lookupOne x (tail list)
but that's a matter of taste.
If you are wanting to accumulate a list of values that match a condition, there are a few ways. By far the easiest is to use filter, but you can also use explicit recursion. To use filter, you could write your function as
lookupOne x list
= map snd -- Return only the values from the assoc list
$ filter (\y -> fst y == x) list -- Find each pair whose first element equals x
If you wanted to use recursion, you could instead write it as
lookupOne _ [] = [] -- The base case pattern
lookupOne x (y:ys) = -- Pattern match with (:), don't have to use head and tail
if fst y == x -- Check if the key and lookup value match
then snd y : lookupOne x ys -- If so, prepend it onto the result of looking up the rest of the list
else lookupOne x ys -- Otherwise, just return the result of looking up the rest of the list
Both of these are equivalent. In fact, you can implement filter as
filter cond [] = []
filter cond (x:xs) =
if cond x
then x : filter cond xs
else filter cond xs
And map as
map f [] = []
map f (x:xs) = f x : map f xs
Hopefully you can spot the similarities between filter and lookupOne, and with map consider f == snd, so you have a merger of the two patterns of map and filter in the explicit recursive version of lookupOne. You could generalize this combined pattern into a higher order function
mapFilter :: (a -> b) -> (a -> Bool) -> [a] -> [b]
mapFilter f cond [] = []
mapFilter f cond (x:xs) =
if cond x
then f x : mapFilter f cond xs
else : mapFilter f cond xs
Which you can use to implement lookupOne as
lookupOne x list = mapFilter snd (\y -> fst y == x) list
Or more simply
lookupOne x = mapFilter snd ((== x) . fst)
I think #bheklilr is right - you're missing an else.
You could fix this particular formatting problem, however, by forming lookupOne as a function composition, rather than writing your own new recursive function.
For example, you can get the right kind of behaviour by defining lookupOne like this:
lookupOne a = map snd . filter ((==) a . fst)
This way it's clearer that you're first filtering out the elements of the input list for which the first element of the tuple matches the key, and then extracting just the second element of each tuple.

Under what circumstances are monadic computations tail-recursive?

In Haskell Wiki's Recursion in a monad there is an example that is claimed to be tail-recursive:
f 0 acc = return (reverse acc)
f n acc = do
v <- getLine
f (n-1) (v : acc)
While the imperative notation leads us to believe that it is tail-recursive, it's not so obvious at all (at least to me). If we de-sugar do we get
f 0 acc = return (reverse acc)
f n acc = getLine >>= \v -> f (n-1) (v : acc)
and rewriting the second line leads to
f n acc = (>>=) getLine (\v -> f (n-1) (v : acc))
So we see that f occurs inside the second argument of >>=, not in a tail-recursive position. We'd need to examine IO's >>= to get an answer.
Clearly having the recursive call as the last line in a do block isn't a sufficient condition a function to be tail-recursive.
Let's say that a monad is tail-recursive iff every recursive function in this monad defined as
f = do
...
f ...
or equivalently
f ... = (...) >>= \x -> f ...
is tail-recursive. My question is:
What monads are tail-recursive?
Is there some general rule that we can use to immediately distinguish tail-recursive monads?
Update: Let me make a specific counter-example: The [] monad is not tail-recursive according to the above definition. If it were, then
f 0 acc = acc
f n acc = do
r <- acc
f (n - 1) (map (r +) acc)
would have to be tail-recursive. However, desugaring the second line leads to
f n acc = acc >>= \r -> f (n - 1) (map (r +) acc)
= (flip concatMap) acc (\r -> f (n - 1) (map (r +) acc))
Clearly, this isn't tail-recursive, and IMHO cannot be made. The reason is that the recursive call isn't the end of the computation. It is performed several times and the results are combined to make the final result.
A monadic computation that refers to itself is never tail-recursive. However, in Haskell you have laziness and corecursion, and that is what counts. Let's use this simple example:
forever :: (Monad m) => m a -> m b
forever c' = let c = c' >> c in c
Such a computation runs in constant space if and only if (>>) is nonstrict in its second argument. This is really very similar to lists and repeat:
repeat :: a -> [a]
repeat x = let xs = x : xs in xs
Since the (:) constructor is nonstrict in its second argument this works and the list can be traversed, because you have a finite weak-head normal form (WHNF). As long as the consumer (for example a list fold) only ever asks for the WHNF this works and runs in constant space.
The consumer in the case of forever is whatever interprets the monadic computation. If the monad is [], then (>>) is non-strict in its second argument, when its first argument is the empty list. So forever [] will result in [], while forever [1] will diverge. In the case of the IO monad the interpreter is the very run-time system itself, and there you can think of (>>) being always non-strict in its second argument.
What really matters is constant stack space. Your first example is tail recursive modulo cons, thanks to the laziness.
The (getLine >>=) will be executed and will evaporate, leaving us again with the call to f. What matters is, this happens in a constant number of steps - there's no thunk build-up.
Your second example,
f 0 acc = acc
f n acc = concat [ f (n - 1) $ map (r +) acc | r <- acc]
will be only linear (in n) in its thunk build-up, as the result list is accessed from the left (again due to the laziness, as concat is non-strict). If it is consumed at the head it can run in O(1) space (not counting the linear space thunk, f(0), f(1), ..., f(n-1) at the left edge ).
Much worse would be
f n acc = concat [ f (n-1) $ map (r +) $ f (n-1) acc | r <- acc]
or in do-notation,
f n acc = do
r <- acc
f (n-1) $ map (r+) $ f (n-1) acc
because there is extra forcing due to information dependency. Similarly, if the bind for a given monad were a strict operation.

Resources