Related
I have been going through the excellent CIS 194 course when I got stuck on Part 5 of Homework 6. It revolves around implementing the ruler function without any divisibility testing.
I found that it is possible to build the ruler function by continuously interspersing an accumulator with values from an infinite list.
nats = [0,1,2,3,..]
[3]
[2,3,2]
[1,2,1,3,1,2,1]
[0,1,0,2,0,1,0,3,0,1,0,2,0]
Then I tried implementing this algorithm for Stream datatype which is a list without nil
data Stream a = Cons a (Stream a)
streamToList :: Stream a -> [a]
streamToList (Cons x xs) = x : streamToList xs
instance Show a => Show (Stream a) where
show = show . take 20 . streamToList
streamFromSeed :: (a -> a) -> a -> Stream a
streamFromSeed f x = Cons x (streamFromSeed f (f x))
nats :: Stream Integer
nats = streamFromSeed succ 0
interleave x (Cons y ys) = Cons x (Cons y (interleave x ys))
foldStream f (Cons x xs) = f x (foldStream f xs)
ruler = foldStream interleave nats
As expected, I got stackoverflow error since I was trying to fold from the right. However, I was surprised to see the same algorithm work for normal infinite lists.
import Data.List
interleave x list = [x] ++ (intersperse x list) ++ [x]
ruler = take 20 (foldr interleave [] [0..])
What am I missing? Why one implementation works while the other doesn't?
Your interleave is insufficiently lazy. The magic thing that right folds must do to work on infinite structures is to not inspect the result of the folded value too closely before they do the first bit of computation. So:
interleave x stream = Cons x $ case stream of
Cons y ys -> Cons y (interleave x ys)
This produces Cons x _ before inspecting stream; in contrast, your version requires stream to be evaluated a bit before it can pass to the right hand side of the equation, which essentially forces the entire fold to happen before any constructor gets produced.
You can also see this in your list version of interleave:
interleave x list = [x] ++ intersperse x list ++ [x]
The first element of the returned list (x) is known before intersperse starts pattern matching on list.
We can inspect the source code of foldr [src]. A less noisy version looks like:
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
Haskell does not evaluate eagerly. This thus means that, unless you need (foldr f z xs), it will not evaluate the accumulator. This thus means that f does not need the second parameter, for example because the first item x has a certain value, it will not evaluate the accumulator.
For example if we implement takeWhileNeq:
takeWhileNeq a = foldr f []
where f x xs -> if x == a then [] else (x:xs)
if we thus run this on a list takeWhileNeq 2 [1,4,2,5], then it will not evaluate anything. If we however want to print the result it will evaluate this as:
f 1 (foldr f [4,2,5])
and f will inspect if 1 == 2, since that is not the case, it will return (x:xs), so:
-> 1 : foldr f [4,2,5]
so now it will evaluate 4 == 2, and because this is false, it will evaluate this to:
-> 1 : (4 : foldr f [2,5])
now we evaluate 2 == 2, and since this is True, the function returns the empty list, and ingores the accumulator, so it will never look at foldr f [5]:
-> 1 : (4 : [])
For an infinite list, it will thus also result an empty list and ignore folding the rest of the list.
In the same scenario as my previous question, I was trying to implement the cycle function only¹ using a fold, when I came up with the following wrong function, which tries to concatenate the accumulator with itself, building the infinite list exponentially (yes, I know this would mean that it generates 2048 copies if one wants to take 1025 of them)
myCycle :: [a] -> [a]
myCycle s = foldr (\_ a -> a ++ a) s [1..]
However, using it throws *** Exception: heap overflow.
This version, instead, works like a charm
myCycle :: [a] -> [a]
myCycle s = foldr (\_ a -> s ++ a) s [1..]
My question is, why does the former version overflow, as compared to the latter? I feel the reason is dumber than me...
[1] I mean, implementing cycle as a fold, having only the step function and the seed as degrees of freedom.
foldr c n takes a list and replaces every (:) with c and the final [] with n. But [1..] has no final [], so foldr (\_ a -> a ++ a) s has no place to put the s. Therefore no information ever "flows" from s to the result of myCycle s, which means it has no choice but to be bottom (or rather: it has too much choice—it's underspecified—so it gives up and falls back to bottom). The second version actually does use s, because it appears in the folding function, which is used when foldr acts on an infinite list.
In fact, we have the identity
foldr (\_ -> f) x xs = fix f = let x = f x in x
when xs is infinite. That is, the second argument of foldr is completely ignored when the list is infinite. Further, if that folding function doesn't actually look at the elements of the list, then all that's really happening is you're infinitely nesting f within itself: fix f = f (f (f (f ...))). fix is fundamental in the sense that every kind of recursion can be written in terms of it (certain more exotic kinds of recursion require adding some language extensions, but the definition fix f = let x = f x in x itself doesn't change). This makes writing any recursive function in terms of foldr and an infinite list trivial.
Here's my take on an exponential cycle. It produces 1 copy of the input, concatenated onto 2 copies, concatenated onto 4, etc.
myCycle xs = xs ++ myCycle (xs ++ xs)
You translate an explicitly recursive definition to fix by abstracting the recursive call as a parameter and passing that to fix:
myCycle = fix \rec xs -> xs ++ rec (xs ++ xs)
And then you use the foldr identity and introduce a bogus [] case
myCycle = foldr (\_ rec xs -> xs ++ rec (xs ++ xs)) (error "impossible") [1..]
It is known that we can find head of a list using foldr like this:
head'' :: [a] -> a
head'' = foldr (\x _ -> x) undefined
but, is there any way to get the same result using foldl?
Similarly, we can find the last element of list using foldl like this:
last'' :: [a] -> a
last'' = foldl (\_ x -> x) undefined
Is there any way to get the same result using foldr?
head cannot be written with foldl, because foldl goes into an infinite loop on infinite lists, while head doesn't. Otherwise, sure:
head' :: [a] -> a
head' = fromJust . foldl (\y x -> y <|> Just x) Nothing
Drop the fromJust for a safe version.
last can definitely be written as a foldr, in about the same way:
last' :: [a] -> a
last' = fromJust . foldr (\x y -> y <|> Just x) Nothing
For head, we start with Nothing. The first element (the wanted one) is wrapped into Just and used to "override" the Nothing with (<|>). The following elements are ignored. For last, it's about the same, but flipped.
The first thing that springs to mind is to use foldl1 instead of foldl, then:
head'' :: [a] -> a
head'' = foldl1 (\x _ -> x)
and since foldl1 is defined in terms of foldl if the list is non-empty (and crashes if the list is empty - but so does head):
foldl1 f (x:xs) = foldl f x xs
we can say
head'' (x:xs) = foldl (\x _ -> x) x xs
The same of course for last, using foldr1
I am trying to implement the unzip function, I did the following code but I get error.
myUnzip [] =()
myUnzip ((a,b):xs) = a:fst (myUnzip xs) b:snd (myUnzip xs)
I know that problem is in the right side of the second line but I do know how to improve it .
any hint please .
the error that I am getting is
ex1.hs:190:22:
Couldn't match expected type `()' with actual type `[a0]'
In the expression: a : fst (myUnzip xs) b : snd (myUnzip xs)
In an equation for `myUnzip':
myUnzip ((a, b) : xs) = a : fst (myUnzip xs) b : snd (myUnzip xs)
ex1.hs:190:29:
Couldn't match expected type `(t0 -> a0, b0)' with actual type `()'
In the return type of a call of `myUnzip'
In the first argument of `fst', namely `(myUnzip xs)'
In the first argument of `(:)', namely `fst (myUnzip xs) b'
ex1.hs:190:49:
Couldn't match expected type `(a1, [a0])' with actual type `()'
In the return type of a call of `myUnzip'
In the first argument of `snd', namely `(myUnzip xs)'
In the second argument of `(:)', namely `snd (myUnzip xs)'
You could do it inefficiently by traversing the list twice
myUnzip [] = ([], []) -- Defaults to a pair of empty lists, not null
myUnzip xs = (map fst xs, map snd xs)
But this isn't very ideal, since it's bound to be quite slow compared to only looping once. To get around this, we have to do it recursively
myUnzip [] = ([], [])
myUnzip ((a, b):xs) = (a : ???, b : ???)
where ??? = myUnzip xs
I'll let you fill in the blanks, but it should be straightforward from here, just look at the type signature of myUnzip and figure out what you can possible put in place of the question marks at where ??? = myUnzip xs
I thought it might be interesting to display two alternative solutions. In practice you wouldn't use these, but they might open your mind to some of the possibilities of Haskell.
First, there's the direct solution using a fold -
unzip' xs = foldr f x xs
where
f (a,b) (as,bs) = (a:as, b:bs)
x = ([], [])
This uses a combinator called foldr to iterate through the list. Instead, you just define the combining function f which tells you how to combine a single pair (a,b) with a pair of lists (as, bs), and you define the initial value x.
Secondly, remember that there is the nice-looking solution
unzip'' xs = (map fst xs, map snd xs)
which looks neat, but performs two iterations of the input list. It would be nice to be able to write something as straightforward as this, but which only iterates through the input list once.
We can nearly achieve this using the Foldl library. For an explanation of why it doesn't quite work, see the note at the end - perhaps someone with more knowledge/time can explain a fix.
First, import the library and define the identity fold. You may have to run cabal install foldl first in order to install the library.
import Control.Applicative
import Control.Foldl
ident = Fold (\as a -> a:as) [] reverse
You can then define folds that extract the first and second components of a list of pairs,
fsts = map fst <$> ident
snds = map snd <$> ident
And finally you can combine these two folds into a single fold that unzips the list
unzip' = (,) <$> fsts <*> snds
The reason that this doesn't quite work is that although you only traverse the list once to extract the pairs, they will be extracted in reverse order. This is what necessitates the additional call to reverse in the definition of ident, which results in an extra traversal of the list, to put it in the right order. I'd be interested to learn of a way to fix that up (I expect it's not possible with the current Foldl library, but might be possible with an analogous Foldr library that gives up streaming in order to preserve the order of inputs).
Note that neither of these work with infinite lists. The solution using Foldl will never be able to handle infinite lists, because you can't observe the value of a left fold until the list has terminated.
However, the version using a right fold should work - but at the moment it isn't lazy enough. In the definition
unzip' xs = foldr f x xs
where
f (a,b) (as,bs) = (a:as, b:bs) -- problem is in this line!
x = ([], [])
the pattern match requires that we open up the tuple in the second argument, which requires evaluating one more step of the fold, which requires opening up another tuple, which requires evaluating one more step of the fold, etc. However, if we use an irrefutable pattern match (which always succeeds, without having to examine the pattern) we get just the right amount of laziness -
unzip'' xs = foldr f x xs
where
f (a,b) ~(as,bs) = (a:as, b:bs)
x = ([], [])
so we can now do
>> let xs = repeat (1,2)
>> take 10 . fst . unzip' $ xs
^CInterrupted
<< take 10 . fst . unzip'' $ xs
[1,1,1,1,1,1,1,1,1,1]
Here's Chris Taylor's answer written using the (somewhat new) "folds" package:
import Data.Fold (R(R), run)
import Control.Applicative ((<$>), (<*>))
ident :: R a [a]
ident = R id (:) []
fsts :: R (a, b) [a]
fsts = map fst <$> ident
snds :: R (a, b) [b]
snds = map snd <$> ident
unzip' :: R (a, b) ([a], [b])
unzip' = (,) <$> fsts <*> snds
test :: ([Int], [Int])
test = run [(1,2), (3,4), (5,6)] unzip'
*Main> test
([1,3,5],[2,4,6])
Here is what I got working after above guidances
myUnzip' [] = ([],[])
myUnzip' ((a,b):xs) = (a:(fst rest), b:(snd rest))
where rest = myUnzip' xs
myunzip :: [(a,b)] -> ([a],[b])
myunzip xs = (firstValues xs , secondValues xs)
where
firstValues :: [(a,b)] -> [a]
firstValues [] = []
firstValues (x : xs) = fst x : firstValues xs
secondValues :: [(a,b)] -> [b]
secondValues [] = []
secondValues (x : xs) = snd x : secondValues xs
By the task we've had to implement foldl by foldr. By comparing both function signatures and foldl implementation I came with the following solution:
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl _ acc [] = acc
myFoldl fn acc (x:xs) = foldr fn' (fn' x acc) xs
where
fn' = flip fn
Just flip function arguments to satisfy foldr expected types and mimic foldl definition by recursively applying passed function.
It was a surprise as my teacher rated this answer with zero points.
I even checked this definition stacks its intermediate results in the same way as the standard foldl:
> myFoldl (\a elm -> concat ["(",a,"+",elm,")"]) "" (map show [1..10])
> "((((((((((+1)+10)+9)+8)+7)+6)+5)+4)+3)+2)"
> foldl (\a elm -> concat ["(",a,"+",elm,")"]) "" (map show [1..10])
> "((((((((((+1)+10)+9)+8)+7)+6)+5)+4)+3)+2)"
The correct answer was the following defintion:
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
Just asking why is my previous definition incorrect ?
Essentially, your fold goes in the wrong order. I think you didn't copy your output from foldl correctly; I get the following:
*Main> myFoldl (\ a elem -> concat ["(", a, "+", elem, ")"]) "" (map show [1..10])
"((((((((((+1)+10)+9)+8)+7)+6)+5)+4)+3)+2)"
*Main> foldl (\ a elem -> concat ["(", a, "+", elem, ")"]) "" (map show [1..10])
"((((((((((+1)+2)+3)+4)+5)+6)+7)+8)+9)+10)"
so what happens is that your implementation gets the first element--the base case--correct but then uses foldr for the rest which results in everything else being processed backwards.
There are some nice pictures of the different orders the folds work in on the Haskell wiki:
This shows how foldr (:) [] should be the identity for lists and foldl (flip (:)) [] should reverse a list. In your case, all it does is put the first element at the end but leaves everything else in the same order. Here is exactly what I mean:
*Main> foldl (flip (:)) [] [1..10]
[10,9,8,7,6,5,4,3,2,1]
*Main> myFoldl (flip (:)) [] [1..10]
[2,3,4,5,6,7,8,9,10,1]
This brings us to a deeper and far more important point--even in Haskell, just because the types line up does not mean your code works. The Haskell type system is not omnipotent and there are often many--even an infinite number of--functions that satisfy any given type. As a degenerate example, even the following definition of myFoldl type-checks:
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl _ acc _ = acc
So you have to think about exactly what your function is doing even if the types match. Thinking about things like folds might be confusing for a while, but you'll get used to it.