I have been going through the excellent CIS 194 course when I got stuck on Part 5 of Homework 6. It revolves around implementing the ruler function without any divisibility testing.
I found that it is possible to build the ruler function by continuously interspersing an accumulator with values from an infinite list.
nats = [0,1,2,3,..]
[3]
[2,3,2]
[1,2,1,3,1,2,1]
[0,1,0,2,0,1,0,3,0,1,0,2,0]
Then I tried implementing this algorithm for Stream datatype which is a list without nil
data Stream a = Cons a (Stream a)
streamToList :: Stream a -> [a]
streamToList (Cons x xs) = x : streamToList xs
instance Show a => Show (Stream a) where
show = show . take 20 . streamToList
streamFromSeed :: (a -> a) -> a -> Stream a
streamFromSeed f x = Cons x (streamFromSeed f (f x))
nats :: Stream Integer
nats = streamFromSeed succ 0
interleave x (Cons y ys) = Cons x (Cons y (interleave x ys))
foldStream f (Cons x xs) = f x (foldStream f xs)
ruler = foldStream interleave nats
As expected, I got stackoverflow error since I was trying to fold from the right. However, I was surprised to see the same algorithm work for normal infinite lists.
import Data.List
interleave x list = [x] ++ (intersperse x list) ++ [x]
ruler = take 20 (foldr interleave [] [0..])
What am I missing? Why one implementation works while the other doesn't?
Your interleave is insufficiently lazy. The magic thing that right folds must do to work on infinite structures is to not inspect the result of the folded value too closely before they do the first bit of computation. So:
interleave x stream = Cons x $ case stream of
Cons y ys -> Cons y (interleave x ys)
This produces Cons x _ before inspecting stream; in contrast, your version requires stream to be evaluated a bit before it can pass to the right hand side of the equation, which essentially forces the entire fold to happen before any constructor gets produced.
You can also see this in your list version of interleave:
interleave x list = [x] ++ intersperse x list ++ [x]
The first element of the returned list (x) is known before intersperse starts pattern matching on list.
We can inspect the source code of foldr [src]. A less noisy version looks like:
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
Haskell does not evaluate eagerly. This thus means that, unless you need (foldr f z xs), it will not evaluate the accumulator. This thus means that f does not need the second parameter, for example because the first item x has a certain value, it will not evaluate the accumulator.
For example if we implement takeWhileNeq:
takeWhileNeq a = foldr f []
where f x xs -> if x == a then [] else (x:xs)
if we thus run this on a list takeWhileNeq 2 [1,4,2,5], then it will not evaluate anything. If we however want to print the result it will evaluate this as:
f 1 (foldr f [4,2,5])
and f will inspect if 1 == 2, since that is not the case, it will return (x:xs), so:
-> 1 : foldr f [4,2,5]
so now it will evaluate 4 == 2, and because this is false, it will evaluate this to:
-> 1 : (4 : foldr f [2,5])
now we evaluate 2 == 2, and since this is True, the function returns the empty list, and ingores the accumulator, so it will never look at foldr f [5]:
-> 1 : (4 : [])
For an infinite list, it will thus also result an empty list and ignore folding the rest of the list.
I'm working on UPENN Haskell Homework 6 Exercise 5, trying to define a ruler function
0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,...
where the nth element in the stream (assuming the first element corresponds to n = 1) is the largest power of 2 which evenly divides n.
I just came up with an idea to build it without any divisibility testing:
data Stream x = Cons x (Stream x) deriving (Eq)
streamRepeat x = Cons x (streamRepeat x)
interleaveStreams (Cons x xs) (Cons y ys) =
Cons x (Cons y (interleaveStreams xs ys))
ruler =
interleaveStreams (streamRepeat 0)
(interleaveStreams (streamRepeat 1)
(interleaveStreams (streamRepeat 2)
(interleaveStreams (streamRepeat 3) (...))
where first 20 element of
ruler =
interleaveStreams (streamRepeat 0)
(interleaveStreams (streamRepeat 1)
(interleaveStreams (streamRepeat 2)
(interleaveStreams (streamRepeat 3) (streamRepeat 4))))
is
[0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2]
Obviously I couldn't define it manually to infinite so I defined a infInterStream to help such infinite recursion definition:
infInterStream n = interleaveStreams (streamRepeat n) (infInterStream (n+1))
ruler = infInterStream 0
But now I get stuck when typing in ruler in ghci, it probably falls into infinite loop.
It shouldn't be if lazy evaluation works. I want to know why lazy evaluation fails here.
Helper function to observe Stream:
streamToList (Cons x xs) = x : streamToList xs
instance Show a => Show (Stream a) where
show = show . take 20 . streamToList
Your interleaving function is too strict. The following works:
interleaveStreams (Cons x xs) ys = Cons x (interleaveStreams ys xs)
This also works:
interleaveStreams (Cons x xs) ~(Cons y ys) =
Cons x (Cons y (interleaveStreams xs ys))
The original definition goes into infinite loop because interleaveStreams demands that both arguments must be of Cons forms. infInterStream n evaluates to the the interleaving of two streams, and the first one can be immediately evaluated to Cons, but the second one must also be reduced first to Cons, so we recursively call infInterStream (n + 1), which keeps calling itself ad infinitum.
If interleaveStreams can return a Cons a _ without first having to force the second argument, infInterStream can also incrementally build the result.
There's no need for a new type for streams, and we can just use Haskell's lazy lists instead. As others have noted, the definition of interleave must be sufficiently lazy that it can produce the first element of the output before testing whether the second argument is non-empty. This definition will do:
interleave (x:xs) ys = x : interleave ys xs
If you want interleave to work also for finite lists, you can add the equation
interleave [] ys = ys
Note also that, using functions from the standard prelude,
ruler = interleave (repeat 0) (map (+1) ruler)
Here repeat 0 is the list [0, 0, 0, ...].
In haskell, [1,2,3,4,5,6,7] \\ [4,5,6] will return [1,2,3,7]. Now i want to implement the same function using clisp. Up to now i find set-difference works :
(set-difference '(1 2 3 4 5 6 7) '(4 5 6))
Are there any other solution ?
Here are relevant bits of haskell library source. Maybe you can translate these definitions directly. I don't think it uses anything specific to Haskell.
(the source is from http://haskell.org/ghc/docs/latest/html/libraries/base/src/Data-List.html)
delete :: (Eq a) => a -> [a] -> [a]
delete = deleteBy (==)
-- | The 'deleteBy' function behaves like 'delete', but takes a
-- user-supplied equality predicate.
deleteBy :: (a -> a -> Bool) -> a -> [a] -> [a]
deleteBy _ _ [] = []
deleteBy eq x (y:ys) = if x `eq` y then ys else y : deleteBy eq x ys
(\\) :: (Eq a) => [a] -> [a] -> [a]
(\\) = foldl (flip delete)
I don't know Common Lisp that well, so here's a Scheme implementation of the code pasted by Ben:
(define (difference big small)
(fold delete big small))
(define (delete x lst)
(delete-by equal? x lst))
(define (delete-by equal? x lst)
(if (null? lst) '()
(receive (y ys) (car+cdr lst)
(if (equal? x y) ys
(cons y (delete-by equal? x ys))))))
where fold and car+cdr come from SRFI 1, and receive comes from SRFI 8.
If we will allow ourselves the use of SRFI 26's cut form, then we have a solution that looks even closer to the Haskell version (since the latter uses currying in at least two places):
(define difference (cut fold delete <...>))
(define delete (cut delete-by equal? <...>))
; Unchanged from the above version
(define (delete-by equal? x lst)
(if (null? lst) '()
(receive (y ys) (car+cdr lst)
(if (equal? x y) ys
(cons y (delete-by equal? x ys))))))
I'm currently on chapter 4 of Real World Haskell, and I'm trying to wrap my head around implementing foldl in terms of foldr.
(Here's their code:)
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
I thought I'd try to implement zip using the same technique, but I don't seem to be making any progress. Is it even possible?
zip2 xs ys = foldr step done xs ys
where done ys = []
step x zipsfn [] = []
step x zipsfn (y:ys) = (x, y) : (zipsfn ys)
How this works: (foldr step done xs) returns a function that consumes
ys; so we go down the xs list building up a nested composition of
functions that will each be applied to the corresponding part of ys.
How to come up with it: I started with the general idea (from similar
examples seen before), wrote
zip2 xs ys = foldr step done xs ys
then filled in each of the following lines in turn with what it had to
be to make the types and values come out right. It was easiest to
consider the simplest cases first before the harder ones.
The first line could be written more simply as
zip2 = foldr step done
as mattiast showed.
The answer had already been given here, but not an (illustrative) derivation. So even after all these years, perhaps it's worth adding it.
It is actually quite simple. First,
foldr f z xs
= foldr f z [x1,x2,x3,...,xn] = f x1 (foldr f z [x2,x3,...,xn])
= ... = f x1 (f x2 (f x3 (... (f xn z) ...)))
hence by eta-expansion,
foldr f z xs ys
= foldr f z [x1,x2,x3,...,xn] ys = f x1 (foldr f z [x2,x3,...,xn]) ys
= ... = f x1 (f x2 (f x3 (... (f xn z) ...))) ys
As is apparent here, if f is non-forcing in its 2nd argument, it gets to work first on x1 and ys, f x1r1ys where r1 =(f x2 (f x3 (... (f xn z) ...)))= foldr f z [x2,x3,...,xn].
So, using
f x1 r1 [] = []
f x1 r1 (y1:ys1) = (x1,y1) : r1 ys1
we arrange for passage of information left-to-right along the list, by calling r1 with the rest of the input list ys1, foldr f z [x2,x3,...,xn]ys1 = f x2r2ys1, as the next step. And that's that.
When ys is shorter than xs (or the same length), the [] case for f fires and the processing stops. But if ys is longer than xs then f's [] case won't fire and we'll get to the final f xnz(yn:ysn) application,
f xn z (yn:ysn) = (xn,yn) : z ysn
Since we've reached the end of xs, the zip processing must stop:
z _ = []
And this means the definition z = const [] should be used:
zip xs ys = foldr f (const []) xs ys
where
f x r [] = []
f x r (y:ys) = (x,y) : r ys
From the standpoint of f, r plays the role of a success continuation, which f calls when the processing is to continue, after having emitted the pair (x,y).
So r is "what is done with more ys when there are more xs", and z = const [], the nil-case in foldr, is "what is done with ys when there are no more xs". Or f can stop by itself, returning [] when ys is exhausted.
Notice how ys is used as a kind of accumulating value, which is passed from left to right along the list xs, from one invocation of f to the next ("accumulating" step being, here, stripping a head element from it).
Naturally this corresponds to the left fold, where an accumulating step is "applying the function", with z = id returning the final accumulated value when "there are no more xs":
foldl f a xs =~ foldr (\x r a-> r (f a x)) id xs a
Similarly, for finite lists,
foldr f a xs =~ foldl (\r x a-> r (f x a)) id xs a
And since the combining function gets to decide whether to continue or not, it is now possible to have left fold that can stop early:
foldlWhile t f a xs = foldr cons id xs a
where
cons x r a = if t x then r (f a x) else a
or a skipping left fold, foldlWhen t ..., with
cons x r a = if t x then r (f a x) else r a
etc.
I found a way using quite similar method to yours:
myzip = foldr step (const []) :: [a] -> [b] -> [(a,b)]
where step a f (b:bs) = (a,b):(f bs)
step a f [] = []
For the non-native Haskellers here, I've written a Scheme version of this algorithm to make it clearer what's actually happening:
> (define (zip lista listb)
((foldr (lambda (el func)
(lambda (a)
(if (empty? a)
empty
(cons (cons el (first a)) (func (rest a))))))
(lambda (a) empty)
lista) listb))
> (zip '(1 2 3 4) '(5 6 7 8))
(list (cons 1 5) (cons 2 6) (cons 3 7) (cons 4 8))
The foldr results in a function which, when applied to a list, will return the zip of the list folded over with the list given to the function. The Haskell hides the inner lambda because of lazy evaluation.
To break it down further:
Take zip on input: '(1 2 3)
The foldr func gets called with
el->3, func->(lambda (a) empty)
This expands to:
(lambda (a) (cons (cons el (first a)) (func (rest a))))
(lambda (a) (cons (cons 3 (first a)) ((lambda (a) empty) (rest a))))
If we were to return this now, we'd have a function which takes a list of one element
and returns the pair (3 element):
> (define f (lambda (a) (cons (cons 3 (first a)) ((lambda (a) empty) (rest a)))))
> (f (list 9))
(list (cons 3 9))
Continuing, foldr now calls func with
el->3, func->f ;using f for shorthand
(lambda (a) (cons (cons el (first a)) (func (rest a))))
(lambda (a) (cons (cons 2 (first a)) (f (rest a))))
This is a func which takes a list with two elements, now, and zips them with (list 2 3):
> (define g (lambda (a) (cons (cons 2 (first a)) (f (rest a)))))
> (g (list 9 1))
(list (cons 2 9) (cons 3 1))
What's happening?
(lambda (a) (cons (cons 2 (first a)) (f (rest a))))
a, in this case, is (list 9 1)
(cons (cons 2 (first (list 9 1))) (f (rest (list 9 1))))
(cons (cons 2 9) (f (list 1)))
And, as you recall, f zips its argument with 3.
And this continues etc...
The problem with all these solutions for zip is that they only fold over one list or the other, which can be a problem if both of them are "good producers", in the parlance of list fusion. What you actually need is a solution that folds over both lists. Fortunately, there is a paper about exactly that, called "Coroutining Folds with Hyperfunctions".
You need an auxiliary type, a hyperfunction, which is basically a function that takes another hyperfunction as its argument.
newtype H a b = H { invoke :: H b a -> b }
The hyperfunctions used here basically act like a "stack" of ordinary functions.
push :: (a -> b) -> H a b -> H a b
push f q = H $ \k -> f $ invoke k q
You also need a way to put two hyperfunctions together, end to end.
(.#.) :: H b c -> H a b -> H a c
f .#. g = H $ \k -> invoke f $ g .#. k
This is related to push by the law:
(push f x) .#. (push g y) = push (f . g) (x .#. y)
This turns out to be an associative operator, and this is the identity:
self :: H a a
self = H $ \k -> invoke k self
You also need something that disregards everything else on the "stack" and returns a specific value:
base :: b -> H a b
base b = H $ const b
And finally, you need a way to get a value out of a hyperfunction:
run :: H a a -> a
run q = invoke q self
run strings all of the pushed functions together, end to end, until it hits a base or loops infinitely.
So now you can fold both lists into hyperfunctions, using functions that pass information from one to the other, and assemble the final value.
zip xs ys = run $ foldr (\x h -> push (first x) h) (base []) xs .#. foldr (\y h -> push (second y) h) (base Nothing) ys where
first _ Nothing = []
first x (Just (y, xys)) = (x, y):xys
second y xys = Just (y, xys)
The reason why folding over both lists matters is because of something GHC does called list fusion, which is talked about in the GHC.Base module, but probably should be much more well-known. Being a good list producer and using build with foldr can prevent lots of useless production and immediate consumption of list elements, and can expose further optimizations.
I tried to understand this elegant solution myself, so I tried to derive the types and evaluation myself. So, we need to write a function:
zip xs ys = foldr step done xs ys
Here we need to derive step and done, whatever they are. Recall foldr's type, instantiated to lists:
foldr :: (a -> state -> state) -> state -> [a] -> state
However our foldr invocation must be instantiated to something like below, because we must accept not one, but two list arguments:
foldr :: (a -> ? -> ?) -> ? -> [a] -> [b] -> [(a,b)]
Because -> is right-associative, this is equivalent to:
foldr :: (a -> ? -> ?) -> ? -> [a] -> ([b] -> [(a,b)])
Our ([b] -> [(a,b)]) corresponds to state type variable in the original foldr type signature, therefore we must replace every occurrence of state with it:
foldr :: (a -> ([b] -> [(a,b)]) -> ([b] -> [(a,b)]))
-> ([b] -> [(a,b)])
-> [a]
-> ([b] -> [(a,b)])
This means that arguments that we pass to foldr must have the following types:
step :: a -> ([b] -> [(a,b)]) -> [b] -> [(a,b)]
done :: [b] -> [(a,b)]
xs :: [a]
ys :: [b]
Recall that foldr (+) 0 [1,2,3] expands to:
1 + (2 + (3 + 0))
Therefore if xs = [1,2,3] and ys = [4,5,6,7], our foldr invocation would expand to:
1 `step` (2 `step` (3 `step` done)) $ [4,5,6,7]
This means that our 1 `step` (2 `step` (3 `step` done)) construct must create a recursive function that would go through [4,5,6,7] and zip up the elements. (Keep in mind, that if one of the original lists is longer, the excess values are thrown away). IOW, our construct must have the type [b] -> [(a,b)].
3 `step` done is our base case, where done is an initial value, like 0 in foldr (+) 0 [1..3]. We don't want to zip anything after 3, because 3 is the final value of xs, so we must terminate the recursion. How do you terminate the recursion over list in the base case? You return empty list []. But recall done type signature:
done :: [b] -> [(a,b)]
Therefore we can't return just [], we must return a function that would ignore whatever it receives. Therefore use const:
done = const [] -- this is equivalent to done = \_ -> []
Now let's start figuring out what step should be. It combines a value of type a with a function of type [b] -> [(a,b)] and returns a function of type [b] -> [(a,b)].
In 3 `step` done, we know that the result value that would later go to our zipped list must be (3,6) (knowing from original xs and ys). Therefore 3 `step` done must evaluate into:
\(y:ys) -> (3,y) : done ys
Remember, we must return a function, inside which we somehow zip up the elements, the above code is what makes sense and typechecks.
Now that we assumed how exactly step should evaluate, let's continue the evaluation. Here's how all reduction steps in our foldr evaluation look like:
3 `step` done -- becomes
(\(y:ys) -> (3,y) : done ys)
2 `step` (\(y:ys) -> (3,y) : done ys) -- becomes
(\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys)
1 `step` (\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys) -- becomes
(\(y:ys) -> (1,y) : (\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys) ys)
The evaluation gives rise to this implementation of step (note that we account for ys running out of elements early by returning an empty list):
step x f = \[] -> []
step x f = \(y:ys) -> (x,y) : f ys
Thus, the full function zip is implemented as follows:
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr step done xs ys
where done = const []
step x f [] = []
step x f (y:ys) = (x,y) : f ys
P.S.: If you are inspired by elegance of folds, read Writing foldl using foldr and then Graham Hutton's A tutorial on the universality and expressiveness of fold.
A simple approach:
lZip, rZip :: Foldable t => [b] -> t a -> [(a, b)]
-- implement zip using fold?
lZip xs ys = reverse.fst $ foldl f ([],xs) ys
where f (zs, (y:ys)) x = ((x,y):zs, ys)
-- Or;
rZip xs ys = fst $ foldr f ([],reverse xs) ys
where f x (zs, (y:ys)) = ((x,y):zs, ys)