Strictness of pattern matching vs. deconstructing - haskell

I'm trying to define primitive recursion in term of foldr, as explained in A tutorial on the universality and expressiveness on fold chapter 4.1.
Here is first attempt at it
simpleRecursive f v xs = fst $ foldr g (v,[]) xs
where
g x (acc, xs) = (f x xs acc,x:xs)
However, above definition does not halt for head $ simpleRecursive (\x xs acc -> x:xs) [] [1..]
Below is definition that halt
simpleRecursive f v xs = fst $ foldr g (v,[]) xs
where
g x r = let (acc,xs) = r
in (f x xs acc,x:xs)
Given almost similar definition but different result, why does it differ? Does it have to do with how Haskell pattern match?

The crucial difference between the two functions is that in
g x r = let (acc, xs) = r
in (f x xs acc, x:xs)
The pattern match on the tuple constructor is irrefutable, whereas in
g x (acc, xs) = (f x xs acc, x:xs)
it is not. In other words, the first definition of g is equivalent to
g x ~(acc, xs) = (f x xs acc, x:xs)

Related

Using foldr to define map (develop)

Having a hard time understanding fold... Is the expansion correct ? Also would appreciate any links, or analogies that would make fold more digestible.
foldMap :: (a -> b) -> [a] -> [b]
foldMap f [] = []
foldMap f xs = foldr (\x ys -> (f x) : ys) [] xs
b = (\x ys -> (f x):ys)
foldMap (*2) [1,2,3]
= b 1 (b 2 (foldr b [] 3))
= b 1 (b 2 (b 3 ( b [] [])))
= b 1 (b 2 ((*2 3) : []))
= b 1 ((*2 2) : (6 :[]))
= (* 2 1) : (4 : (6 : []))
= 2 : (4 : (6 : []))
First, let's not use the name foldMap since that's already a standard function different from map. If you want to re-implement an existing function with the same or similar semantics, convention is to give it the same name but either in a separate module, or with a prime ' appended to the name. Also, we can omit the empty-list case, since you can just pass that to the fold just as well:
map' :: (a -> b) -> [a] -> [b]
map' f xs = foldr (\x ys -> f x : ys) [] xs
Now if you want to evaluate this function by hand, first just use the definition without inserting anything more:
map' (*2) [1,2,3,4]
≡ let f = (*2)
xs = [1,2,3,4]
in foldr (\x ys -> (f x) : ys) [] xs
≡ foldr (\x ys -> (*2) x : ys) [] [1,2,3,4]
Now just prettify a bit:
≡ foldr (\x ys -> x*2 : ys) [] [1,2,3,4]
Now to evaluate this through, you also need the definition of foldr. It's actually a bit different in GHC, but effectively
foldr _ z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
So with your example
...
≡ foldr (\x ys -> x*2 : ys) [] (1:[2,3,4])
≡ (\x ys -> x*2 : ys) 1 (foldr (\x ys -> x*2 : ys) [] [2,3,4])
Now we can perform a β-reduction:
≡ 1*2 : foldr (\x ys -> x*2 : ys) [] [2,3,4]
≡ 2 : foldr (\x ys -> x*2 : ys) [] [2,3,4]
...and repeat for the recursion.
foldr defines a family of equations,
foldr g n [] = n
foldr g n [x] = g x (foldr g n []) = g x n
foldr g n [x,y] = g x (foldr g n [y]) = g x (g y n)
foldr g n [x,y,z] = g x (foldr g n [y,z]) = g x (g y (g z n))
----- r ---------
and so on. g is a reducer function,
g x r = ....
accepting as x an element of the input list, and as r the result of recursively processing the rest of the input list (as can be seen in the equations).
map, on the other hand, defines a family of equations
map f [] = []
map f [x] = [f x] = (:) (f x) [] = ((:) . f) x []
map f [x,y] = [f x, f y] = ((:) . f) x (((:) . f) y [])
map f [x,y,z] = [f x, f y, f z] = ((:) . f) x (((:) . f) y (((:) . f) z []))
= (:) (f x) ( (:) (f y) ( (:) (f z) []))
The two families simply exactly match with
g = ((:) . f) = (\x -> (:) (f x)) = (\x r -> f x : r)
and n = [], and thus
foldr ((:) . f) [] xs == map f xs
We can prove this rigorously by mathematical induction on the input list's length, following the defining laws of foldr,
foldr g n [] = []
foldr g n (x:xs) = g x (foldr g n xs)
which are the basis for the equations at the top of this post.
Modern Haskell has Fodable type class with its basic fold following the laws of
fold(<>,n) [] = n
fold(<>,n) (xs ++ ys) = fold(<>,n) xs <> fold(<>,n) ys
and the map is naturally defined in its terms as
map f xs = foldMap (\x -> [f x]) xs
turning [x, y, z, ...] into [f x] ++ [f y] ++ [f z] ++ ..., since for lists (<>) == (++). This follows from the equivalence
f x : ys == [f x] ++ ys
This also lets us define filter along the same lines easily, as
filter p xs = foldMap (\x -> [x | p x]) xs
To your specific question, the expansion is correct, except that (*2 x) should be written as ((*2) x), which is the same as (x * 2). (* 2 x) is not a valid Haskell (though valid Lisp :) ).
Functions like (*2) are known as "operator sections" -- the missing argument goes into the empty slot: (* 2) 3 = (3 * 2) = (3 *) 2 = (*) 3 2.
You also asked for some links: see e.g. this, this and this.

How to prove this equation? (FOLDR function) [duplicate]

Is the following a definition of structural induction?
foldr f a (xs::ys) = foldr f (foldr f a ys) xs
Can someone give me an example of structural induction in Haskell?
You did not specify it, but I will assume :: means list concatention and
use ++, since that is the operator used in Haskell.
To prove this, we will perform induction on xs. First, we show that the
statement holds for the base case (i.e. xs = [])
foldr f a (xs ++ ys)
{- By definition of xs -}
= foldr f a ([] ++ ys)
{- By definition of ++ -}
= foldr f a ys
and
foldr f (foldr f a ys) xs
{- By definition of xs -}
= foldr f (foldr f a ys) []
{- By definition of foldr -}
= foldr f a ys
Now, we assume that the induction hypothesis foldr f a (xs ++ ys) = foldr
f (foldr f a ys) xs holds for xs and show that it will hold for the list
x:xs as well.
foldr f a (x:xs ++ ys)
{- By definition of ++ -}
= foldr f a (x:(xs ++ ys))
{- By definition of foldr -}
= x `f` foldr f a (xs ++ ys)
^------------------ call this k1
= x `f` k1
and
foldr f (foldr f a ys) (x:xs)
{- By definition of foldr -}
= x `f` foldr f (foldr f a ys) xs
^----------------------- call this k2
= x `f` k2
Now, by our induction hypothesis, we know that k1 and k2 are equal,
therefore
x `f` k1 = x `f` k2
Thus proving our hypothesis.

Laziness of (>>=) in folding

Consider the following 2 expressions in Haskell:
foldl' (>>=) Nothing (repeat (\y -> Just (y+1)))
foldM (\x y -> if x==0 then Nothing else Just (x+y)) (-10) (repeat 1)
The first one takes forever, because it's trying to evaluate the infinite expression
...(((Nothing >>= f) >>= f) >>=f)...
and Haskell will just try to evaluate it inside out.
The second expression, however, gives Nothing right away. I've always thought foldM was just doing fold using (>>=), but then it would run into the same problem. So it's doing something more clever here - once it hits Nothing it knows to stop. How does foldM actually work?
foldM can't be implemented using foldl. It needs the power of foldr to be able to stop short. Before we get there, here's a version without anything fancy.
foldM f b [] = return b
foldM f b (x : xs) = f b x >>= \q -> foldM f q xs
We can transform this into a version that uses foldr. First we flip it around:
foldM f b0 xs = foldM' xs b0 where
foldM' [] b = return b
foldM' (x : xs) b = f b x >>= foldM' xs
Then move the last argument over:
foldM' [] = return
foldM' (x : xs) = \b -> f b x >>= foldM' xs
And then recognize the foldr pattern:
foldM' = foldr go return where
go x r = \b -> f b x >>= r
Finally, we can inline foldM' and move b back to the left:
foldM f b0 xs = foldr go return xs b0 where
go x r b = f b x >>= r
This same general approach works for all sorts of situations where you want to pass an accumulator from left to right within a right fold. You first shift the accumulator all the way over to the right so you can use foldr to build a function that takes an accumulator, instead of trying to build the final result directly. Joachim Breitner did a lot of work to create the Call Arity compiler analysis for GHC 7.10 that helps GHC optimize functions written this way. The main reason to want to do so is that it allows them to participate in the GHC list libraries' fusion framework.
One way to define foldl in terms of foldr is:
foldl f z xn = foldr (\ x g y -> g (f y x)) id xn z
It's probably worth working out why that is for yourself. It can be re-written using >>> from Control.Arrow as
foldl f z xn = foldr (>>>) id (map (flip f) xn) z
The monadic equivalent of >>> is
f >=> g = \ x -> f x >>= \ y -> g y
which allows us to guess that foldM might be
foldM f z xn = foldr (>=>) return (map (flip f) xn) z
which turns out to be the correct definition. It can be re-written using foldr/map as
foldM f z xn = foldr (\ x g y -> f y x >>= g) return xn z

Proving foldr f st (xs++ys) = f (foldr f st xs) (foldr f st ys)

I am trying to prove the following statement by structural induction:
foldr f st (xs++yx) = f (foldr f st xs) (foldr f st ys) (foldr.3)
However I am not even sure how to define foldr, so I am stuck as no definitions have been provided to me. I now believe that foldr can be defined as
foldr f st [] = st (foldr.1)
foldr f st x:xs = f x (foldr f st xs) (foldr.2)
Now I want to start working on the base case passing the empty list to foldr. I have this, but I don't think it is correct.
foldr f st ([]++[]) = f (foldr f st []) (foldr f st [])
LHS:
foldr f st ([]++[]) = foldr f st [] by (++)
foldr f st [] = st by (foldr.1)
RHS:
f (foldr f st []) (foldr f st []) = f st st by (foldr.1)
= st by definition of identity, st = 0
LHS = RHS, therefore base case holds
Now this is what I have for my inductive step:
Assume that:
foldr f st (xs ++ ys) = f (foldr f st xs) (foldr f st ys) (ind. hyp)
Show that:
foldr f st (x:xs ++ ys) = f (foldr f st x:xs) (foldr f st ys) (inductive step)
LHS:
foldr f st (x:xs ++ ys) = f x (foldr f st xs) (foldr f st ys) (by foldr.2)
RHS:
f (foldr f st x:xs) (foldr f st ys) =
= f f x (foldr f st xs) (foldr f st ys) (by foldr.2)
= f x (foldr f st xs) (foldr f st ys)
LHS = RHS, therefore inductive step holds. End of proof.
I am not sure if this proof is valid. I need some help in determining if it correct and if not - what part of it is not.
First: you can find the definition for many basic Haskell functions via the API documentation, which is available on Hackage. The documentation for base is here. foldr is exported in Prelude, which has a link to its source code:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr k z = go
where
go [] = z
go (y:ys) = y `k` go ys
It's defined like this for efficiency reasons; look up "worker-wrapper." It's equivalent to
foldr f st [] = st
foldr f st (y:ys) = f y (foldr f st ys)
Second: In your desired proof, the type of f must be a -> a -> a, which is less general than a -> b -> b.
Let's work through the base case (xs = ys = []).
foldr f st ([]++[]) = f (foldr f st []) (foldr f st [])
-- Definition of ++
foldr f st [] = f (foldr f st []) (foldr f st [])
-- Equation 1 of foldr
st = f st st
This equation does not hold in general. To proceed with the proof, you'll have to assume that st is an identity for f.
You'll also have to assume, in the non-base case, that f is associative, I believe. These two assumptions, combined, indicate that f and st form a monoid. Are you trying to prove something about foldMap?

Fusion law for foldr1?

For foldr we have the fusion law: if f is strict, f a = b, and
f (g x y) = h x (f y) for all x, y, then f . foldr g a = foldr h b.
How can one discover/derive a similar law for foldr1? (It clearly can't even take the same form - consider the case when both sides act on [x].)
You can use free theorems to derive statements like the fusion law. The Automatic generation of free theorems does this work for you, it automatically derives the following statement if you enter foldr1 or the type (a -> a -> a) -> [a] -> a.
If f strict and f (p x y) = q (f x) (f y)) for all x and y you have f (foldr1 p z) = foldr1 q (map f z)). That is, in contrast to you statement about foldr you get an additional map f on the right hand side.
Also note that the free theorem for foldr is slightly more general than your fusion law and, therefore, looks quite similar to the law for foldr1. Namely you have for strict functions g and f if g (p x y) = q (f x) (g y)) for all x and y then g (foldr p z v) = foldr q (g z) (map f v)).
I don't know if there's going to be anything satisfying for foldr1. [I think] It's just defined as
foldr1 f (x:xs) = foldr f x xs
let's first expand what you have above to work on the entire list,
f (foldr g x xs) = foldr h (f x) xs
for foldr1, you could say,
f (foldr1 g xs) = f (foldr g x xs)
= foldr h (f x) xs
to recondense into foldr1, you can create some imaginary function that maps f to the left element, for a result of,
f . foldr1 g = foldr1 h (mapfst f) where
mapfst (x:xs) = f x : xs

Resources