Why does n-reduction not apply to filter even in the following example? - haskell

I am following 'Learn Haskell Fast and Hard' and I was able to follow most of it, but I have two questions for the following code sample.
In the first function, why don't I need l but in the second version I do need l?
In evenSum1, when the function is called recursively will filter be called on the list again and again or will filter be called only once on the first call?
.
evenSum = accumSum 0
where
accumSum n [] = n
accumSum n (x:xs) =
if even x
then accumSum (n+x) xs
else accumSum n xs
evenSum1 l = mysum 0 (filter even l)
where
mysum n [] = n
mysum n (x:xs) = mysum (n+x) xs

You can actually drop of the l in the second example too, but you need to switch to what is called point free notation and use the function composition operator (.):
evenSum1 = mysum 0 . filter even
where
mysum n [] = n
mysum n (x:xs) = mysum (n + x) xs
And in evenSum1, the filter even function will only be called once. What happens is that filter even runs out the list passed in, then the output of that is passed to mysum 0.
A quick primer on point free notation
Say you have a function add:
add :: Int -> Int -> Int
add x y = x + y
And then you want to make a function add5 that always adds 5 to an Int. You could do it as
add5 :: Int -> Int
add5 y = add 5 y
But since functions are first class objects in Haskell and we can partially apply a function, this is equivalent to saying
add5 :: Int -> Int
add5 = add 5
Another way to look at it is to add some optional parentheses to the type signature of add:
add :: Int -> (Int -> Int)
add x y = x + y
Written like this, we can say that add is a function that accepts a single Int argument and returns a new function of Int -> Int. So if we give add a single Int, we get a new function back. This is also what lets us write expressions like
filter even list
Instead of
filter (\x -> even x) list
A good rule of thumb for point-free notation is that variables can be dropped off the end turning the last $ into a .:
f x y = h x $ g y
f x = h x . g
f x y z = h x $ g y $ j z
f x y = h x $ g y . j
This doesn't always work with multi-argument functions:
f x y = h $ g x y
Is not the same as
f = h . g
Because h . g won't type check. This is because of implicit parentheses:
f x y = h $ (g x) y
f x = h . (g x)
And now there's parentheses in the way from being able to drop the x argument.
Also, keep in mind that f x y = h (g x y) is equivalent to f x y = h $ g x y, so you can usually turn the outermost parentheses into a $ instead, potentially letting you eta-reduce and change the $ to a .. If all this seems confusing, you can also grab the pointfree package off hackage, which contains a command line tool for automatically performing eta-reductions for you.

Related

Defining foldl in terms of foldr in Standard ML

The defined code is
fun foldl f e l = let
fun g(x, f'') = fn y => f''(f(x, y))
in foldr g (fn x => x) l e end
I don't understand how this works;
what is the purpose of g(x, f'')?
I also find a similar example in Haskell,
the definition is quite short
myFoldl f z xs = foldr step id xs z
where
step x g a = g (f a x)
Let's dissect the Haskell implementation of myFoldl and then take a look at the ocaml SML code. First, we'll look at some type signatures:
foldr :: (a -> b -> b) -- the step function
-> b -- the initial value of the accumulator
-> [a] -- the list to fold
-> b -- the result
It should be noted that although the foldr function accepts only three arguments we are applying it two four arguments:
foldr step id xs z
However, as you can see the second argument to foldr (i.e. the inital value of the accumulator) is id which is a function of the type x -> x. Therefore, the result is also of the type x -> x. Hence, it accepts four arguments.
Similarly, the step function is now of the type a -> (x -> x) -> x -> x. Hence, it accepts three arguments instead of two. The accumulator is an endofunction (i.e. a function whose domain and codomain is the same).
Endofunctions have a special property, they are composed from left to right instead of from right to left. For example, let's compose a bunch of Int -> Int functions:
inc :: Int -> Int
inc n = n + 1
dbl :: Int -> Int
dbl n = n * 2
The normal way to compose these functions is to use the function composition operator as follows:
incDbl :: Int -> Int
incDbl = inc . dbl
The incDbl function first doubles a number and then increments it. Note that this reads from right to left.
Another way to compose them is to use continuations (denoted by k):
inc' :: (Int -> Int) -> Int -> Int
inc' k n = k (n + 1)
dbl' :: (Int -> Int) -> Int -> Int
dbl' k n = k (n * 2)
Notice that the first argument is a continuation. If we want to recover the original functions then we can do:
inc :: Int -> Int
inc = inc' id
dbl :: Int -> Int
dbl = dbl' id
However, if we want to compose them then we do it as follows:
incDbl' :: (Int -> Int) -> Int -> Int
incDbl' = dbl' . inc'
incDbl :: Int -> Int
incDbl = incDbl' id
Notice that although we are still using the dot operator to compose the functions, it now reads from left to right.
This is the key behind making foldr behave as foldl. We fold the list from right to left but instead of folding it into a value, we fold it into an endofunction which when applied to an initial accumulator value actually folds the list from left to right.
Consider our incDbl function:
incDbl = incDbl' id
= (dbl' . inc') id
= dbl' (inc' id)
Now consider the definition of foldr:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr _ acc [] = acc
foldr fun acc (y:ys) = fun y (foldr fun acc ys)
In the basis case we simply return the accumulated value. However, in the inductive case we return fun y (foldr fun acc ys). Our step function is defined as follows:
step :: a -> (x -> x) -> x -> x
step x g a = g (f a x)
Here f is the reducer function of foldl and is of the type x -> a -> x. Notice that step x is an endofunction of the type (x -> x) -> x -> x which we know can be composed left to right.
Hence the folding operation (i.e. foldr step id) on a list [y1,y2..yn] looks like:
step y1 (step y2 (... (step yn id)))
-- or
(step y1 . step y2 . {dots} . step yn) id
Each step yx is an endofunction. Hence, this is equivalent to composing the endofunctions from left to right.
When this result is applied to an initial accumulator value then the list folds from left to right. Hence, myFoldl f z xs = foldr step id xs z.
Now consider the foldl function (which is written in Standard ML and not OCaml). It is defined as:
fun foldl f e l = let fun g (x, f'') = fn y => f'' (f (x, y))
in foldr g (fn x => x) l e end
The biggest difference between the foldr functions of Haskell and SML are:
In Haskell the reducer function has the type a -> b -> b.
In SML the reducer function has the type (a, b) -> b.
Both are correct. It's only a matter of preference. In SML instead of passing two separate arguments, you pass one single tuple which contains both arguments.
Now, the similarities:
The id function in Haskell is the anonymous fn x => x function in SML.
The step function in Haskell is the function g in SML which takes a tuple containing the first two arguments.
The step function is Haskell step x g a has been split into two functions in SML g (x, f'') = fn y => f'' (f (x, y)) for more clarity.
If we rewrite the SML function to use the same names as in Haskell then we have:
fun myFoldl f z xs = let step (x, g) = fn a => g (f (a, x))
in foldr step (fn x => x) xs z end
Hence, they are exactly the same function. The expression g (x, f'') simply applies the function g to the tuple (x, f''). Here f'' is a valid identifier.
Intuition
The foldl function traverses the list head to tail while operating elements with an accumulator:
(...(a⊗x1)⊗...⊗xn-1)⊗xn
And you want to define it via a foldr:
x1⊕(x2⊕...⊕(xn⊕e)...)
Rather unintuitive. The trick is that your foldr will not produce a value, but rather a function. The list traversal will operate the elements as to produce a function that, when applied to the accumulator, performs the computation you desire.
Lets see a simple example to illustrate how this works. Consider sum foldl (+) 0 [1,2,3] = ((0+1)+2)+3. We may calculate it via foldr as follows.
foldr ⊕ [1,2,3] id
-> 1⊕(2⊕(3⊕id))
-> 1⊕(2⊕(id.(+3))
-> 1⊕(id.(+3).(+2))
-> (id.(+3).(+2).(+1))
So when we apply this function to 0 we get
(id.(+3).(+2).(+1)) 0
= ((0+1)+2)+3
We began with the identity function and successively changed it as we traversed the list, using ⊕ where,
n ⊕ g = g . (+n)
Using this intuition, it isn't hard to define a sum with an accumulator via foldr. We built the computation for a given list via foldr ⊕ id xs. Then to calculate the sum we applied it to 0, foldr ⊕ id xs 0. So we have,
foldl (+) 0 xs = foldr ⊕ id xs 0
where n ⊕ g = g . (+n)
or equivalently, denoting n ⊕ g in prefix form by (⊕) n g and noting that (⊕) n g a = (g . (+n)) a = g (a+n),
foldl (+) 0 xs = foldr ⊕ id xs 0
where (⊕) n g a = g (a+n)
Note that the ⊕ is your step function, and that you can obtain the generic result you're looking for by substituting a function f for +, and accumulator a for 0.
Next let us show that the above really is correct.
Formal derivation
Moving on to a more formal approach. It is useful, for simplicity, to be aware of the following universal property of foldr.
h [] = e
h (x:xs) = f x (h xs)
iff
h = foldr f e
This means that rather than defining foldr directly, we may instead and more simply define a function h in the form above.
We want to define such an h so that,
h xs a = foldl f a xs
or equivalently,
h xs = \a -> foldl f a xs
So lets determine h. The empty case is simple:
h [] = \a -> foldl f a []
= \a -> a
= id
The non-empty case results in:
h (x:xs) = \a -> foldl f a (x:xs)
= \a -> foldl f (f a x) xs
= \a -> h xs (f a x)
= step x (h xs) where step x g = \a -> g (f a x)
= step x (h xs) where step x g a = g (f a x)
So we conclude that,
h [] = id
h (x:xs) = step x (h xs) where step x g a = g (f a x)
satisfies h xs a = foldl f a xs
And by the universal property above (noting that the f in the universal property formula corresponds to step here, and e to id) we know that h = foldr step id. Therefore,
h = foldr step id
h xs a = foldl f a xs
-----------------------
foldl f a xs = foldr step id xs a
where step x g a = g (f a x)

map and/or filter to only return a single element instead of list. HASKELL

The following function can use ONLY map and/or filter. No fold/foldr etc. The function should have the following signature and types: apply::n f x and it should apply f to x only n times. A bit more formally described, it looks like so: apply n f x = f (f...(f x)...), where f is applied n times.
This is very easily achievable with map, but the problem is that map will take and return a list. And I want it to only take a single integer, transform it by f and then return that new integer.
I so far wrote this: (works by taking and returning a list)
apply::Int->(Int->Int)->[Int]->[Int]
apply n f x
| n==1 =map f x
| n>1 =apply (n-1) f (map f x)
| otherwise =x
This is how I am calling it:
main = do
print(apply 2 (*2) [3])
How can I modify this function, s.t it no longer takes and returns a list, but instead takes a single integer and returns the new modified integer? Thanks
You don't need map or filter for this. If you enter the realm of the list monad, there's no escape (if you can only use filter and map). Here's a very simple implementation you can study on:
apply :: Int -> (a -> a) -> a -> a
apply 0 _ = id
apply 1 f = f
apply n f = (apply (n - 1) f) . f
Live demo
apply 0 _ x = x
apply 1 f x = f x
apply 2 f x = apply 1 (fmap f f) x
apply 3 f x = apply 1 (fmap f f) $ f x
apply 4 f x = apply 2 (fmap f f) x
apply 5 f x = apply 2 (fmap f f) $ f x
See if you can generalize.
Pro tips:
forall f. x = apply 0 (fmap f f) x and f x = apply 0 (fmap f f) $ f x
If f . g is well-typed, then fmap f g = f . g.
Not sure why you need map / filter; perhaps rephrase the question?

High order function returning result and modified itself

My goal is to create function, which take argument, compute result and return it in tuple with modified itself.
My first try looked like this:
f x = (x,f') where
f' y = (y+1,f')
cl num func = let (nu,fu) = func num in nu:fu num
My desired result if I call function cl with 0 and f was
[0,1,2,3,4,5,6,7,8,9,10,11,12,13 ... infinity]
Unfortunately, haskell cannot construct infinite type. It is hard for me to devise another way of doing it. Maybe, I'm just looking at problem from the bad side, thats why I posted this question.
EDIT:
This is the state of my functions:
newtype InFun = InFun { innf :: Int -> (Int,InFun) }
efunc x = (x,InFun deep) where
deep y = (y+1, InFun deep)
crli n (InFun f) = let (n',f') = f n in n':crli n f'
main = putStrLn $ show (take 10 (crli 0 (InFun efunc)))
Result is [0,1,1,1,1,1,1,1,1,1]. That's better, But, I want the modification made by deep function recursive.
Probably you are looking for
{-# LANGUAGE RankNTypes #-}
newtype F = F { f :: Int -> (Int, F) }
g y = (y + 1, F g)
then
*Main> fst $ (f $ snd $ g 3) 4
5
or
*Main> map fst $ take 10 $ iterate (\(x, F h) -> h x) (g 0)
[1,2,3,4,5,6,7,8,9,10]
or more complex modification (currying)
h = g False
where g x y = (y', F g')
where y' = if x then y + 1
else 2 * y
g' = if x then g False
else g True
then
*Main> map fst $ take 10 $ iterate (\(x, F h) -> h x) (h 0)
[0,1,2,3,6,7,14,15,30,31]
You can use iterate:
iterate (+1) 0

Why does GHC make fix so confounding?

Looking at the GHC source code I can see that the definition for fix is:
fix :: (a -> a) -> a
fix f = let x = f x in x
In an example fix is used like this:
fix (\f x -> let x' = x+1 in x:f x')
This basically yields a sequence of numbers that increase by one to infinity. For this to happen fix must be currying the function that it receives right back to that very function as it's first parameter. It isn't clear to me how the definition of fix listed above could be doing that.
This definition is how I came to understand how fix works:
fix :: (a -> a) -> a
fix f = f (fix f)
So now I have two questions:
How does x ever come to mean fix x in the first definition?
Is there any advantage to using the first definition over the second?
It's easy to see how this definition works by applying equational reasoning.
fix :: (a -> a) -> a
fix f = let x = f x in x
What will x evaluate to when we try to evaluate fix f? It's defined as f x, so fix f = f x. But what is x here? It's f x, just as before. So you get fix f = f x = f (f x). Reasoning in this way you get an infinite chain of applications of f: fix f = f (f (f (f ...))).
Now, substituting (\f x -> let x' = x+1 in x:f x') for f you get
fix (\f x -> let x' = x+1 in x:f x')
= (\f x -> let x' = x+1 in x:f x') (f ...)
= (\x -> let x' = x+1 in x:((f ...) x'))
= (\x -> x:((f ...) x + 1))
= (\x -> x:((\x -> let x' = x+1 in x:(f ...) x') x + 1))
= (\x -> x:((\x -> x:(f ...) x + 1) x + 1))
= (\x -> x:(x + 1):((f ...) x + 1))
= ...
Edit: Regarding your second question, #is7s pointed out in the comments that the first definition is preferable because it is more efficient.
To find out why, let's look at the Core for fix1 (:1) !! 10^8:
a_r1Ko :: Type.Integer
a_r1Ko = __integer 1
main_x :: [Type.Integer]
main_x =
: # Type.Integer a_r1Ko main_x
main3 :: Type.Integer
main3 =
!!_sub # Type.Integer main_x 100000000
As you can see, after the transformations fix1 (1:) essentially became main_x = 1 : main_x. Note how this definition refers to itself - this is what "tying the knot" means. This self-reference is represented as a simple pointer indirection at runtime:
Now let's look at fix2 (1:) !! 100000000:
main6 :: Type.Integer
main6 = __integer 1
main5
:: [Type.Integer] -> [Type.Integer]
main5 = : # Type.Integer main6
main4 :: [Type.Integer]
main4 = fix2 # [Type.Integer] main5
main3 :: Type.Integer
main3 =
!!_sub # Type.Integer main4 100000000
Here the fix2 application is actually preserved:
The result is that the second program needs to do allocation for each element of the list (but since the list is immediately consumed, the program still effectively runs in constant space):
$ ./Test2 +RTS -s
2,400,047,200 bytes allocated in the heap
133,012 bytes copied during GC
27,040 bytes maximum residency (1 sample(s))
17,688 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
[...]
Compare that to the behaviour of the first program:
$ ./Test1 +RTS -s
47,168 bytes allocated in the heap
1,756 bytes copied during GC
42,632 bytes maximum residency (1 sample(s))
18,808 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
[...]
How does x ever come to mean fix x in the first definition?
fix f = let x = f x in x
Let bindings in Haskell are recursive
First of all, realize that Haskell allows recursive let bindings. What Haskell calls "let", some other languages call "letrec". This feels pretty normal for function definitions. For example:
ghci> let fac n = if n == 0 then 1 else n * fac (n - 1) in fac 5
120
But it can seem pretty weird for value definitions. Nevertheless, values can be recursively defined, due to Haskell's non-strictness.
ghci> take 5 (let ones = 1 : ones in ones)
[1,1,1,1,1]
See A gentle introduction to Haskell sections 3.3 and 3.4 for more elaboration on Haskell's laziness.
Thunks in GHC
In GHC, an as-yet-unevaluated expression is wrapped up in a "thunk": a promise to perform the computation. Thunks are only evaluated when they absolutely must be. Suppose we want to fix someFunction. According to the definition of fix, that's
let x = someFunction x in x
Now, what GHC sees is something like this.
let x = MAKE A THUNK in x
So it happily makes a thunk for you and moves right along until you demand to know what x actually is.
Sample evaluation
That thunk's expression just happens to refer to itself. Let's take the ones example and rewrite it to use fix.
ghci> take 5 (let ones recur = 1 : recur in fix ones)
[1,1,1,1,1]
So what will that thunk look like?
We can inline ones as the anonymous function \recur -> 1 : recur for a clearer demonstration.
take 5 (fix (\recur -> 1 : recur))
-- expand definition of fix
take 5 (let x = (\recur -> 1 : recur) x in x)
Now then, what is x? Well, even though we're not quite sure what x is, we can still go through with the function application:
take 5 (let x = 1 : x in x)
Hey look, we're back at the definition we had before.
take 5 (let ones = 1 : ones in ones)
So if you believe you understand how that one works, then you have a good feel of how fix works.
Is there any advantage to using the first definition over the second?
Yes. The problem is that the second version can cause a space leak, even with optimizations. See GHC trac ticket #5205, for a similar problem with the definition of forever. This is why I mentioned thunks: because let x = f x in x allocates only one thunk: the x thunk.
The difference is in sharing vs copying.1
fix1 f = x where x = f x -- more visually apparent way to write the same thing
fix2 f = f (fix2 f)
If we substitute the definition into itself, both are reduced as the same infinite application chain f (f (f (f (f ...)))). But the first definition uses explicit naming; in Haskell (as in most other languages) sharing is enabled by the ability to name things: one name is more or less guaranteed to refer to one "entity" (here, x). The 2nd definition does not guarantee any sharing - the result of a call fix2 f is substituted into the expression, so it might as well be substituted as a value.
But a given compiler could in theory be smart about it and use sharing in the second case as well.
The related issue is "Y combinator". In untyped lambda calculus where there is no naming constructs (and thus no self-reference), Y combinator emulates self-reference by arranging for the definition to be copied, so referring to the copy of self becomes possible. But in implementations which use environment model to allow for named entities in a language, direct reference by name becomes possible.
To see a more drastic difference between the two definitions, compare
fibs1 = fix1 ( (0:) . (1:) . g ) where g (a:t#(b:_)) = (a+b):g t
fibs2 = fix2 ( (0:) . (1:) . g ) where g (a:t#(b:_)) = (a+b):g t
See also:
In Scheme, how do you use lambda to create a recursive function?
Y combinator discussion in "The Little Schemer"
Can fold be used to create infinite lists?
(especially try to work out the last two definitions in the last link above).
1 Working from the definitions, for your example fix (\g x -> let x2 = x+1 in x : g x2) we get
fix1 (\g x -> let x2 = x+1 in x : g x2)
= fix1 (\g x -> x : g (x+1))
= fix1 f where {f = \g x -> x : g (x+1)}
= fix1 f where {f g x = x : g (x+1)}
= x where {x = f x ; f g x = x : g (x+1)}
= g where {g = f g ; f g x = x : g (x+1)} -- both g in {g = f g} are the same g
= g where {g = \x -> x : g (x+1)} -- and so, here as well
= g where {g x = x : g (x+1)}
and thus a proper recursive definition for g is actually created. (in the above, we write ....x.... where {x = ...} for let {x = ...} in ....x...., for legibility).
But the second derivation proceeds with a crucial distinction of substituting a value back, not a name, as
fix2 (\g x -> x : g (x+1))
= fix2 f where {f g x = x : g (x+1)}
= f (fix2 f) where {f g x = x : g (x+1)}
= (\x-> x : g (x+1)) where {g = fix2 f ; f g x = x : g (x+1)}
= h where {h x = x : g (x+1) ; g = fix2 f ; f g x = x : g (x+1)}
so the actual call will proceed as e.g.
take 3 $ fix2 (\g x -> x : g (x+1)) 10
= take 3 (h 10) where {h x = x : g (x+1) ; g = fix2 f ; f g x = x : g (x+1)}
= take 3 (x:g (x+1)) where {x = 10 ; g = fix2 f ; f g x = x : g (x+1)}
= x:take 2 (g x2) where {x2 = x+1 ; x = 10 ; g = fix2 f ; f g x = x : g (x+1)}
= x:take 2 (g x2) where {x2 = x+1 ; x = 10 ; g = f (fix2 f) ; f g x = x : g (x+1)}
= x:take 2 (x2 : g2 (x2+1)) where { g2 = fix2 f ;
x2 = x+1 ; x = 10 ; f g x = x : g (x+1)}
= ......
and we see that a new binding (for g2) is established here, instead of the previous one (for g) being reused as with the fix1 definition.
I have perhaps a bit simplified explanation that comes from inlining optimization. If we have
fix :: (a -> a) -> a
fix f = f (fix f)
then fix is a recursive function and this means it cannot be inlined in places where it is used (an INLINE pragma will be ignored, if given).
However
fix' f = let x = f x in x
is not a recursive function - it never calls itself. Only x inside is recursive. So when calling
fix' (\r x -> let x' = x+1 in x:r x')
the compiler can inline it into
(\f -> (let y = f y in y)) (\r x -> let x' = x+1 in x:r x')
and then continue simplifying it, for example
let y = (\r x -> let x' = x+1 in x:r x') y in y
let y = (\ x -> let x' = x+1 in x:y x') in y
which is just as if the function were defined using the standard recursive notation without fix:
y x = let x' = x+1 in x:y x'

How to call the same function 'n' times? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Library function to compose a function with itself n times
I need a function to call another function n number of times.
so it would look something like this
f n = g(g(g(g(l))))
where n equals to the number of function g nested.
how should I go about this? thanks!
iterate is a common solution:
> :t iterate
iterate :: (a -> a) -> a -> [a]
So, given a function with a domain the same as its range, a -> a, and an initial input a, produce an infinite list of results in the form:
iterate f a --> [a, f(a), f(f(a)), ...]
And you can access the nth element of the list using !!:
iterate f a !! n
NB iterate f a !! 0 == a.
This is a function that I use often at the ghci prompt. There are a few ways to write it, none of which I am particularly fond of, but they are all reasonably clean:
fpow n f x = iterate f x !! n
fpow n f = foldr (.) id $ replicate n f
fpow n = foldr (.) id . replicate n -- just eta the above
fpow 0 f = id
fpow n f = f . fpow (n-1) f
The middle two appeal to me because my brain has chunked foldr (.) id to mean "compose a list of functions".
I kinda just wish it were in the prelude :-).
f 0 = l
f n = g (f (n-1))
But more functional would be:
f 0 l = l
f n l = g (f (n-1) l)
This could also be done with folds or morfisms, but this is easier to understand.
For example here's using a hylomorphism, but it doesn't make it clearer really:
f g l = hylo l (.) (\n -> (g, n-1)) (==0)
It says some thing like compose (.) g(l) until n==0
Can be done using fold:
applyNTimes :: Int -> (a -> a) -> a -> a
applyNTimes n f val = foldl (\s e -> e s) val [f | x <- [1..n]]

Resources