Different fold in Haskell and SML/NJ - haskell

In foldl definition possible wrong in SML/NJ 110.75, I found that the relation foldl (op -) 2 [1] = foldr (op -) 2 [1] holds. But when I tried the above in Haskell I found that the above relation rewritten in Haskell as foldl (-) 2 [1] == foldr (-) 2 [1] doesn't hold. Why is this? Does Haskell have different definition for fold than SML/NJ?
Thanks

In ML, both folds have the same type signature:
val foldl : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
val foldr : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
whereas in Haskell they're different:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldr :: (a -> b -> b) -> b -> [a] -> b
so Haskell's foldl is necessarily doing something different with the operation it's been given.
Similarities
The two languages agree on both the type and the value computed by foldr - a list folded into a value by moving righwards along the list, bracketed from the right hand end:
foldr f init [x1, x2, ..., xn]
==> f(x1, f(x2, ..., f(xn, init)...))
Differences
First, ML has
foldl f init [x1, x2, ..., xn]
==> f(xn,...,f(x2, f(x1, init))...)
So ML's foldl is a left fold in the sense that it folds the list leftwards instead of rightwards.
whereas in Haskell, you have
foldl f init [x1,x2,.....,xn]
==> f(f(...f(f(init,x1),x2),.....),xn)
In haskell, foldl is a left fold in the sense that it puts the initial value at the left and brackets the list from the left, but retains its order.
Your example
With a list with just a single element, ML does f(x1,init) which gives you x1 - init which happens to be the same as foldr's xn - init because the first and last elements are the same.
Conversely, Haskell does f(init,x1) which gives you init - x1. That's why you get the opposite answer.
Slightly longer example
ML's foldl:
foldl (op -) 100 [1,2,3,4]
==> 4 - (3 - (2 - (1 - 100)))
==> 102
ML/Haskell's foldr:
foldr (-) 100 [1,2,3,4] or foldr (op -) 100 [1,2,3,4]
==> 1 - (2 - (3 - (4 - 100)))
==> 98
Haskell's foldl:
foldl (-) 100 [1,]
==> (((100 - 1) - 2) - 3) - 4
==> 90
Conclusion
Yes the two definitions are different for foldl. ML's left means opposite order of elements, whereas Haskell's left means opposite order of bracketing.
This isn't a big problem as long as you remember which one you're using. (If the types of init and x1 are different, the type checker will tell you when you get it wrong.)

Does this help?
mlFoldl :: (a -> b -> b) -> b -> [a] -> b
mlFoldl f = foldl (flip f)

Long story short, they are essentially the same, with one minor difference: the order of the arguments passed to the operator (the combining function you pass to fold) are flipped. And since subtraction is not commutative, it will produce different results.
In Haskell (as well as OCaml, C++, Clojure, Common Lisp, Erlang, F#, JavaScript, PHP, Python, Ruby, Scala, and many others), for foldl, the supplied function's first argument is the initial value, or the "folded value so far", while the second argument is an element from the list.
However, in Standard ML, the supplied function's first argument is the element from the list, and the second argument is the initial value, or the "folded value so far".
Neither is "correct" or "incorrect". The order of arguments is purely a design decision. The way Haskell does it is more commonly used today across languages. And in a certain "graphical" way of looking at folding, it makes more sense. Why did SML define theirs the way they did? I am not sure. Perhaps so that the signatures of foldl and foldr will be the same.

Expanding on Some Other Guy's answer:
From the Haskell Wiki:
-- if the list is empty, the result is the initial value z; else
-- apply f to the first element and the result of folding the rest
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
-- if the list is empty, the result is the initial value; else
-- we recurse immediately, making the new initial value the result
-- of combining the old initial value with the first element.
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
so foldl (-) 2 [1] is (2 - 1) and foldr (-) 2 [1] is (1 - 2)
From the SML Basis Library
foldl f init [x1, x2, ..., xn]
returns
f(xn,...,f(x2, f(x1, init))...)
or init if the list is empty.
foldr f init [x1, x2, ..., xn]
returns
f(x1, f(x2, ..., f(xn, init)...))
or init if the list is empty.
so foldl (op -) 2 [1] is fxn - init or 1 - 2, and foldr (op -) 2 [1] is fx1 - init. It is still 1 - 2, but only by coincidence. The answers diverge with a longer list, but not as much as the answers between Haskell and SML.

Related

Understanding foldr and foldl functions

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f v [] = v
foldr f v (x:xs) = f x (foldr f v xs)
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f v [] = v
foldl f v (x:xs) = foldl f (f v x) xs
I am trying to wrap my head around this two functions. I have two questions. One regarding function f. In general,
foldr f v xs
f has access to the first element of xs and the recursively processed tail. Here:
foldl f v xs
f has access to the last element of xs and the recursively processed tail.
Is this an useful (and correct) way to think about it ?
My second question is related to fold "right" or "left". In many places, they say that foldr "starts from the right". For example, if I expand the expression
foldr (+) 0 [1,2,3]
I get
(+) 1 (foldr (+) 0 [2,3])
So, I see it is "starting from the left" of the list. The first element and the recursively processed tail are the arguments to the function. Could someone give some light into this issue ?
EDIT: One of my question focuses is on the function f passed to fold; the linked answer doesn't address that point.
"Starting from the right" is good basic intuition, but it can also mislead, as you've already just discovered. The truth of the matter is that lists in Haskell are singly linked and we only have access to one side directly, so in some sense every list operation in Haskell "starts" from the left. But what it does from there is what's important. Let's finish expanding your foldr example.
foldr (+) 0 [1, 2, 3]
1 + foldr 0 [2, 3]
1 + (2 + foldr 0 [3])
1 + (2 + (3 + foldr 0 []))
1 + (2 + (3 + 0))
Now the same for foldl.
foldl (+) 0 [1, 2, 3]
foldl (+) (0 + 1) [2, 3]
foldl (+) ((0 + 1) + 2) [3]
foldl (+) (((0 + 1) + 2) + 3) []
((0 + 1) + 2) + 3
In the foldr case, we make our recursive call directly, so we take the head and make it an argument to our accumulating function, and then we make the other argument our recursive call.
In the foldl case, we make our recursive call by changing the the accumulator argument. Since we're changing the argument rather than the result, the order of evaluation gets flipped around.
The difference is in the way the parentheses "associate". In the foldr case, the parentheses associate to the right, while in the foldl case they associate to the left. Likewise, the "initial" value is on the right for foldr and on the left for foldl.
The general advice for the use of folds on lists in Haskell is this.
Use foldr if you want lazy evaluation that respects the list structure. Since foldr does its recursion inside the function call, so if the folding function happens to be guarded (i.e. by a data constructor), then our foldr call is guarded. For instance, we can use foldr to efficiently construct an infinite list out of another infinite list.
Use foldl' (note the ' at the end), the strict left fold, for situations where you want the operation to be strict. foldl' forces each step of the fold to weak head normal form before continuing, preventing thunks from building up. So whereas foldl will build up the entire internal expression and then potentially evaluate it at the end, foldl' will do the work as we go, which saves a ton of memory on large lists.
Don't use foldl on lists. The laziness gained by foldl is almost never useful, since the only way to get anything useful out of a left fold is to force the whole fold anyway, building up the thunks internally is not useful.
For other data structures which are not right-biased, the rules may be different. All of this is running on the assumption that your Foldable is a Haskell list.

Partial functions application and folds in haskell

I'm trying to learn Haskell by solving exercises and looking at others solutions when i'm stuck. Been having trouble understanding as functions get more complex.
-- Ex 5: given a list of lists, return the longest list. If there
-- are multiple lists of the same length, return the list that has
-- the smallest _first element_.
--
-- (If multiple lists have the same length and same first element,
-- you can return any one of them.)
--
-- Give the longest function a suitable type.
--
-- Examples:
-- longest [[1,2,3],[4,5],[6]] ==> [1,2,3]
-- longest ["bcd","def","ab"] ==> "bcd"
longest :: (Foldable t, Ord a) => t [a] -> [a]
longest xs = foldl1 comp xs
where
comp acc x | length acc > length x = acc
| length acc == length x = if head acc < head x then acc else x
| otherwise = x
So foldl1 works as follows - input: foldl1 (+) [1,2,3,4] output: 10. As I understand it, it takes a function applies it to a list and "folds" it. The thing I don't understand is that comp acc x compares two lists and outputs the larger length list.
The thing I don't understand is with longest xs = foldl1 comp xs. How are two lists provided to comp to compare and what is foldl1 "folding" and what is the start accumulator?
Here is another shorter example of another fold that I thought I understood.
foldl - input: foldl (\x y -> x + y) 0 [1,2,3] output: 6
It starts at 0 and adds each element from left one by one. How does foldl exactly apply the two variables in the anonymous function. For instance if the anonymous function was (\x y z-> x + y + z) it would fail which I don't yet understand why.
I think your current notion of what foldl1/foldl does is not quite accurate. As others already explained foldl1 f (x:xs) == foldl f x xs so the first value in the list is taken as an accumulator.
You say that foldl1 (+) list takes each value of the list "one by one" and computes the sum. I think this notion is misleaing: Actually you do always take two values, add them and get an intermediate result. And you repeat that over and over again with one of the values being the intermediate result of the last. I really like following illustration:
Source
If you start to think about these intermediate values, it will make more sense that you always get the largets one.
I think it is easiest to understand if you look at a symbolic example:
foldl k z [a, b, c] = k (k (k z a) b) c
foldl1 k [a, b, c] = k (k a b) c
As you can see foldl1 just starts with the first two arguments and then adds on the rest one by one using k to combine it with the accumulator.
And foldl starts by applying k to the initial accumulator z and the first element a and then adds on the rest one by one.
The k function only ever gets two arguments, so you cannot use a function with three arguments for that.

Haskell code explanation

xs = [1,2,3]::[Float]
ys = map (+) xs
This was a question in an old test and there is no solution sheet.
The questions:
1) What kind of signature does ys have?
2) Explain why and draw how ys looks like
For the first question I know that xs is of type float and so should ys(I run the program in ghci too).
As for the second one I have no idea, because when I run the code nothing happens. When I run it and the run ys on a separate row I get an error.
Can someone help me with a hint?
For the first question I know that xs is of type float
er, no. xs has type [Float]: a list of floats.
and so should ys
ys does not have the same type as xs. You probably think so because you've read that + requires the arguments and result to have the same type:
(+) :: Num a => a -> a -> a
...or if you instantiate it to Float numbers
(+) :: Float -> Float -> Float
This is correct, nevertheless (+) is not an endomorphism (a function mapping a type to itself, as it would have to be if ys was the same type as xs) because it has two number arguments.
With map (+) you're considering (+) as a function of a single argument, not of two arguments. In most programming languages this would actually be an error, but not so in Haskell: in Haskell, all functions actually have only one argument. Functions with “multiple arguments” are really just functions on interesting types, that make it seem as if you're passing multiple arguments. In particular, the signature of (+) is actually shorthand for:
(+) :: Float -> (Float -> Float)
So, considered as a one-argument function, (+) actually maps numbers to number-endomorphisms. Hence,
map (+) :: [Float] -> [Float -> Float]
and
ys :: [Float -> Float]
– a list of number-functions. Specifically, it's this list:
ys = [(+) 1 , (+) 2 , (+) 3 ]
≡ [(1+) , (2+) , (3+) ]
≡ [\n -> 1+n, \n -> 2+n, \n -> 3+n]
I could, for example, use it like this:
GHCi> let [f,g,h] = ys in [f 3, g 2, h 1]
[4,4,4]
GHCi> map ($ 10) ys -- applies all functions separately to the number 10
[11,12,13]
GHCi> foldr ($) 0 ys -- applies all the functions one after another to 0
6
BTW, IMO you're asking the question the wrong way around. In Haskell, you don't want to consider some code and wonder what type it has – that is more an ML or even Lisp approach. I'd always start with the type signature, and work out the implementation “outside to in” (typed holes are very handy for this). This possibility is one of the big advantages of functional programming in comparison to procedural languages.
I don't have ghci at the moment, apologies if something I say is wrong.
xs is type [Float] and ys is of type [Float -> Float](it's a list of functions that each take a Float and return a Float). ys will be [(+) 1, (+) 2, (+) 3] because map applies (+) to each elements in xs. But you cannot print ys because functions do not derive Show
ys type is [Float -> Float], a list of functions that receive a number return the number +1 (first elem), the number + 2 (the second) and the number +3 (the last).
Please, bear in mind that + is a is applied with a single argument for each list element so it does return another function.
If you wanted to add all the items in the List, you should use a reduce function, such as foldl.
let zs = foldl (+) 0 xs
I hope this helps.
Cristóbal

Composing a chain of 2-argument functions

So I have a list of a functions of two arguments of the type [a -> a -> a]
I want to write a function which will take the list and compose them into a chain of functions which takes length+1 arguments composed on the left. For example if I have [f,g,h] all of types [a -> a -> a] I need to write a function which gives:
chain [f,g,h] = \a b c d -> f ( g ( h a b ) c ) d
Also if it helps, the functions are commutative in their arguments ( i.e. f x y = f y x for all x y ).
I can do this inside of a list comprehension given that I know the the number of functions in question, it would be almost exactly like the definition. It's the stretch from a fixed number of functions to a dynamic number that has me stumped.
This is what I have so far:
f xs = f' xs
where
f' [] = id
f' (x:xs) = \z -> x (f' xs) z
I think the logic is along the right path, it just doesn't type-check.
Thanks in advance!
The comment from n.m. is correct--this can't be done in any conventional way, because the result's type depends on the length of the input list. You need a much fancier type system to make that work. You could compromise in Haskell by using a list that encodes its length in the type, but that's painful and awkward.
Instead, since your arguments are all of the same type, you'd be much better served by creating a function that takes a list of values instead of multiple arguments. So the type you want is something like this: chain :: [a -> a -> a] -> [a] -> a
There are several ways to write such a function. Conceptually you want to start from the front of the argument list and the end of the function list, then apply the first function to the first argument to get something of type a -> a. From there, apply that function to the next argument, then apply the next function to the result, removing one element from each list and giving you a new function of type a -> a.
You'll need to handle the case where the list lengths don't match up correctly, as well. There's no way around that, other than the aforementioned type-encoded-lengths and the hassle associate with such.
I wonder, whether your "have a list of a functions" requirement is a real requirement or a workaround? I was faced with the same problem, but in my case set of functions was small and known at compile time. To be more precise, my task was to zip 4 lists with xor. And all I wanted is a compact notation to compose 3 binary functions. What I used is a small helper:
-- Binary Function Chain
bfc :: (c -> d) -> (a -> b -> c) -> a -> b -> d
bfc f g = \a b -> f (g a b)
For example:
ghci> ((+) `bfc` (*)) 5 3 2 -- (5 * 3) + 2
17
ghci> ((+) `bfc` (*) `bfc` (-)) 5 3 2 1 -- ((5 - 3) * 2) + 1
5
ghci> zipWith3 ((+) `bfc` (+)) [1,2] [3,4] [5,6]
[9,12]
ghci> getZipList $ (xor `bfc` xor `bfc` xor) <$> ZipList [1,2] <*> ZipList [3,4] <*> ZipList [5,6] <*> ZipList [7,8]
[0,8]
That doesn't answers the original question as it is, but hope still can be helpful since it covers pretty much what question subject line is about.

Is this a correct way of writing the Haskell foldr function?

I was doing the exercises from YAHT's Recursive Datatype section, and found writing the listFoldr function a bit challenging (mainly because I didn't really understand the difference between foldl and foldr at first). When I finally realized exactly how the foldr function worked, I decided that a simple swap of function arguments would be all that'd be needed to change my listFoldl function to a listFoldr function:
listFoldl f i [] = i
listFoldl f i (x:xs) = listFoldl f (f i x) xs
listFoldr f i [] = i
listFoldr f i (x:xs) = listFoldr f (f x i) xs
This appears to work (I did more tests than this):
Main> foldr (-) 4 [1, 2, 3]
-2
Main> listFoldr (-) 4 [1, 2, 3]
-2
But the solution given for the exercise is much different than mine. Their listFoldl is exactly the same as mine, but look at their listFoldr:
listFoldr f i [] = i
listFoldr f i (x:xs) = f x (listFoldr f i xs)
Which solution is better, mine or theirs? Is one of them incorrect? (In my tests, they both end up with the exact same result...)
Your solution is definitely incorrect. You have simply implemented a foldl in which the function f takes arguments in the opposite order. For example of what is wrong, foldr (:) [] is supposed to be an identify function on lists, but your function reverses the list. There are lots of other reasons why your function is not foldr, like how foldr works on infinite lists and yours does not. It is a pure coincidence that they are the same in your example, because 3 - (2 - (1 - 4)) == 1 - (2 - (3 - 4)). I think you should start from scratch and look at how foldr is supposed to work.
I think you are processing the elements in the 'opposite order', and so yours is not right.
You should be able to demonstrate this with an example where 'order matters'. For example, something like
listfoldr f "" ["a", "b", "c"]
where 'f' is a function along the lines of
f s1 s2 = "now processing f(" # s1 # "," # s2 # ")\n"
where '#' is a string-append operator (I forget what it is in Haskell). The point is just to 'instrument' the function so you can see what order it is getting called with the various args.
(Note that this didn't show up in your example because the math "4-1-2-3" yields the same answer as "4-3-2-1".)
Yours is broken. Try it with something that doesn't end up with a single numeric result.
eg: listFoldr (++) "a" ["b", "c", "d"]
You're processing in the wrong direction.
On a list [x1, x2, ..., xk], your listFoldr computes
f xk (... (f x2 (f x1 i)) ...)
whereas foldr should compute
f x1 (f x2 (... (f xk i) ...))
(In comparison, foldl computes
f (... (f (f i x1) x2) ...) xk
Essentially, listFoldr f = foldl (flip f).)
You're test case is unfortunate, because
3 - (2 - (1 - 4)) = 1 - (2 - (3 - 4))
When you are testing functions like these, be sure to pass in an f that is non-commutative and non-associative (i.e., argument and application order matter), so you can be sure the expression is evaluated correctly. Of course, subtraction is non-commutative and non-associative and you just got unlucky.

Resources