foldr - further explanation and example with a map function - haskell

I've looked at different folds and folding in general as well as a few others and they explain it fairly well.
I'm still having trouble on how it would work in this case.
length :: [t] -> Int
length list = foldr (+) 0 (map (\x ->1) list)
Could someone go through that step by step and try to explain that to me.
And also how would foldl work as well.

(map (\x ->1) list) takes the list and turns it into a list of 1 values:
(map (\x ->1) ["a", "b", "c"]) == [1, 1, 1]
Now, if you substitute that in the original foldr, it looks like this:
foldr (+) 0 [1, 1, 1]
The starting point is 0 and the aggregation function is (+). As it steps through each element in the list, you are basically adding up all the 1 values, and that's how you end up returning the length.
foldr starts from the right and works back to the head of the list. foldl starts from the left and works through the list. Because the aggregation function is (+) :: Num a => a -> a -> a, the ordering of the left and right arguments in (+) is logically inconsequential (with the caveat that foldl has stack overflow problems with large lists because of lazy evaluation)

Related

Infinite lists that depend on each other in Haskell?

I am working on a programming exercise where the goal is to write a function to get the term at the Nth index of Hofstadter's Figure-Figure sequence.
Rather come up with a basic solution using the formula, I thought it would be an interesting challenge to generate an infinite list to represent the sequence and then index it.
This was my initial approach, however, it hangs when trying to calculate anything past the first two terms.
hof :: Int -> Int
hof = (!!) seqA
where
seqA = 1 : zipWith (+) seqA seqB
seqB = 2 : filter (`notElem` seqA) [3..]
seqA represents the sequence of terms, and seqB is the differences between them.
Though I don't really understand how to use seq, I tried using it to strictly evaluate the terms that come before the desired one, like shown below.
hof :: Int -> Int
hof 0 = 1
hof n = seq (hof $ n - 1) $ seqA !! n
where
seqA = 1 : zipWith (+) seqA seqB
seqB = 2 : filter (`notElem` seqA) [3..]
This also hangs when trying to calculate values past the first index.
After playing around in ghci, I found a way to get this to work in a weird way
ghci> seqB = [2, 4, 5, 6]
ghci> seqA = 1 : zipWith (+) seqA seqB
ghci> seqB = 2 : filter (`notElem` seqA) [3..]
ghci> seqA = 1 : zipWith (+) seqA seqB
ghci> hof = (!!) seqA
By giving seqB and initial value and redefining both seqA and seqB afterwards, it seems to function normally. I did notice, however, that the result of passing larger values to hof seems to give different results based on how many terms I initially put in the seqB list. When I redefine the function in ghci, does it still use the older version for functions that call it previous to its redefinition?
I would like to know why this works in ghci and whether it's possible to write a working version of this code using a similar technique. Thanks in advance!
The problem is that seqA is infinite, and so
(`notElem` seqA) x
can never return True. If it sees that x is the first element of seqA, then great: it can return False. But if it doesn't see x, it wants to keep looking: maybe x is the next element! The list never ends, so there's no way it can conclude x is definitely not present.
This is a classic mistake beginners make, trying to filter an infinite list and expecting the list to end at some point. Often, the answer is to use something like
x `notElem` (takeWhile (<= x) infList)
instead. This way, your program gives up on searching for x once it's found a number above x. This only works if your lists are sorted, of course. Your equations look like they probably produce ascending lists, in which case it would work, but I haven't worked through the algebra. If your lists aren't in ascending order, you'll need to design some other stopping condition to avoid the infinite recursion.
The other answer tells you the problem with your approach, and suggests a great fix. I thought it might be fun to try to work out a slightly more efficient solution, though; it seems a shame to keep checking the beginning of seqA over and over during our membership calls. Here's the idea I had: the point is for seqB to be the complement of seqA, right? Well, what if we just directly define a complement function? Like this:
complement :: Integer -> [Integer] -> [Integer]
complement = go 1 where
go i xs#(x:xt) = case compare i x of
LT -> i : go (i+1) xs
EQ -> i+1 : go (i+2) xt
GT -> go i xt -- this case should be impossible
go i [] = [i..] -- this case is irrelevant for our purposes
The EQ case is a bit suspect; it doesn't work for general increasing input sequences. (But see below.) Anyway, with this definition in place, the two sequences can be quite naturally defined:
seqA, seqB :: [Integer]
seqA = 1 : zipWith (+) seqA seqB
seqB = complement seqA
Try it in ghci:
> take 10 seqA
[1,3,7,12,18,26,35,45,56,69]
Nice. Now, if we fix up the EQ case to work properly for all (increasing) input sequences, it would have to look like this:
complement :: Integer -> [Integer] -> [Integer]
complement = go i where
go i xs#(x:xt) = case compare i x of
LT -> i : go (i+1) xs
EQ -> go (i+1) xt
GT -> go i xt -- still impossible
go i [] = [i..] -- still irrelevant
Unfortunately, our definitions of seqA and seqB above don't quite work any more. The right first value for seqB depends on whether 2 is in seqA, but whether 2 is in seqA depends on whether the first value of seqB is 1 or not... Luckily, because seqA grows much faster than seqB, we only have to prime the pump a little.
seqA, seqB :: [Integer]
seqA = 1 : 3 : 7 : zipWith (+) (drop 2 seqA) (drop 2 seqB)
seqB = complement seqA
-- OR
seqA = 1 : zipWith (+) seqA seqB
seqB = 2 : 4 : drop 2 (complement seqA)
Try it in ghci:
> take 10 seqA
[1,3,7,12,18,26,35,45,56,69]
The definition of seqX is a bit less natural, but the definition of complement is a bit more natural, so there seems to be something of a tradeoff there.
As an answer to this part:
When I redefine the function in ghci, does it still use the older version for functions that call it previous to its redefinition?
Yes, that's the way it has to work. Bindings at the ghci prompt are not mutable variables as you would have in an imperative language, they're supposed to work the same way as variables do in every other part of Haskell.
So when you have this:
ghci> a = 1
ghci> b = [a]
ghci> b
[1]
a is just a name for 1, and b is just a name for [1]. The latter was calculated by from the expression [a] by seeing what value a was a name for, but it is absolutely the value [1] and not the expression [a] that b refers to.
ghci> a = 2
ghci> b
[1]
Executing a = 2 doesn't change the value referred to by a, it just changes the state of the environment available at the ghci prompt. This cannot affect any values that were calculated when a was a name for 1; they were and remain pure values.
An easy way to think about it is that a = 2 is not "changing a", it's just introducing a new and separate binding. Because it happens to have the same name as an existing one the new one shadows the old one, making the old one impossible for you to refer to in any future expressions. But nothing about the old one has been changed.
And you will in fact see exactly the same behaviour in a compiled module in contexts where you can have multiple bindings for one name (if you shadow a function argument with a let, or nest lets, etc). All but one of them will be inaccessible, but things that were defined in terms of the shadowed binding remain exactly the same; they aren't re-evaluated as if they were defined in terms of the new binding.
So with that in mind, it becomes easy to explain why this works:
ghci> seqB = [2, 4, 5, 6]
ghci> seqA = 1 : zipWith (+) seqA seqB
ghci> seqB = 2 : filter (`notElem` seqA) [3..]
ghci> seqA = 1 : zipWith (+) seqA seqB
ghci> hof = (!!) seqA
It's much the same as if you had defined it this way:
ghci> seqB_old = [2, 4, 5, 6]
ghci> seqA_old = 1 : zipWith (+) seqA_old seqB_old
ghci> seqB_new = 2 : filter (`notElem` seqA_old) [3..]
ghci> seqA_new = 1 : zipWith (+) seqA_new seqB_new
ghci> hof = (!!) seqA_new
seqB_old is just a finte list
Because zipWith stops at the length of the shortest list, seqA_old is also just a finite list, even though it's defined in terms of itself.
seqB_new is an infinite list that just has to filter each element against any of the elements of the finite list seqA_old; this doesn't get caught up in the problem amalloy points out, but it isn't actually the correct list you were trying to define
seqA_new is defined in terms of itself, but seqB_new was defined in terms of seqA_old, not this new version. There is simply no mutual recursion happening.
This problem doesn’t really lend itself to a mutually recursive solution. filter + notElem will continue searching beyond where they could ever return a result, because they can’t make any use of the fact that the sequence is strictly ascending.
Rather than searching for the next element that we haven’t seen, we can turn the problem around: start by assuming we will see every number, and use delete to prune out those numbers that we know we will want to exclude.
hof :: Int -> Int
hof = (!!) seqA
where
-- By definition, one is the cumulative sum of the other.
seqA = scanl' (+) 1 seqB
-- Iteratively build the sequence.
seqB = unfoldr (infinitely step) (1, [2 ..])
step c (d, xs) = (c, (c + d, delete (c + d) xs))
-- Helper for when ‘unfoldr’ is known to have
-- unbounded input (‘x : xs’ always matches)
-- and unbounded output (we always return ‘Just’).
infinitely f (d, x : xs) = Just (f x (d, xs))

Understanding foldr and foldl functions

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f v [] = v
foldr f v (x:xs) = f x (foldr f v xs)
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f v [] = v
foldl f v (x:xs) = foldl f (f v x) xs
I am trying to wrap my head around this two functions. I have two questions. One regarding function f. In general,
foldr f v xs
f has access to the first element of xs and the recursively processed tail. Here:
foldl f v xs
f has access to the last element of xs and the recursively processed tail.
Is this an useful (and correct) way to think about it ?
My second question is related to fold "right" or "left". In many places, they say that foldr "starts from the right". For example, if I expand the expression
foldr (+) 0 [1,2,3]
I get
(+) 1 (foldr (+) 0 [2,3])
So, I see it is "starting from the left" of the list. The first element and the recursively processed tail are the arguments to the function. Could someone give some light into this issue ?
EDIT: One of my question focuses is on the function f passed to fold; the linked answer doesn't address that point.
"Starting from the right" is good basic intuition, but it can also mislead, as you've already just discovered. The truth of the matter is that lists in Haskell are singly linked and we only have access to one side directly, so in some sense every list operation in Haskell "starts" from the left. But what it does from there is what's important. Let's finish expanding your foldr example.
foldr (+) 0 [1, 2, 3]
1 + foldr 0 [2, 3]
1 + (2 + foldr 0 [3])
1 + (2 + (3 + foldr 0 []))
1 + (2 + (3 + 0))
Now the same for foldl.
foldl (+) 0 [1, 2, 3]
foldl (+) (0 + 1) [2, 3]
foldl (+) ((0 + 1) + 2) [3]
foldl (+) (((0 + 1) + 2) + 3) []
((0 + 1) + 2) + 3
In the foldr case, we make our recursive call directly, so we take the head and make it an argument to our accumulating function, and then we make the other argument our recursive call.
In the foldl case, we make our recursive call by changing the the accumulator argument. Since we're changing the argument rather than the result, the order of evaluation gets flipped around.
The difference is in the way the parentheses "associate". In the foldr case, the parentheses associate to the right, while in the foldl case they associate to the left. Likewise, the "initial" value is on the right for foldr and on the left for foldl.
The general advice for the use of folds on lists in Haskell is this.
Use foldr if you want lazy evaluation that respects the list structure. Since foldr does its recursion inside the function call, so if the folding function happens to be guarded (i.e. by a data constructor), then our foldr call is guarded. For instance, we can use foldr to efficiently construct an infinite list out of another infinite list.
Use foldl' (note the ' at the end), the strict left fold, for situations where you want the operation to be strict. foldl' forces each step of the fold to weak head normal form before continuing, preventing thunks from building up. So whereas foldl will build up the entire internal expression and then potentially evaluate it at the end, foldl' will do the work as we go, which saves a ton of memory on large lists.
Don't use foldl on lists. The laziness gained by foldl is almost never useful, since the only way to get anything useful out of a left fold is to force the whole fold anyway, building up the thunks internally is not useful.
For other data structures which are not right-biased, the rules may be different. All of this is running on the assumption that your Foldable is a Haskell list.

Church Using foldr

I came across this code and it works but I don't know why. I will attempt to convey the parts that I understand but I don't get the full picture. Here is the code:
church :: Int -> (c -> c) -> c -> c
church 0 _ arg = arg
church n f arg = foldr (\x acc -> f acc) arg [1..n]
When running this input on the Prelude,
church 4 tail "ABCDEFGH"
this is the answer:
"EFGH"
I know how foldr works, I can walk through an example of foldr:
foldr (/) 2 [8,12,24,4]
What happens here is:
4/2 = 2, 24/2 = 12, 12/12 = 1, 8/1 = 8
I get the desired output 8, as described in the second example of this page:
As for this question, I know why "EFGH" is the answer. The tail is applied four times and it goes like this:
tail "ABCDEFGH" = "BCDEFGH",
tail "BCDEFGH" = "CDEFGH"
tail "CDEFGH" = "DEFGH"
tail "DEFGH" = "EFGH"
But, in this code, this is the procedure when I write it out:
foldr (\x acc -> tail acc) "ABCDEFGH" [1, 2, 3, 4]
From what I have described for foldr above, foldr applies tail and "ABCDEFGH" to 4, since 4 is the last element. But, I can't wrap my head around how "ABCDEFGH" is applied with a tail to 4. From my example, it was easy because (/) divides 2 elements, one from the list and the other being the second argument. However, in this code's case, tail is used between an element from a list and another list. I do not understand that. Can anybody help me out by going through element by element, like how I described in my example?
Note that x isn't used anywhere to the right of the ->. Thus, the numbers in the list aren't being used. It could be a list of units and still work the same: foldr (\x acc -> tail acc) "ABCDEFGH" [(), (), (), ()] The only purpose of the list is to encode the number of times to do tail in its length. It may also help your understanding if you replace \x acc -> tail acc with the equivalent expression const tail.

Haskell: Purpose of the flip function?

I am a bit surprised that this was not asked before. Maybe it is a stupid question.
I know that flip is changing the order of two arguments.
Example:
(-) 5 3
= 5 - 3
= 2
flip (-) 5 3
= 3 - 5
= -2
But why would I need such a function? Why not just change the inputs manually?
Why not just write:
(-) 3 5
= 3 - 5
= -2
One is unlikely to ever use the flip function on a function that is immediately applied to two or more arguments, but flip can be useful in two situations:
If the function is passed higher-order to a different function, one cannot simply reverse the arguments at the call site, since the call site is in another function! For example, these two expressions produce very different results:
ghci> foldl (-) 0 [1, 2, 3, 4]
-10
ghci> foldl (flip (-)) 0 [1, 2, 3, 4]
2
In this case, we cannot swap the arguments of (-) because we do not apply (-) directly; foldl applies it for us. So we can use flip (-) instead of writing out the whole lambda \x y -> y - x.
Additionally, it can be useful to use flip to partially apply a function to its second argument. For example, we could use flip to write a function that builds an infinite list using a builder function that is provided the element’s index in the list:
buildList :: (Integer -> a) -> [a]
buildList = flip map [0..]
ghci> take 10 (buildList (\x -> x * x))
[0,1,4,9,16,25,36,49,64,81]
Perhaps more frequently, this is used when we want to partially apply the second argument of a function that will be used higher-order, like in the first example:
ghci> map (flip map [1, 2, 3]) [(+ 1), (* 2)]
[[2,3,4],[2,4,6]]
Sometimes, instead of using flip in a case like this, people will use infix syntax instead, since operator sections have the unique property that they can supply the first or second argument to a function. Therefore, writing (`f` x) is equivalent to writing flip f x. Personally, I think writing flip directly is usually easier to read, but that’s a matter of taste.
One very useful example of flip usage is sorting in descending order. You can see how it works in ghci:
ghci> import Data.List
ghci> :t sortBy
sortBy :: (a -> a -> Ordering) -> [a] -> [a]
ghci> :t compare
compare :: Ord a => a -> a -> Ordering
ghci> sortBy compare [2,1,3]
[1,2,3]
ghci> sortBy (flip compare) [2,1,3]
[3,2,1]
Sometimes you'll want to use a function by supplying the second parameter but take it's first parameter from somewhere else. For example:
map (flip (-) 5) [1..5]
Though this can also be written as:
map (\x -> x - 5) [1..5]
Another use case is when the second argument is long:
flip (-) 5 $
if odd x
then x + 1
else x
But you can always use a let expression to name the first parameter computation and then not use flip.

Getting the gcd of a list

I am new to Haskell, actually I just started, and I would like to get a small hint to the question I am about to ask.
I am currently trying to get the GCD of a given list. For example, having the list [3, 6, 9] it will return 3.
For the moment, I tought of the following aproach, am I going in a good direction?
let getGCD l = map (\x y -> gcd x y) l
Not quite, you don't want map but rather a fold. map will let you transform every element in the list uniformly, so you give it a local transformation a -> b and it gives you a global transformation ([a] -> [b]). This isn't really what you want.
As a quick primer on folds, there's a whole family of them which all let us express computations which we build up by repeatedly applying a function to an initial value, the next element and the list, and then repeating with the result of that application as the new initial value. So foldl' (+) 0 [1, 2, 3, 4] would so something like
foldl' (+) 0 [1, 2, 3, 4] ==>
foldl' (+) 1 [2, 3, 4] ==>
foldl' (+) 3 [3, 4] ==>
foldl' (+) 6 [4] ==>
foldl' (+) 10 [] ==> -- For empty lists we just return the seed given
10
Can you see how to slot your problem into this framework?
More hints
You want to take a list and compute a result which depends on every element of the list, something like
gcdAll :: [Int] -> Int
gcdAll l = foldl' step initial l
is closer to what you want where step takes the current gcd of the list you've processed so far and the next element of the list and returns the next value and initial is the value to start with (and what is returned if l is empty. Since there isn't really a sane value, I'd instead split this into
gcdAll :: [Int] -> Maybe Int
gcdAll [] = Nothing
gcdAll (h : rest) = Just $ foldl' step h rest
so that you correctly signal the possibility of failure, after all, what's the gcd of nothing?
Note that foldl' is imported from Data.List.
You can recursively use gcd on a list (essentially a fold implementation)
gcd' :: (Integral a) => [a] -> a
gcd' [] = 1
gcd' [x] = x
gcd' (x:xs) = gcd x (gcd' xs)
A GCD is a property of a pair of numbers. So, really, you want to look at pairs of numbers drawn from your list. Ultimately you want to end up with a single GCD for the entire list, but as a first step, you want pairs.
There's a widely-known trick for working with consecutive pairs:
f1 list = zipWith f2 list (tail list)
The zipWith function is a bit like map, but works with a pair of lists. In this case, the original list, and the tail of the original list. (Note that this fails if the list is empty.) If you replace f2 with your gcd function, you now have a new list which is the GCD of each consecutive pair of numbers. And this list is one element shorter than the original:
f1 [x, y, z, w] ==> [gcd x y, gcd y z, gcd z w]
So each time you apply f1 to a list, you get a new, shorter list of GCDs. Apply it enough times, and you should end up with just one element...
I just tackled this one and this would be the quickest, simplest
myGCDMultiple = foldr1 gcd
> myGCDMultiple [3,6,9]
3

Resources