Define min function using Foldr - haskell

I want to define min function to get minimum number on a list using Foldr function
min xs = foldr (\ x y -> if x<y then x else y) xs
Although I understand the logic of Foldr for simple functions like below
sum = foldr (+) 0
I get confused how to do it for function like min

The type signature of foldr is:
foldr :: (a -> b -> b) -> b -> [a] -> b
In particular, foldr takes three arguments, so your definition for min is missing
one value:
min xs = foldr (\ x y -> if x<y then x else y) ??? xs
In the case of sum = foldr (+) 0, the argument 0 is the value of sum on the empty list.
Likewise, the missing argument ??? should be the value for min on the empty list. But does min [] even make any sense?
The way to resolve this is to realize that min should only be called on non-empty lists and write:
min [] = error "min called on an empty list"
min (a:as) = foldr (\x y -> if x < y then x else y) ??? as
To determine what ??? should be, just ask yourself: what should min (a:as) be when as = []?

min is best viewed as a left fold rather than a right fold. The trouble with the right fold approach is that no comparisons happen until the list has already been entirely traversed. If the list is generated lazily, this will lead to a lot of excess memory use. Even if it's not, it will likely lead to poor use of cache and general slowness. The rest of this answer is much less practical.
As http://www.haskell.org/haskellwiki/Foldl_as_foldr shows, it's actually possible to write foldl in terms of foldr:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f a bs =
foldr (\b g x -> g (f x b)) id bs a
This is not going to be such a great implementation in general, but recent developments in program transformation have actually made it work okay, apparently, although maybe not in an actual released compiler!
Then
foldl1 f (a:as) = foldl f a as
= foldr (\b g x -> g (f x b)) id as a
Now write
min2 x y
| x <= y = x
| otherwise = y
Then
min = foldl1 min2
So we can write
min (a:as) = foldr (\b g x -> g (min2 x b)) id as a
and (on a bleeding edge research compiler) this is probably the best way to use foldr to implement min.

Related

How to break out from a fold function in haskell when the accumulator met a certain condition?

I'm calculating the sum of a list after applying someFunction to every element of it like so:
sum (map someFunction myList)
someFunction is very resource heavy so to optimise it I want to stop calculating the sum if it goes above a certain threshold.
It seems like I need to use fold but I don't know how to break out if it if the accumulator reaches the threshold. My guess is to somehow compose fold and takeWhile but I'm not exactly sure how.
Another technique is to use a foldM with Either to capture the early termination effect. Left signals early termination.
import Control.Monad(foldM)
sumSome :: (Num n,Ord n) => n -> [n] -> Either n n
sumSome thresh = foldM f 0
where
f a n
| a >= thresh = Left a
| otherwise = Right (a+n)
To ignore the exit status, just compose with either id id.
sumSome' :: (Num n,Ord n) => n -> [n] -> n
sumSome' n = either id id . sumSome n
One of the options would be using scanl function, which returns a list of intermediate calculations of foldl.
Thus, scanl1 (+) (map someFunction myList) will return the intermediate sums of your calculations. And since Haskell is a lazy language it won't calculate all the values of myList until you need it. For example:
take 5 $ scanl1 (+) (map someFunction myList)
will calculate someFunction 5 times and return the list of these 5 results.
After that you can use either takeWhile or dropWhile and stop the calculation, when a certain condition is True. For example:
head $ dropWhile (< 1000) $ scanl1 (+) [1..1000000000]
will stop the calculation, when sum of the numbers reaches 1000 and returns 1035.
This will do what you ask about without building the intermediate list as scanl' would (and scanl would even cause a thunks build-up on top of that):
foldl'Breaking break reduced reducer acc list =
foldr cons (\acc -> acc) list acc
where
cons x r acc | break acc x = reduced acc x
| otherwise = r $! reducer acc x
cf. related wiki page.
Use a bounded addition operator instead of (+) with foldl.
foldl (\b a -> b + if b > someThreshold then 0 else a) 0 (map someFunction myList)
Because Haskell is non-strict, only calls to someFunction that are necessary to evaluate the if-then-else are themselves evaluated. fold still traverses the entire list.
> foldl (\b a -> b + if b > 10 then 0 else a) 0 (map (trace "foo") [1..20])
foo
foo
foo
foo
foo
15
sum [1..5] > 10, and you can see that trace "foo" only executes 5 times, not 20.
Instead of foldl, though, you should use the strict version foldl' from Data.Foldable.
You could try making your own sum function, maybe call it boundedSum that takes
an Integer upper bound
an [Integer] to sum over
a "sum up until this point" value to be compared with the upper bound
and returns the sum of the list.
boundedSum :: Integer -> [Integer] -> Integer -> Integer
boundedSum upperBound (x : xs) prevSum =
let currentSum = prevSum + x
in
if currentSum > upperBound
then upperBound
else boundedSum upperBound xs currentSum
boundedSum upperBound [] prevSum =
prevSum
I think this way you won't "eat up" more of the list if the sum up until the current element exceeds upperBound.
EDIT: The answers to this question suggest better techniques than mine and the question itself looks rather similar to yours.
This is a possible solution:
last . takeWhile (<=100) . scanl (+) 0 . map (^2) $ [1..]
Dissected:
take your starting list ([1..] in the example)
map your expensive function ((^2))
compute partial sums scanl (+) 0
stop after the partial sums become too large (keep those (<=100))
take the last one
If performance matters, also try scanl', which might improve it.
Something like this using until :: (a -> Bool) -> (a -> a) -> a -> a from the Prelude
sumUntil :: Real a => a -> [a] -> a
sumUntil threshold u = result
where
(_, result) = until stopCondition next (u, 0)
next :: Real a => ([a], a) -> ([a], a)
next ((x:xs), y) = (xs, x + y)
stopCondition :: Real a => ([a], a) -> Bool
stopCondition (ls, x) = null ls || x > threshold
Then apply
sumUntil 10 (map someFunction myList)
This post is already a bit older but I'd like to mention a way to generalize the nice code of #trevor-cook above to break fold with the additional possibility to return not only a default value or the accumulator but also the index and element of the list where the breaking condition was satisfied:
import Control.Monad (foldM)
breakFold step initialValue list exitCondition exitFunction =
either id (exitFunction (length list) (last list))
(foldM f initialValue (zip [0..] list))
where f acc (index,x)
| exitCondition index x acc
= Left (exitFunction index x acc)
| otherwise = Right (step index x acc)
It also only requires to import foldM. Examples for the usage are:
mysum thresh list = breakFold (\i x acc -> x + acc) 0 list
(\i x acc -> x + acc > thresh)
(\i x acc -> acc)
myprod thresh list = breakFold (\i x acc -> x * acc) 1 list
(\i x acc -> acc == thresh)
(\i x acc -> (i,x,acc))
returning
*myFile> mysum 42 [1,1..]
42
*myFile> myprod 0 ([1..5]++[0,0..])
(6,0,0)
*myFile> myprod 0 (map (\n->1/n) [1..])
(178,5.58659217877095e-3,0.0)
In this way, one can use the index and the last evaluated list value as input for further functions.
Despite the age of this post, I'll add a possible solution. I like continuations because I find them very useful in terms of flow control.
breakableFoldl
:: (b -> a -> (b -> r) -> (b -> r) -> r)
-> b
-> [a]
-> (b -> r)
-> r
breakableFoldl f b (x : xs) = \ exit ->
f b x exit $ \ acc ->
breakableFoldl f acc xs exit
breakableFoldl _ b _ = ($ b)
breakableFoldr
:: (a -> b -> (b -> r) -> (b -> r) -> r)
-> b
-> [a]
-> (b -> r)
-> r
breakableFoldr f b l = \ exit ->
fix (\ fold acc xs next ->
case xs of
x : xs' -> fold acc xs' (\ acc' -> f x acc' exit next)
_ -> next acc) b l exit
exampleL = breakableFoldl (\ acc x exit next ->
( if acc > 15
then exit
else next . (x +)
) acc
) 0 [1..9] print
exampleR = breakableFoldr (\ x acc exit next ->
( if acc > 15
then exit
else next . (x +)
) acc
) 0 [1..9] print

Generalizing fold such that it becomes expressive enough to define any finite recursion?

So, there is something known as a "universal property of fold", stating exactly following:
g [] = i; g (x:xs) = f x (g xs) <=> g = fold f i
However, as you probably now, there are rare cases like dropWhile, which can not be redefined as fold f i unless you generalize it.
The simplest yet obvious way to generalize is to redefine universal property:
g' y [] = j y; g' y (x:xs) = h y x xs (g' y xs) <=> g' y = fold (?) l
At this point I can make my assumption: I assume existence of somewhat function p :: a -> b -> b, which would satisfy the equation g' y = fold p l. Let's try to solve given equation with help of universal property, mention at the very beginning:
g' y [] = j y = fold p l [] = l => j y = l
g' y (x:xs) = h y x xs (g' y xs) = fold p l (x:xs) = p x (fold p l xs) = p x (g' y xs) => letting rs = (g' y xs), h y x xs rs = p x rs, which is wrong: xs occurs freely from the left and thus equality can't hold.
Now let me try to interpret result I've came up with and ask question.
I see that the problem is xs emerging as unbound variable; it's true for various situations, including above mentioned dropWhile. Does it mean that the only way that equation can be solved is by "extending" rs to a pair of (rs, xs)? In other words, fold accumulates into tuple rather than a single type (ignoring the fact that tuple itself is a single type)? Is there any other way to generalize bypassing pairing?
It is as you say. The universal property says that g [] = i; g (x:xs) = f x (g xs) iff g = fold f i. This can't apply for a straightforward definition of dropWhile, as the would-be f :: a -> [a] -> [a] depends not just on the element and accumulated value at the current fold step, but also on the whole list suffix left to process (in your words, "xs emerg[es] as an unbound variable"). What can be done is twisting dropWhileso that this dependency on the list suffix becomes manifest in the accumulated value, be it through a tuple -- cf. dropWhilePair from this question, with f :: a -> ([a], [a]) -> ([a], [a]) -- or a function -- as in chi's implementation...
dropWhileFun = foldr (\x k -> \p -> if p x then k p else x : k (const False)) (const [])
... with f :: a -> ((a -> Bool) -> [a]) -> ((a -> Bool) -> [a]).
At the end of the day, the universal property is what it is -- a fundamental fact about foldr. It is no accident that not all recursive functions are immediately expressible through foldr. In fact, the tupling workaround your question brings to the table directly reflects the notion of paramorphism (for an explanation of them, see What are paramorphisms? and its exquisite answer by Conor McBride). At face value, paramorphisms are generalisations of catamorphisms (i.e. a straightforward fold); however, it only takes a slight contortion to implement paramorphisms in terms of catamorphisms. (Additional technical commentary on that might be found, for instance, in Chapter 3 of Categorical Programming With Inductive and Coinductive Types, Varmo Vene's PhD thesis.)

Defining foldl in terms of foldr in Standard ML

The defined code is
fun foldl f e l = let
fun g(x, f'') = fn y => f''(f(x, y))
in foldr g (fn x => x) l e end
I don't understand how this works;
what is the purpose of g(x, f'')?
I also find a similar example in Haskell,
the definition is quite short
myFoldl f z xs = foldr step id xs z
where
step x g a = g (f a x)
Let's dissect the Haskell implementation of myFoldl and then take a look at the ocaml SML code. First, we'll look at some type signatures:
foldr :: (a -> b -> b) -- the step function
-> b -- the initial value of the accumulator
-> [a] -- the list to fold
-> b -- the result
It should be noted that although the foldr function accepts only three arguments we are applying it two four arguments:
foldr step id xs z
However, as you can see the second argument to foldr (i.e. the inital value of the accumulator) is id which is a function of the type x -> x. Therefore, the result is also of the type x -> x. Hence, it accepts four arguments.
Similarly, the step function is now of the type a -> (x -> x) -> x -> x. Hence, it accepts three arguments instead of two. The accumulator is an endofunction (i.e. a function whose domain and codomain is the same).
Endofunctions have a special property, they are composed from left to right instead of from right to left. For example, let's compose a bunch of Int -> Int functions:
inc :: Int -> Int
inc n = n + 1
dbl :: Int -> Int
dbl n = n * 2
The normal way to compose these functions is to use the function composition operator as follows:
incDbl :: Int -> Int
incDbl = inc . dbl
The incDbl function first doubles a number and then increments it. Note that this reads from right to left.
Another way to compose them is to use continuations (denoted by k):
inc' :: (Int -> Int) -> Int -> Int
inc' k n = k (n + 1)
dbl' :: (Int -> Int) -> Int -> Int
dbl' k n = k (n * 2)
Notice that the first argument is a continuation. If we want to recover the original functions then we can do:
inc :: Int -> Int
inc = inc' id
dbl :: Int -> Int
dbl = dbl' id
However, if we want to compose them then we do it as follows:
incDbl' :: (Int -> Int) -> Int -> Int
incDbl' = dbl' . inc'
incDbl :: Int -> Int
incDbl = incDbl' id
Notice that although we are still using the dot operator to compose the functions, it now reads from left to right.
This is the key behind making foldr behave as foldl. We fold the list from right to left but instead of folding it into a value, we fold it into an endofunction which when applied to an initial accumulator value actually folds the list from left to right.
Consider our incDbl function:
incDbl = incDbl' id
= (dbl' . inc') id
= dbl' (inc' id)
Now consider the definition of foldr:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr _ acc [] = acc
foldr fun acc (y:ys) = fun y (foldr fun acc ys)
In the basis case we simply return the accumulated value. However, in the inductive case we return fun y (foldr fun acc ys). Our step function is defined as follows:
step :: a -> (x -> x) -> x -> x
step x g a = g (f a x)
Here f is the reducer function of foldl and is of the type x -> a -> x. Notice that step x is an endofunction of the type (x -> x) -> x -> x which we know can be composed left to right.
Hence the folding operation (i.e. foldr step id) on a list [y1,y2..yn] looks like:
step y1 (step y2 (... (step yn id)))
-- or
(step y1 . step y2 . {dots} . step yn) id
Each step yx is an endofunction. Hence, this is equivalent to composing the endofunctions from left to right.
When this result is applied to an initial accumulator value then the list folds from left to right. Hence, myFoldl f z xs = foldr step id xs z.
Now consider the foldl function (which is written in Standard ML and not OCaml). It is defined as:
fun foldl f e l = let fun g (x, f'') = fn y => f'' (f (x, y))
in foldr g (fn x => x) l e end
The biggest difference between the foldr functions of Haskell and SML are:
In Haskell the reducer function has the type a -> b -> b.
In SML the reducer function has the type (a, b) -> b.
Both are correct. It's only a matter of preference. In SML instead of passing two separate arguments, you pass one single tuple which contains both arguments.
Now, the similarities:
The id function in Haskell is the anonymous fn x => x function in SML.
The step function in Haskell is the function g in SML which takes a tuple containing the first two arguments.
The step function is Haskell step x g a has been split into two functions in SML g (x, f'') = fn y => f'' (f (x, y)) for more clarity.
If we rewrite the SML function to use the same names as in Haskell then we have:
fun myFoldl f z xs = let step (x, g) = fn a => g (f (a, x))
in foldr step (fn x => x) xs z end
Hence, they are exactly the same function. The expression g (x, f'') simply applies the function g to the tuple (x, f''). Here f'' is a valid identifier.
Intuition
The foldl function traverses the list head to tail while operating elements with an accumulator:
(...(a⊗x1)⊗...⊗xn-1)⊗xn
And you want to define it via a foldr:
x1⊕(x2⊕...⊕(xn⊕e)...)
Rather unintuitive. The trick is that your foldr will not produce a value, but rather a function. The list traversal will operate the elements as to produce a function that, when applied to the accumulator, performs the computation you desire.
Lets see a simple example to illustrate how this works. Consider sum foldl (+) 0 [1,2,3] = ((0+1)+2)+3. We may calculate it via foldr as follows.
foldr ⊕ [1,2,3] id
-> 1⊕(2⊕(3⊕id))
-> 1⊕(2⊕(id.(+3))
-> 1⊕(id.(+3).(+2))
-> (id.(+3).(+2).(+1))
So when we apply this function to 0 we get
(id.(+3).(+2).(+1)) 0
= ((0+1)+2)+3
We began with the identity function and successively changed it as we traversed the list, using ⊕ where,
n ⊕ g = g . (+n)
Using this intuition, it isn't hard to define a sum with an accumulator via foldr. We built the computation for a given list via foldr ⊕ id xs. Then to calculate the sum we applied it to 0, foldr ⊕ id xs 0. So we have,
foldl (+) 0 xs = foldr ⊕ id xs 0
where n ⊕ g = g . (+n)
or equivalently, denoting n ⊕ g in prefix form by (⊕) n g and noting that (⊕) n g a = (g . (+n)) a = g (a+n),
foldl (+) 0 xs = foldr ⊕ id xs 0
where (⊕) n g a = g (a+n)
Note that the ⊕ is your step function, and that you can obtain the generic result you're looking for by substituting a function f for +, and accumulator a for 0.
Next let us show that the above really is correct.
Formal derivation
Moving on to a more formal approach. It is useful, for simplicity, to be aware of the following universal property of foldr.
h [] = e
h (x:xs) = f x (h xs)
iff
h = foldr f e
This means that rather than defining foldr directly, we may instead and more simply define a function h in the form above.
We want to define such an h so that,
h xs a = foldl f a xs
or equivalently,
h xs = \a -> foldl f a xs
So lets determine h. The empty case is simple:
h [] = \a -> foldl f a []
= \a -> a
= id
The non-empty case results in:
h (x:xs) = \a -> foldl f a (x:xs)
= \a -> foldl f (f a x) xs
= \a -> h xs (f a x)
= step x (h xs) where step x g = \a -> g (f a x)
= step x (h xs) where step x g a = g (f a x)
So we conclude that,
h [] = id
h (x:xs) = step x (h xs) where step x g a = g (f a x)
satisfies h xs a = foldl f a xs
And by the universal property above (noting that the f in the universal property formula corresponds to step here, and e to id) we know that h = foldr step id. Therefore,
h = foldr step id
h xs a = foldl f a xs
-----------------------
foldl f a xs = foldr step id xs a
where step x g a = g (f a x)

Find the K'th element of a list using foldr

I try to implement own safe search element by index in list.
I think, that my function have to have this signature:
safe_search :: [a] -> Int -> Maybe a
safe_search xs n = foldr iteration init_val xs n
iteration = undefined
init_val = undefined
I have problem with implementation of iteration. I think, that it has to look like this:
safe_search :: [a] -> Int -> Maybe a
safe_search xs n = foldr iteration init_val xs n
where
iteration :: a -> (Int -> [a]) -> Int -> a
iteration x g 0 = []
iteration x g n = x (n - 1)
init_val :: Int -> a
init_val = const 0
But It has to many errors. My intuition about haskell is wrong.
you have
safe_search :: [a] -> Int -> Maybe a
safe_search xs n = foldr iteration init_val xs n
if null xs holds, foldr iteration init_val [] => init_val, so
init_val n
must make sense. Nothing to return, so
= Nothing
is all we can do here, to fit the return type.
So init_val is a function, :: Int -> Maybe a. By the definition of foldr, this is also what the "recursive" argument to the combining function is, "coming from the right":
iteration x r
but then this call must also return just such a function itself (again, by the definition of foldr, foldr f z [a,b,c,...,n] == f a (f b (f c (...(f n z)...))), f :: a -> b -> b i.e. it must return a value of the same type as it gets in its 2nd argument ), so
n | n==0 = Just x
That was easy, 0-th element is the one at hand, x; what if n > 0?
| n>0 = ... (n-1)
Right? Just one more step left for you to do on your own... :) It's not x (the list's element) that goes on the dots there; it must be a function. We've already received such a function, as an argument...
To see what's going on here, it might help to check the case when the input is a one-element list, first,
safe_search [x] n = foldr iteration init_val [x] n
= iteration x init_val n
and with two elements,
[x1, x2] n = iteration x1 (iteration x2 init_val) n
-- iteration x r n
Hope it is clear now.
edit: So, this resembles the usual foldr-based implementation of zip fused with the descending enumeration from n down, indeed encoding the more higher-level definition of
foo xs n = ($ zip xs [n,n-1..]) $
dropWhile ((>0) . snd) >>>
map fst >>>
take 1 >>> listToMaybe
= drop n >>> take 1 >>> listToMaybe $ xs
Think about a few things.
What type should init_val have?
What do you need to do with g? g is the trickiest part of this code. If you've ever learned about continuation-passing style, you should probably think of both init_val and g as continuations.
What does x represent? What will you need to do with it?
I wrote up an explanation some time ago about how the definition of foldl in terms of foldr works. You may find it helpful.
I suggest to use standard foldr pattern, because it is easier to read and understand the code, when you use standard functions:
foldr has the type foldr :: (a -> b -> b) -> [a] -> b -> [b],
where third argument b is the accumulator acc for elements of your list [a].
You need to stop adding elements of your list [a] to acc after you've added desired element of your list. Then you take head of the resulting list [b] and thus get desired element of the list [a].
To get n'th element of the list xs, you need to add length xs - n elements of xs to the accumulator acc, counting from the end of the list.
But where to use an iterator if we want to use the standard foldr function to improve the readability of our code? We can use it in our accumulator, representing it as a tuple (acc, iterator). We subtract 1 from the iterator each turn we add element from our initial list xs to the acc and stop to add elements of xs to the acc when our iterator is equal 0.
Then we apply head . fst to the result of our foldr function to get the desired element of the initial list xs and wrap it with Just constructor.
Of course, if length - 1 of our initial list xs is less than the index of desired element n, the result of the whole function safeSearch will be Nothing.
Here is the code of the function safeSearch:
safeSearch :: Int -> [a] -> Maybe a
safeSearch n xs
| (length xs - 1) < n = Nothing
| otherwise = return $ findElem n' xs
where findElem num =
head .
fst .
foldr (\x (acc,iterator) ->
if iterator /= 0
then (x : acc,iterator - 1)
else (acc,iterator))
([],num)
n' = length xs - n

Haskell Fold with anonymous function

I have a problem with one of the Haskell basics: Fold + anonymous functions
I'm developing a bin2dec program with foldl.
The solution looks like this:
bin2dec :: String -> Int
bin2dec = foldl (\x y -> if y=='1' then x*2 + 1 else x*2) 0
I understand the basic idea of foldl / foldr but I can't understand what the parameters x y stands for.
See the type of foldl
foldl :: (a -> b -> a) -> a -> [b] -> a
Consider foldl f z list
so foldl basically works incrementally on the list (or anything foldable), taking 1 element from the left and applying f z element to get the new element to be used for the next step while folding over the rest of the elements. Basically a trivial definition of foldl might help understanding it.
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
The diagram from Haskell wiki might help building a better intuition.
Consider your function f = (\x y -> if y=='1' then x*2 + 1 else x*2) and try to write the trace for foldl f 0 "11". Here "11" is same as ['1','1']
foldl f 0 ['1','1']
= foldl f (f 0 '1') ['1']
Now f is a function which takes 2 arguments, first a integer and second a character and returns a integer.
So In this case x=0 and y='1', so f x y = 0*2 + 1 = 1
= foldl f 1 ['1']
= foldl f (f 1 '1') []
Now again applying f 1 '1'. Here x=1 and y='1' so f x y = 1*2 + 1 = 3.
= foldl f 3 []
Using the first definition of foldl for empty list.
= 3
Which is the decimal representation of "11".
Use the types! You can type :t in GHCi followed by any function or value to see its type. Here's what happens if we ask the for the type of foldl
Prelude> :t foldl
foldl :: (a -> b -> a) -> a -> [b] -> a
The input list is of type [b], so it's a list of bs. The output type is a, which is what we're going to produce. You also have to supply an initial value for the fold, also of type a. The function is of type
a -> b -> a
The first parameter (a) is the value of the fold computed so far. The second parameter (b) is the next element of the list. So in your example
\x y -> if y == '1' then x * 2 + 1 else x * 2
the parameter x is the binary number you've computed so far, and y is the next character in the list (either a '1' or a '0').

Resources