Identity of the "accumulating parameter" of the foldr function - haskell

the foldr function:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr func acc [] = acc
foldr func acc (x:xs) = func x (foldr func acc xs)
catches patterns like those (left side)
and makes them simpler (right side)
sum :: [Integer] -> Integer | sum :: [Integer] -> Integer
sum [] = 0 | sum [] = 0
sum (x:xs) = x + sum xs | sum (x:xs) = foldr (+) 0 xs
|
product :: [Integer] -> Integer | product :: [Integer] -> Integer
product [] = 0 | product [] = 0
product (x:xs) = x * product xs | product (x:xs) = foldr (*) 1 xs
|
concat :: [[a]] -> [a] | concat :: [[a]] -> [a]
concat [] = [] | concat [] = []
concat (x:xs) = x ++ concat xs | concat (x:xs) = foldr (++) [] xs
----------------------------------------------------------------------
not using folds | using folds
one thing I noticed was that the acc argument, provided as input for the fold,
seems to be exactly the neutral element / identity element of that function.
In Mathematics the neutral element of the addition operation + is 0
because n + 0 = n, n ∈ ℝ
it doesn't change anything, in other words:
With this neutral element provided as an input for the addition function, the summand equals the sum.
(+) summand 0 = summand or summand + 0 = summand
The same goes for multiplication, the product of the factor and the identiy equals the factor itelf:
(*) factor 1 = factor
So is this just a coincidence or is there someting bigger behind ?

You're exactly right. We very often want to pass an "identity"-like element to foldr, so that the "starting point" doesn't affect the result at all. In fact, this is codified in Haskell with the Monoid typeclass. A monoid is an associative binary operation with an identity. The examples you provide are all examples of a monoid, and they all exist in Haskell.
+ on any Num is codified as a monoid over the Sum newtype.
* on any Num is codified as a monoid over the Product newtype.
++ on any list is codified as a monoid on [a].
And in fact we can go one step further. Folding over a monoid is such a common practice that we can do it automatically with fold (or foldMap, if you need to disambiguate). For instance,
import Data.Foldable
import Data.Monoid
sum :: Num a => [a] -> a
sum = getSum . foldMap Sum
product :: Num a => [a] -> a
product = getProduct . foldMap Product
concat :: [[a]] -> [a]
concat = fold
If you look in the source for Foldable, you can see that fold and foldMap are actually defined in terms of foldr on a monoid, so this is doing the exact same thing you just described.
You can find the full list of (built-in) Monoid instances on Hackage, but a few others that you might find of interest:
|| on Booleans is a monoid with the Any newtype.
&& on Booleans is a monoid with the All newtype.
Function composition is a monoid with the Endo newtype (short for "endomorphism")
As an exercise, you might consider trying to pinpoint the identity of each of these operations.

Related

Is there any terminating fold in Haskell?

I need some kind of fold which can terminate if I already have the data I want.
For example I need to find first 3 numbers which are greater than 5. I decided to use Either for termination and my code looks like this:
terminatingFold :: ([b] -> a -> Either [b] [b]) -> [a] -> [b]
terminatingFold f l = reverse $ either id id $ fold [] l
where fold acc [] = Right acc
fold acc (x:xs) = f acc x >>= flip fold xs
first3NumsGreater5 acc x =
if length acc >= 3
then Left acc
else Right (if x > 5 then (x : acc) else acc)
Are there some more clever/generic approaches?
The result of your function is a list, and it would be desirable if it were produced lazily, that is, extracting one item from the result should only require evaluating the input list up until the item is found there.
Unfolds are under-appreciated for these kinds of tasks. Instead of focusing on "consuming" the input list, let's think of it as a seed from which (paired with some internal accumulator) we can produce the result, element by element.
Let's define a Seed type that contains a generic accumulator paired with the as-yet unconsumed parts of the input:
{-# LANGUAGE NamedFieldPuns #-}
import Data.List (unfoldr)
data Seed acc input = Seed {acc :: acc, pending :: [input]}
Now let's reformulate first3NumsGreater5 as a function that either produces the next output element from the Seed, of signals that there aren't any more elements:
type Counter = Int
first3NumsGreater5 :: Seed Counter Int -> Maybe (Int, Seed Counter Int)
first3NumsGreater5 (Seed {acc, pending})
| acc >= 3 =
Nothing
| otherwise =
case dropWhile (<= 5) pending of
[] -> Nothing
x : xs -> Just (x, Seed {acc = succ acc, pending = xs})
Now our main function can be written in terms of unfoldr:
unfoldFromList ::
(Seed acc input -> Maybe (output, Seed acc input)) ->
acc ->
[input] ->
[output]
unfoldFromList next acc pending = unfoldr next (Seed {acc, pending})
Putting it to work:
main :: IO ()
main = print $ unfoldFromList first3NumsGreater5 0 [0, 6, 2, 7, 9, 10, 11]
-- [6,7,9]
Normally an early termination-capable fold is foldr with the combining function which is non-strict in its second argument. But, its information flow is right-to-left (if any), while you want it left-to-right.
A possible solution is to make foldr function as a left fold, which can then be made to stop early:
foldlWhile :: Foldable t
=> (a -> Bool) -> (r -> a -> r) -> r
-> t a -> r
foldlWhile t f a xs = foldr cons (\acc -> acc) xs a
where
cons x r acc | t x = r (f acc x)
| otherwise = acc
You will need to tweak this for t to test the acc instead of x, to fit your purposes.
This function is foldlWhile from https://wiki.haskell.org/Foldl_as_foldr_alternative, re-written a little. foldl'Breaking from there might fit the bill a bit better.
foldr with the lazy reducer function can express corecursion perfectly fine just like unfoldr does.
And your code is already lazy: terminatingFold (\acc x -> Left acc) [1..] => []. That's why I'm not sure if this answer is "more clever", as you've requested.
edit: following a comment by #danidiaz, to make it properly lazy you'd have to code it as e.g.
first3above5 :: (Foldable t, Ord a, Num a)
=> t a -> [a]
first3above5 xs = foldr cons (const []) xs 0
where
cons x r i | x > 5 = if i==2 then [x]
else x : r (i+1)
| otherwise = r i
This can be generalized further by abstracting the test and the count.
Of course it's just reimplementing take 3 . filter (> 5), but shows how to do it in general with foldr.

Fold that's both constant-space and short-circuiting

I'm trying to build a Haskell function that does basically the same thing as Prelude's product. Unlike that function, however, it should have these two properties:
It should operate in constant space (ignoring the fact that some numeric types like Integer aren't). For example, I want myProduct (replicate 100000000 1) to eventually return 1, unlike Prelude's product which uses up all of my RAM and then gives *** Exception: stack overflow.
It should short-circuit when it encounters a 0. For example, I want myProduct (0:undefined) to return 0, unlike Prelude's product which gives *** Exception: Prelude.undefined.
Here's what I've come up with so far:
myProduct :: (Eq n, Num n) => [n] -> n
myProduct = go 1
where go acc (x:xs) = if x == 0 then 0 else acc `seq` go (acc * x) xs
go acc [] = acc
That works exactly how I want it to for lists, but I'd like to generalize it to have type (Foldable t, Eq n, Num n) => t n -> n. Is it possible to do this with any of the folds? If I just use foldr, then it will short-circuit but won't be constant-space, and if I just use foldl', then it will be constant-space but won't short-circuit.
If you spell your function slightly differently, it's more obvious how to turn it into a foldr. Namely:
myProduct :: (Eq n, Num n) => [n] -> n
myProduct = flip go 1 where
go (x:xs) = if x == 0 then \acc -> 0 else \acc -> acc `seq` go xs (acc * x)
go [] = \acc -> acc
Now go has got that foldr flavor, and we can just fill in the holes.
myProduct :: (Foldable t, Eq n, Num n) => t n -> n
myProduct = flip go 1 where
go = foldr
(\x f -> if x == 0 then \acc -> 0 else \acc -> acc `seq` f (acc * x))
(\acc -> acc)
Hopefully you can see where each of those pieces came from in the previous explicit-recursion style and how mechanical the transformation is. Then I'd make a few aesthetic tweaks:
myProduct :: (Foldable t, Eq n, Num n) => t n -> n
myProduct xs = foldr step id xs 1 where
step 0 f acc = 0
step x f acc = f $! acc * x
And we're all done! A bit of quick testing in ghci reveals that it still short-circuits on 0 as required and uses constant space when specialized to lists.
You might be looking for foldM. Instantiate it with m = Either b and you get short circuiting behavior (or Maybe, depends if you have many possible early exit values, or one known in advance).
foldM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
I recall discussions whether there should be foldM', but IIRC GHC does the right thing most of the time.
import Control.Monad
import Data.Maybe
myProduct :: (Foldable t, Eq n, Num n) => t n -> n
myProduct = fromMaybe 0 . foldM go 1
where go acc x = if x == 0 then Nothing else Just $! acc * x

Converting a foldl into fold1

I am using the following fold to get the final monotonically decreasing sequence of a list.
foldl (\acc x -> if x<=(last acc) then acc ++ [x] else [x]) [(-1)] a
So [9,5,3,6,2,1] would return [6,2,1]
However, with foldl I needed to supply a start for the fold namely [(-1)]. I was trying to turn into to a foldl1 to be able to handle any range of integers as well as any Ord a like so:
foldl1 (\acc x -> if x<=(last acc) then acc ++ [x] else [x]) a
But I get there error:
cannot construct infinite type: a ~ [a]
in the second argument of (<=) namely last acc
I was under the impression that foldl1 was basically :
foldl (function) [head a] a
But I guess this isn't so? How would you go about making this fold generic for any Ord type?
I was under the impression that foldl1 was basically :
foldl (function) [head a] a
No, foldl1 is basically:
foldl function (head a) (tail a)
So the initial element is not a list of head a, but head a.
How would you go about making this fold generic for any Ord type?
Well a quick fix is:
foldl (\acc x -> if x<=(last acc) then acc ++ [x] else [x]) [head a] (tail a)
But there are still two problems:
in case a is an empty list, this function will error (while you probably want to return the empty list); and
the code is not terribly efficient since both last and (++) run in O(n).
The first problem can easily be addressed by using pattern matching to prevent that scenario. But for the latter you better would for instance use a reverse approach. Like for instance:
f :: Ord t => [t] -> [t]
f [] = [] -- case when the empty list is given
f a = reverse $ foldl (\acc#(ac:_) x -> if x <= ac then (x:acc) else [x]) [head a] (tail a)
Furthermore personally I am not a huge fan of if-then-else in functional programming, you can for instance define a helper function like:
f :: Ord t => [t] -> [t]
f [] = [] -- case when the empty list is given
f a = reverse $ foldl g [head a] (tail a)
where g acc#(ac:_) x | x <= ac = (x:acc)
| otherwise = [x]
Now reverse runs in O(n) but this is done only once. Furthermore the (:) construction runs in O(1) so all the actions in g run in O(1) (well given the comparison of course works efficient, etc.) making the algorithm itself O(n).
For your sample input it gives:
*Main> f [9,5,3,6,2,1]
[6,2,1]
The type of foldl1 is:
Foldable t => (a -> a -> a) -> t a -> a
Your function argument,
\acc x -> if x<=(last acc) then acc ++ [x] else [x]
has type:
(Ord a) => [a] -> a -> [a]
When Haskell's typechecker tries typechecking your function, it'll try unifying the type a -> a -> a (the type of the first argument of foldl1) with the type [a] -> a -> [a] (the type of your function).
To unify these types would require unifying a with [a], which would lead to the infinite type a ~ [a] ~ [[a]] ~ [[[a]]]... and so on.
The reason this works while using foldl is that the type of foldl is:
Foldable t => (b -> a -> b) -> b -> t a -> b
So [a] gets unified with b and a gets unified with the other a, leading to no problem at all.
foldl1 is limited in that it can only take functions which deal with only one type, or, in other terms, the accumulator needs to be the same type as the input list (for instance, when folding a list of Ints, foldl1 can only return an Int, while foldl can use arbitrary accumulators. So you can't do this using foldl1).
With regards to making this generic for all Ord values, one possible solution is to make a new typeclass for values which state their own "least-bound" value, which would then be used by your function. You can't make this function as it is generic on all Ord values because not all Ord values have sequence least bounds you can use.
class LowerBounded a where
lowerBound :: a
instance LowerBounded Int where
lowerBound = -1
finalDecreasingSequence :: (Ord a, LowerBounded a) => [a] -> [a]
finalDecreasingSequence = foldl buildSequence lowerBound
where buildSequence acc x
| x <= (last acc) = acc ++ [x]
| otherwise = [x]
You might also want to read a bit about how Haskell does its type inference, as it helps a lot in figuring out errors like the one you got.

High order function thats has for input a list of functions and a list of elements and applies the functions to the elements

As the title suggests i am trying to implement a high order function declared as
Ord u => [v->u]->[v]->[u]
that has inputs a) a list of functions of any type and a range of values of any type and b) a list of elements of the same type and then it will return a list that is the result of all elements that occured from applying a function from the given list to an element from the given list in ascending order without repetitive values.
i was trying to implement it with the foldr function with no luck.
i thought that i can index with zip the functions as a pair so they will be applied one by one with the foldr function. bellow that i created a insertion sort so i can sort the final list
apply :: Ord u => [v->u]->[v]->[u]
apply f y = insSort (foldr(\(i, x) y -> x:y ) (zip [1..] f))
insSort :: Ord u => [u] -> [u]
insSort (h:t) = insert h (insSort t)
insSort [] = []
insert :: Ord u => u -> [u] -> [u]
insert n (h:t)
| n <= h = n : h : t
| otherwise = h : insert n t
insert n [] = [n]
for example some inputs with the output:
>apply [abs] [-1]
[1]
>apply [(^2)] [1..5]
[1,4,9,16,25]
>apply [(^0),(0^),(\x->div x x),(\x->mod x x)] [1..1000]
[0,1]
>apply [head.tail,last.init] ["abc","aaaa","cbbc","cbbca"]
"abc"
> apply [(^2),(^3),(^4),(2^)] [10]
[100,1000,1024,10000]
>apply [(*5)] (apply [(‘div‘5)] [1..100])
[0,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100]
apply :: [a -> b] -> [a] -> [b]
First of all, this signature matches that of the standard <*> function, which is part of the Applicative class.
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
Setting f ~ [] we have <*> :: [a -> b] -> [a] -> [b].
There are at least two sensible ways of writing an Applicative instance for lists. The first one takes the Cartesian product of its inputs, pairing every function with every value. If <*>'s input lists have length N and M, the output list will have length N*M. pure for this specification would put an element in a singleton list, so that pure id <*> xs = xs.
instance Applicative [] where
pure x = [x]
(f:fs) <*> xs = map f xs ++ (fs <*> xs)
This is equivalent to the standard Applicative instance for [].
The other sensible way of implementing Applicative zips the two lists together by applying functions to elements pointwise. If <*>'s input lists have length N and M, the output list will have length min(N, M). pure creates an infinite list, so once again pure id <*> xs = xs.
instance Applicative [] where
pure x = let xs = x:xs in xs
[] <*> _ = []
_ <*> [] = []
(f:fs) <*> (x:xs) = f x : (fs <*> xs)
This instance is available in base under the ZipList newtype.

Haskell / Miranda: Find the type of the function

Brief: This is a past exam question from a Miranda exam but the syntax is very similar to Haskell.
Question: What is the type of the following expression and what does it do? (The definitions
of the functions length and swap are given below).
(foldr (+) 0) . (foldr ((:) . length . (swap (:) [] )) [])
length [] = 0
length (x:xs) = 1 + length xs
swap f x y = f y x
Note:
Please feel free to reply in haskell syntax - sorry about putting using the stars as polytypes but i didn't want to translate it incorrectly into haskell. Basically, if one variable has type * and the other has * it means they can be any type but they must both be the same type. If one has ** then it means that it can but does not need to have the same type as *. I think it corresponds to a,b,c etc in haskell usuage.
My working so far
From the definition of length you can see that it finds the length of a list of anything so this gives
length :: [*] -> num.
From the definition I think swap takes in a function and two parameters and produces the function with the two parameters swapped over, so this gives
swap :: (* -> ** -> ***) -> ** -> [*] -> ***
foldr takes a binary function (like plus) a starting value and list and folds the list from right to left using that function. This gives
foldr :: (* -> ** -> **) -> ** -> [*] -> **)
I know in function composition it is right associative so for example everything to the right of the first dot (.) needs to produce a list because it will be given as an argument to the first foldr.
The foldr function outputs a single value ( the result of folding up the list) so I know that the return type is going to be some sort of polytype and not a list of polytype.
My problem
I'm unsure where to go from here really. I can see that swap needs to take in another argument, so does this partial application imply that the whole thing is a function? I'm quite confused!
You've already got the answer, I'll just write down the derivation step by step so it's easy to see all at once:
xxf xs = foldr (+) 0 . foldr ((:) . length . flip (:) []) [] $ xs
= sum $ foldr ((:) . length . (: [])) [] xs
= sum $ foldr (\x -> (:) (length [x])) [] xs
= sum $ foldr (\x r -> length [x]:r) [] xs
= sum $ map (\x -> length [x] ) xs
= sum [length [x] | x <- xs]
= sum [ 1 | x <- xs]
-- = length xs
xxf :: (Num n) => [a] -> n
So that, in Miranda, xxf xs = #xs. I guess its type is :: [*] -> num in Miranda syntax.
Haskell's length is :: [a] -> Int, but as defined here, it is :: (Num n) => [a] -> n because it uses Num's (+) and two literals, 0 and 1.
If you're having trouble visualizing foldr, it is simply
foldr (+) 0 (a:(b:(c:(d:(e:(...:(z:[])...))))))
= a+(b+(c+(d+(e+(...+(z+ 0)...)))))
= sum [a, b, c, d, e, ..., z]
Let's go through this step-by-step.
The length function obviously has the type that you described; in Haskell it's Num n => [a] -> n. The equivalent Haskell function is length (It uses Int instead of any Num n).
The swap function takes a function to invoke and reverses its first two arguments. You didn't get the signature quite right; it's (a -> b -> c) -> b -> a -> c. The equivalent Haskell function is flip.
The foldr function has the type that you described; namely (a -> b -> b) -> b -> [a] -> b. The equivalent Haskell function is foldr.
Now, let's see what each sub expression in the main expression means.
The expression swap (:) [] takes the (:) function and swaps its arguments. The (:) function has type a -> [a] -> [a], so swapping it yields [a] -> a -> [a]; the whole expression thus has type a -> [a] because the swapped function is applied to []. What the resulting function does is that it constructs a list of one item given that item.
For simplicity, let's extract that part into a function:
singleton :: a -> [a]
singleton = swap (:) []
Now, the next expression is (:) . length . singleton. The (:) function still has type a -> [a] -> [a]; what the (.) function does is that it composes functions, so if you have a function foo :: a -> ... and a function bar :: b -> a, foo . bar will have type b -> .... The expression (:) . length thus has type Num n => [a] -> [n] -> [n] (Remember that length returns a Num), and the expression (:) . length . singleton has type Num => a -> [n] -> [n]. What the resulting expression does is kind of strange: given any value of type a and some list, it will ignore the a and prepend the number 1 to that list.
For simplicity, let's make a function out of that:
constPrependOne :: Num n => a -> [n] -> [n]
constPrependOne = (:) . length . singleton
You should already be familiar with foldr. It performs a right-fold over a list using a function. In this situation, it calls constPrependOne on each element, so the expression foldr constPrependOne [] just constructs a list of ones with equal length to the input list. So let's make a function out of that:
listOfOnesWithSameLength :: Num n => [a] -> [n]
listOfOnesWithSameLength = foldr constPrependOne []
If you have a list [2, 4, 7, 2, 5], you'll get [1, 1, 1, 1, 1] when applying listOfOnesWithSameLength.
Then, the foldr (+) 0 function is another right-fold. It is equivalent to the sum function in Haskell; it sums the elements of a list.
So, let's make a function:
sum :: Num n => [n] -> n
sum = foldr (+) 0
If you now compose the functions:
func = sum . listOfOnesWithSameLength
... you get the resulting expression. Given some list, it creates a list of equal length consisting of only ones, and then sums the elements of that list. It does in other words behave exactly like length, only using a much slower algorithm. So, the final function is:
inefficientLength :: Num n => [a] -> n
inefficientLength = sum . listOfOnesWithSameLength

Resources