Combining foldl and foldr - haskell

I've figured out myself that foldl (or foldl') is the best approach when you want to produce summarise a list into one result (i.e. sum), and foldr is the best approach when you want to produce another (perhaps even infinite) list (i.e. filter).
So I was considering was processing that combines these two. So I made the function sum_f. sum_f is fairly simple, all it does is add up the elements of a list, but if it finds an element such that f x is true, it gives the current result as output as the element of a list and starts summing from that point all over.
The code is here:
sum_f :: (Num a) => (a -> Bool) -> [a] -> [a]
sum_f f =
let
sum_f_worker s (x:xs) =
let
rec_call z = sum_f_worker z xs
next_sum = s + x
in
next_sum `seq` if (f x) then next_sum : (rec_call 0) else rec_call next_sum
sum_f_worker _ [] = []
in
sum_f_worker 0
Now for example, lets sum all the positive integers grouped by any powers of two. This should output the following:
[1, 2, 3+4, 5+6+7+8, 9+10+11+12+13+14+15+16, ...]
i.e.
[1, 2, 7, 26, 100, ...]
We can do this like the following:
import Data.Bits
main =
let
power_of_two x = (x .&. (x - 1)) == 0 -- .&. is bitwise and
in
print $ take 25 $ sum_f power_of_two [(1::Integer)..]
Now this above function (I believe) runs in constant space (like foldl'), even though the groups grow exponentially. Also, it works on infinite lists (like foldr).
I was wondering whether I could write the above using prelude functions without explicit recursion (i.e. only the recursion inside prelude functions). Or does combining the ideas of foldl and foldr here mean that the recursion here can't be done with standard prelude functions and needs to be explicit?

What you want can be expressed using only a right fold as follows:
{-# LANGUAGE BangPatterns #-}
sum_f :: (Num a) => (a -> Bool) -> [a] -> [a]
sum_f p xs = foldr g (const []) xs 0
where
g x f !a = if p x then x+a:f 0 else f (x+a)
Prelude Data.Bits> sum_f (\x -> x .&. pred x == 0) [1..10]
[1,2,7,26]
And it works on infinite lists:
Prelude Data.Bits> take 10 . sum_f (\x -> x .&. pred x == 0) $ [1..]
[1,2,7,26,100,392,1552,6176,24640,98432]

Related

How does the bind operator get invoked multiple times

First I would like to apologize if I am not asking the correct question. I realize that there are some fundamental concepts that I don't quite understand about Monads and the bind operator which is making it difficult to formulate my question. I am having a hard time wrapping my head around how the following code is creating a list of tuples.
ordPairs :: Ord a => [a] -> [(a, a)]
ordPairs xs =
xs >>= \x1 ->
xs >>= \x2 ->
if x1 < x2 then [(x1, x2)] else []
main = print $ ordPairs [1, 2, 4, 6, 7, 8, 3, 4, 5, 6, 2, 9, 7, 8, 45, 4]
I understand the type declaration states that it returns a list of tuples [(a, a)]. What I can't figure out is how is this code "looping" through each item in the list? Looking at this as a beginner it looks as if it only passes the first and second item forward x1 and x2 and then ends with the if then else expression. Is this code being desugared into multiple iterations and building the list under the hood? I guess what I am asking is how is this code iterating through each item in the list and building a list of tuples at the end?
It might help to understand "where the parentheses are". The right-hand side of the ordPairs definitions is parsed like this:
xs >>= (\x1 -> xs >>= (\x2 -> if x1 < x2 then [(x1, x2)] else []))
As you can see here, the if-then-else expression does not stand alone, it's actually the body of an anonymous function:
\x2 -> if x1 < x2 then [(x1, x2)] else []
which can, obviously, be invoked multiple times for different values of x2. What invokes it? Well, the second >>= operator, of course. The "outer" loop works similarly, with the first >>= operator invoking another anonymous function multiple times:
\x1 -> xs >>= (\x2 -> ...)
For this example, you could replace the >>= operator with your own custom bind function. Note that it's just a plain function. There's no special desugaring or secret iterating going on. The function itself does the iterating using recursion:
bind :: [a] -> (a -> [b]) -> [b]
bind (x:xs) f = f x ++ bind xs f
bind [] _ = []
You could also write bind like this, if you prefer:
bind xs f = concatMap f xs
-- or even `bind = flip concatMap`, as per #WillemVanOnsem's comment
Or as a list comprehension.
bind xs f = [y | x <- xs, y <- f x]
This last one is the actual definition of the >>= operator for the list monad. See GHC/Base.hs.
With any of these definitions, the following will work just like your original:
bind :: [a] -> (a -> [b]) -> [b]
bind (x:xs) f = f x ++ bind xs f
bind [] _ = []
ordPairs :: Ord a => [a] -> [(a, a)]
ordPairs xs =
xs `bind` \x1 ->
xs `bind` \x2 ->
if x1 < x2 then [(x1, x2)] else []
main = print $ ordPairs [1,2,4,6,7,8,4,5,6,2,9,7,8,45,4]

Is there any terminating fold in Haskell?

I need some kind of fold which can terminate if I already have the data I want.
For example I need to find first 3 numbers which are greater than 5. I decided to use Either for termination and my code looks like this:
terminatingFold :: ([b] -> a -> Either [b] [b]) -> [a] -> [b]
terminatingFold f l = reverse $ either id id $ fold [] l
where fold acc [] = Right acc
fold acc (x:xs) = f acc x >>= flip fold xs
first3NumsGreater5 acc x =
if length acc >= 3
then Left acc
else Right (if x > 5 then (x : acc) else acc)
Are there some more clever/generic approaches?
The result of your function is a list, and it would be desirable if it were produced lazily, that is, extracting one item from the result should only require evaluating the input list up until the item is found there.
Unfolds are under-appreciated for these kinds of tasks. Instead of focusing on "consuming" the input list, let's think of it as a seed from which (paired with some internal accumulator) we can produce the result, element by element.
Let's define a Seed type that contains a generic accumulator paired with the as-yet unconsumed parts of the input:
{-# LANGUAGE NamedFieldPuns #-}
import Data.List (unfoldr)
data Seed acc input = Seed {acc :: acc, pending :: [input]}
Now let's reformulate first3NumsGreater5 as a function that either produces the next output element from the Seed, of signals that there aren't any more elements:
type Counter = Int
first3NumsGreater5 :: Seed Counter Int -> Maybe (Int, Seed Counter Int)
first3NumsGreater5 (Seed {acc, pending})
| acc >= 3 =
Nothing
| otherwise =
case dropWhile (<= 5) pending of
[] -> Nothing
x : xs -> Just (x, Seed {acc = succ acc, pending = xs})
Now our main function can be written in terms of unfoldr:
unfoldFromList ::
(Seed acc input -> Maybe (output, Seed acc input)) ->
acc ->
[input] ->
[output]
unfoldFromList next acc pending = unfoldr next (Seed {acc, pending})
Putting it to work:
main :: IO ()
main = print $ unfoldFromList first3NumsGreater5 0 [0, 6, 2, 7, 9, 10, 11]
-- [6,7,9]
Normally an early termination-capable fold is foldr with the combining function which is non-strict in its second argument. But, its information flow is right-to-left (if any), while you want it left-to-right.
A possible solution is to make foldr function as a left fold, which can then be made to stop early:
foldlWhile :: Foldable t
=> (a -> Bool) -> (r -> a -> r) -> r
-> t a -> r
foldlWhile t f a xs = foldr cons (\acc -> acc) xs a
where
cons x r acc | t x = r (f acc x)
| otherwise = acc
You will need to tweak this for t to test the acc instead of x, to fit your purposes.
This function is foldlWhile from https://wiki.haskell.org/Foldl_as_foldr_alternative, re-written a little. foldl'Breaking from there might fit the bill a bit better.
foldr with the lazy reducer function can express corecursion perfectly fine just like unfoldr does.
And your code is already lazy: terminatingFold (\acc x -> Left acc) [1..] => []. That's why I'm not sure if this answer is "more clever", as you've requested.
edit: following a comment by #danidiaz, to make it properly lazy you'd have to code it as e.g.
first3above5 :: (Foldable t, Ord a, Num a)
=> t a -> [a]
first3above5 xs = foldr cons (const []) xs 0
where
cons x r i | x > 5 = if i==2 then [x]
else x : r (i+1)
| otherwise = r i
This can be generalized further by abstracting the test and the count.
Of course it's just reimplementing take 3 . filter (> 5), but shows how to do it in general with foldr.

Haskell generate random numbers with a propriety without explicit recursion

Here I have a function to generate a stream of random numbers between 0 and 999.
randomHelp :: RandomGen g => g -> [Int]
randomHelp g = zipWith (mod) (map fst $ iterate (next . snd) $ next $ snd $ split g) $ repeat 1000
I would like to select all numbers from the stream defined above and each elem(i) and elem(i + 1) must respect a propriety. For example their gcd have to be one. All I can think is a fold function with because I can start with and accumulator which contains the number 1 (let's assume 1 will be the first element I want to show) then I check the propriety in fold's function and if it is respected i add the element to the accumulator, but the problem is the program blocks because of stackoverflow I think.
Here is the function:
randomFunc :: RandomGen g => g -> [Int]
randomFunc g = foldl (\acc x -> if (gcd x (last acc) == 1) then acc ++ [x] else acc) [1] (randomHelp g)
Note: I don't want to use explicit recursion.
A right fold would probably fit better, something like:
import System.Random (RandomGen, randomRs, mkStdGen)
randomFunc :: RandomGen g => g -> [Int]
randomFunc g = foldr go (const []) (randomRs (1, 20) g) 1
where go x f lst = if gcd x lst == 1 then x: f x else f lst
then
\> take 20 . randomFunc $ mkStdGen 1
[16,7,6,19,8,15,16,1,9,2,15,17,14,3,11,17,15,8,1,5]
Doing so you may build the list using : instead of ++ which may cause quadratic performance cost, and you may bypass the call to last.

Boolean selection of list

Suppose we want those elements of list x for which the corresponding element of list y is strictly positive. Any of the three solutions below work:
let x = [1..4]
let y = [1, -1, 2, -2]
[ snd both | both <- zip (map (> 0) y) x, fst both ]
or
map snd $ filter fst $ zip (map (>0) y) x
or
sel :: [Bool] -> [a] -> [a]
sel [] _ = []
sel (True : xs) (y : ys) = y : sel xs ys
sel (False : xs) (y : ys) = sel xs ys
sel (map (> 0) y) x
however, what prompted this was that in the R language this can be written compactly like this:
x[y > 0]
and given how much shorter that is I was wondering if there is a shorter/better way to do this in Haskell?
I'm not a haskell specialist, but why not use list comprehension?
[i | (i,j) <- zip x y, j > 0 ]
If you are willing to use a language extension, I can offer the alternative
{-# LANGUAGE ParallelListComp #-}
bfilter :: (b -> Bool) -> [a] -> [b] -> [a]
bfilter cond xs ys = [x | x <- xs | y <- ys, cond y]
Nothing in Haskell will be nearly as short as the R version, because in R, it's a language built-in, but in Haskell it isn't. Apparently whoever designed R found there to be good reasons to include such a primitive, but none of the Haskell designers found there to be convincing reasons to include such a construct in the language (and it wouldn't fit in nicely, so I fully endorse that decision - it may fit in well in R, I don't know that language).
zip x y >>= \(a,b) -> filter(const(b>0)) [a]
Or pointlessly using Applicative...
import Control.Applicative
zip x y >>= filter <$> const.(>0).snd <*> (:[]).fst
As Daniel Fischer says, there isn't any special syntax for this.
If you're going to be doing this operation often, it's best to define your own single reusable function, instead of having to assemble the list comprehension or map/filter chain manually every time. (Your sel doesn't pass this test because the caller has to apply the map separately.)
So
selectWhere :: [a] -> (a -> Bool) -> [b] -> [b]
selectWhere ys pred = map snd . filter (pred . fst) . zip ys
-- call it like this: selectWhere y (> 0) x
or whichever clearer definition you prefer. The important thing is that you wrap it up inside a function.

Recursively sort non-contiguous list to list of contiguous lists

I've been trying to learn a bit of functional programming (with Haskell & Erlang) lately and I'm always amazed at the succinct solutions people can come up with when they can think recursively and know the tools.
I want a function to convert a list of sorted, unique, non-contiguous integers into a list of contiguous lists, i.e:
[1,2,3,6,7,8,10,11]
to:
[[1,2,3], [6,7,8], [10,11]
This was the best I could come up with in Haskell (two functions)::
make_ranges :: [[Int]] -> [Int] -> [[Int]]
make_ranges ranges [] = ranges
make_ranges [] (x:xs)
| null xs = [[x]]
| otherwise = make_ranges [[x]] xs
make_ranges ranges (x:xs)
| (last (last ranges)) + 1 == x =
make_ranges ((init ranges) ++ [(last ranges ++ [x])]) xs
| otherwise = make_ranges (ranges ++ [[x]]) xs
rangify :: [Int] -> [[Int]]
rangify lst = make_ranges [] lst
It might be a bit subjective but I'd be interested to see a better, more elegant, solution to this in either Erlang or Haskell (other functional languages too but I might not understand it.) Otherwise, points for just fixing my crappy beginner's Haskell style!
Most straightforward way in my mind is a foldr:
ranges = foldr step []
where step x [] = [[x]]
step x acc#((y:ys):zs) | y == x + 1 = (x:y:ys):zs
| otherwise = [x]:acc
Or, more concisely:
ranges = foldr step []
where step x ((y:ys):zs) | y == x + 1 = (x:y:ys):zs
step x acc = [x]:acc
But wait, there's more!
abstractRanges f = foldr step []
where step x ((y:ys):zs) | f x y = (x:y:ys):zs
step x acc = [x]:acc
ranges = abstractRanges (\x y -> y == x + 1)
powerRanges = abstractRanges (\x y -> y == x*x) -- mighty morphin
By turning the guard function into a parameter, you can group more interesting things than just +1 sequences.
*Main> powerRanges [1,1,1,2,4,16,3,9,81,5,25]
[[1,1,1],[2,4,16],[3,9,81],[5,25]]
The utility of this particular function is questionable...but fun!
I can't believe I got the shortest solution. I know this is no code golf, but I think it is still quite readable:
import GHC.Exts
range xs = map (map fst) $ groupWith snd $ zipWith (\a b -> (a, a-b)) xs [0..]
or pointfree
range = map (map snd) . groupWith fst . zipWith (\a b -> (b-a, b)) [0..]
BTW, groupWith snd can be replaced with groupBy (\a b -> snd a == snd b) if you prefer Data.List over GHC.Exts
[Edit]
BTW: Is there a nicer way to get rid of the lambda (\a b -> (b-a, b)) than (curry $ (,) <$> ((-) <$> snd <*> fst) <*> snd) ?
[Edit 2]
Yeah, I forgot (,) is a functor. So here is the obfuscated version:
range = map (map fst) . groupWith snd . (flip $ zipWith $ curry $ fmap <$> (-).fst <*> id) [0..]
Suggestions are welcome...
import Data.List (groupBy)
ranges xs = (map.map) snd
. groupBy (const fst)
. zip (True : zipWith ((==) . succ) xs (tail xs))
$ xs
As to how to come up with such a thing: I started with the zipWith f xs (tail xs), which is a common idiom when you want to do something on consecutive elements of a list. Likewise is zipping up a list with information about the list, and then acting (groupBy) upon it. The rest is plumbing.
Then, of course, you can feed it through #pl and get:
import Data.List (groupBy)
import Control.Monad (ap)
import Control.Monad.Instances()
ranges = (((map.map) snd)
. groupBy (const fst))
.) =<< zip
. (True:)
. ((zipWith ((==) . succ)) `ap` tail)
, which, by my authoritative definition, is evil due to Mondad ((->) a). Twice, even. The data flow is meandering too much to lay it out in any sensible way. zipaptail is an Aztec god, and Aztec gods aren't to be messed with.
Another version in Erlang:
part(List) -> part(List,[]).
part([H1,H2|T],Acc) when H1 =:= H2 - 1 ->
part([H2|T],[H1|Acc]);
part([H1|T],Acc) ->
[lists:reverse([H1|Acc]) | part(T,[])];
part([],Acc) -> Acc.
k z = map (fst <$>) . groupBy (const snd) .
zip z . (False:) . (zipWith ((==) . succ) <*> tail) $ z
Try reusing standard functions.
import Data.List (groupBy)
rangeify :: (Num a) => [a] -> [[a]]
rangeify l = map (map fst) $ groupBy (const snd) $ zip l contigPoints
where contigPoints = False : zipWith (==) (map (+1) l) (drop 1 l)
Or, following (mixed) advice to use unfoldr, stop abusing groupBy, and be happy using partial functions when it doesn't matter:
import Control.Arrow ((***))
import Data.List (unfoldr)
spanContig :: (Num a) => [a] -> [[a]]
spanContig l =
map fst *** map fst $ span (\(a, b) -> a == b + 1) $ zip l (head l - 1 : l)
rangeify :: (Num a) => [a] -> [[a]]
rangeify = unfoldr $ \l -> if null l then Nothing else Just $ spanContig l
Erlang using foldr:
ranges(List) ->
lists:foldr(fun (X, [[Y | Ys], Acc]) when Y == X + 1 ->
[[X, Y | Ys], Acc];
(X, Acc) ->
[[X] | Acc]
end, [], List).
This is my v0.1 and I can probably make it better:
makeCont :: [Int] -> [[Int]]
makeCont [] = []
makeCont [a] = [[a]]
makeCont (a:b:xs) = if b - a == 1
then (a : head next) : tail next
else [a] : next
where
next :: [[Int]]
next = makeCont (b:xs)
And I will try and make it better. Edits coming I think.
As a comparison, here's an implementation in Erlang:
partition(L) -> [lists:reverse(T) || T <- lists:reverse(partition(L, {[], []}))].
partition([E|L], {R, [EL|_] = T}) when E == EL + 1 -> partition(L, {R, [E|T]});
partition([E|L], {R, []}) -> partition(L, {R, [E]});
partition([E|L], {R, T}) -> partition(L, {[T|R], [E]});
partition([], {R, []}) -> R;
partition([], {R, T}) -> [T|R].
The standard paramorphism recursion scheme isn't in Haskell's Data.List module, though I think it should be. Here's a solution using a paramorphism, because you are building a list-of-lists from a list, the cons-ing is a little tricksy:
contig :: (Eq a, Num a) => [a] -> [[a]]
contig = para phi [] where
phi x ((y:_),(a:acc)) | x + 1 == y = (x:a):acc
phi x (_, acc) = [x]:acc
Paramorphism is general recursion or a fold with lookahead:
para :: (a -> ([a], b) -> b) -> b -> [a] -> b
para phi b [] = b
para phi b (x:xs) = phi x (xs, para phi b xs)
It can be pretty clear and simple in the Erlang:
partition([]) -> [];
partition([A|T]) -> partition(T, [A]).
partition([A|T], [B|_]=R) when A =:= B+1 -> partition(T, [A|R]);
partition(L, P) -> [lists:reverse(P)|partition(L)].
Edit: Just for curiosity I have compared mine and Lukas's version and mine seems about 10% faster either in native either in bytecode version on testing set what I generated by lists:usort([random:uniform(1000000)||_<-lists:seq(1,1000000)]) on R14B01 64b version at mine notebook. (Testing set is 669462 long and has been partitioned to 232451 sublists.)
Edit2: Another test data lists:usort([random:uniform(1000000)||_<-lists:seq(1,10000000)]), length 999963 and 38 partitions makes bigger diference in native code. Mine version finish in less than half of time. Bytecode version is only about 20% faster.
Edit3: Some microoptimizations which provides additional performance but leads to more ugly and less maintainable code:
part4([]) -> [];
part4([A|T]) -> part4(T, A, []).
part4([A|T], B, R) when A =:= B+1 -> part4(T, A, [B|R]);
part4([A|T], B, []) -> [[B]|part4(T, A, [])];
part4([A|T], B, R) -> [lists:reverse(R, [B])|part4(T, A, [])];
part4([], B, R) -> [lists:reverse(R,[B])].
Here's an attempt from a haskell noob
ranges ls = let (a, r) = foldl (\(r, a#(h:t)) e -> if h + 1 == e then (r, e:a) else (a:r, [e])) ([], [head ls]) (tail ls)
in reverse . map reverse $ r : a

Resources