Data Parallel Haskell Prefix Sum - haskell

I'm playing with some Data Parallel Haskell code and found myself in need of a prefix sum. However I didn't see any basic operator in the dph package for prefix sum.
I rolled my own, but, since I'm new to dph, I'm not sure if it's properly taking advantage of parallelization:
{-# LANGUAGE ParallelArrays #-}
{-# OPTIONS_GHC -fvectorise #-}
module PrefixSum ( scanP ) where
import Data.Array.Parallel (lengthP, indexedP, mapP, zipWithP, concatP, filterP, singletonP, sliceP, (+:+), (!:))
import Data.Array.Parallel.Prelude.Int ((<=), (-), (==), Int, mod)
-- hide prelude
import qualified Prelude
-- assuming zipWithP (a -> b -> c) given
-- [:a:] of length n and
-- [:b:] of length m, n /= m
-- will return
-- [:c:] of length min n m
scanP :: (a -> a -> a) -> [:a:] -> [:a:]
scanP f xs = if lengthP xs <= 1
then xs
else head +:+ tail
where -- [: x_0, x_2, ..., x_2n :]
evens = mapP snd . filterP (even . fst) $ indexedP xs
-- [: x_1, x_3 ... :]
odds = mapP snd . filterP (odd . fst) $ indexedP xs
lenEvens = lengthP evens
lenOdds = lengthP odds
-- calculate the prefix sums [:w:] of the pair sums [:z:]
psums = scanP f $ zipWithP f evens odds
-- calculate the total prefix sums as
-- [: x_0, w_0, f w_0 x_2, w_1, f w_1 x_4, ...,
head = singletonP (evens !: 0)
body = concatP . zipWithP (\p e -> [: p, f p e :]) psums $ sliceP 1 lenOdds evens
-- ending at either
-- ... w_{n-1}, f w_{n-1} x_2n :]
-- or
-- ... w_{n-1}, f w_{n-1} x_2n, w_n :]
-- depending on whether the length of [:x:] is 2n+1 or 2n+2
tail = if lenEvens == lenOdds then body +:+ singletonP (psums !: (lenEvens - 1)) else body
-- reimplement some of Prelude so it can be vectorised
f $ x = f x
infixr 0 $
(.) f g y = f (g y)
snd (a,b) = b
fst (a,b) = a
even n = n `mod` 2 == 0
odd n = n `mod` 2 == 1

Parallel prefix scans are supported, in fact, they're rather fundamental. So just pass (+) as your associative operator.

Related

Haskell generate random numbers with a propriety without explicit recursion

Here I have a function to generate a stream of random numbers between 0 and 999.
randomHelp :: RandomGen g => g -> [Int]
randomHelp g = zipWith (mod) (map fst $ iterate (next . snd) $ next $ snd $ split g) $ repeat 1000
I would like to select all numbers from the stream defined above and each elem(i) and elem(i + 1) must respect a propriety. For example their gcd have to be one. All I can think is a fold function with because I can start with and accumulator which contains the number 1 (let's assume 1 will be the first element I want to show) then I check the propriety in fold's function and if it is respected i add the element to the accumulator, but the problem is the program blocks because of stackoverflow I think.
Here is the function:
randomFunc :: RandomGen g => g -> [Int]
randomFunc g = foldl (\acc x -> if (gcd x (last acc) == 1) then acc ++ [x] else acc) [1] (randomHelp g)
Note: I don't want to use explicit recursion.
A right fold would probably fit better, something like:
import System.Random (RandomGen, randomRs, mkStdGen)
randomFunc :: RandomGen g => g -> [Int]
randomFunc g = foldr go (const []) (randomRs (1, 20) g) 1
where go x f lst = if gcd x lst == 1 then x: f x else f lst
then
\> take 20 . randomFunc $ mkStdGen 1
[16,7,6,19,8,15,16,1,9,2,15,17,14,3,11,17,15,8,1,5]
Doing so you may build the list using : instead of ++ which may cause quadratic performance cost, and you may bypass the call to last.

Elegant implementation of n-dimensional matrix multiplication using lists?

List functions allow us to implement arbitrarily-dimensional vector math quite elegantly. For example:
on = (.) . (.)
add = zipWith (+)
sub = zipWith (-)
mul = zipWith (*)
dist = len `on` sub
dot = sum `on` mul
len = sqrt . join dot
And so on.
main = print $ add [1,2,3] [1,1,1] -- [2,3,4]
main = print $ len [1,1,1] -- 1.7320508075688772
main = print $ dot [2,0,0] [2,0,0] -- 4
Of course, this is not the most efficient solution, but is insightful to look at, as one can say map, zipWith and such generalize those vector operations. There is one function I couldn't implement elegantly, though - that is cross products. Since a possible n-dimensional generalization of cross products is the nd matrix determinant, how can I implement matrix multiplication elegantly?
Edit: yes, I asked a completely unrelated question to the problem I set up. Fml.
It just so happens I have some code lying around for doing n-dimensional matrix operations which I thought was quite cute when I wrote it at least:
{-# LANGUAGE NoMonomorphismRestriction #-}
module MultiArray where
import Control.Arrow
import Control.Monad
import Data.Ix
import Data.Maybe
import Data.Array (Array)
import qualified Data.Array as A
-- {{{ from Dmwit.hs
deleteAt n xs = take n xs ++ drop (n + 1) xs
insertAt n x xs = take n xs ++ x : drop n xs
doublify f g xs ys = f (uncurry g) (zip xs ys)
any2 = doublify any
all2 = doublify all
-- }}}
-- makes the most sense when ls and hs have the same length
instance Ix a => Ix [a] where
range = sequence . map range . uncurry zip
inRange = all2 inRange . uncurry zip
rangeSize = product . uncurry (zipWith (curry rangeSize))
index (ls, hs) xs = fst . foldr step (0, 1) $ zip indices sizes where
indices = zipWith index (zip ls hs) xs
sizes = map rangeSize $ zip ls hs
step (i, b) (s, p) = (s + p * i, p * b)
fold :: (Enum i, Ix i) => ([a] -> b) -> Int -> Array [i] a -> Array [i] b
fold f n a = A.array newBound assocs where
(oldLowBound, oldHighBound) = A.bounds a
(newLowBoundBeg , dimLow : newLowBoundEnd ) = splitAt n oldLowBound
(newHighBoundBeg, dimHigh: newHighBoundEnd) = splitAt n oldHighBound
assocs = [(beg ++ end, f [a A.! (beg ++ i : end) | i <- [dimLow..dimHigh]])
| beg <- range (newLowBoundBeg, newHighBoundBeg)
, end <- range (newLowBoundEnd, newHighBoundEnd)
]
newBound = (newLowBoundBeg ++ newLowBoundEnd, newHighBoundBeg ++ newHighBoundEnd)
flatten a = check a >> return value where
check = guard . (1==) . length . fst . A.bounds
value = A.ixmap ((head *** head) . A.bounds $ a) return a
elementWise :: (MonadPlus m, Ix i) => (a -> b -> c) -> Array i a -> Array i b -> m (Array i c)
elementWise f a b = check >> return value where
check = guard $ A.bounds a == A.bounds b
value = A.listArray (A.bounds a) (zipWith f (A.elems a) (A.elems b))
unsafeFlatten a = fromJust $ flatten a
unsafeElementWise f a b = fromJust $ elementWise f a b
matrixMult a b = fold sum 1 $ unsafeElementWise (*) a' b' where
aBounds = (join (***) (!!0)) $ A.bounds a
bBounds = (join (***) (!!1)) $ A.bounds b
a' = copy 2 bBounds a
b' = copy 0 aBounds b
bijection f g a = A.ixmap ((f *** f) . A.bounds $ a) g a
unFlatten = bijection return head
matrixTranspose = bijection reverse reverse
copy n (low, high) a = A.ixmap (newBounds a) (deleteAt n) a where
newBounds = (insertAt n low *** insertAt n high) . A.bounds
The cute bit here is matrixMult, which is one of the only operations that is specialized to two-dimensional arrays. It expands its first argument along one dimension (by putting a copy of the two-dimensional object into each slice of the three-dimensional object); expands its second along another; does pointwise multiplication (now in a three-dimensional array); then collapses the fabricated third dimension by summing. Quite nice.

High order function returning result and modified itself

My goal is to create function, which take argument, compute result and return it in tuple with modified itself.
My first try looked like this:
f x = (x,f') where
f' y = (y+1,f')
cl num func = let (nu,fu) = func num in nu:fu num
My desired result if I call function cl with 0 and f was
[0,1,2,3,4,5,6,7,8,9,10,11,12,13 ... infinity]
Unfortunately, haskell cannot construct infinite type. It is hard for me to devise another way of doing it. Maybe, I'm just looking at problem from the bad side, thats why I posted this question.
EDIT:
This is the state of my functions:
newtype InFun = InFun { innf :: Int -> (Int,InFun) }
efunc x = (x,InFun deep) where
deep y = (y+1, InFun deep)
crli n (InFun f) = let (n',f') = f n in n':crli n f'
main = putStrLn $ show (take 10 (crli 0 (InFun efunc)))
Result is [0,1,1,1,1,1,1,1,1,1]. That's better, But, I want the modification made by deep function recursive.
Probably you are looking for
{-# LANGUAGE RankNTypes #-}
newtype F = F { f :: Int -> (Int, F) }
g y = (y + 1, F g)
then
*Main> fst $ (f $ snd $ g 3) 4
5
or
*Main> map fst $ take 10 $ iterate (\(x, F h) -> h x) (g 0)
[1,2,3,4,5,6,7,8,9,10]
or more complex modification (currying)
h = g False
where g x y = (y', F g')
where y' = if x then y + 1
else 2 * y
g' = if x then g False
else g True
then
*Main> map fst $ take 10 $ iterate (\(x, F h) -> h x) (h 0)
[0,1,2,3,6,7,14,15,30,31]
You can use iterate:
iterate (+1) 0

Combining foldl and foldr

I've figured out myself that foldl (or foldl') is the best approach when you want to produce summarise a list into one result (i.e. sum), and foldr is the best approach when you want to produce another (perhaps even infinite) list (i.e. filter).
So I was considering was processing that combines these two. So I made the function sum_f. sum_f is fairly simple, all it does is add up the elements of a list, but if it finds an element such that f x is true, it gives the current result as output as the element of a list and starts summing from that point all over.
The code is here:
sum_f :: (Num a) => (a -> Bool) -> [a] -> [a]
sum_f f =
let
sum_f_worker s (x:xs) =
let
rec_call z = sum_f_worker z xs
next_sum = s + x
in
next_sum `seq` if (f x) then next_sum : (rec_call 0) else rec_call next_sum
sum_f_worker _ [] = []
in
sum_f_worker 0
Now for example, lets sum all the positive integers grouped by any powers of two. This should output the following:
[1, 2, 3+4, 5+6+7+8, 9+10+11+12+13+14+15+16, ...]
i.e.
[1, 2, 7, 26, 100, ...]
We can do this like the following:
import Data.Bits
main =
let
power_of_two x = (x .&. (x - 1)) == 0 -- .&. is bitwise and
in
print $ take 25 $ sum_f power_of_two [(1::Integer)..]
Now this above function (I believe) runs in constant space (like foldl'), even though the groups grow exponentially. Also, it works on infinite lists (like foldr).
I was wondering whether I could write the above using prelude functions without explicit recursion (i.e. only the recursion inside prelude functions). Or does combining the ideas of foldl and foldr here mean that the recursion here can't be done with standard prelude functions and needs to be explicit?
What you want can be expressed using only a right fold as follows:
{-# LANGUAGE BangPatterns #-}
sum_f :: (Num a) => (a -> Bool) -> [a] -> [a]
sum_f p xs = foldr g (const []) xs 0
where
g x f !a = if p x then x+a:f 0 else f (x+a)
Prelude Data.Bits> sum_f (\x -> x .&. pred x == 0) [1..10]
[1,2,7,26]
And it works on infinite lists:
Prelude Data.Bits> take 10 . sum_f (\x -> x .&. pred x == 0) $ [1..]
[1,2,7,26,100,392,1552,6176,24640,98432]

Recursively sort non-contiguous list to list of contiguous lists

I've been trying to learn a bit of functional programming (with Haskell & Erlang) lately and I'm always amazed at the succinct solutions people can come up with when they can think recursively and know the tools.
I want a function to convert a list of sorted, unique, non-contiguous integers into a list of contiguous lists, i.e:
[1,2,3,6,7,8,10,11]
to:
[[1,2,3], [6,7,8], [10,11]
This was the best I could come up with in Haskell (two functions)::
make_ranges :: [[Int]] -> [Int] -> [[Int]]
make_ranges ranges [] = ranges
make_ranges [] (x:xs)
| null xs = [[x]]
| otherwise = make_ranges [[x]] xs
make_ranges ranges (x:xs)
| (last (last ranges)) + 1 == x =
make_ranges ((init ranges) ++ [(last ranges ++ [x])]) xs
| otherwise = make_ranges (ranges ++ [[x]]) xs
rangify :: [Int] -> [[Int]]
rangify lst = make_ranges [] lst
It might be a bit subjective but I'd be interested to see a better, more elegant, solution to this in either Erlang or Haskell (other functional languages too but I might not understand it.) Otherwise, points for just fixing my crappy beginner's Haskell style!
Most straightforward way in my mind is a foldr:
ranges = foldr step []
where step x [] = [[x]]
step x acc#((y:ys):zs) | y == x + 1 = (x:y:ys):zs
| otherwise = [x]:acc
Or, more concisely:
ranges = foldr step []
where step x ((y:ys):zs) | y == x + 1 = (x:y:ys):zs
step x acc = [x]:acc
But wait, there's more!
abstractRanges f = foldr step []
where step x ((y:ys):zs) | f x y = (x:y:ys):zs
step x acc = [x]:acc
ranges = abstractRanges (\x y -> y == x + 1)
powerRanges = abstractRanges (\x y -> y == x*x) -- mighty morphin
By turning the guard function into a parameter, you can group more interesting things than just +1 sequences.
*Main> powerRanges [1,1,1,2,4,16,3,9,81,5,25]
[[1,1,1],[2,4,16],[3,9,81],[5,25]]
The utility of this particular function is questionable...but fun!
I can't believe I got the shortest solution. I know this is no code golf, but I think it is still quite readable:
import GHC.Exts
range xs = map (map fst) $ groupWith snd $ zipWith (\a b -> (a, a-b)) xs [0..]
or pointfree
range = map (map snd) . groupWith fst . zipWith (\a b -> (b-a, b)) [0..]
BTW, groupWith snd can be replaced with groupBy (\a b -> snd a == snd b) if you prefer Data.List over GHC.Exts
[Edit]
BTW: Is there a nicer way to get rid of the lambda (\a b -> (b-a, b)) than (curry $ (,) <$> ((-) <$> snd <*> fst) <*> snd) ?
[Edit 2]
Yeah, I forgot (,) is a functor. So here is the obfuscated version:
range = map (map fst) . groupWith snd . (flip $ zipWith $ curry $ fmap <$> (-).fst <*> id) [0..]
Suggestions are welcome...
import Data.List (groupBy)
ranges xs = (map.map) snd
. groupBy (const fst)
. zip (True : zipWith ((==) . succ) xs (tail xs))
$ xs
As to how to come up with such a thing: I started with the zipWith f xs (tail xs), which is a common idiom when you want to do something on consecutive elements of a list. Likewise is zipping up a list with information about the list, and then acting (groupBy) upon it. The rest is plumbing.
Then, of course, you can feed it through #pl and get:
import Data.List (groupBy)
import Control.Monad (ap)
import Control.Monad.Instances()
ranges = (((map.map) snd)
. groupBy (const fst))
.) =<< zip
. (True:)
. ((zipWith ((==) . succ)) `ap` tail)
, which, by my authoritative definition, is evil due to Mondad ((->) a). Twice, even. The data flow is meandering too much to lay it out in any sensible way. zipaptail is an Aztec god, and Aztec gods aren't to be messed with.
Another version in Erlang:
part(List) -> part(List,[]).
part([H1,H2|T],Acc) when H1 =:= H2 - 1 ->
part([H2|T],[H1|Acc]);
part([H1|T],Acc) ->
[lists:reverse([H1|Acc]) | part(T,[])];
part([],Acc) -> Acc.
k z = map (fst <$>) . groupBy (const snd) .
zip z . (False:) . (zipWith ((==) . succ) <*> tail) $ z
Try reusing standard functions.
import Data.List (groupBy)
rangeify :: (Num a) => [a] -> [[a]]
rangeify l = map (map fst) $ groupBy (const snd) $ zip l contigPoints
where contigPoints = False : zipWith (==) (map (+1) l) (drop 1 l)
Or, following (mixed) advice to use unfoldr, stop abusing groupBy, and be happy using partial functions when it doesn't matter:
import Control.Arrow ((***))
import Data.List (unfoldr)
spanContig :: (Num a) => [a] -> [[a]]
spanContig l =
map fst *** map fst $ span (\(a, b) -> a == b + 1) $ zip l (head l - 1 : l)
rangeify :: (Num a) => [a] -> [[a]]
rangeify = unfoldr $ \l -> if null l then Nothing else Just $ spanContig l
Erlang using foldr:
ranges(List) ->
lists:foldr(fun (X, [[Y | Ys], Acc]) when Y == X + 1 ->
[[X, Y | Ys], Acc];
(X, Acc) ->
[[X] | Acc]
end, [], List).
This is my v0.1 and I can probably make it better:
makeCont :: [Int] -> [[Int]]
makeCont [] = []
makeCont [a] = [[a]]
makeCont (a:b:xs) = if b - a == 1
then (a : head next) : tail next
else [a] : next
where
next :: [[Int]]
next = makeCont (b:xs)
And I will try and make it better. Edits coming I think.
As a comparison, here's an implementation in Erlang:
partition(L) -> [lists:reverse(T) || T <- lists:reverse(partition(L, {[], []}))].
partition([E|L], {R, [EL|_] = T}) when E == EL + 1 -> partition(L, {R, [E|T]});
partition([E|L], {R, []}) -> partition(L, {R, [E]});
partition([E|L], {R, T}) -> partition(L, {[T|R], [E]});
partition([], {R, []}) -> R;
partition([], {R, T}) -> [T|R].
The standard paramorphism recursion scheme isn't in Haskell's Data.List module, though I think it should be. Here's a solution using a paramorphism, because you are building a list-of-lists from a list, the cons-ing is a little tricksy:
contig :: (Eq a, Num a) => [a] -> [[a]]
contig = para phi [] where
phi x ((y:_),(a:acc)) | x + 1 == y = (x:a):acc
phi x (_, acc) = [x]:acc
Paramorphism is general recursion or a fold with lookahead:
para :: (a -> ([a], b) -> b) -> b -> [a] -> b
para phi b [] = b
para phi b (x:xs) = phi x (xs, para phi b xs)
It can be pretty clear and simple in the Erlang:
partition([]) -> [];
partition([A|T]) -> partition(T, [A]).
partition([A|T], [B|_]=R) when A =:= B+1 -> partition(T, [A|R]);
partition(L, P) -> [lists:reverse(P)|partition(L)].
Edit: Just for curiosity I have compared mine and Lukas's version and mine seems about 10% faster either in native either in bytecode version on testing set what I generated by lists:usort([random:uniform(1000000)||_<-lists:seq(1,1000000)]) on R14B01 64b version at mine notebook. (Testing set is 669462 long and has been partitioned to 232451 sublists.)
Edit2: Another test data lists:usort([random:uniform(1000000)||_<-lists:seq(1,10000000)]), length 999963 and 38 partitions makes bigger diference in native code. Mine version finish in less than half of time. Bytecode version is only about 20% faster.
Edit3: Some microoptimizations which provides additional performance but leads to more ugly and less maintainable code:
part4([]) -> [];
part4([A|T]) -> part4(T, A, []).
part4([A|T], B, R) when A =:= B+1 -> part4(T, A, [B|R]);
part4([A|T], B, []) -> [[B]|part4(T, A, [])];
part4([A|T], B, R) -> [lists:reverse(R, [B])|part4(T, A, [])];
part4([], B, R) -> [lists:reverse(R,[B])].
Here's an attempt from a haskell noob
ranges ls = let (a, r) = foldl (\(r, a#(h:t)) e -> if h + 1 == e then (r, e:a) else (a:r, [e])) ([], [head ls]) (tail ls)
in reverse . map reverse $ r : a

Resources