Here is a short Haskell program that generates a 440 Hz sound. It uses pulseaudio as an audio backend.
import GHC.Float
import Control.Arrow
import Sound.Pulse.Simple
import qualified Data.List.Stream as S
import Data.List
type Time = Double
type Frequency = Double
type Sample = Double
type CV = Double
chunksize = 441 * 2
sampleRate :: (Fractional a) => a
sampleRate = 44100
integral :: [Double] -> [Double]
integral = scanl1 (\acc x -> acc + x / sampleRate)
chunks :: Int -> [a] -> [[a]]
chunks n = S.takeWhile (not . S.null) . S.unfoldr (Just . S.splitAt n)
pulseaudioOutput :: [Sample] -> IO ()
pulseaudioOutput sx = do
pa <- simpleNew Nothing "Synths" Play Nothing "Synths PCM output"
(SampleSpec (F32 LittleEndian) 44100 1) Nothing Nothing
mapM_ (simpleWrite pa . S.map double2Float) $ chunks 1000 sx
simpleDrain pa
simpleFree pa
oscSine :: Frequency -> [CV] -> [Sample]
oscSine f = S.map sin <<< integral <<< S.map ((2 * pi * f *) . (2**))
music ::[Sample]
music = oscSine 440 (S.repeat 0)
main = do
pulseaudioOutput music
If I compile and run this, I see an ever growing CPU consumption.
If I change "S.splitAt" to "splitAt" in the definition of "chunks", everything is fine.
Can anyone guess why this can be?
Thank you.
Update
In the following code all three version of chunks can produce the aforementioned behaviour:
import GHC.Float
import Control.Arrow
import Sound.Pulse.Simple
import Data.List.Stream
import Prelude hiding ( unfoldr
, map
, null
, scanl1
, takeWhile
, repeat
, splitAt
, drop
, take
)
type Time = Double
type Frequency = Double
type Sample = Double
type CV = Double
chunksize = 441 * 2
sampleRate :: (Fractional a) => a
sampleRate = 44100
integral :: [Double] -> [Double]
integral = scanl1 (\acc x -> acc + x / sampleRate)
chunks :: Int -> [a] -> [[a]]
--chunks n = takeWhile (not . null) . unfoldr (Just . splitAt n)
--chunks n xs = take n xs : chunks n (drop n xs)
chunks n xs = h : chunks n t
where
(h, t) = splitAt n xs
pulseaudioOutput :: [Sample] -> IO ()
pulseaudioOutput sx = do
pa <- simpleNew Nothing "Synths" Play Nothing "Synths PCM output"
(SampleSpec (F32 LittleEndian) 44100 1) Nothing Nothing
mapM_ (simpleWrite pa . map double2Float) $ chunks 1000 sx
simpleDrain pa
simpleFree pa
oscSine :: Frequency -> [CV] -> [Sample]
oscSine f = map sin <<< integral <<< map ((2 * pi * f *) . (2**))
music ::[Sample]
music = oscSine 440 (repeat 0)
main = do
pulseaudioOutput music
I cleaned up the code to avoid mixing plain old lists and stream-fusion lists. The memory / cpu leak is still there. To see that the code is working on old lists, just remove the Prelude import and ".Stream" after "Data.List".
The splitAt on streams that is substituted by the fusion rules (http://hackage.haskell.org/package/stream-fusion-0.1.2.5/docs/Data-Stream.html#g:12) has the following signature:
splitAt :: Int -> Stream a -> ([a], [a])
From this we can see that since it produces lists and not streams, that obstructs further fusion. The correct thing to do, I think, is to produce either a splitAt that generates streams, or better yet to write a chunks function directly on streams with the appropriate fusion rules from the list version.
Here is a splitAt on streams that I think should be good. You would of course need to pair it with the appropriate rewrite rules from a splitAt on lists, and if those rewrite rules get tricky, perhaps write the chunks function directly, though it seems a bit tricky to do so as well:
splitAt :: Int -> Stream a -> (Stream a, Stream a)
splitAt n0 (Stream next s0)
| n0 < 0 = (nilStream, (Stream next s0))
| otherwise = loop_splitAt n0 s0
where
nilStream = Stream (const Done) s0
loop_splitAt 0 !s = (nilStream, (Stream next s))
loop_splitAt !n !s = case next s of
Done -> (nilStream, nilStream)
Skip s' -> loop_splitAt n s'
Yield x s' -> (cons x xs', xs'')
where
(xs', xs'') = loop_splitAt (n-1) s'
Related
I have an algorithm for parallel sorting a list of a given length:
import Control.Parallel (par, pseq)
import Data.Time.Clock (diffUTCTime, getCurrentTime)
import System.Environment (getArgs)
import System.Random (StdGen, getStdGen, randoms)
parSort :: (Ord a) => [a] -> [a]
parSort (x:xs) = force greater `par` (force lesser `pseq`
(lesser ++ x:greater))
where lesser = parSort [y | y <- xs, y < x]
greater = parSort [y | y <- xs, y >= x]
parSort _ = []
sort :: (Ord a) => [a] -> [a]
sort (x:xs) = lesser ++ x:greater
where lesser = sort [y | y <- xs, y < x]
greater = sort [y | y <- xs, y >= x]
sort _ = []
parSort2 :: (Ord a) => Int -> [a] -> [a]
parSort2 d list#(x:xs)
| d <= 0 = sort list
| otherwise = force greater `par` (force lesser `pseq`
(lesser ++ x:greater))
where lesser = parSort2 d' [y | y <- xs, y < x]
greater = parSort2 d' [y | y <- xs, y >= x]
d' = d - 1
parSort2 _ _ = []
force :: [a] -> ()
force xs = go xs `pseq` ()
where go (_:xs) = go xs
go [] = 1
randomInts :: Int -> StdGen -> [Int]
randomInts k g = let result = take k (randoms g)
in force result `seq` result
testFunction = parSort
main = do
args <- getArgs
let count | null args = 500000
| otherwise = read (head args)
input <- randomInts count `fmap` getStdGen
start <- getCurrentTime
let sorted = testFunction input
putStrLn $ "Sort list N = " ++ show (length sorted)
end <- getCurrentTime
putStrLn $ show (end `diffUTCTime` start)
I want to get the time to perform parallel sorting on 2, 3 and 4 processor cores less than 1 core.
At the moment, this result I can not achieve.
Here are my program launches:
1. SortList +RTS -N1 -RTS 10000000
time = 41.2 s
2.SortList +RTS -N3 -RTS 10000000
time = 39.55 s
3.SortList +RTS -N4 -RTS 10000000
time = 54.2 s
What can I do?
Update 1:
testFunction = parSort2 60
Here's one idea you can play around with, using Data.Map. For simplicity and performance, I assume substitutivity for the element type, so we can count occurrences rather than storing lists of elements. I'm confident that you can get better results using some fancy array algorithm, but this is simple and (essentially) functional.
When writing a parallel algorithm, we want to minimize the amount of work that must be done sequentially. When sorting a list, there's one thing that we really can't avoid doing sequentially: splitting up the list into pieces for multiple threads to work on. We'd like to get that done with as little effort as possible, and then try to work mostly in parallel from then on.
Let's start with a simple sequential algorithm.
{-# language BangPatterns, TupleSections #-}
import qualified Data.Map.Strict as M
import Data.Map (Map)
import Data.List
import Control.Parallel.Strategies
type Bag a = Map a Int
ssort :: Ord a => [a] -> [a]
ssort xs =
let m = M.fromListWith (+) $ (,1) <$> xs
in concat [replicate c x | (x,c) <- M.toList m]
How can we parallelize this? First, let's break up the list into pieces. There are various ways to do this, none of them great. Assuming a small number of capabilities, I think it's reasonable to let each of them walk the list itself. Feel free to experiment with other approaches.
-- | Every Nth element, including the first
everyNth :: Int -> [a] -> [a]
everyNth n | n <= 0 = error "What you doing?"
everyNth n = go 0 where
go !_ [] = []
go 0 (x : xs) = x : go (n - 1) xs
go k (_ : xs) = go (k - 1) xs
-- | Divide up a list into N pieces fairly. Walking each list in the
-- result will walk the original list.
splatter :: Int -> [a] -> [[a]]
splatter n = map (everyNth n) . take n . tails
Now that we have pieces of list, we spark threads to convert them to bags.
parMakeBags :: Ord a => [[a]] -> Eval [Bag a]
parMakeBags xs =
traverse (rpar . M.fromListWith (+)) $ map (,1) <$> xs
Now we can repeatedly merge pairs of bags until we have just one.
parMergeBags_ :: Ord a => [Bag a] -> Eval (Bag a)
parMergeBags_ [] = pure M.empty
parMergeBags_ [t] = pure t
parMergeBags_ q = parMergeBags_ =<< go q where
go [] = pure []
go [t] = pure [t]
go (t1:t2:ts) = (:) <$> rpar (M.unionWith (+) t1 t2) <*> go ts
But ... there's a problem. In each round of merges, we use only half as many capabilities as we did in the previous one, and perform the final merge with just one capability. Ouch! To fix this, we'll need to parallelize unionWith. Fortunately, this is easy!
import Data.Map.Internal (Map (..), splitLookup, link)
parUnionWith
:: Ord k
=> (v -> v -> v)
-> Int -- Number of threads to spark
-> Map k v
-> Map k v
-> Eval (Map k v)
parUnionWith f n t1 t2 | n <= 1 = rseq $ M.unionWith f t1 t2
parUnionWith _ !_ Tip t2 = rseq t2
parUnionWith _ !_ t1 Tip = rseq t1
parUnionWith f n (Bin _ k1 x1 l1 r1) t2 = case splitLookup k1 t2 of
(l2, mb, r2) -> do
l1l2 <- parEval $ parUnionWith f (n `quot` 2) l1 l2
r1r2 <- parUnionWith f (n `quot` 2) r1 r2
case mb of
Nothing -> rseq $ link k1 x1 l1l2 r1r2
Just x2 -> rseq $ link k1 fx1x2 l1l2 r1r2
where !fx1x2 = f x1 x2
Now we can fully parallelize bag merging:
-- Uses the given number of capabilities per merge, initially,
-- doubling for each round.
parMergeBags :: Ord a => Int -> [Bag a] -> Eval (Bag a)
parMergeBags !_ [] = pure M.empty
parMergeBags !_ [t] = pure t
parMergeBags n q = parMergeBags (n * 2) =<< go q where
go [] = pure []
go [t] = pure [t]
go (t1:t2:ts) = (:) <$> parEval (parUnionWith (+) n t1 t2) <*> go ts
We can then implement a parallel merge like this:
parMerge :: Ord a => [[a]] -> Eval [a]
parMerge xs = do
bags <- parMakeBags xs
-- Why 2 and not one? We only have half as many
-- pairs as we have lists (capabilities we want to use)
-- so we double up.
m <- parMergeBags 2 bags
pure $ concat [replicate c x | (x,c) <- M.toList m]
Putting the pieces together,
parSort :: Ord a => Int -> [a] -> Eval [a]
parSort n = parMerge . splatter n
pSort :: Ord a => Int -> [a] -> [a]
pSort n = runEval . parMerge . splatter n
There's just one sequential piece remaining that we can parallelize: converting the final bag to a list. Is it worth parallelizing? I'm pretty sure that in practice it is not. But let's do it anyway, just for fun! To avoid considerable extra complexity, I'll assume that there aren't large numbers of equal elements; repeated elements in the result will lead to some work (thunks) remaining in the result list.
We'll need a basic partial list spine forcer:
-- | Force the first n conses of a list
walkList :: Int -> [a] -> ()
walkList n _ | n <= 0 = ()
walkList _ [] = ()
walkList n (_:xs) = walkList (n - 1) xs
And now we can convert the bag to a list in parallel chunks without paying for concatenation:
-- | Use up to the given number of threads to convert a bag
-- to a list, appending the final list argument.
parToListPlus :: Int -> Bag k -> [k] -> Eval [k]
parToListPlus n m lst | n <= 1 = do
rseq (walkList (M.size m) res)
pure res
-- Note: the concat and ++ should fuse away when compiling with
-- optimization.
where res = concat [replicate c x | (x,c) <- M.toList m] ++ lst
parToListPlus _ Tip lst = pure lst
parToListPlus n (Bin _ x c l r) lst = do
r' <- parEval $ parToListPlus (n `quot` 2) r lst
res <- parToListPlus (n `quot` 2) l $ replicate c x ++ r'
rseq r' -- make sure the right side is finished
pure res
And then we modify the merger accordingly:
parMerge :: Ord a => Int -> [[a]] -> Eval [a]
parMerge n xs = do
bags <- parMakeBags xs
m <- parMergeBags 2 bags
parToListPlus n m []
I have these data types:
data Card = Card One | Card Two | ...
data Deck = Deck [Card]
I want to shuffle the list cards in a deck
Here's where I am so far:
import System.Random
shuffle :: Deck -> Deck
shuffle (Deck c) = Deck $ shuffleOnce randomN c
where randomN = randomIO >>= (\x -> return (x `mod` 52))
shuffleOnce n c = (findNth n c : deleteNth n c)
findNth 0 (x:_) = x
findNth n (_:xs) = findNth (n-1) xs
deleteNth 0 (_:xs) = xs
deleteNth n (x:xs) = x : (deleteNth (n-1) xs)
The problem (obviously) is here:
where randomN = randomIO >>= (\x -> return (x `mod` 52))
I have no idea how to use the IO monad to generate a random number. Once I wrap the random number from ` to 52 in the monadic context, how do I get it out?
Or better question, how do I actually shuffle the deck?
Please help. full code here
Compiler said, that the problem is in (randomN 1). randomN function is called with an argument, but it doesn't accept one.
Then it couldn't figure out, which type of random number is expected, so we need to provide one here:
randomIO >>= (\x -> return (x `mod` 52)) :: IO Int
Then No instance for (Eq (IO Int)) error is received, which actually means, that Int is expected, but IO Int is provided. For convenience, we can swap the arguments of shuffleOnce n c and call this function as:
shuffleOnce c <$> randomN
After that, the error says:
Expected type: [Card]
Actual type: IO [Card]
We need to use <$> again instead of $, just to get "inside" IO:
Deck <$> shuffleOnce c <$> randomN
As a result, we receive:
Couldn't match expected type ‘Deck’ with actual type ‘IO Deck’
We can't do anything about it, but change the type of shuffle function:
shuffle :: Deck -> IO Deck
shuffle (Deck c) = Deck <$> shuffleOnce c <$> randomN
where randomN = randomIO >>= (\x -> return (x `mod` 52)) :: IO Int
shuffleOnce c n = (findNth n c : deleteNth n c)
findNth 0 (x:_) = x
findNth n (_:xs) = findNth (n-1) xs
deleteNth 0 (_:xs) = xs
deleteNth n (x:xs) = x : (deleteNth (n-1) xs)
Then we realise, that shuffle is a function with side effects, because it contains random numbers, which are "received from outside world". And actually having this information in the type definition seems logical.
I am implementing an approximate counting algorithm where we:
Maintain t counters {X1,..., Xt} using log (log n) bits for each
Initialize all counters to 0
When an item arrives, increase each Xi by 1 independently with probability (½)Xi
When the stream is over, output Z = 1/t (2X1 − 1 + ... + 2Xt − 1)
repeat the above steps m times independently and in parallel and output the median.
Here it is in haskell, using the conduit library.
import Data.Random
import Data.Conduit
import Data.List
import Data.Ord (comparing)
import qualified Data.Conduit.List as Cl
import Control.Monad.Identity
type Prob = Double
type Counter = Float
type Delta = Double
type Eps = Double
-- * Run Morris alpha on stream inputs `xs`
morrisA :: [a] -> IO Counter
morrisA xs = flip runRVar StdRandom $ Cl.sourceList xs $$ alpha
-- * Run Morris beta on stream inputs `xs` for `t` independent trials and average
morrisB :: Int -> [a] -> IO Counter
morrisB t = fmap rmean . replicateM t . morrisA
-- * final morris algorithm
-- * Run on stream inputs `xs` for t independent trials for `t = 1/eps`,
-- * and `m` times in parralell, for `m = 1/(e^2 * d)`
-- * and take the median
morris :: Eps -> Delta -> [a] -> IO Counter
morris e d = fmap rmedian . replicateM m . morrisB t
where (t,m) = (round $ 1/(e^2*d), round $ 1/d)
-- * Utils * --
-- * A step in morris Algorithm alpha
alpha :: Sink a RVar Counter
alpha = (\x -> 2^(round x) - 1) <$> Cl.foldM (\x _ -> incr x) 0
-- * Increment a counter `x` with probability 1/2^x
incr :: Counter -> RVar Counter
incr x = do
h <- (\q -> q <= (0.5^(round x) :: Prob)) <$> uniform 0 1
return $ if h then (seq () succ x) else seq () x
rmean, rmedian :: (Floating a, Ord a, RealFrac a) => [a] -> Float
rmean = fromIntegral . round . mean
rmedian = fromIntegral . round . median
-- |Numerically stable mean
mean :: Floating a => [a] -> a
mean x = fst $ foldl' (\(!m, !n) x -> (m+(x-m)/(n+1),n+1)) (0,0) x
-- |Median
median :: (Floating a, Ord a) => [a] -> a
median x | odd n = head $ drop (n `div` 2) x'
| even n = mean $ take 2 $ drop i x'
where i = (length x' `div` 2) - 1
x' = sort x
n = length x
The problem is that runtime of morris is linear in both length of stream and number of iterations t*m. So eg morrisA takes about 100μs for 100 items. Now if we want to be 95% confident of 5% error, we have to run morris for n=160000 times.
Could someone suggest how to optimize this code? maybe:
implement morris in something other than conduit
some faster way to repeat morrisA other than replicateM.
For the 99 Haskell questions, specifically the 23rd one, I need to
"Extract a given number of randomly selected elements from a list.
Example (in lisp):
(rnd-select '(a b c d e f g h) 3)
(E D A)
"
Which I have implemented like so:
import System.Random
import Control.Monad
removeAt :: [a] -> Int -> [a]
removeAt (x:xs) i
| i > 0 = x : removeAt xs (i-1)
| otherwise = xs
rndSelect :: (RandomGen g) => [a] -> Int -> g -> IO [a]
rndSelect _ 0 _ = return []
rndSelect xs n gen = do
let (pos, newGen) = randomR (0, length xs - 1) gen
rest <- rndSelect (removeAt xs pos) (n-1) newGen
return $ (xs!!pos):rest
-- for an explanation of what this is doing see EXPLANATION below
As far as I can tell this works, but what I'm concerned about are those last two lines. I'm new to this and I don't know the associated costs of the '<-' operator is or bouncing in and out of IO repeatedly like I'm doing. Is this efficient, is there a better way to do this that doesn't involve bouncing IO, or is there no real overheads involved?
Any insight you have is appreciated, since I've only recently started learning these more sophisticated concepts in Haskell and haven't yet gotten used to reasoning about Haskell's IO system.
EXPLANATION: In order to do this I've decided that I should randomly select one element from the list using the randomR function (returns a random number in a given range), and keep doing this recursively until I've taken n elements.
I've made a couple assumptions about the problem that have lead me to this approach. Firstly I've assumed that rndSelect can select a specific element from the list only once, and secondly I've assumed that each element should have an equal probability of being picked.
PS: it's my first question on SO so if I've formatted the question poorly feel free to tell me.
You do not need IO for this, since randomR does not require it. What you need to do however, is to thread the random number generator through your computation:
import System.Random
import Control.Monad
removeAt :: [a] -> Int -> [a]
removeAt (x:xs) i
| i > 0 = x : removeAt xs (i-1)
| otherwise = xs
rndSelect :: (RandomGen t, Num a) => [a1] -> a -> t -> ([a1], t)
rndSelect _ 0 g = ([],g)
rndSelect xs n gen =
let (pos, newGen) = randomR (0, length xs - 1) gen
(rest,ng) = rndSelect (removeAt xs pos) (n-1) newGen
in ((xs!!pos):rest, ng)
If you're concerned about overheads going from IO to pure code, don't be. Instead you can try mwc-random package which will be atleast an order of magnitude faster in this case. Further, you could get additional benefit using any random access data structure instead of list if you have many elements.
You can avoid IO as :
rndSelect :: (RandomGen g) => [a] -> Int -> g -> [a]
rndSelect _ 0 _ = return []
rndSelect xs n gen = do
let (pos, newGen) = randomR (0, length xs - 1) gen
rest = rndSelect (removeAt xs pos) (n-1) newGen
in (xs!!pos):rest
I am new to Haskell and I wonder how/if I can make this code more efficient and tidy. It seems unnecessarily long and untidy.
My script generates a list of 10 averages of 10 coin flips.
import Data.List
import System.Random
type Rand a = StdGen -> Maybe (a,StdGen)
output = do
gen <- newStdGen
return $ distBernoulli 10 10 gen
distBernoulli :: Int -> Int -> StdGen -> [Double]
distBernoulli m n gen = [fromIntegral (sum x) / fromIntegral (length x) | x <- lst]
where lst = splitList (randomList (n*m) gen) n
splitList :: [Int] -> Int -> [[Int]]
splitList [] n = []
splitList lst n = take n lst : splitList (drop n lst) n
randomList :: Int -> StdGen -> [Int]
randomList n = take n . unfoldr trialBernoulli
trialBernoulli :: Rand Int
trialBernoulli gen = Just ((2*x)-1,y)
where (x,y) = randomR (0,1) gen
Any help would be appreciated, thanks.
I'd tackle this problem in a slightly different way. First I'd define a function that would give me an infinite sampling of flips from a Bernoulli distribution with success probability p:
flips :: Double -> StdGen -> [Bool]
flips p = map (< p) . randoms
Then I'd write distBernoulli as follows:
distBernoulli :: Int -> Int -> StdGen -> [Double]
distBernoulli m n = take m . map avg . splitEvery n . map val . flips 0.5
where
val True = 1
val False = -1
avg = (/ fromIntegral n) . sum
I think this matches your definition of distBernoulli:
*Main> distBernoulli 10 10 $ mkStdGen 0
[-0.2,0.4,0.4,0.0,0.0,0.2,0.0,0.6,0.2,0.0]
(Note that I'm using splitEvery from the handy split package, so you'd have to install the package and add import Data.List.Split (splitEvery) to your imports.)
This approach is slightly more general, and I think a little neater, but really the main difference is just that I'm using randoms and splitEvery.
EDIT: I posted this too fast and didn't match behavior, it should be good now.
import Control.Monad.Random
import Control.Monad (liftM, replicateM)
KNOWLEDGE: If you like randoms then use MonadRandom - it rocks.
STYLE: Only importing symbols you use helps readability and sometimes maintainability.
output :: IO [Double]
output = liftM (map dist) getLists
Note: I've given output an explicit type, but know it doesn't have to be IO.
STYLE:
1) Its usually good to separate your IO from pure functions. Here I've divided out the getting of random lists from the calculation of distributions. In your case it was pure but you combined getting "random" lists via a generator with the distribution function; I would divide those parts up.
2) Read Do notation considered harmful. Consider using >>= instead of
output = do
gen <- new
return $ dist gen
you can do:
output = new >>= dist
Wow!
dist :: [Int] -> Double
dist lst = (fromIntegral (sum lst) / fromIntegral (length lst))
getLists :: MonadRandom m => Int -> Int -> m [[Int]]
getLists m n= replicateM m (getList n)
KNOWLEDGE In Control.Monad anything ending in an M is like the original but for monads. In this case, replicateM should be familiar if you used the Data.List replicate function.
getList :: MonadRandom m => Int -> m [Int]
getList m = liftM (map (subtract 1 . (*2)) . take m) (getRandomRs (0,1::Int))
STYLE: If I do something lots of times I like to have a single instance in its own function (getList) then the repetition in a separate function.
I'm not sure I understand your code or your question...
But it seems to me all you'd need to do is generate a list of random ones and zeroes, and then divide each of them by their length with a map and add them together with a foldl.
Something like:
makeList n lis = if n /= 0 then
makeList (n-1) randomR(0,1) : lis
else
lis
And then make it apply a Map and Foldl or Foldr to it.
Using the above, I am now using this.
import Data.List
import System.Random
type Rand a = [a]
distBernoulli :: Int -> Int -> StdGen -> [Double]
distBernoulli m n gen = [fromIntegral (sum x) / fromIntegral (length x) | x <- lst]
where lst = take m $ splitList (listBernoulli gen) n
listBernoulli :: StdGen -> Rand Int
listBernoulli = map (\x -> (x*2)-1) . randomRs (0,1)
splitList :: [Int] -> Int -> [[Int]]
splitList lst n = take n lst : splitList (drop n lst) n
Thanks for your help, and I welcome any further comments :)