Implementing an efficient sliding-window algorithm in Haskell

Implementing an efficient sliding-window algorithm in Haskell - haskell

I needed an efficient sliding window function in Haskell, so I wrote the following:
windows n xz#(x:xs)
| length v < n = []
| otherwise = v : windows n xs
where
v = take n xz
My problem with this is that I think the complexity is O(n*m) where m is the length of the list and n is the window size. You count down the list once for take, another time for length, and you do it down the list of essentially m-n times. It seems like it can be more efficient than this, but I'm at a loss for how to make it more linear. Any takers?

You can't get better than O(m*n), since this is the size of the output data structure.
But you can avoid checking the lengths of the windows if you reverse the order of operations: First create n shifted lists and then just zip them together. Zipping will get rid of those that don't have enough elements automatically.
import Control.Applicative
import Data.Traversable (sequenceA)
import Data.List (tails)
transpose' :: [[a]] -> [[a]]
transpose' = getZipList . sequenceA . map ZipList
Zipping a list of lists is just a transposition, but unlike transpose from Data.List it throws away outputs that would have less than n elements.
Now it's easy to make the window function: Take m lists, each shifted by 1, and just zip them:
windows :: Int -> [a] -> [[a]]
windows m = transpose' . take m . tails
Works also for infinite lists.

You can use Seq from Data.Sequence, which has O(1) enqueue and dequeue at both ends:
import Data.Foldable (toList)
import qualified Data.Sequence as Seq
import Data.Sequence ((|>))
windows :: Int -> [a] -> [[a]]
windows n0 = go 0 Seq.empty
where
go n s (a:as) | n' < n0 = go n' s' as
| n' == n0 = toList s' : go n' s' as
| otherwise = toList s'' : go n s'' as
where
n' = n + 1 -- O(1)
s' = s |> a -- O(1)
s'' = Seq.drop 1 s' -- O(1)
go _ _ [] = []
Note that if you materialize the entire result your algorithm is necessarily O(N*M) since that is the size of your result. Using Seq just improves performance by a constant factor.
Example use:
>>> windows [1..5]
[[1,2,3],[2,3,4],[3,4,5]]

First let's get the windows without worrying about the short ones at the end:
import Data.List (tails)
windows' :: Int -> [a] -> [[a]]
windows' n = map (take n) . tails
> windows' 3 [1..5]
[[1,2,3],[2,3,4],[3,4,5],[4,5],[5],[]]
Now we want to get rid of the short ones without checking the length of every one.
Since we know they are at the end, we could lose them like this:
windows n xs = take (length xs - n + 1) (windows' n xs)
But that's not great since we still go through xs an extra time to get its length. It also doesn't work on infinite lists, which your original solution did.
Instead let's write a function for using one list as a ruler to measure the amount to take from another:
takeLengthOf :: [a] -> [b] -> [b]
takeLengthOf = zipWith (flip const)
> takeLengthOf ["elements", "get", "ignored"] [1..10]
[1,2,3]
Now we can write this:
windows :: Int -> [a] -> [[a]]
windows n xs = takeLengthOf (drop (n-1) xs) (windows' n xs)
> windows 3 [1..5]
[[1,2,3],[2,3,4],[3,4,5]]
Works on infinite lists too:
> take 5 (windows 3 [1..])
[[1,2,3],[2,3,4],[3,4,5],[4,5,6],[5,6,7]]
As Gabriella Gonzalez says, the time complexity is no better if you want to use the whole result. But if you only use some of the windows, we now manage to avoid doing the work of take and length on the ones you don't use.

If you want O(1) length then why not use a structure that provides O(1) length? Assuming you aren't looking for windows from an infinite list, consider using:
import qualified Data.Vector as V
import Data.Vector (Vector)
import Data.List(unfoldr)
windows :: Int -> [a] -> [[a]]
windows n = map V.toList . unfoldr go . V.fromList
where
go xs | V.length xs < n = Nothing
| otherwise =
let (a,b) = V.splitAt n xs
in Just (a,b)
Conversation of each window from a vector to a list might bite you some, I won't hazard an optimistic guess there, but I will bet that the performance is better than the list-only version.

For the sliding window I also used unboxed Vetors as length, take, drop as well as splitAt are O(1) operations.
The code from Thomas M. DuBuisson is a by n shifted window, not a sliding, except if n =1. Therefore a (++) is missing, however this has a cost of O(n+m). Therefore careful, where you put it.
import qualified Data.Vector.Unboxed as V
import Data.Vector.Unboxed (Vector)
import Data.List
windows :: Int -> Vector Double -> [[Int]]
windows n = (unfoldr go)
where
go !xs | V.length xs < n = Nothing
| otherwise =
let (a,b) = V.splitAt 1 xs
c= (V.toList a ++V.toList (V.take (n-1) b))
in (c,b)
I tried it out with +RTS -sstderr and:
putStrLn $ show (L.sum $ L.concat $ windows 10 (U.fromList $ [1..1000000]))
and got real time 1.051s and 96.9% usage, keeping in mind that after the sliding window two O(m) operations are performed.

Related

How to efficiently generate all lists of length `n^2` containing `n` copies of every `x < n`?

Given an integer n, how can I build the list containing all lists of length n^2 containing exactly n copies of each integer x < n? For example, for n = 2, we have:
[0,0,1,1], [0,1,0,1], [1,0,0,1], [0,1,1,0], [1,0,1,0], [1,1,0,0]
This can be easily done combining permutations and nub:
f :: Int -> [[Int]]
f n = nub . permutations $ concatMap (replicate n) [0..n-1]
But that is way too inefficient. Is there any simple way to encode the efficient/direct algorithm?

Sure, it's not too hard. We'll start with a list of n copies of each number less than n, and repeatedly choose one to start our result with. First, a function for choosing an element from a list:
zippers :: [a] -> [([a], a, [a])]
zippers = go [] where
go l (h:r) = (l,h,r) : go (h:l) r
go _ [] = []
Now we'll write a function that produces all possible interleavings of some input lists. Internally we'll maintain the invariant that each [a] is non-empty; hence we'll have to establish that invariant before we start recursing. In fact, this will be wasted work in the way we intend to call this function, but for good abstraction we might as well handle all inputs correctly, right?
interleavings :: [[a]] -> [[a]]
interleavings = go . filter (not . null) where
go [] = [[]]
go xss = do
(xssl, x:xs, xssr) <- zippers xss
(x:) <$> interleavings ([xs | not (null xs)] ++ xssl ++ xssr)
And now we're basically done. All we have to do is feed in an appropriate starting list.
f :: Int -> [[Int]]
f n = interleavings (replicate n <$> [1..n])
Try it in ghci:
> f 2
[[1,1,2,2],[1,2,2,1],[1,2,1,2],[2,2,1,1],[2,1,1,2],[2,1,2,1]]

Computing Moving Average in Haskell

I'm working on learning Haskell, so I tried to implement a moving average function. Here is my code:
mAverage :: Int-> [Int] -> [Float]
mAverage x a = [fromIntegral k / fromIntegral x | k <- rawAverage]
where
rawAverage = mAverage' x a a
-- First list contains original values; second list contains moving average computations
mAverage' :: Int -> [Int] -> [Int] -> [Int]
mAverage' 1 a b = b
mAverage' x a b = mAverage' (x - 1) a' b'
where
a' = init a
b' = zipWith (+) a' (tail b)
where the user calls mAverage with a length for each average and the list of values (e.g. mAverage 4 [1,2..100]).
However, when I run the code on the input mAverage 4 [1,2..100000], I get that it takes 3.6 seconds in ghci (using :set +s) and uses a gigabyte of memory. This seems very inefficient to me, as the equivalent function takes a fraction of a second in Python. Is there some way that I could make my code more efficient?

If you want to learn something new you can take a look at this nice solution for Moving Average problem. It is written by one of my students so I won't claim authorship. I really like it because it's very short. The only problem here is average function. Such functions are known to be bad. Instead you can use Beautiful folds by Gabriel Gonzalez. And yes, this function takes O(k) time (where k is size of window) for calculating average of window (I find it better because you can face floating point errors if you try to add only new element to window and subtract last). Oh, it also uses State monad :)
{-# LANGUAGE UnicodeSyntax #-}
module MovingAverage where
import Control.Monad (forM)
import Control.Monad.State (evalState, gets, modify)
moving :: Fractional a ⇒ Int → [a] → [a]
moving n _ | n <= 0 = error "non-positive argument"
moving n xs = evalState (forM xs $ \x → modify ((x:) . take (n-1)) >> gets average) []
where
average xs = sum xs / fromIntegral n

Here is a straightforward list-based solution which is idiomatic and fast enough, though requires more memory.
import Data.List (tails)
mavg :: Fractional b => Int -> [b] -> [b]
mavg k lst = take (length lst-k) $ map average $ tails lst
where average = (/ fromIntegral k) . sum . take k
This solution allows to use any function instead of average in a moving window.
The following solution is less universal but it is constant in space and seems to be the fastest one.
import Data.List (scanl')
mavg :: Fractional b => Int -> [b] -> [b]
mavg k lst = map (/ fromIntegral k) $ scanl' (+) (sum h) $ zipWith (-) t lst
where (h, t) = splitAt k lst
Finally, the solution which uses a kind of Okasaki's persistent functional queue, to keep the moving window. It does make sense when dealing with streaming data, like conduits or pipes.
mavg k lst = map average $ scanl' enq ([], take k lst) $ drop k lst
where
average (l,r) = (sum l + sum r) / fromIntegral k
enq (l, []) x = enq ([], reverse l) x
enq (l, (_:r)) x = (x:l, r)
And as it was mentioned in comments to the original post, do not use ghci for profiling. For example, you won't be able to see any benefits of scanl' in ghci.

Here's a solution for you.
The idea is to scan two lists, one where the averaging window starts, and another where it ends. Getting a tail end of a list costs as much as scanning the part we're skipping, and we're not copying anything. (If the windows size was usually quite large, we could calculate the remaining_data along with counting the sum initial_data, in one go.)
We generate a list of partial sums as described in my comment, then divide them by the windows width to get averages.
While slidingAverage calculates averages for biased position (window width to the right), centeredSlidingAverage calculates centered averages, using half window width to the left and to the right.
import Data.List (splitAt, replicate)
slidingAverage :: Int -> [Int] -> [Double] -- window size, source list -> list of averages
slidingAverage w xs = map divide $ initial_sum : slidingSum initial_sum xs remaining_data
where
divide = (\n -> (fromIntegral n) / (fromIntegral w)) -- divides the sums by window size
initial_sum = sum initial_data
(initial_data, remaining_data) = splitAt w xs
centeredSlidingAverage :: Int -> [Int] -> [Double] -- window size, source list -> list of averages
centeredSlidingAverage w xs = slidingAverage w $ left_padding ++ xs ++ right_padding
where
left_padding = replicate half_width 0
right_padding = replicate (w - half_width) 0
half_width = (w `quot` 2) -- quot is integer division
slidingSum :: Int -> [Int] -> [Int] -> [Int] -- window_sum before_window after_window -> list of sums
slidingSum _ _ [] = []
slidingSum window_sum before_window after_window = new_sum : slidingSum new_sum new_before new_after
where
value_to_go = head before_window
new_before = tail before_window
value_to_come = head after_window
new_after = tail after_window
new_sum = window_sum - value_to_go + value_to_come
When I try length $ slidingAverage 10 [1..1000000], it takes less than a second on my MBP. Due to the laziness, centeredSlidingAverage takes about the same time.

One simple way of doing it that also uses O(n) complexity
movingAverage :: (Fractional a) => Int -> [a] -> [a]
movingAverage n _ | n <= 0 = error "non-positive argument"
movingAverage n xs = fmap average $ groupBy n xs
where average xs' = sum xs' / fromIntegral (length xs')
groupBy :: Int -> [a] -> [[a]]
groupBy _ [] = []
groupBy n xs = go [] xs
where
go _ [] = []
go l (x:xs') = (x:t) : go (x:l) xs'
where t = take (n-1) l

Another way is to use STUArray.
import Data.Array.Unboxed
import Data.Array.ST
import Data.STRef
import Control.Monad
import Control.Monad.ST
movingAverage :: [Double] -> IO [Double]
movingAverage vals = stToIO $ do
let end = length vals - 1
myArray <- newArray (1, end) 0 :: ST s (STArray s Int Double)
forM_ [1 .. end] $ \i -> do
let cval = vals !! i
let lval = vals !! (i-1)
writeArray myArray i ((cval + lval)/2)
getElems myArray

Generating all combinations of 6 Xs with 3 Qs in Haskell

I am trying to generate a list of all strings that consist of 6 Xs and 3 Qs.
A subset of the list I am trying to generate is as follows:
["XXXXXXQQQ", "XQXXQXXQX", "QXQXQXXXX",...
What is a good way to go about this?

Here is a dynamic programming solution using Data.Array. mem just stores memoized values.
import Data.Array
strings :: Int -> Int -> [String]
strings n m = strings' n m
where
mem :: Array (Int,Int) [String]
mem = array ((0,0),(n,m)) [ ((i,j), strings' i j) | i <- [0..n], j <- [0..m] ]
strings' 0 m = [replicate m 'X']
strings' n 0 = [replicate n 'Q']
strings' n m = (('Q':) <$> mem ! (n-1,m)) ++ (('X':) <$> mem ! (n,m-1))

The naive solution is to recursively choose one of X or Q until we run out of choices to make. This is especially convenient when using the list monad to model the nondeterministic choice, and leads to quite short code:
stringsNondet m 0 = [replicate m 'X']
stringsNondet 0 n = [replicate n 'Q']
stringsNondet m n = do
(char, m', n') <- [('X', m-1, n), ('Q', m, n-1)]
rest <- stringsNondet m' n'
return (char:rest)
The disadvantage of this approach is that it does a lot of extra work. If we choose an X and then choose a Q, the continuations are the same as if we had chosen a Q and then an X, but these continuations will be recomputed in the above. (And similarly for other choice paths that lead to shared continuations.)
Alec has posted a dynamic programming solution which solves this problem by introducing a recursively-defined array to share the subcomputations. I like this solution, but the recursive definition is a bit mind-bending. The following solution is also a dynamic programming solution -- subcomputations are also shared -- but uses no hand-written recursion. It does make use of standard recursive patterns (map, zip, iterate, ++, and !!) but notably does not require "tying the knot" as Alec's solution does.
As a warmup, let's discuss the type of the function of interest to us:
step :: [[String]] -> [[String]]
The final result of interest to us is [String], a collection of strings with a fixed number m of 'X's and a fixed number n of 'Q's. The step function will expect a collection of results, all of the same length, and will assume that the result at index m has m copies of 'X'. It will also produce a result with these properties, and where each result is one longer than the input results.
We implement step by producing two intermediate [[String]]s, one with an extra 'X' compared to the input results and one with an extra 'Q'. These two intermediates can then be zipped together with a little "stutter" to represent the slight difference in 'X' count between them. Thus:
step css = zipWith (++)
([[]] ++ map (map ('X':)) css)
(map (map ('Q':)) css ++ [[]])
The top-level function is now easy to write: we simply index into the iterated version of step by the length of the final string we want, then index into the list of results we get that way by the number of 'X's we want.
strings m n = iterate step [[[]]] !! (m+n) !! m
A bonus of this approach is the single, aesthetically pleasing base case of [[[]]].

Use permutations and nub functions from Data.List:
Prelude Data.List> nub $ permutations "XXXXXXQQQ"
["XXXXXXQQQ","QXXXXXXQQ","XQXXXXXQQ","XXQXXXXQQ","XXXQXXXQQ","XXXXQXXQQ","XXXXXQXQQ","QQXXXXXXQ","QXQXXXXXQ","QXXQXXXXQ","QXXXQXXXQ","QXXXXQXXQ","QXXXXXQXQ","XQQXXXXXQ","XQXQXXXXQ","XQXXQXXXQ","XQXXXQXXQ","XQXXXXQXQ","XXQQXXXXQ","XXQXQXXXQ","XXQXXQXXQ","XXQXXXQXQ","XXXQQXXXQ","XXXQXQXXQ","XXXQXXQXQ","XXXXQQXXQ","XXXXQXQXQ","XXXXXQQXQ","QQQXXXXXX","QQXQXXXXX","QQXXQXXXX","QQXXXQXXX","QQXXXXQXX","QQXXXXXQX","QXQQXXXXX","XQQQXXXXX","XQQXQXXXX","XQQXXQXXX","XQQXXXQXX","XQQXXXXQX","QXQXQXXXX","QXQXXQXXX","QXQXXXQXX","QXQXXXXQX","QXXQQXXXX","XQXQQXXXX","XXQQQXXXX","XXQQXQXXX","XXQQXXQXX","XXQQXXXQX","XQXQXQXXX","XQXQXXQXX","XQXQXXXQX","QXXQXQXXX","QXXQXXQXX","QXXQXXXQX","QXXXQQXXX","XQXXQQXXX","XXQXQQXXX","XXXQQQXXX","XXXQQXQXX","XXXQQXXQX","XXQXQXQXX","XXQXQXXQX","XQXXQXQXX","XQXXQXXQX","QXXXQXQXX","QXXXQXXQX","QXXXXQQXX","XQXXXQQXX","XXQXXQQXX","XXXQXQQXX","XXXXQQQXX","XXXXQQXQX","XXXQXQXQX","XQXXXQXQX","QXXXXQXQX","XXQXXQXQX","QXXXXXQQX","XQXXXXQQX","XXQXXXQQX","XXXQXXQQX","XXXXQXQQX","XXXXXQQQX"]
We can have a faster implementation as well:
insertAtEvery x [] = [[x]]
insertAtEvery x (y:ys) = (x:y:ys) : map (y:) (insertAtEvery x ys)
combinations [] = [[]]
combinations (x:xs) = nub . concatMap (insertAtEvery x) . combinations $ xs
Comparison with the previous solution in ghci:
Prelude Data.List> (sort . nub . permutations $ "XXXXXXQQQ") == (sort . combinations $ "XXXXXXQQQ")
True
Prelude Data.List> :set +s
Prelude Data.List> combinations "XXXXXXQQQ"
["XXXXXXQQQ","XXXXXQXQQ","XXXXXQQXQ","XXXXXQQQX","XXXXQXXQQ","XXXXQXQXQ","XXXXQXQQX","XXXXQQXXQ","XXXXQQXQX","XXXXQQQXX","XXXQXXXQQ","XXXQXXQXQ","XXXQXXQQX","XXXQXQXXQ","XXXQXQXQX","XXXQXQQXX","XXXQQXXXQ","XXXQQXXQX","XXXQQXQXX","XXXQQQXXX","XXQXXXXQQ","XXQXXXQXQ","XXQXXXQQX","XXQXXQXXQ","XXQXXQXQX","XXQXXQQXX","XXQXQXXXQ","XXQXQXXQX","XXQXQXQXX","XXQXQQXXX","XXQQXXXXQ","XXQQXXXQX","XXQQXXQXX","XXQQXQXXX","XXQQQXXXX","XQXXXXXQQ","XQXXXXQXQ","XQXXXXQQX","XQXXXQXXQ","XQXXXQXQX","XQXXXQQXX","XQXXQXXXQ","XQXXQXXQX","XQXXQXQXX","XQXXQQXXX","XQXQXXXXQ","XQXQXXXQX","XQXQXXQXX","XQXQXQXXX","XQXQQXXXX","XQQXXXXXQ","XQQXXXXQX","XQQXXXQXX","XQQXXQXXX","XQQXQXXXX","XQQQXXXXX","QXXXXXXQQ","QXXXXXQXQ","QXXXXXQQX","QXXXXQXXQ","QXXXXQXQX","QXXXXQQXX","QXXXQXXXQ","QXXXQXXQX","QXXXQXQXX","QXXXQQXXX","QXXQXXXXQ","QXXQXXXQX","QXXQXXQXX","QXXQXQXXX","QXXQQXXXX","QXQXXXXXQ","QXQXXXXQX","QXQXXXQXX","QXQXXQXXX","QXQXQXXXX","QXQQXXXXX","QQXXXXXXQ","QQXXXXXQX","QQXXXXQXX","QQXXXQXXX","QQXXQXXXX","QQXQXXXXX","QQQXXXXXX"]
(0.01 secs, 3,135,792 bytes)
Prelude Data.List> nub $ permutations "XXXXXXQQQ"
["XXXXXXQQQ","QXXXXXXQQ","XQXXXXXQQ","XXQXXXXQQ","XXXQXXXQQ","XXXXQXXQQ","XXXXXQXQQ","QQXXXXXXQ","QXQXXXXXQ","QXXQXXXXQ","QXXXQXXXQ","QXXXXQXXQ","QXXXXXQXQ","XQQXXXXXQ","XQXQXXXXQ","XQXXQXXXQ","XQXXXQXXQ","XQXXXXQXQ","XXQQXXXXQ","XXQXQXXXQ","XXQXXQXXQ","XXQXXXQXQ","XXXQQXXXQ","XXXQXQXXQ","XXXQXXQXQ","XXXXQQXXQ","XXXXQXQXQ","XXXXXQQXQ","QQQXXXXXX","QQXQXXXXX","QQXXQXXXX","QQXXXQXXX","QQXXXXQXX","QQXXXXXQX","QXQQXXXXX","XQQQXXXXX","XQQXQXXXX","XQQXXQXXX","XQQXXXQXX","XQQXXXXQX","QXQXQXXXX","QXQXXQXXX","QXQXXXQXX","QXQXXXXQX","QXXQQXXXX","XQXQQXXXX","XXQQQXXXX","XXQQXQXXX","XXQQXXQXX","XXQQXXXQX","XQXQXQXXX","XQXQXXQXX","XQXQXXXQX","QXXQXQXXX","QXXQXXQXX","QXXQXXXQX","QXXXQQXXX","XQXXQQXXX","XXQXQQXXX","XXXQQQXXX","XXXQQXQXX","XXXQQXXQX","XXQXQXQXX","XXQXQXXQX","XQXXQXQXX","XQXXQXXQX","QXXXQXQXX","QXXXQXXQX","QXXXXQQXX","XQXXXQQXX","XXQXXQQXX","XXXQXQQXX","XXXXQQQXX","XXXXQQXQX","XXXQXQXQX","XQXXXQXQX","QXXXXQXQX","XXQXXQXQX","QXXXXXQQX","XQXXXXQQX","XXQXXXQQX","XXXQXXQQX","XXXXQXQQX","XXXXXQQQX"]
(0.71 secs, 161,726,128 bytes)

Random numbers without duplicates

Simulating a lottery of 6 numbers chosen from 40, I want to create a list of numbers in Haskell using the system random generator but eliminate duplicates, which often arise.
If I have the following:
import System.Random
main :: IO ()
main = do
rs <- forM [1..6] $ \_x -> randomRIO (1, 40) :: (IO Int)
print rs
this is halfway. But how do I filter out duplicates? It seems to me I need a while loop of some sort to construct a list filtering elements that are already in the list until the list is the required size. If I can generate an infinite list of random numbers and filter it inside the IO monad I am sure that would work, but I do not know how to approach this. It seems while loops are generally deprecated in Haskell, so I am uncertain of the true Haskeller's way here. Is this a legitimate use case for a while loop, and if so, how does one do that?

The function you are looking for is nub from Data.List, to filter dublicates.
import Data.List
import System.Random
main = do
g <- newStdGen
print . take 6 . nub $ (randomRs (1,40) g :: [Int])

If you don't mind using a library, then install the random-shuffle package and use it like this:
import System.Random.Shuffle
import Control.Monad.Random
main1 = do
perm <- evalRandIO $ shuffleM [1..10]
print perm
If you want to see how to implement a naive Fischer-Yates shuffle using lists in Haskell, have a look at this code:
shuffle2 xs g = go [] g (length xs) xs
where
go perm g n avail
| n == 0 = (perm,g)
| otherwise = let (i, g') = randomR (0,n-1) g
a = avail !! i
-- can also use splitAt to define avail':
avail' = take i avail ++ drop (i+1) avail
in go (a:perm) g' (n-1) avail'
main = do
perm <- evalRandIO $ liftRand $ shuffle2 [1..10]
print perm
The parameters to the go helper function are:
perm - the constructed permutation so far
g - the current generator value
n - the length of the available items
avail - the available items - i.e. items not yet selected to be part of the permutation
go simply adds a random element from avail to the permutation being constructed and recursively calls itself with the new avail list and new generator.
To only draw k random elements from xs, just start go at k instead of length xs:
shuffle2 xs k g = go [] g k xs
...
You could also use a temporary array (in the ST or IO monad) to implement a Fischer-Yates type algorithm. The shuffleM function in random-shuffle uses a yet completely different approach which you might find interesting.
Update: Here is an example of using an ST-array in a F-Y style algorithm:
import Control.Monad.Random
import Data.Array.ST
import Control.Monad
import Control.Monad.ST (runST, ST)
shuffle3 :: RandomGen g => Int -> g -> ([Int], g)
shuffle3 n g0 = runST $ do
arr <- newListArray (1,n) [1..n] :: ST s (STUArray s Int Int)
let step g i = do let (j,g') = randomR (1,n) g
-- swap i and j
a <- readArray arr i
b <- readArray arr j
writeArray arr j a
writeArray arr i b
return g'
g' <- foldM step g0 [1..n]
perm <- getElems arr
return (perm, g')
main = do
perm <- evalRandIO $ liftRand $ shuffle3 20
print perm

I've used the Fisher Yates Shuffle in C++ with a decent random number generator to great success. This approach is very efficient if you are willing to allocate an array for holding numbers 1 to 40.

Going the strict IO way requires to break down nub, bringing the condition into the tail recursion.
import System.Random
randsf :: (Eq a, Num a, Random a) => [a] -> IO [a]
randsf rs
| length rs > 6 = return rs
| otherwise = do
r <- randomRIO (1,40)
if elem r rs
then randsf rs
else randsf (r:rs)
main = do
rs <- randsf [] :: IO [Int]
print rs
If you know what you do unsafeInterleaveIO from System.IO.Unsafe can be handy, allowing you to generate lazy lists from IO. Functions like getContents work this way.
import Control.Monad
import System.Random
import System.IO.Unsafe
import Data.List
rands :: (Eq a, Num a, Random a) => IO [a]
rands = do
r <- randomRIO (1,40)
unsafeInterleaveIO $ liftM (r:) rands
main = do
rs <- rands :: IO [Int]
print . take 6 $ nub rs

You commented:
The goal is to learn how to build a list monadically using filtering. It's a raw newbie question
Maybe you should change the question title then! Anyways, this is quite a common task. I usually define a combinator with a general monadic type that does what I want, give it a descriptive name (which I didn't quite succeed in here :-) and then use it, like below
import Control.Monad
import System.Random
-- | 'myUntil': accumulate a list with unique action results until the list
-- satisfies a test
myUntil :: (Monad m, Eq a) => ([a] -> Bool) -> m a -> m [a]
myUntil test action = myUntil' test [] action where
myUntil' test resultSoFar action = do
if test resultSoFar then
return resultSoFar
else do
x <- action
let newResults = if x `elem` resultSoFar then resultSoFar
else resultSoFar ++ [x] -- x:resultSoFar
myUntil' test newResults action
main :: IO ()
main = do
let enough xs = length xs == 6
drawNumber = randomRIO (0, 40::Int)
numbers <- myUntil enough drawNumber
print numbers
NB: this is not the optimal way to get your 6 distinct numbers, but meant as an example how to, in general, build a list monadically using a filter that works on the entire list
It is in essence the same as Vektorweg's longest answer, but uses a combinator with a much more general type (which is the way I like to do it, which may be more useful for you, given your comment at the top of this answer)

How can this haskell rolling sum implementation be improved?

How can I improve the the following rolling sum implementation?
type Buffer = State BufferState (Maybe Double)
type BufferState = ( [Double] , Int, Int )
-- circular buffer
buff :: Double -> Buffer
buff newVal = do
( list, ptr, len) <- get
-- if the list is not full yet just accumulate the new value
if length list < len
then do
put ( newVal : list , ptr, len)
return Nothing
else do
let nptr = (ptr - 1) `mod` len
(as,(v:bs)) = splitAt ptr list
nlist = as ++ (newVal : bs)
put (nlist, nptr, len)
return $ Just v
-- create intial state for circular buffer
initBuff l = ( [] , l-1 , l)
-- use the circular buffer to calculate a rolling sum
rollSum :: Double -> State (Double,BufferState) (Maybe Double)
rollSum newVal = do
(acc,bState) <- get
let (lv , bState' ) = runState (buff newVal) bState
acc' = acc + newVal
-- subtract the old value if the circular buffer is full
case lv of
Just x -> put ( acc' - x , bState') >> (return $ Just (acc' - x))
Nothing -> put ( acc' , bState') >> return Nothing
test :: (Double,BufferState) -> [Double] -> [Maybe Double] -> [Maybe Double]
test state [] acc = acc
test state (x:xs) acc =
let (a,s) = runState (rollSum x) state
in test s xs (a:acc)
main :: IO()
main = print $ test (0,initBuff 3) [1,1,1,2,2,0] []
Buffer uses the State monad to implement a circular buffer. rollSum uses the State monad again to keep track of the rolling sum value and the state of the circular buffer.
How could I make this more elegant?
I'd like to implement other functions like rolling average or a difference, what could I do to make this easy?
Thanks!
EDIT
I forgot to mention I am using a circular buffer as I intend to use this code on-line and process updates as they arrive - hence the need to record state. Something like
newRollingSum = update rollingSum newValue

I haven't managed to decipher all of your code, but here is the plan I would take for solving this problem. First, an English description of the plan:
We need windows into the list of length n starting at each index.
Make windows of arbitrary length.
Truncate long windows to length n.
Drop the last n-1 of these, which will be too short.
For each window, add up the entries.
This was the first idea I had; for windows of length three it's an okay approach because step 2 is cheap on such a short list. For longer windows, you may want an alternate approach, which I will discuss below; but this approach has the benefit that it generalizes smoothly to functions other than sum. The code might look like this:
import Data.List
rollingSums n xs
= map sum -- add up the entries
. zipWith (flip const) (drop (n-1) xs) -- drop the last n-1
. map (take n) -- truncate long windows
. tails -- make arbitrarily long windows
$ xs
If you're familiar with the "equational reasoning" approach to optimization, you might spot a first place we can improve the performance of this function: by swapping the first map and zipWith, we can produce a function with the same behavior but with a map f . map g subterm, which can be replaced by map (f . g) to get slightly less allocation.
Unfortunately, for large n, this adds n numbers together in the inner loop; we would prefer to simply add the value at the "front" of the window and subtract the one at the "back". So we need to get trickier. Here's a new idea: we'll traverse the list twice in parallel, n positions apart. Then we'll use a simple function for getting the rolling sum (of unbounded window length) of prefixes of a list, namely, scanl (+), to convert this traversal into the actual sums we're interested in.
rollingSumsEfficient n xs = scanl (+) firstSum deltas where
firstSum = sum (take n xs)
deltas = zipWith (-) (drop n xs) xs -- front - back
There's one twist, which is that scanl never returns an empty list. So if it's important that you be able to handle short lists, you'll want another equation that checks for these. Don't use length, as that forces the entire input list into memory before starting the computation -- a potentially lethal performance mistake. Instead add a line like this above the previous definition:
rollingSumsEfficient n xs | null (drop (n-1) xs) = []
We can try these two out in ghci. You'll notice that they do not quite have the same behavior as yours:
*Main> rollingSums 3 [10^n | n <- [0..5]]
[111,1110,11100,111000]
*Main> rollingSumsEfficient 3 [10^n | n <- [0..5]]
[111,1110,11100,111000]
On the other hand, the implementations are considerably more concise and are fully lazy in the sense that they work on infinite lists:
*Main> take 5 . rollingSums 10 $ [1..]
[55,65,75,85,95]
*Main> take 5 . rollingSumsEfficient 10 $ [1..]
[55,65,75,85,95]

Efficient implementation for rolling sum in haskell-
rollingSums :: Num a => Int -> [a] -> Maybe [a]
rollingSums n xs | n <= 0 = Nothing
| otherwise = Just $ if length as == n then go (sum as) xs bs else []
where
(as, bs) = splitAt n xs
go s xs [] = [s]
go s xs (y:ys) = s : go (s + y - head xs) (tail xs) ys
Asuming that - sum((i+1)...(i+1+n)) = sum(i..(i+n)) - arr[i] + arr[i+n+1]

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Implementing an efficient sliding-window algorithm in Haskell - haskell

Related

How to efficiently generate all lists of length `n^2` containing `n` copies of every `x < n`?

Computing Moving Average in Haskell

Generating all combinations of 6 Xs with 3 Qs in Haskell

Random numbers without duplicates

How can this haskell rolling sum implementation be improved?

Categories

Resources