Related
Here is the list of lists: [[1,2,3],[1,2,3,4],[1,2,3]]
How can I increment each element of the second list by the length of the first list, and increment the third list by the length of the first list + second list? The first list should remain unchanged.
Intended output: [[1,2,3],[4,5,6,7],[8,9,10]]
Since the first list has length 3, the second list is generated by [1+3, 2+3, 3+3, 4+3].
Since the first list + second list combined have length 7, the third list is generated by [1+7, 2+7, 3+7].
Ideally it should work with any number of lists.
So far, I've had slight sucess using this:
scanl1 (\xs ys -> [y + length xs | y <- ys]) [[1,2,3],[1,2,3,4],[1,2,3]]
which outputs: [[1,2,3],[4,5,6,7],[5,6,7]]
scanl1 is a good idea, but it's not quite right, because you don't want your accumulator to be a list, but rather to be an integer. So you really want scanl, not scanl1. I'll leave it as an exercise for you to see how to adjust your solution - given that you managed to write something almost-right with scanl1, I don't think you'll find it too hard once you have the right function.
In the comments, jpmariner suggests mapAccumL :: (s -> a -> (s, b)) -> s -> [a] -> (s, [b])). That's perfectly typed for what we want to do, so let's see how it would look.
import Data.Traversable (mapAccumL)
addPreviousLengths :: [[Int]] -> [[Int]]
addPreviousLengths = snd . mapAccumL go 0
where go n xs = (n + length xs, map (+ n) xs)
λ> addPreviousLengths [[1,2,3],[1,2,3,4],[1,2,3]]
[[1,2,3],[4,5,6,7],[8,9,10]]
mapAccumL really is the best tool for this job - there's not much unnecessary complexity involved in using it. But if you're trying to implement this from scratch, you might try the recursive approach Francis King suggested. I'd suggest a lazy algorithm instead of the tail-recursive algorithm, though:
incrLength :: [[Int]] -> [[Int]]
incrLength = go 0
where go _ [] = []
go amount (x:xs) =
map (+ amount) x : go (amount + length x) xs
It works the same as the mapAccumL version. Note that both versions are lazy: they consume only as much of the input list as necessary. This is an advantage not shared by a tail-recursive approach.
λ> take 3 . incrLength $ repeat [1]
[[1],[2],[3]]
λ> take 3 . addPreviousLengths $ repeat [1]
[[1],[2],[3]]
There are many ways to solve this. A simple recursion is one approach:
lst :: [[Int]]
lst = [[1,2,3],[1,2,3,4],[1,2,3]]
incrLength :: [[Int]] -> Int -> [[Int]] -> [[Int]]
incrLength [] _ result = result
incrLength (x:xs) amount result =
incrLength xs (amount + length x) (result ++ [map (+amount) x])
(Edit: it is more efficient to use (:) in this function. See #amalloy comment below. The result then has to be reversed.
incrLength :: [[Int]] -> Int -> [[Int]] -> [[Int]]
incrLength [] _ result = reverse result
incrLength (x:xs) amount result =
incrLength xs (amount + length x) (map (+amount) x : result)
End Edit)
Another approach is to use scanl. We use length to get the length of the inner lists, then accumulate using scanl.
map length lst -- [3,4,3]
scanl (+) 0 $ map length lst -- [0,3,7,10]
init $ scanl (+) 0 $ map length lst -- [0,3,7]
Then we zip the lst and the accumulated value together, and map one over the other.
incrLength' :: [[Int]] -> [[Int]]
incrLength' lst =
[map (+ snd y) (fst y) | y <- zip lst addlst]
where
addlst =init $scanl (+) 0 $ map length lst
main = do
print $ incrLength lst 0 [] -- [[1,2,3],[4,5,6,7],[8,9,10]]
I'm working on learning Haskell, so I tried to implement a moving average function. Here is my code:
mAverage :: Int-> [Int] -> [Float]
mAverage x a = [fromIntegral k / fromIntegral x | k <- rawAverage]
where
rawAverage = mAverage' x a a
-- First list contains original values; second list contains moving average computations
mAverage' :: Int -> [Int] -> [Int] -> [Int]
mAverage' 1 a b = b
mAverage' x a b = mAverage' (x - 1) a' b'
where
a' = init a
b' = zipWith (+) a' (tail b)
where the user calls mAverage with a length for each average and the list of values (e.g. mAverage 4 [1,2..100]).
However, when I run the code on the input mAverage 4 [1,2..100000], I get that it takes 3.6 seconds in ghci (using :set +s) and uses a gigabyte of memory. This seems very inefficient to me, as the equivalent function takes a fraction of a second in Python. Is there some way that I could make my code more efficient?
If you want to learn something new you can take a look at this nice solution for Moving Average problem. It is written by one of my students so I won't claim authorship. I really like it because it's very short. The only problem here is average function. Such functions are known to be bad. Instead you can use Beautiful folds by Gabriel Gonzalez. And yes, this function takes O(k) time (where k is size of window) for calculating average of window (I find it better because you can face floating point errors if you try to add only new element to window and subtract last). Oh, it also uses State monad :)
{-# LANGUAGE UnicodeSyntax #-}
module MovingAverage where
import Control.Monad (forM)
import Control.Monad.State (evalState, gets, modify)
moving :: Fractional a ⇒ Int → [a] → [a]
moving n _ | n <= 0 = error "non-positive argument"
moving n xs = evalState (forM xs $ \x → modify ((x:) . take (n-1)) >> gets average) []
where
average xs = sum xs / fromIntegral n
Here is a straightforward list-based solution which is idiomatic and fast enough, though requires more memory.
import Data.List (tails)
mavg :: Fractional b => Int -> [b] -> [b]
mavg k lst = take (length lst-k) $ map average $ tails lst
where average = (/ fromIntegral k) . sum . take k
This solution allows to use any function instead of average in a moving window.
The following solution is less universal but it is constant in space and seems to be the fastest one.
import Data.List (scanl')
mavg :: Fractional b => Int -> [b] -> [b]
mavg k lst = map (/ fromIntegral k) $ scanl' (+) (sum h) $ zipWith (-) t lst
where (h, t) = splitAt k lst
Finally, the solution which uses a kind of Okasaki's persistent functional queue, to keep the moving window. It does make sense when dealing with streaming data, like conduits or pipes.
mavg k lst = map average $ scanl' enq ([], take k lst) $ drop k lst
where
average (l,r) = (sum l + sum r) / fromIntegral k
enq (l, []) x = enq ([], reverse l) x
enq (l, (_:r)) x = (x:l, r)
And as it was mentioned in comments to the original post, do not use ghci for profiling. For example, you won't be able to see any benefits of scanl' in ghci.
Here's a solution for you.
The idea is to scan two lists, one where the averaging window starts, and another where it ends. Getting a tail end of a list costs as much as scanning the part we're skipping, and we're not copying anything. (If the windows size was usually quite large, we could calculate the remaining_data along with counting the sum initial_data, in one go.)
We generate a list of partial sums as described in my comment, then divide them by the windows width to get averages.
While slidingAverage calculates averages for biased position (window width to the right), centeredSlidingAverage calculates centered averages, using half window width to the left and to the right.
import Data.List (splitAt, replicate)
slidingAverage :: Int -> [Int] -> [Double] -- window size, source list -> list of averages
slidingAverage w xs = map divide $ initial_sum : slidingSum initial_sum xs remaining_data
where
divide = (\n -> (fromIntegral n) / (fromIntegral w)) -- divides the sums by window size
initial_sum = sum initial_data
(initial_data, remaining_data) = splitAt w xs
centeredSlidingAverage :: Int -> [Int] -> [Double] -- window size, source list -> list of averages
centeredSlidingAverage w xs = slidingAverage w $ left_padding ++ xs ++ right_padding
where
left_padding = replicate half_width 0
right_padding = replicate (w - half_width) 0
half_width = (w `quot` 2) -- quot is integer division
slidingSum :: Int -> [Int] -> [Int] -> [Int] -- window_sum before_window after_window -> list of sums
slidingSum _ _ [] = []
slidingSum window_sum before_window after_window = new_sum : slidingSum new_sum new_before new_after
where
value_to_go = head before_window
new_before = tail before_window
value_to_come = head after_window
new_after = tail after_window
new_sum = window_sum - value_to_go + value_to_come
When I try length $ slidingAverage 10 [1..1000000], it takes less than a second on my MBP. Due to the laziness, centeredSlidingAverage takes about the same time.
One simple way of doing it that also uses O(n) complexity
movingAverage :: (Fractional a) => Int -> [a] -> [a]
movingAverage n _ | n <= 0 = error "non-positive argument"
movingAverage n xs = fmap average $ groupBy n xs
where average xs' = sum xs' / fromIntegral (length xs')
groupBy :: Int -> [a] -> [[a]]
groupBy _ [] = []
groupBy n xs = go [] xs
where
go _ [] = []
go l (x:xs') = (x:t) : go (x:l) xs'
where t = take (n-1) l
Another way is to use STUArray.
import Data.Array.Unboxed
import Data.Array.ST
import Data.STRef
import Control.Monad
import Control.Monad.ST
movingAverage :: [Double] -> IO [Double]
movingAverage vals = stToIO $ do
let end = length vals - 1
myArray <- newArray (1, end) 0 :: ST s (STArray s Int Double)
forM_ [1 .. end] $ \i -> do
let cval = vals !! i
let lval = vals !! (i-1)
writeArray myArray i ((cval + lval)/2)
getElems myArray
How can I improve the the following rolling sum implementation?
type Buffer = State BufferState (Maybe Double)
type BufferState = ( [Double] , Int, Int )
-- circular buffer
buff :: Double -> Buffer
buff newVal = do
( list, ptr, len) <- get
-- if the list is not full yet just accumulate the new value
if length list < len
then do
put ( newVal : list , ptr, len)
return Nothing
else do
let nptr = (ptr - 1) `mod` len
(as,(v:bs)) = splitAt ptr list
nlist = as ++ (newVal : bs)
put (nlist, nptr, len)
return $ Just v
-- create intial state for circular buffer
initBuff l = ( [] , l-1 , l)
-- use the circular buffer to calculate a rolling sum
rollSum :: Double -> State (Double,BufferState) (Maybe Double)
rollSum newVal = do
(acc,bState) <- get
let (lv , bState' ) = runState (buff newVal) bState
acc' = acc + newVal
-- subtract the old value if the circular buffer is full
case lv of
Just x -> put ( acc' - x , bState') >> (return $ Just (acc' - x))
Nothing -> put ( acc' , bState') >> return Nothing
test :: (Double,BufferState) -> [Double] -> [Maybe Double] -> [Maybe Double]
test state [] acc = acc
test state (x:xs) acc =
let (a,s) = runState (rollSum x) state
in test s xs (a:acc)
main :: IO()
main = print $ test (0,initBuff 3) [1,1,1,2,2,0] []
Buffer uses the State monad to implement a circular buffer. rollSum uses the State monad again to keep track of the rolling sum value and the state of the circular buffer.
How could I make this more elegant?
I'd like to implement other functions like rolling average or a difference, what could I do to make this easy?
Thanks!
EDIT
I forgot to mention I am using a circular buffer as I intend to use this code on-line and process updates as they arrive - hence the need to record state. Something like
newRollingSum = update rollingSum newValue
I haven't managed to decipher all of your code, but here is the plan I would take for solving this problem. First, an English description of the plan:
We need windows into the list of length n starting at each index.
Make windows of arbitrary length.
Truncate long windows to length n.
Drop the last n-1 of these, which will be too short.
For each window, add up the entries.
This was the first idea I had; for windows of length three it's an okay approach because step 2 is cheap on such a short list. For longer windows, you may want an alternate approach, which I will discuss below; but this approach has the benefit that it generalizes smoothly to functions other than sum. The code might look like this:
import Data.List
rollingSums n xs
= map sum -- add up the entries
. zipWith (flip const) (drop (n-1) xs) -- drop the last n-1
. map (take n) -- truncate long windows
. tails -- make arbitrarily long windows
$ xs
If you're familiar with the "equational reasoning" approach to optimization, you might spot a first place we can improve the performance of this function: by swapping the first map and zipWith, we can produce a function with the same behavior but with a map f . map g subterm, which can be replaced by map (f . g) to get slightly less allocation.
Unfortunately, for large n, this adds n numbers together in the inner loop; we would prefer to simply add the value at the "front" of the window and subtract the one at the "back". So we need to get trickier. Here's a new idea: we'll traverse the list twice in parallel, n positions apart. Then we'll use a simple function for getting the rolling sum (of unbounded window length) of prefixes of a list, namely, scanl (+), to convert this traversal into the actual sums we're interested in.
rollingSumsEfficient n xs = scanl (+) firstSum deltas where
firstSum = sum (take n xs)
deltas = zipWith (-) (drop n xs) xs -- front - back
There's one twist, which is that scanl never returns an empty list. So if it's important that you be able to handle short lists, you'll want another equation that checks for these. Don't use length, as that forces the entire input list into memory before starting the computation -- a potentially lethal performance mistake. Instead add a line like this above the previous definition:
rollingSumsEfficient n xs | null (drop (n-1) xs) = []
We can try these two out in ghci. You'll notice that they do not quite have the same behavior as yours:
*Main> rollingSums 3 [10^n | n <- [0..5]]
[111,1110,11100,111000]
*Main> rollingSumsEfficient 3 [10^n | n <- [0..5]]
[111,1110,11100,111000]
On the other hand, the implementations are considerably more concise and are fully lazy in the sense that they work on infinite lists:
*Main> take 5 . rollingSums 10 $ [1..]
[55,65,75,85,95]
*Main> take 5 . rollingSumsEfficient 10 $ [1..]
[55,65,75,85,95]
Efficient implementation for rolling sum in haskell-
rollingSums :: Num a => Int -> [a] -> Maybe [a]
rollingSums n xs | n <= 0 = Nothing
| otherwise = Just $ if length as == n then go (sum as) xs bs else []
where
(as, bs) = splitAt n xs
go s xs [] = [s]
go s xs (y:ys) = s : go (s + y - head xs) (tail xs) ys
Asuming that - sum((i+1)...(i+1+n)) = sum(i..(i+n)) - arr[i] + arr[i+n+1]
I want to generate a vectorspace from a basis pair, which looks something like:
genFromPair (e1, e2) = [x*e1 + y*e2 | x <- [0..], y <- [0..]]
When I examine the output though, it sems like I'm getting [0, e2, 2*e2,...] (i.e. x never gets above 0). Which sort of makes sense when I think about how I would write the code to do this list comprehension.
I wrote some code to take expanding "shells" from the origin (first the ints with norm 0, then with norm 1, then norm 2...) but this is kind of annoying and specific to Z^2 - I'd have to rewrite it for Z^3 or Z[i] etc. Is there a cleaner way of doing this?
The data-ordlist package has some functions which are extremely useful for working with sorted infinite lits. One of these is mergeAllBy, which combines an infinite list of infinite lists using some comparison function.
The idea is then to build an infinite list of lists such that y is fixed in each list, while x grows. As long as we can guarantee that each list is sorted, and that the heads of the lists are sorted, according to our ordering, we get a merged sorted list back.
Here's a quick example:
import Data.List.Ordered
import Data.Ord
genFromPair (e1, e2) = mergeAllBy (comparing norm) [[x.*e1 + y.*e2 | x <- [0..]] | y <- [0..]]
-- The rest just defines a simple vector type so we have something to play with
data Vec a = Vec a a
deriving (Eq, Show)
instance Num a => Num (Vec a) where
(Vec x1 y1) + (Vec x2 y2) = Vec (x1+x2) (y1+y2)
-- ...
s .* (Vec x y) = Vec (s*x) (s*y)
norm (Vec x y) = sqrt (x^2 + y^2)
Trying this in GHCi we get the expected result:
*Main> take 5 $ genFromPair (Vec 0 1, Vec 1 0)
[Vec 0.0 0.0,Vec 0.0 1.0,Vec 1.0 0.0,Vec 1.0 1.0,Vec 0.0 2.0]
You could look at your space as a tree. At the root of the tree one picks the first element and in its child you pick the second element..
Here's your tree defined using the ListTree package:
import Control.Monad.ListT
import Data.List.Class
import Data.List.Tree
import Prelude hiding (scanl)
infiniteTree :: ListT [] Integer
infiniteTree = repeatM [0..]
spacesTree :: ListT [] [Integer]
spacesTree = scanl (\xs x -> xs ++ [x]) [] infiniteTree
twoDimSpaceTree = genericTake 3 spacesTree
It's an infinite tree, but we could enumerate over it for example in DFS order:
ghci> take 10 (dfs twoDimSpaceTree)
[[],[0],[0,0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7]]
The order you want, in tree-speak, is a variant of best-first-search for infinite trees, where one assumes that the children of tree nodes are sorted (you can't compare all the node's children as in normal best-first-search because there are infinitely many of those). Luckily, this variant is already implemented:
ghci> take 10 $ bestFirstSearchSortedChildrenOn sum $ genericTake 3 $ spacesTree
[[],[0],[0,0],[0,1],[1],[1,0],[1,1],[0,2],[2],[2,0]]
You can use any norm you like for your expanding shells, instead of sum above.
Using the diagonal snippet from CodeCatalog:
genFromPair (e1, e2) = diagonal [[x*e1 + y*e2 | x <- [0..]] | y <- [0..]]
diagonal :: [[a]] -> [a]
diagonal = concat . stripe
where
stripe [] = []
stripe ([]:xss) = stripe xss
stripe ((x:xs):xss) = [x] : zipCons xs (stripe xss)
zipCons [] ys = ys
zipCons xs [] = map (:[]) xs
zipCons (x:xs) (y:ys) = (x:y) : zipCons xs ys
Piggybacking on hammar's reply: His approach seems fairly easy to extend to higher dimensions:
Prelude> import Data.List.Ordered
Prelude Data.List.Ordered> import Data.Ord
Prelude Data.List.Ordered Data.Ord> let norm (x,y,z) = sqrt (fromIntegral x^2+fromIntegral y^2+fromIntegral z^2)
Prelude Data.List.Ordered Data.Ord> let mergeByNorm = mergeAllBy (comparing norm)
Prelude Data.List.Ordered Data.Ord> let sorted = mergeByNorm (map mergeByNorm [[[(x,y,z)| x <- [0..]] | y <- [0..]] | z <- [0..]])
Prelude Data.List.Ordered Data.Ord> take 20 sorted
[(0,0,0),(1,0,0),(0,1,0),(0,0,1),(1,1,0),(1,0,1),(0,1,1),(1,1,1),(2,0,0),(0,2,0),(0,0,2),(2,1,0),(1,2,0),(2,0,1),(0,2,1),(1,0,2),(0,1,2),(2,1,1),(1,2,1),(1,1,2)]
Just starting out with Haskell! As an exercise, the current problem I'm trying to implement is as follows:
We have n squares, print all possible world configurations where :
(1) Each square could have a "P" (pit) or not (2^n possibilities).
(2) There can be at most one "W" (wumpus) in all n squares (n+1 possibilities).
Representing two squares as two strings, here is an output example for n=2. We have (2^n)·(n+1) = (2^2)·(2+1) = 12 configurations.
[[" W"," "],[" "," W"],[" "," "],
[" W","P"],[" ","PW"],[" ","P"],
["PW"," "],["P"," W"],["P"," "],
["PW","P"],["P","PW"],["P","P"]]
Condition (1) is easily implemented. Looking around, I've found a few ways to express it :
p 0 = [[]]
p n = [x:xs | x <- [" ","P"], xs <- p (n-1)]
or
p n = mapM (\x -> [" ","P"]) [1..n]
or
p n = replicateM n [" ","P"]
I cannot claim to understand the last two yet, but here they are for completeness.
Question : How can I add condition (2)? Can it be done with list comprehension?
My not-so-good-looking novice solution involved these functions:
insertw :: Int -> [String] -> [String]
insertw n xs
| n < 0 = xs
| n >= lgth = xs
| otherwise = (take (n) xs) ++ [xs!!n++"W"] ++ (drop (n+1) xs)
where lgth = length xs
duplicate :: Int -> [String] -> [[String]]
duplicate i squares
| i > lgth = []
| otherwise = (insertw i squares) : duplicate (i+1) squares
where lgth = length squares
worlds :: Int -> [[String]]
worlds n = concat . map (duplicate 0) . p $ n
Condition 2 isn't an obvious candidate for a list comprehension, but the working code you have already written can be cleaned up.
The iteration from 0 to lgth in duplicate can be done with a map instead of explicit recursion:
duplicate squares = map (\i -> insertw i squares) [0 .. length squares]
duplicate no longer takes an index parameter, and concat . map is the same as concatMap:
worlds = concatMap duplicate . p
If you do both a drop and a take, then splitAt is often the better operation.
insertw n xs =
case splitAt n xs of
(as, []) -> as
(as, b : bs) -> as ++ ((b ++ "W") : bs)
Note that we got rid of the length xs and xs !! n operations too.
As an exercise, another short duplicate function can be written by zipping over the inits and tails of the squares list.
Seems obvious to me :). In list comprehensions, the later lists can depend on the values generated in the earlier ones. The second function generates your set by calling the first when it adds a wumpus..
p 0 = [[]]
p n = [[x,' ']:xs | x <- [' ','P'], xs <- p (n-1)]
pw 0 = [[]]
pw n = [[x,w]:xs | w <- [' ','W'], x <- [' ','P'], xs <- if w == 'W' then p (n-1) else pw (n-1)]
it isn't as clean as possible, but I always find list comprehensions bring an elegance to the problem :). Totally worth it.