Building up a list in Haskell - haskell

What I am wanting to do is create a list of random integers, with no duplicates. As a first step, I have a function which makes a list of n random samples. How does one write this in a more Haskell idiomatic way, where an empty list does not need to be passed in to start the list off? I am sure I am missing something basic and fundamental.
-- make a list of random integers.
-- takes a size, and an empty list.
-- returns a list of that length of random numbers.
f :: Int -> [Int] -> IO [Int]
f l xs | length xs >= l = return (xs)
f l xs = do
r <- randomRIO (1, 40) :: IO Int
f l $ r : x
Usage:
*Main> f 6 []
[10,27,33,35,31,28]
Ultimately this function will have filtering to check for duplicate insertions, but that is a separate question. Although this may look like homework, it is not, but part of my own attempt to come to grips with the State monad as used for random number generation, and finding I am stuck at a much earlier spot.

Well, you can operate on the output of the recursive call:
f :: Int -> IO [Int]
f 0 = return []
f n = do
r <- randomRIO (1, 40)
xs <- f (n-1)
return $ r : xs
Note however that it's important the the operation you perform on the result is fast. In this case r : xs is constant time. However if you replace the last line with (say):
return $ xs ++ [r]
this would change the complexity of the function from linear to quadratic because every ++ call will have to scan all the sequence of previously generated numbers before appending the new one.
However you could simply do:
f n = sequence $ replicate n (randomRIO (1, 40))
replicate creates a [IO Int] list of length n made of randomRIO actions and sequence takes an [IO a] and turns it into an IO [a] by executing all the actions in order and collecting the results.
Even simpler, you could use replicateM which is already the function you want:
import Control.Monad(replicateM)
f n = replicateM n (randomRIO (1, 40))
or in point-free style:
f :: Int -> IO [Int]
f = flip replicateM $ randomRIO (1, 40)

This uses a Set to keep track of numbers already generated:
import System.Random
import qualified Data.Set as Set
generateUniqueRandoms :: (Int, Int) -> Int -> IO [Int]
generateUniqueRandoms range#(low, high) n =
let maxN = min (high - low) n
in
go maxN Set.empty
where
go 0 _ = return []
go n s = do
r <- getUniqueRandom s
xs <- go (n-1) (Set.insert r s)
return $ r : xs
getUniqueRandom s = do
r <- randomRIO range
if (Set.member r s) then getUniqueRandom s
else return r
Here is some sample output:
Main> generateUniqueRandoms (1, 40) 23
[29,22,2,17,5,8,24,27,10,16,6,3,14,37,25,34,30,28,7,31,15,20,36]
Main> generateUniqueRandoms (1, 40) 1000
[33,35,24,16,13,1,26,7,14,11,15,2,4,30,28,6,32,25,38,22,17,12,20,5,18,40,36,39,27,9,37,31,21,29,8,34,10,23,3]
Main> generateUniqueRandoms (1, 40) 0
[]
However, it is worth noting that if n is close to the width of the range, it'd be much more efficient to shuffle a list of all numbers in the range and take the first n of that.

Related

Backtracking with state

The list monad provides an excellent abstraction for backtracking in
search problems. However, the problem I am facing now is one which
involves state in addition to backtracking. (It also involves
constraints related to previous choices made in the search path, but I
will attack that issue later.)
The following simplified example illustrates the problematics. The
function sumTo is given a nonnegative integer and a list with pairs
of integers. The first element in each pair is a positive integer, the
second element is the number of such integers available. The search
problem is to express the first argument using the integers in the
list, with count constraints. For example, here the integer 8 is
represented in different ways as sums of five 1s, three 2s and two
4s with the contraint that all numbers making up the sum have to be
even (so the 1s can not be used).
λ> sumTo 8 [(1,5), (4,2), (2,3)]
[[4,4],[4,2,2],[2,2,4],[2,4,2]]
The following is my current recursive solution to the problem.
sumTo :: Int -> [(Int, Int)] -> [[Int]]
sumTo = go []
where
go :: [(Int, Int)] -> Int -> [(Int, Int)] -> [[Int]]
go _ 0 _ = [[]] -- base case: success
go _ _ [] = [] -- base case: out of options, failure
-- recursion step: use the first option if it has counts left and
-- is suitable; append to this cases where current option is not
-- used at this point
go prevOpts n (opt#(val,cnt):opts) =
(if cnt > 0 && val <= n && even val
then map (val:) $ go [] (n - val) $ (val,cnt-1):(prevOpts ++ opts)
else [])
++ go (opt:prevOpts) n opts
While the function seems to work ok, it is much more complicated
than one without state, employing the list monad.
sumToN :: Int -> [Int] -> [[Int]]
sumToN 0 _ = [[]]
sumToN n opts = do
val <- opts
guard $ val <= n
guard $ even val
map (val:) $ sumToN (n - val) opts
Not having constraints, this one gives one additional solution.
λ> sumToN 8 [1, 4, 2]
[[4,4],[4,2,2],[2,4,2],[2,2,4],[2,2,2,2]]
Now I am wondering if some higher order abstraction, such as
StateT or something similar, could be utilized to simplify the case
of backtracking with this kind of state constraints.
There are two versions below, the first that just uses lists, and the second with StateT.
import Control.Applicative
import Control.Monad.State
The list type is the type of nondeterministic computations.
Given a set of elements (given in compact form as a list of (element, nb_copies)), we can pick any one, and return it together with the updated set. The result is a pair (Int, [(Int, Int)]). As a regular function, pick returns all possible results of that action.
Internally, we can also follow the definition with an "imperative" point of view. If the list is empty, there is nothing to pick (the empty list is the failing computation). Otherwise, there is at least one element x (implicitly, i > 0). Then we either pick one x (pickx), or we pick one element from the rest (pickxs), being careful to put x back at the end.
pick :: [(Int, Int)] -> [(Int, [(Int, Int)])]
pick [] = []
pick ((x, i) : xs) = pickx ++ pickxs
where
pickx = if i == 1 then [ (x, xs) ] else [ (x, (x, i-1) : xs) ]
pickxs = do
(x', xs') <- pick xs
return (x', (x, i) : xs')
Then sumTo is defined as follows: if n = 0 then the only solution is the empty sum ([]) and we return it. Otherwise, we pick one element i from the set, check its validity, and recursively look for a solution for n-i with the updated set.
sumTo :: Int -> [(Int, Int)] -> [[Int]]
sumTo = go
where
go 0 _ = return []
go n xs = do
(i, xs') <- pick xs
guard $ i <= n
guard $ even i
s' <- go (n-i) xs'
return (i : s')
Now threading the set around can be tedious. StateT transforms a type of computation to be stateful. [] is nondeterministic computation. StateT s [] is stateful nondeterministic computation, with state type s. Here the state will be the set of remaining elements.
Interestingly, pick can directly be interpreted as such a stateful computation. The intuition is that executing pickState removes an element from the state, which updates the state, and returns that element. pickState automatically fails if there are no more elements.
pickState :: StateT [(Int, Int)] [] Int
pickState = StateT pick
Then we repeatedly pick elements until we reach 0.
sumToState :: Int -> StateT [(Int, Int)] [] [Int]
sumToState = go
where
go 0 = return []
go n = do
i <- pickState
guard $ i <= n
guard $ even i
s' <- go (n-i)
return (i : s')
main = do
let n = 8
xs = [(1, 5), (4, 2), (2, 3)]
print $ sumTo n xs
print $ evalStateT (sumToState n) xs
(full source)
It's not much work to add the StateT monad transformer to your clean solution. You just need to add a layer of wrapping and unwrapping to lift the values into the StateT type, and then take them back out using evalStateT.
Your code would also benefit from internally using a more specialized type for the opts than [(Int, Int)]. MultiSet would be a good choice since it automatically manages occurrences.
Here's a tested example of what it could look like:
import Control.Monad.State (StateT, evalStateT, get, modify, lift, guard)
import Data.MultiSet (MultiSet, fromOccurList, distinctElems, delete)
sumToN :: Int -> [(Int, Int)] -> [[Int]]
sumToN nStart optsStart =
evalStateT (go nStart) (fromOccurList optsStart)
where
go :: Int -> StateT (MultiSet Int) [] [Int]
go 0 = return []
go n = do
val <- lift . distinctElems =<< get
guard (val <= n && even val)
modify (delete val)
(val:) <$> go (n - val)
λ> sumToN 8 [(1,5), (4,2), (2,3)]
[[2,2,4],[2,4,2],[4,2,2],[4,4]]
And actually, the StateT isn't benefiting us very much here. You could refactor it to take the MultiSet Int as a parameter and it would work just as well.
import Control.Monad (guard)
import Data.MultiSet (fromOccurList, distinctElems, delete)
sumToN :: Int -> [(Int, Int)] -> [[Int]]
sumToN nStart optsStart =
go nStart (fromOccurList optsStart)
where
go 0 _ = return []
go n opts = do
val <- distinctElems opts
guard (val <= n && even val)
(val:) <$> go (n - val) (delete val opts)

Computing Moving Average in Haskell

I'm working on learning Haskell, so I tried to implement a moving average function. Here is my code:
mAverage :: Int-> [Int] -> [Float]
mAverage x a = [fromIntegral k / fromIntegral x | k <- rawAverage]
where
rawAverage = mAverage' x a a
-- First list contains original values; second list contains moving average computations
mAverage' :: Int -> [Int] -> [Int] -> [Int]
mAverage' 1 a b = b
mAverage' x a b = mAverage' (x - 1) a' b'
where
a' = init a
b' = zipWith (+) a' (tail b)
where the user calls mAverage with a length for each average and the list of values (e.g. mAverage 4 [1,2..100]).
However, when I run the code on the input mAverage 4 [1,2..100000], I get that it takes 3.6 seconds in ghci (using :set +s) and uses a gigabyte of memory. This seems very inefficient to me, as the equivalent function takes a fraction of a second in Python. Is there some way that I could make my code more efficient?
If you want to learn something new you can take a look at this nice solution for Moving Average problem. It is written by one of my students so I won't claim authorship. I really like it because it's very short. The only problem here is average function. Such functions are known to be bad. Instead you can use Beautiful folds by Gabriel Gonzalez. And yes, this function takes O(k) time (where k is size of window) for calculating average of window (I find it better because you can face floating point errors if you try to add only new element to window and subtract last). Oh, it also uses State monad :)
{-# LANGUAGE UnicodeSyntax #-}
module MovingAverage where
import Control.Monad (forM)
import Control.Monad.State (evalState, gets, modify)
moving :: Fractional a ⇒ Int → [a] → [a]
moving n _ | n <= 0 = error "non-positive argument"
moving n xs = evalState (forM xs $ \x → modify ((x:) . take (n-1)) >> gets average) []
where
average xs = sum xs / fromIntegral n
Here is a straightforward list-based solution which is idiomatic and fast enough, though requires more memory.
import Data.List (tails)
mavg :: Fractional b => Int -> [b] -> [b]
mavg k lst = take (length lst-k) $ map average $ tails lst
where average = (/ fromIntegral k) . sum . take k
This solution allows to use any function instead of average in a moving window.
The following solution is less universal but it is constant in space and seems to be the fastest one.
import Data.List (scanl')
mavg :: Fractional b => Int -> [b] -> [b]
mavg k lst = map (/ fromIntegral k) $ scanl' (+) (sum h) $ zipWith (-) t lst
where (h, t) = splitAt k lst
Finally, the solution which uses a kind of Okasaki's persistent functional queue, to keep the moving window. It does make sense when dealing with streaming data, like conduits or pipes.
mavg k lst = map average $ scanl' enq ([], take k lst) $ drop k lst
where
average (l,r) = (sum l + sum r) / fromIntegral k
enq (l, []) x = enq ([], reverse l) x
enq (l, (_:r)) x = (x:l, r)
And as it was mentioned in comments to the original post, do not use ghci for profiling. For example, you won't be able to see any benefits of scanl' in ghci.
Here's a solution for you.
The idea is to scan two lists, one where the averaging window starts, and another where it ends. Getting a tail end of a list costs as much as scanning the part we're skipping, and we're not copying anything. (If the windows size was usually quite large, we could calculate the remaining_data along with counting the sum initial_data, in one go.)
We generate a list of partial sums as described in my comment, then divide them by the windows width to get averages.
While slidingAverage calculates averages for biased position (window width to the right), centeredSlidingAverage calculates centered averages, using half window width to the left and to the right.
import Data.List (splitAt, replicate)
slidingAverage :: Int -> [Int] -> [Double] -- window size, source list -> list of averages
slidingAverage w xs = map divide $ initial_sum : slidingSum initial_sum xs remaining_data
where
divide = (\n -> (fromIntegral n) / (fromIntegral w)) -- divides the sums by window size
initial_sum = sum initial_data
(initial_data, remaining_data) = splitAt w xs
centeredSlidingAverage :: Int -> [Int] -> [Double] -- window size, source list -> list of averages
centeredSlidingAverage w xs = slidingAverage w $ left_padding ++ xs ++ right_padding
where
left_padding = replicate half_width 0
right_padding = replicate (w - half_width) 0
half_width = (w `quot` 2) -- quot is integer division
slidingSum :: Int -> [Int] -> [Int] -> [Int] -- window_sum before_window after_window -> list of sums
slidingSum _ _ [] = []
slidingSum window_sum before_window after_window = new_sum : slidingSum new_sum new_before new_after
where
value_to_go = head before_window
new_before = tail before_window
value_to_come = head after_window
new_after = tail after_window
new_sum = window_sum - value_to_go + value_to_come
When I try length $ slidingAverage 10 [1..1000000], it takes less than a second on my MBP. Due to the laziness, centeredSlidingAverage takes about the same time.
One simple way of doing it that also uses O(n) complexity
movingAverage :: (Fractional a) => Int -> [a] -> [a]
movingAverage n _ | n <= 0 = error "non-positive argument"
movingAverage n xs = fmap average $ groupBy n xs
where average xs' = sum xs' / fromIntegral (length xs')
groupBy :: Int -> [a] -> [[a]]
groupBy _ [] = []
groupBy n xs = go [] xs
where
go _ [] = []
go l (x:xs') = (x:t) : go (x:l) xs'
where t = take (n-1) l
Another way is to use STUArray.
import Data.Array.Unboxed
import Data.Array.ST
import Data.STRef
import Control.Monad
import Control.Monad.ST
movingAverage :: [Double] -> IO [Double]
movingAverage vals = stToIO $ do
let end = length vals - 1
myArray <- newArray (1, end) 0 :: ST s (STArray s Int Double)
forM_ [1 .. end] $ \i -> do
let cval = vals !! i
let lval = vals !! (i-1)
writeArray myArray i ((cval + lval)/2)
getElems myArray

How to generate an infinite list of random numbers using randomRIO?

I'm trying to generate an infinite list of random numbers using randomRIO.
import System.Random
g :: IO [Integer]
g = do
n <- randomRIO (1,6) :: IO Integer
f <- g
return (n : f)
It compiles but hangs when run.
The problem with your code is that it never teminates. Whenever you execute g, it initially calcuates the first random number and stores it in n. Then it again recursively calls back to the same function g which goes on and on. To get an infinite list you can use the randomRs function:
g2 :: IO [Integer]
g2 = do
g <- newStdGen
return $ randomRs (1,6) g
And then you can produce any number of random numbers using it:
λ> g2 >>= \x -> return $ take 5 $ x
[5,4,6,5,6]
Well, you can't, for much the same reason why you can't lazily mapM over State (and then use the result state afterwards). randomRIO produces a single value and only hands on the monadic program flow after this value has been produced.
The way to produce infinite streams of random values is of course randomRs. You can easily couple it with getStdRandom to produce such streams right in IO:
import Control.Arrow
randomRsIO :: Random a => (a,a) -> IO [a]
randomRsIO range = getStdRandom $ split >>> first (randomRs range)
With that,
g :: IO [Integer]
g = randomRsIO (1,6)
You wanted an answer using randomRIO. I know its been a long time since the question was asked.
This is using liftM where you can generate a random number in range low l to high h and keep appending to a list num times. So we have a list of length num of random numbers.
import Control.Monad
import System.Random
gx :: Int -> Int -> Int -> IO [Int]
gx l h num = do
x <- randomRIO(l,h)
if num <=1 then return [x] else liftM (x:) (gx l h (num-1))

Random numbers without duplicates

Simulating a lottery of 6 numbers chosen from 40, I want to create a list of numbers in Haskell using the system random generator but eliminate duplicates, which often arise.
If I have the following:
import System.Random
main :: IO ()
main = do
rs <- forM [1..6] $ \_x -> randomRIO (1, 40) :: (IO Int)
print rs
this is halfway. But how do I filter out duplicates? It seems to me I need a while loop of some sort to construct a list filtering elements that are already in the list until the list is the required size. If I can generate an infinite list of random numbers and filter it inside the IO monad I am sure that would work, but I do not know how to approach this. It seems while loops are generally deprecated in Haskell, so I am uncertain of the true Haskeller's way here. Is this a legitimate use case for a while loop, and if so, how does one do that?
The function you are looking for is nub from Data.List, to filter dublicates.
import Data.List
import System.Random
main = do
g <- newStdGen
print . take 6 . nub $ (randomRs (1,40) g :: [Int])
If you don't mind using a library, then install the random-shuffle package and use it like this:
import System.Random.Shuffle
import Control.Monad.Random
main1 = do
perm <- evalRandIO $ shuffleM [1..10]
print perm
If you want to see how to implement a naive Fischer-Yates shuffle using lists in Haskell, have a look at this code:
shuffle2 xs g = go [] g (length xs) xs
where
go perm g n avail
| n == 0 = (perm,g)
| otherwise = let (i, g') = randomR (0,n-1) g
a = avail !! i
-- can also use splitAt to define avail':
avail' = take i avail ++ drop (i+1) avail
in go (a:perm) g' (n-1) avail'
main = do
perm <- evalRandIO $ liftRand $ shuffle2 [1..10]
print perm
The parameters to the go helper function are:
perm - the constructed permutation so far
g - the current generator value
n - the length of the available items
avail - the available items - i.e. items not yet selected to be part of the permutation
go simply adds a random element from avail to the permutation being constructed and recursively calls itself with the new avail list and new generator.
To only draw k random elements from xs, just start go at k instead of length xs:
shuffle2 xs k g = go [] g k xs
...
You could also use a temporary array (in the ST or IO monad) to implement a Fischer-Yates type algorithm. The shuffleM function in random-shuffle uses a yet completely different approach which you might find interesting.
Update: Here is an example of using an ST-array in a F-Y style algorithm:
import Control.Monad.Random
import Data.Array.ST
import Control.Monad
import Control.Monad.ST (runST, ST)
shuffle3 :: RandomGen g => Int -> g -> ([Int], g)
shuffle3 n g0 = runST $ do
arr <- newListArray (1,n) [1..n] :: ST s (STUArray s Int Int)
let step g i = do let (j,g') = randomR (1,n) g
-- swap i and j
a <- readArray arr i
b <- readArray arr j
writeArray arr j a
writeArray arr i b
return g'
g' <- foldM step g0 [1..n]
perm <- getElems arr
return (perm, g')
main = do
perm <- evalRandIO $ liftRand $ shuffle3 20
print perm
I've used the Fisher Yates Shuffle in C++ with a decent random number generator to great success. This approach is very efficient if you are willing to allocate an array for holding numbers 1 to 40.
Going the strict IO way requires to break down nub, bringing the condition into the tail recursion.
import System.Random
randsf :: (Eq a, Num a, Random a) => [a] -> IO [a]
randsf rs
| length rs > 6 = return rs
| otherwise = do
r <- randomRIO (1,40)
if elem r rs
then randsf rs
else randsf (r:rs)
main = do
rs <- randsf [] :: IO [Int]
print rs
If you know what you do unsafeInterleaveIO from System.IO.Unsafe can be handy, allowing you to generate lazy lists from IO. Functions like getContents work this way.
import Control.Monad
import System.Random
import System.IO.Unsafe
import Data.List
rands :: (Eq a, Num a, Random a) => IO [a]
rands = do
r <- randomRIO (1,40)
unsafeInterleaveIO $ liftM (r:) rands
main = do
rs <- rands :: IO [Int]
print . take 6 $ nub rs
You commented:
The goal is to learn how to build a list monadically using filtering. It's a raw newbie question
Maybe you should change the question title then! Anyways, this is quite a common task. I usually define a combinator with a general monadic type that does what I want, give it a descriptive name (which I didn't quite succeed in here :-) and then use it, like below
import Control.Monad
import System.Random
-- | 'myUntil': accumulate a list with unique action results until the list
-- satisfies a test
myUntil :: (Monad m, Eq a) => ([a] -> Bool) -> m a -> m [a]
myUntil test action = myUntil' test [] action where
myUntil' test resultSoFar action = do
if test resultSoFar then
return resultSoFar
else do
x <- action
let newResults = if x `elem` resultSoFar then resultSoFar
else resultSoFar ++ [x] -- x:resultSoFar
myUntil' test newResults action
main :: IO ()
main = do
let enough xs = length xs == 6
drawNumber = randomRIO (0, 40::Int)
numbers <- myUntil enough drawNumber
print numbers
NB: this is not the optimal way to get your 6 distinct numbers, but meant as an example how to, in general, build a list monadically using a filter that works on the entire list
It is in essence the same as Vektorweg's longest answer, but uses a combinator with a much more general type (which is the way I like to do it, which may be more useful for you, given your comment at the top of this answer)

Haskell - How do I get Random Points (Int,Int)

I'm trying to get a set of random points (x,y) for drawing graph nodes to a screen. I need one randomly generated point for each node name passed in.
I found this code on a SO page, and modified it slightly to work for me, but it doesn't really do what I need.
I need a list of random (as random as possible) (Int,Int).
Anyway, here is what I have so far, and of course, it gives the same values every time, so it isn't particularly random :)
rndPoints :: [String] -> [Point]
rndPoints [] = []
rndPoints xs = zip x y where
size = length xs
x = take size (tail (map fst $ scanl (\(r, gen) _ -> randomR (25::Int,1000::Int) gen) (random (mkStdGen 1)) $ repeat ()))
y = take size (tail (map fst $ scanl (\(r, gen) _ -> randomR (25::Int,775::Int) gen) (random (mkStdGen 1)) $ repeat ()))
Any help would be much appreciated.
First, let's clean up your code a bit. There is a plural version of randomR that delivers an infinite list of random values: randomRs. This simplifies things a bit:
rndPoints1 :: [String] -> [Point]
rndPoints1 [] = []
rndPoints1 xs = zip x y
where
size = length xs
x = take size $ randomRs (25, 1000) (mkStdGen 1)
y = take size $ randomRs (25, 775) (mkStdGen 1)
We can simplify that further, by using zip's property that it stops after the shorter list is exhausted:
rndPoints2 :: [a] -> [Point]
rndPoints2 xs = map snd $ zip xs $ zip x y
where
x = randomRs (25, 1000) (mkStdGen 1)
y = randomRs (25, 775) (mkStdGen 1)
Notice I've also generalized the type of incoming list to just [a]. Since the values are never used, they needn't be Strings!
Now, it gives the same value every time because it uses mkStdGen to create a pseudo-random generator from the same seed (1) each time. If you want it to be different each time, then you need to create a generator in IO which can be based on the radom state of the computer. Rather than put the whole computation in IO, it is cleaner to pass in a StdGen:
rndPoints3 :: StdGen -> [Point]
rndPoints3 sg = zip x y
where
(sg1, sg2) = split sg
x = randomRs (25, 1000) sg1
y = randomRs (25, 775) sg2
pointsForLabels :: [a] -> StdGen -> [(a, Point)]
pointsForLabels xs sg = zip xs $ rndPoints3 sg
example3 :: [a] -> IO [(a, Point)]
example3 xs = newStdGen >>= return . pointsForLabels xs
Here, newStdGen creates a new pseudo-random generator each time, but it is in IO. That is passed eventually to a pure (non-IO) function rndPoints3 that takes the generator, and returns an infinite list of random Points. Within that function, split is used to create two generators from it, and each is used to derive the random list of coordinates.
pointsForLables now separates out the logic of matching up a new random point for each label. I also changed it to return the more likely useful pairs of labels and Points.
Finally, example3 lives in IO, and creates the generator and passes it all into the otherwise pure code.
I ended up using MonadRandom for this. I think the code was a little clearer and easier for me to understand. You could adapt the following code to address the original question.
import Control.Applicative
import Control.Monad.Random
type Point = (Float, Float)
type Poly = [Point]
randomScalar :: (RandomGen g) => Rand g Float
randomScalar = getRandomR (-500, 500)
randomPoint :: (RandomGen g) => Rand g Point
randomPoint = (,) <$> randomScalar <*> randomScalar
randomPoly :: (RandomGen g) => Int -> Rand g Poly
randomPoly n = sequence (replicate n randomPoint)

Resources