Generating a List of random Integers with condition in Haskell - haskell

Here I am trying to generate a list of Integers where I want to add random numbers to the list while the condition where md<=1 is not satisfied.
I tried several times but with no luck.
fx :: Int -> Int -> IO [Int]
fx md s = do
x <- randomRIO (1,min md s)
if md<=1
then return [md]
else return md:(fx (md-x) s)

The error is at the last line:
else return md:(fx (md-x) s)
The result of fx is in the IO monad, and with that result you want to perform a pure operation of prepending md to it. This is exactly what liftM does, your last line should be
else liftM (md :) (fx (md-x) s)
Other small improvements:
You don't really need IO, you only need to generate random numbers within the function. For this you could use the MonadRandom type class (residing in the package with the same name). This will make the function more specific, type-safe and flexible.
It's redundant to generate a random number and only after then check if you need it - the comparison md <= 1 doesn't need x at all.
So an improved version might look like:
import Control.Monad
import Control.Monad.Random
fx :: (MonadRandom m) => Int -> Int -> m [Int]
fx md s | md <= 1 = return [md]
| otherwise = do
x <- getRandomR (1, min md s)
liftM (md :) (fx (md-x) s)
alternative without liftM
maybe this is easier to understand:
fx :: (MonadRandom m) => Int -> Int -> m [Int]
fx md s | md <= 1 = return [md]
| otherwise = do
x <- getRandomR (1, min md s)
xs <- fx (md-x) s
return (md : xs)

Related

Parallelize computation of mutable vector in ST

How can computations done in ST be made to run in parallel?
I have a vector which needs to be filled in by random access, hence the use of ST, and the computation runs correctly single-threaded, but have been unable to figure out how to use more than one core.
Random access is needed because of the meaning of the indices into the vector. There are n things and every possible way of choosing among n things has an entry in the vector, as in the choice function. Each of these choices corresponds to a binary number (conceptually, a packed [Bool]) and these Int values are the indices. If there are n things, then the size of the vector is 2^n. The natural way the algorithm runs is for every entry corresponding to "n choose 1" to be filled in, then every entry for "n choose 2," etc. The entries corresponding to "n choose k" depends on the entries corresponding to "n choose (k-1)." The integers for the different choices do not occur in numerical order, and that's why random access is needed.
Here's a pointless (but slow) computation that follows the same pattern. The example function shows how I tried to break the computation up so that the bulk of the work is done in a pure world (no ST monad). In the code below, bogus is where most of the work is done, with the intent of calling that in parallel, but only one core is ever used.
import qualified Data.Vector as Vb
import qualified Data.Vector.Mutable as Vm
import qualified Data.Vector.Generic.Mutable as Vg
import qualified Data.Vector.Generic as Gg
import Control.Monad.ST as ST ( ST, runST )
import Data.Foldable(forM_)
import Data.Char(digitToInt)
main :: IO ()
main = do
putStrLn $ show (example 9)
example :: Int -> Vb.Vector Int
example n = runST $ do
m <- Vg.new (2^n) :: ST s (Vm.STVector s Int)
Vg.unsafeWrite m 0 (1)
forM_ [1..n] $ \i -> do
p <- prev m n (i-1)
let newEntries = (choiceList n i) :: [Int]
forM_ newEntries $ \e -> do
let v = bogus p e
Vg.unsafeWrite m e v
Gg.unsafeFreeze m
choiceList :: Int -> Int -> [Int]
choiceList _ 0 = [0]
choiceList n 1 = [ 2^k | k <- [0..(n-1) ] ]
choiceList n k
| n == k = [2^n - 1]
| otherwise = (choiceList (n-1) k) ++ (map ((2^(n-1)) +) $ choiceList (n-1) (k-1))
prev :: Vm.STVector s Int -> Int -> Int -> ST s Integer
prev m n 0 = return 1
prev m n i = do
let chs = choiceList n i
v <- mapM (\k -> Vg.unsafeRead m k ) chs
let e = map (\k -> toInteger k ) v
return (sum e)
bogus :: Integer -> Int -> Int
bogus prior index = do
let f = fac prior
let g = (f^index) :: Integer
let d = (map digitToInt (show g)) :: [Int]
let a = fromIntegral (head d)^2
a
fac :: Integer -> Integer
fac 0 = 1
fac n = n * fac (n - 1)
If anyone tests this, using more than 9 or 10 in show (example 9) will take much longer than you want to wait for such a pointless sequence of numbers.
Just do it in IO. If you need to use the result in pure code, then unsafePerformIO is available.
The following version runs about 3-4 times faster with +RTS -N16 than +RTS -N1. My changes involved converting the ST vectors to IO, changing the forM_ to forConcurrently_, and adding a bang annotation to let !v = bogus ....
Full code:
import qualified Data.Vector as Vb
import qualified Data.Vector.Mutable as Vm
import qualified Data.Vector.Generic.Mutable as Vg
import qualified Data.Vector.Generic as Gg
import Control.Monad.ST as ST ( ST, runST )
import Data.Foldable(forM_)
import Data.Char(digitToInt)
import Control.Concurrent.Async
import System.IO.Unsafe
main :: IO ()
main = do
let m = unsafePerformIO (example 9)
putStrLn $ show m
example :: Int -> IO (Vb.Vector Int)
example n = do
m <- Vg.new (2^n)
Vg.unsafeWrite m 0 (1)
forM_ [1..n] $ \i -> do
p <- prev m n (i-1)
let newEntries = (choiceList n i) :: [Int]
forConcurrently_ newEntries $ \e -> do
let !v = bogus p e
Vg.unsafeWrite m e v
Gg.unsafeFreeze m
choiceList :: Int -> Int -> [Int]
choiceList _ 0 = [0]
choiceList n 1 = [ 2^k | k <- [0..(n-1) ] ]
choiceList n k
| n == k = [2^n - 1]
| otherwise = (choiceList (n-1) k) ++ (map ((2^(n-1)) +) $ choiceList (n-1) (k-1))
prev :: Vm.IOVector Int -> Int -> Int -> IO Integer
prev m n 0 = return 1
prev m n i = do
let chs = choiceList n i
v <- mapM (\k -> Vg.unsafeRead m k ) chs
let e = map (\k -> toInteger k ) v
return (sum e)
bogus :: Integer -> Int -> Int
bogus prior index = do
let f = fac prior
let g = (f^index) :: Integer
let d = (map digitToInt (show g)) :: [Int]
let a = fromIntegral (head d)^2
a
fac :: Integer -> Integer
fac 0 = 1
fac n = n * fac (n - 1)
I think this can not be done in a safe way. In the general case, it seems it would break Haskell's referential transparency.
If we could perform multi-threaded computations within ST s, then we could spawn two threads that race over the same STRef s Bool. Let's say one thread is writing False and the other one True.
After we use runST on the computation, we get an expression of type Bool which is sometimes False and sometimes True. That should not be possible.
If you are absolutely certain that your parallelization does not break referential transparency, you could try using unsafe primitives like unsafeIOToST to spawn new threads. Use with extreme care.
There might be safer ways to achieve something similar. Outside ST, we do have some parallelism available in Control.Parallel.Strategies.
There are a number of ways to do parallelization in Haskell. Usually they will give comparable performance improvements, however some are better then the others and it mostly depends on problem that needs parallelization. This particular use case looked very interesting to me, so I decided to investigate a few approaches.
Approaches
vector-strategies
We are using a boxed vector, therefore we can utilize laziness and built-in spark pool for parallelization. One very simple approach is provided by vector-strategies package, which can iterate over any immutable boxed vector and evaluate all of the thunks in parallel. It is also possible to split the vector in chunks, but as it turns out the chunk size of 1 is the optimal one:
exampleParVector :: Int -> Vb.Vector Int
exampleParVector n = example n `using` parVector 1
parallel
parVector uses par underneath and requires one extra iteration over the vector. In this case we are already iterating over thee vector, thus it would actually make more sense to use par from parallel directly. This would allow us to perform computation in parallel while continue using ST monad:
import Control.Parallel (par)
...
forM_ [1..n] $ \i -> do
p <- prev m n (i-1)
let newEntries = choiceList n i :: [Int]
forM_ newEntries $ \e -> do
let v = bogus p e
v `par` Vg.unsafeWrite m e v
It is important to note that the computation of each element of the vector is expensive when compared to the total number of elements in the vector. That is why using par is a very good solution here. If it was the opposite, namely the vector was very large, but elements weren't too expensive to compute, it would be better to use an unboxed vector and switch it to a different parallelization method.
async
Another way was described by #K.A.Buhr. Switch to IO from ST and use async:
import Control.Concurrent.Async (forConcurrently_)
...
forM_ [1..n] $ \i -> do
p <- prev m n (i-1)
let newEntries = choiceList n i :: [Int]
forConcurrently_ newEntries $ \e -> do
let !v = bogus p e
Vg.unsafeWrite m e v
The concern that #chi has raised is a valid one, however in this particular implementation it is safe to use unsafePerformIO instead of runST, because parallelization does not violate the invariant of deterministic computation. Namely, we can promise that regardless of the input supplied to example function, the output will always be exactly the same.
scheduler
Green threads are pretty cheap in Haskell, but they aren't free. The solution above with async package has one slight drawback: it will spin up at least as many threads as there are elements in the newEntries list each time forConcurrently_ is called. It would be better to spin up as many threads as there are capabilities (the -N RTS option) and let them do all the work. For this we can use scheduler package, which is a work stealing scheduler:
import Control.Scheduler (Comp(Par), runBatch_, withScheduler_)
...
withScheduler_ Par $ \scheduler ->
forM_ [1..n] $ \i -> runBatch_ scheduler $ \_ -> do
p <- prev m n (i-1)
let newEntries = choiceList n i :: [Int]
forM_ newEntries $ \e -> scheduleWork_ scheduler $ do
let !v = bogus p e
Vg.unsafeWrite m e v
Spark pool in GHC also uses a work stealing scheduler, which is built into RTS and is unrelated to the package above in any shape or form, but the idea is very similar: few threads with many units of computation.
Benchmarks
Here are some benchmarks on a 16-core machine for all of the approaches with example 7 (value 9 takes on the order of seconds, which introduces too much noise for criterion). We only get about x5 speedup, because a significant part of the algorithm is sequential in nature and can't be parallelized.

Haskell - read list of persons

The problem sounds like this: write a program that reads a number n and then n persons, for each persons, read their name and age and then return the oldest persons/persons.
Example input:
3
Ion Ionel Ionescu
70
Gica Petrescu
99
Mustafa ben Muhamad
7
Example output
Oldest is Gica Petrescu (99 years).
My code so far:
readPers :: IO(String, Int)
readPers = do
name <- getLine
age <- readLn :: IO Int
return (name, age)
readPerss :: (Ord t, Num t) => t -> [IO (String, Int)]
readPerss n
| n > 0 = readPers : readPerss(n-1)
| otherwise = []
pFunc = do
print "Numer of persons:"
n <- readLn :: IO Int
let persons = readPerss n
return persons
I first read n, then I try to make a list of persons using readPers and readPerss, but I am stuck, I don't know how to tackle it from that point forward and I guess that my implementation thus far is not quite right.
How should I solve the problem?
You are very close! What you are doing in readPerss :: (Ord t, Num t) => t -> [IO (String, Int)] is returning a list of IO actions; each action returns a pair of String and Int when it is executed. Currently in pFunc you are only building this list of actions, storing it in a variable with let, and returning it from pFunc; you are never executing them with a <- “bind” statement.
There are a few simple ways to do what you want. The smallest change to your code that does what you want is to add sequence, which takes a container of actions and produces an action that returns a container:
sequence :: (Traversable t, Monad m) => t (m a) -> m (t a)
Here t is [], m is IO, and a is (String, Int):
sequence :: [IO (String, Int)] -> IO [(String, Int)]
Another way is to rewrite readPerss so that it executes the actions directly, accumulating the (String, Int) results in a list instead of accumulating the IO actions:
readPerss :: (Ord t, Num t) => t -> IO [(String, Int)]
-- Change [IO …] to IO […]: ~~~~~~~~~~~~~~~~~~
readPerss n
| n > 0 = do
pers <- readPers
perss <- readPerss (n - 1)
return (pers : perss)
| otherwise = return []
I know you may not be supposed to use library functions if this is a homework assignment or exercise, but in typical code “repeat x action n times and accumulate the results” is often represented with replicateM n x:
replicateM :: Applicative m => Int -> m a -> m [a]
This is how I always do this (it is from a code challenge isn’t it). I always seperate IO and logic as soon as possible. Works perfect (unless N is very big).
import Data.List.Split (chunksOf)
type Person = (String, Int)
main = do
x <- getContents
putStrLn $ program x
program :: String -> String
program s = “Oldest is “ ++ x ++ “ (“ ++ (show y) ++ “ years old).”
where
(x, y) = solve persons
persons = [(name, read age :: Int) | [name, age] <- chunksOf 2 . tail . lines $ s]
solve :: [Person] -> Person
solve ls = undefined
I leave the undefined to you.

Haskell: RandomGen drops half of values

I am writing a simple deterministic Random Number generator, based on the xorshift. The goal here is not to get a cryptographically secure or statistically perfect (pseudo-)random number generator, but to be able to archieve the same deterministic sequence of semi-random numbers across programming languages.
My Haskell program looks like follows:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module SimpleRNG where
import Data.Word (Word32)
import Data.Bits (xor, shift)
import System.Random (RandomGen(..))
import Control.Arrow
(|>) :: a -> (a -> b) -> b
(|>) x f = f x
infixl 0 |>
newtype SeedState = SeedState Word32
deriving (Eq, Show, Enum, Bounded)
seed :: Integral a => a -> SeedState
seed = SeedState . fromIntegral
rand_r :: SeedState -> (Word32, SeedState)
rand_r (SeedState num) = (res, SeedState res)
where
res = num
|> xorshift 13
|> xorshift (-17)
|> xorshift 5
xorshift :: Int -> Word32 -> Word32
xorshift amount x = x `xor` (shift x amount)
instance RandomGen SeedState where
next seed_state = (first fromIntegral) $ rand_r seed_state
where
genRange seed_state = (fromEnum (minBound `asTypeOf` seed_state),
fromEnum (maxBound `asTypeOf` seed_state))
split seed_state#(SeedState num) = (seed_state', inverted_seed_state')
where
(_, seed_state') = next seed_state
(_, inverted_seed_state') = next inverted_seed_state
inverted_seed_state = SeedState (maxBound - num)
Now, for some reason, when running
take 10 $ System.Random.randoms (seed 42) :: [Word32]
it returns only the 'odd' results, compared to the output of the following Python program:
class SeedState(object):
def __init__(self, seed = 42):
self.data = seed
def rand_r(rng_state):
num = rng_state.data
num ^= (num << 13) % (2 ** 32)
num ^= (num >> 17) % (2 ** 32)
num ^= (num << 5) % (2 ** 32)
rng_state.data = num
return num
__global_rng_state = SeedState(42)
def rand():
global __global_rng_state
return rand_r(__global_rng_state)
def seed(seed):
global __global_rng_state
__global_rng_state = SeedState(seed)
if __name__ == '__main__':
for x in range(0, 10):
print(rand())
It seems like the internals of the System.Random module do some weird trickery with the return result of the generator
(calling
map fst $ take 10 $ iterate (\(_, rng) -> rand_r rng) (rand_r $ seed 42)
gives the result I'd expect).
This is odd, since the type returned by the generator is already a Word32, so it could/should just be passed on unaltered without any remapping happening.
What is going on here, and is there a way to plug this xorshift-generator into System.Random in a way that returns the same results?
This is to do with the behaviour of System.Random.randoms, which repeatedly applies random to a RandomGen, not next.
class Random a where
...
random :: (RandomGen g) => g -> (a, g)
The Random class is what allows you to reuse RandomGen instances across different enums, and the instance for Word32 (as well as nearly all other types) is defined as
instance Random Word32 where randomR = randomIvalIntegral; random = randomBounded
randomBounded just calls randomR, so the behaviour of random is decided by `
randomIvalIntegral (l,h) = randomIvalInteger (toInteger l, toInteger h)
randomIvalInteger is an interesting function, you can read the source here. It's actually causing your problem because the function will discard a certain number of intermediate values based on the range of the generator and the range being generated over.
To get the values you want, you just need to use next instead - the easiest way would just be to define
randoms' g = x : (randoms' g') where (x, g') = next g

Working with Data.Map.StrictMap.Maps using Control.Parallel

I have the following code. The M prefix designates functions from Data.Map.Strict, and Table is a type alias for Data.Map.Strict.Map Mapping Bool, where Mapping is an arbitrary opaque structure.
computeCoverage :: Table -> Expr -> Maybe Coverage
computeCoverage t e = go t True M.empty
where go src flag targ
| null src = if flag
then Nothing
else Just (M.size t, targ)
| otherwise = let ((m, b), rest) = M.deleteFindMin src
result = interpret e m
flag' = result && flag in
go rest flag' (if b == result then targ else M.insert m b targ)
I would like to be able to use Control.Parallel to perform this with as much parallelism as possible. However, I'm not sure how to do this. Based on reading Data.Map.Strict, it seems what you're supposed to do is call splitRoot, then do whatever parallel stuff you want on the resulting list, then recombine (I guess?). Have I basically got the right idea? If not, what should I do instead to parallelize the code above?
Here's a contrived example. You just use parMap over M.splitRoot m:
import qualified Data.Map.Strict as M
import Control.Parallel.Strategies
import System.Environment
fib 0 = 0
fib 1 = 1
fib n = fib (n-2) + fib (n-1)
theMap :: Int -> M.Map Int Int
theMap n = M.fromList [ (x, 33 + mod x 3) | x <- [1..n] ]
isInteresting n = mod (fib n) 2 == 0
countInteresting :: M.Map Int Int -> Int
countInteresting m = length $ filter isInteresting (M.elems m)
doit :: Int -> [Int]
doit n = parMap rseq countInteresting (M.splitRoot $ theMap n)
main :: IO ()
main = do
( arg1 : _) <- getArgs
let n = read arg1
print $ doit n
Note, however these caveats:
the splits may not be of equal size
use splitRoot if working with a Map is helpful for your computation; this particular example doesn't benefit from the Map structure of root - it could have just parMapped over the elements.

How to make a random list using IO in Haskell

I'm trying to do a flocking simulation in order to better teach myself haskell. I'm running into trouble when trying to generate the initial state for the simulation which requires randomness. I'm trying to generate a list of Boids which all have random initial positions and directions.
In the main function I call this using
let numBoids = 10
rBoids <- randomBoids numBoids
And rBoids I indend to store in an IORef which I can then update every frame, which I think is the right way to do things?
And here is the code which fails:
-- Type for the flocking algorithm
data Boid = Boid {
boidPosition :: Vector2(GLfloat)
, boidDirection :: Vector2(GLfloat)
} deriving Show
randomBoids :: Int -> IO ([Boid])
randomBoids 0 = do
return []
randomBoids n = do
b <- randomBoid
bs <- (randomBoids (n-1))
return b : bs
randomBoid = do
pos <- randomVector
vel <- randomVector
return (Boid pos vel)
randomVector = do
x <- randomRIO(-1.0, 1.0)
y <- randomRIO(-1.0, 1.0)
return (Vector2 x y)
What actually fails is return b : bs. If I change this into return [b] it compiles. The error given is:
Couldn't match expected type `IO [Boid]' with actual type `[a0]'
In the expression: return b : bs
In the expression:
do { b <- randomBoid;
bs <- (randomBoids (n - 1));
return b : bs }
In an equation for `randomBoids':
randomBoids n
= do { b <- randomBoid;
bs <- (randomBoids (n - 1));
return b : bs }
I'm pretty lost here, and my understanding of the whole imperative-code-in-a-functional language (and monads) is shaky to say the least. Any help would be most appreciated!
Gangadahr is correct. I only wanted to mention that you can shorten your code a LOT:
import Control.Applicative
import Control.Monad
randomBoids n = replicateM n randomBoid
randomBoid = Boid <$> randomVector <*> randomVector
randomVector = Vector2 <$> randomRIO (-1, 1) <*> randomRIO (-1, 1)
The first function takes advantage of replicateM, which is a very useful function when you want to repeat a monadic action and collect the results. The latter two functions use Applicative style, which is enormously useful.
The reason you are getting the error is because return b : bs will make the compiler interpret it as (return b): bs To fix this, you can change the statement to return (b:bs). That will make the statement return an IO[Boid]
The typechecker is reading return x : xs as (return x) : xs. If you write return (x:xs) it will typecheck.

Resources