MonadRandom: why stack overflow happens? - haskell

This question is certainly for stackoverflow.com
here is the sample
module Main where
import Control.Monad.Random
import Control.Exception
data Tdata = Tdata Int Int Integer String
randomTdata :: (Monad m, RandomGen g) => RandT g m Tdata
randomTdata = do
a <- getRandom
b <- getRandom
c <- getRandom
return $ Tdata a b c "random"
manyTdata :: IO [Tdata]
manyTdata = do
g <- newStdGen
evalRandT (sequence $ repeat randomTdata) g
main = do
a <- manyTdata
b <- evaluate $ take 1 a
return ()
after compilation this return
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it
How can it happen ? Is MonadRandom not lazy or what else ? And how to define the cause of stack overflow in cases like that ?

The issue arises because you are building IO into your manyTdata function.
The monad transformer ends up being of type RandT g IO Tdata. Because each element of
your infinite list can consist of IO actions, the entirety of the infinite list
returned by manyTdata must be evaluated completely before the function can return
any results.
The simplest solution would be to use Rand instead of RandT, as using the tranformer
isn't really useful here, anyway; you could also change the base monad to something like
the Identity monad by changing manyTdata to
manyTdata :: IO [Tdata]
manyTdata = do
g <- newStdGen
return $ runIdentity $ evalRandT (sequence $ repeat randomTdata) g
Which will terminate in a finite amount of time. The error concerning your stack size
is simply a result of recursively expanding your list of IO actions. Your code says to sequence all of these actions, so they all have to be performed, it has nothing to do with laziness.
Something else to think about, rather than using randomTdata, consider
making Tdata an instance of the Random class.

Related

Create random numbers in a reproducible way and hide generator threading (Using Haskell Monad)

I need to create random data in Haskell.
I want my code to be:
a) reproducible from a seed
b) the threading of generators to be implicit
I understand Monads generally and the way that Random Generators work.
My approach is to thread the generator through the code so I can reproduce the random numbers but want to hide the threading of the generators in a Monad.
I'm thinking that the State Monad is a good approach.
Here's some simple code:
type Gen a = State StdGen a
roll :: Gen Int
roll = state $ randomR (1, 6)
roll2 :: Gen Int
roll2 = (+) <$> roll <*> roll
test :: Int -> IO ()
test seed = do
let gen = mkStdGen seed
print (evalState roll gen)
print (evalState roll gen)
print (evalState roll2 gen)
print (evalState roll2 gen)
I'm trying to use State so that I can push the threading of the generator into the State Monad but the results of roll are the same and results of roll2 are the same. I can see that this is because I'm passing gen into the functions multiple times so of course it would produce the same output. So that makes me think I need to get a new generator from each function. But then I'm back to having to thread the generator through the code which is what I'm trying to avoid by using State. I feel like I'm missing a trick!
I explored MonadRandom too and that did push the threading away from my code but I couldn't see how to make that approach be reproducible.
I've hunted a lot and tried many things but seem to always either be able to hide the generators OR make the code reproducible but not both.
I'm keen to use a Monad more specific than IO.
I'm also going to build a series of more complex functions which will generate random lists of numbers so I need to have a simple way to make these random functions rely on each other. I managed that with MonadRandom but again I couldn't see how that could be reproducible.
Any help appreciated.
If you needn't interleave IO with randomness, as here, then the answer is just to lump your State actions together into one with the Monad operations (they're the thing passing the state around for you!).
test :: Int -> IO ()
test seed = do
print a
print b
print c
print d
where
(a,b,c,d) = flip evalState (mkStdGen seed) $ do
a <- roll
b <- roll
c <- roll2
d <- roll2
return (a,b,c,d)
If you will need to interleave IO and randomness, then you will want to look into StateT StdGen IO as your monad instead of using State StdGen and IO separately. That might look like this, say:
roll :: MonadState StdGen m => m Int
roll = state (randomR (1,6))
roll2 :: MonadState StdGen m => m Int
roll2 = (+) <$> roll <*> roll
test :: (MonadState StdGen m, MonadIO m) => m ()
test = do
roll >>= liftIO . print
roll >>= liftIO . print
roll2 >>= liftIO . print
roll2 >>= liftIO . print
(You could then use e.g. evalStateT test (mkStdGen seed) to turn this back into an IO () action, or embed it into a larger computation if there were further random things you needed to generate and do IO about.)
MonadRandom does little more than package up StateT StdGen in a way that lets you still use non-seed state, so I encourage you to reconsider using it. evalRand and evalRandT from Control.Monad.Random.Lazy (or .Strict) shouldy give you the repeatability you need; if they don't, you should open a fresh question with the full details of what you tried and how it went wrong.
Normally, it's pretty much the whole point of a random generator that you don't always get the same result. And that's the reason why you use a state monad: to pass on the updated generator, so that the next random event will actually be different.
If you want always the same value, then there's not really any reason to use special random tooling at all – just generate one value once (or two values), then pass it on whereever needed, like you would pass on another variable.
test :: IO ()
test = do
[dice0, dice1] <- replicateM 2 $ randomRIO (1,6)
print dice0
print dice0
print $ dice0+dice1
print $ dice0+dice1

Haskell random numbers

I am trying to generate a sample of random numbers in Haskell
import System.Random
getSample n = take n $ randoms g where
g = newStdGen
but it seems I am not quite using newStdGen the right way. What am I missing?
First off, you probably don't want to use newStdGen. The biggest problem is that you'll get a different seed every time you run your program, so no results will be reproducible. In my opinion, mkStdGen is a better choice as it requires you to give it a seed. This means you will get the same sequence of (pseudo)random numbers every time. If you want a different sequence, just change the seed.
The second problem with newStdGen is that since it's impure, you'll end up in the IO monad which can be a bit inconvenient.
sample :: Int -> IO [Int]
sample n = do
gen <- newStdGen
return $ take n $ randoms gen
You can use do-notation to 'extract' the values and then sum them:
main :: IO ()
main = do
xs <- sample 10
s = sum xs
print s
Or you could 'fmap' the function over the result (but notice that at some point you will probably need to extract the value):
main :: IO ()
main = do
s <- fmap sum $ sample 10
print s
The fmap function is a generalized version of map. Just like map applies a function to the values inside a list, fmap can apply a function to values inside IO.
Another problem with this sample function is that if we call it again, it starts with a fresh seed instead of continuing the previous (pseudo)random sequence. Again, this make reproducing results impossible. In order to fix this problem, we need to pass in the seed and return a new seed. Unfortunately, randoms does not return the next seed for us, so we'll have to write this from scratch using random.
sample :: Int -> StdGen -> ([Int],StdGen)
sample n seed1 = case n of
0 -> ([],seed1)
k -> let (rs,seed2) = sample (k-1) seed1
(r, seed3) = random seed2
in ((r:rs),seed3)
Our main function is now
main :: IO ()
main = do
let seed1 = mkStdGen 123456
(xs,seed2) = sample 10 seed1
s = sum xs
(ys,seed3) = sample 10 seed2
t = sum ys
print s
print t
I know this seems like an awful lot of work just to to use random numbers, but the advantages are worth it. We can generate all of our randomness with a single seed which guarantees that the results can be reproduced.
Of course, this being Haskell, we can take advantage of Monads to get rid of all the manual threading of the seed values. This is a slightly more advanced method, but well worth learning since monads are ubiquitous in Haskell code.
We need these imports:
import System.Random
import Control.Monad
import Control.Applicative
Then we'll create a newtype which represents the action of turning a seed into a value and the next seed.
newtype Rand a = Rand { runRand :: StdGen -> (a,StdGen) }
We need Functor and Applicative instances or GHC will complain, but we can avoid implementing them for this example.
instance Functor Rand
instance Applicative Rand
And now for the Monad instance. This is where the magic happens. The >>= function (called bind) is the one place where we specify how to thread the seed value through the computation.
instance Monad Rand where
return x = Rand ( \seed -> (x,seed) )
ra >>= f = Rand ( \s1 -> let (a,s2) = runRand ra s1
in runRand (f a) s2 )
newRand :: Rand Int
newRand = Rand ( \seed -> random seed )
Now our sample function is extremely simple! We can take advantage of replicateM from Control.Monad which repeats a given action and accumulates the results in a list. All that funny business with the seed values is taken care of behind the scenes
sample :: Int -> Rand [Int]
sample n = replicateM n newRand
main :: IO ()
main = do
let seed1 = mkStdGen 124567
(xs,seed2) = runRand (sample 10) seed1
s = sum xs
print s
We can even stay inside the Rand monad if we need to generate random values multiple times.
main :: IO ()
main = do
let seed1 = mkStdGen 124567
(xs,seed2) = flip runRand seed1 $ do
x <- newRand
bs <- sample 5
cs <- sample 10
return $ x : (bs ++ cs)
s = sum xs
print s
I hope this helps!

How does IO monad work in System.Random

import System.Random
main = do
g <- newStdGen
a <-take 5 (randoms g :: [Double])
return ()
So this code doesn't work because apparently what I'm assigning to a has type [Double] instead of IO [Double] but I thought that you can't escape from IO ever? So how come I seemed to have escaped from IO even though g is type IO? I'm still confused on how IO monads work inside do notation.
You can't escape from IO, but inside a do block you're not actually escaping per se.
Loosely: when you write g <- newStdGen in a do block, you can then use g later in the block as if it just had type StdGen, instead of IO StdGen. At the end of the block, whatever you return will be wrapped back up in IO.
Use let a = instead of a <- since the RHS is a pure value.
import System.Random
main = do
g <- newStdGen
let a = take 5 (randoms g :: [Double])
print a

Maybe monad inside stack of transformers

I have the following code:
import Control.Monad
import Control.Monad.Trans
import Control.Monad.Trans.State
type T = StateT Int IO Int
someMaybe = Just 3
f :: T
f = do
x <- get
val <- lift $ do
val <- someMaybe
-- more code in Maybe monad
-- return 4
return 3
When I use do notation inside to work in Maybe monad it fails. From the error it gives it looks like type signature for this do doesn't match. However I have no idea how to fix it. I tried some lift combinations, but none of them worked and I don't want to guess anymore.
The problem is that Maybe is not part of your transformer stack. If your transformer only knows about StateT Int and IO, it does not know anything about how to lift Maybe.
You can fix this by changing your type T to something like:
type T = StateT Int (MaybeT IO) Int
(You'll need to import Control.Monad.Trans.Maybe.)
You will also need to change your inner do to work with MaybeT rather than Maybe. This means wrapping raw Maybe a values with MaybeT . return:
f :: T
f = do
x <- get
val <- lift $ do
val <- MaybeT $ return someMaybe
-- more code in Maybe monad
return 4
return 3
This is a little awkward, so you probably want to write a function like liftMaybe:
liftMaybe = MaybeT . return
If you used lift to lift IO a values in other parts of your code, this will now break because you have three levels in your transformer stack now. You will get an error that looks like this:
Couldn't match expected type `MaybeT IO t0'
with actual type `IO String'
To fix this, you should use liftIO for all your raw IO a values. This uses a typeclass to life IO actions through any number of transformer layers.
In response to your comment: if you only have a bit of code depending on Maybe, it would be easier just to put the result of the do notation into a variable and match against that:
let maybeVal = do val <- someMaybe
-- more Maybe code
return 4
case maybeVal of
Just res -> ...
Nothing -> ...
This means that the Maybe code will not be able to do an IO. You can also naturally use a function like fromMaybe instead of case.
If you want to run the code in the inner do purely in the Maybe monad, you will not have access to the StateT Int or IO monads (which might be a good thing). Doing so will return a Maybe value, which you will have to scrutinize:
import Control.Monad
import Control.Monad.Trans
import Control.Monad.Trans.State
type T = StateT Int IO Int
someMaybe = Just 3
f :: T
f = do
x <- get
-- no need to use bind
let mval = do
-- this code is purely in the Maybe monad
val <- someMaybe
-- more code in Maybe monad
return 4
-- scrutinize the resulting Maybe value now we are back in the StateT monad
case mval of
Just val -> liftIO . putStrLn $ "I got " ++ show val
Nothing -> liftIO . putStrLn $ "I got a rock"
return 3

Is mapM in Haskell strict? Why does this program get a stack overflow?

The following program terminates correctly:
import System.Random
randomList = mapM (\_->getStdRandom (randomR (0, 50000::Int))) [0..5000]
main = do
randomInts <- randomList
print $ take 5 randomInts
Running:
$ runhaskell test.hs
[26156,7258,29057,40002,26339]
However, feeding it with an infinite list, the program never terminates, and when compiled, eventually gives a stack overflow error!
import System.Random
randomList = mapM (\_->getStdRandom (randomR (0, 50000::Int))) [0..]
main = do
randomInts <- randomList
print $ take 5 randomInts
Running,
$ ./test
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
I expected the program to lazily evaluate getStdRandom each time I pick an item off the list, finishing after doing so 5 times. Why is it trying to evaluate the whole list?
Thanks.
Is there a better way to get an infinite list of random numbers? I want to pass this list into a pure function.
EDIT: Some more reading revealed that the function
randomList r = do g <- getStdGen
return $ randomRs r g
is what I was looking for.
EDIT2: after reading camccann's answer, I realized that getStdGen is getting a new seed on every call. Instead, better to use this function as a simple one-shot random list generator:
import System.Random
randomList :: Random a => a -> a -> IO [a]
randomList r g = do s <- newStdGen
return $ randomRs (r,g) s
main = do r <- randomList 0 (50::Int)
print $ take 5 r
But I still don't understand why my mapM call did not terminate. Evidently not related to random numbers, but something to do with mapM maybe.
For example, I found that the following also does not terminate:
randomList = mapM (\_->return 0) [0..]
main = do
randomInts <- randomList
print $ take 50000 randomInts
What gives? By the way, IMHO, the above randomInts function should be in System.Random. It's extremely convenient to be able to very simply generate a random list in the IO monad and pass it into a pure function when needed, I don't see why this should not be in the standard library.
Random numbers in general are not strict, but monadic binding is--the problem here is that mapM has to sequence the entire list. Consider its type signature, (a -> m b) -> [a] -> m [b]; as this implies, what it does is first map the list of type [a] into a list of type [m b], then sequence that list to get a result of type m [b]. So, when you bind the result of applying mapM, e.g. by putting it on the right-hand side of <-, what this means is "map this function over the list, then execute each monadic action, and combine the results back into a single list". If the list is infinite, this of course won't terminate.
If you simply want a stream of random numbers, you need to generate the list without using a monad for each number. I'm not entirely sure why you've used the design you have, but the basic idea is this: Given a seed value, use a pseudo-random number generator to produce a pair of 1) a random number 2) a new seed, then repeat with the new seed. Any given seed will of course provide the same sequence each time. So, you can use the function getStdGen, which will provide a fresh seed in the IO monad; you can then use that seed to create an infinite sequence in completely pure code.
In fact, System.Random provides functions for precisely that purpose, randoms or randomRs instead of random and randomR.
If for some reason you want to do it yourself, what you want is essentially an unfold. The function unfoldr from Data.List has the type signature (b -> Maybe (a, b)) -> b -> [a], which is fairly self-explanatory: Given a value of type b, it applies the function to get either something of type a and a new generator value of type b, or Nothing to indicate the end of the sequence.
You want an infinite list, so will never need to return Nothing. Thus, partially applying randomR to the desired range and composing it with Just gives this:
Just . randomR (0, 50000::Int) :: (RandomGen a) => a -> Maybe (Int, a)
Feeding that into unfoldr gives this:
unfoldr (Just . randomR (0, 50000::Int)) :: (RandomGen a) => a -> [Int]
...which does exactly as it claims: Given an instance of RandomGen, it will produce an infinite (and lazy) list of random numbers generated from that seed.
I would do something more like this, letting randomRs do the work with an initial RandomGen:
#! /usr/bin/env runhaskell
import Control.Monad
import System.Random
randomList :: RandomGen g => g -> [Int]
randomList = randomRs (0, 50000)
main :: IO ()
main = do
randomInts <- liftM randomList newStdGen
print $ take 5 randomInts
As for the laziness, what's happening here is that mapM is (sequence . map)
Its type is: mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
It's mapping the function, giving a [m b] and then needs to execute all those actions to make an m [b]. It's the sequence that'll never get through the infinite list.
This is explained better in the answers to a prior question: Is Haskell's mapM not lazy?

Resources