How can I generate different random values in Haskell?

How can I generate different random values in Haskell? - haskell

Suppose that I have a list like this:
let list = ["random", "foo", "random", "bar", "random", "boo"]
I want to iterate over a list and map all "random" elements to different random strings:
let newList = fmap randomize list
print newList
-- ["dasidias", "foo", "gasekir", "bar", "nabblip", "boo"]
My randomize function looks like this:
randomize :: String -> String
randomize str =
case str of
"random" -> randStr
_ -> str
where
randStr = take 10 $ randomRs ('a','z') $ unsafePerformIO newStdGen
But I get the same random string for every "random" element:
["abshasb", "foo", "abshasb", "bar", "abshasb", "boo"]
I can't figure out why is this happening and how to get a different random value for each occurrence of "random".

There are two problems with your code:
You are calling unsafePerformIO, but explicitly violating the contract of that function. It is on you to prove that the thing you provide to unsafePerformIO is actually pure, and the compiler is within its rights to act as if that's the case, and here it is definitely not.
You are not carefully tracking the updated random number generator state after using it. Indeed, it is not possible to do this correctly with randomRs; if you use randomRs, then to a first approximation, that must be the last randomness your program needs.
The simplest fix to both of these is to admit that you really, truly are doing IO. So:
import Control.Monad
import System.Random
randomize :: String -> IO String
randomize "random" = replicateM 10 (randomRIO ('a', 'z'))
randomize other = pure other
Try it out in ghci:
> traverse randomize ["random", "foo", "random", "bar", "random", "boo"]
["xytuowzanb","foo","lzhasynexf","bar","dceuvoxkyh","boo"]
There is no call to unsafePerformIO, and so no proof burden to shirk; and randomRIO tracks the updated generator state for you in a hidden IORef, and so you correctly continue advancing it on each call.

How not to involve IO in random number generation:
This question has received excellent answers. However, it might leave some readers under the impression that pseudo-random number generation (PRNG) within Haskell is necessarily linked to IO.
Well, it's not. It is just that in Haskell, the default random number generator happens to be "hosted" in the IO type. But this is by choice, not by necessity.
For reference, here is a recent review paper on the subject of PRNGs.
PRNGs are deterministic mathematical automata. They do not involve IO. Using PRNGs in Haskell does not need to involve the IO type. At the bottom of this answer, I provide code that solves the problem at hand without involving the IO type, except for printing the result.
The Haskell libraries provide functions such as mkStdGen that take an integer seed and return a pseudo-random number generator, that is an object of the RandomGen class, whose state is dependent on the value of seed. Note that there is nothing magic about mkStdGen. If for some reason you do not like it, there are alternatives, such as mkTFGen which is based on the Threefish block cipher.
Now, pseudo-random number generation is not managed in the same way in imperative languages such as C++ and in Haskell. In C++, you would extract a random value like this: rval = rng.nextVal();. On top of just returning the value, calling nextVal() has the side effect of altering the state of the rng object, ensuring that next time it will return a different random number.
But in Haskell, functions have no side effects. So you need to have something like this:
(rval, rng2) = nextVal rng1
That is, the evaluation function needs to return both the pseudo-random value and the updated state of the generator. A minor consequence is that, if the state is large (such as for the common Mersenne Twister generator), Haskell might need a bit more memory than C++.
So, we expect that solving the problem at hand, that is randomly transforming a list of strings, will involve a function with the following type signature: RandomGen tg => [String] -> tg -> ([String], tg).
For illustration purposes, let's get a generator and use it to generate a couple of "random" integers between 0 and 100. For this, we need the randomR function:
$ ghci
Prelude> import System.Random
Prelude System.Random> :t randomR
randomR :: (RandomGen g, Random a) => (a, a) -> g -> (a, g)
Prelude System.Random>
Prelude System.Random> let rng1 = mkStdGen 544
Prelude System.Random> let (v, rng2) = randomR (0,100) rng1
Prelude System.Random> v
23
Prelude System.Random> let (v, rng2) = randomR (0,100) rng1
Prelude System.Random> v
23
Prelude System.Random> let (w, rng3) = randomR (0,100) rng2
Prelude System.Random> w
61
Prelude System.Random>
Note that above, when we forget to feed the updated state of the generator, rng2, into the next computation, we get the same "random" number 23 a second time. This is a very common mistake and a very common complaint. Function randomR is a pure Haskell function that does not involve IO. Hence it has referential transparency, that is when given the same arguments, it returns the same output value.
A possible way to deal with this situation is to pass the updated state around manually within the source code. This is cumbersome and error prone, but can be managed. That gives this style of code:
-- stateful map of randomize function for a list of strings:
fmapRandomize :: RandomGen tg => [String] -> tg -> ([String], tg)
fmapRandomize [] rng = ([], rng)
fmapRandomize(str:rest) rng = let (str1, rng1) = randomize str rng
(rest1, rng2) = fmapRandomize rest rng1
in (str1:rest1, rng2)
Thankfully, there is a better way, which involves the runRand function or its evalRand sibling. Function runRand takes a monadic computation plus (an initial state of) a generator. It returns the pseudo-random value and the updated state of the generator. It is much easier to write the code for monadic computations than to pass the generator state manually around.
This is a possible way to solve the random string substitution problem from the question text:
import System.Random
import Control.Monad.Random
-- generic monadic computation to get a sequence of "count" random items:
mkRandSeqM :: (RandomGen tg, Random tv) => (tv,tv) -> Int -> Rand tg [tv]
mkRandSeqM range count = sequence (replicate count (getRandomR range))
-- monadic computation to get our sort of random string:
mkRandStrM :: RandomGen tg => Rand tg String
mkRandStrM = mkRandSeqM ('a', 'z') 10
-- monadic single string transformation:
randomizeM :: RandomGen tg => String -> Rand tg String
randomizeM str = if (str == "random") then mkRandStrM else (pure str)
-- monadic list-of-strings transformation:
mapRandomizeM :: RandomGen tg => [String] -> Rand tg [String]
mapRandomizeM = mapM randomizeM
-- non-monadic function returning the altered string list and generator:
mapRandomize :: RandomGen tg => [String] -> tg -> ([String], tg)
mapRandomize lstr rng = runRand (mapRandomizeM lstr) rng
main = do
let inpList = ["random", "foo", "random", "bar", "random", "boo", "qux"]
-- get a random number generator:
let mySeed = 54321
let rng1 = mkStdGen mySeed
-- execute the string substitutions:
let (outList, rng2) = mapRandomize inpList rng1
-- display results:
putStrLn $ "inpList = " ++ (show inpList)
putStrLn $ "outList = " ++ (show outList)
Note that above, RandomGen is the class of the generator, while Random is just the class of the generated value.
Program output:
$ random1.x
inpList = ["random","foo","random","bar","random","boo","qux"]
outList = ["gahuwkxant","foo","swuxjgapni","bar","zdjqwgpgqa","boo","qux"]
$

The fundamental problem with your approach is that Haskell is a pure language, and you're trying to use it as if its not. In fact this isn't the only fundamental misunderstanding of the language that your code displays.
In your randomise function:
randomize :: String -> String
randomize str =
case str of
"random" -> randStr
_ -> str
where
randStr = take 10 $ randomRs ('a','z') $ unsafePerformIO newStdGen
you clearly intend that randStr takes a different value each time it is used. But in Haskell, when you use the = sign, you are not "assigning a value to a variable", as would be the case in an imperative language. You are saying that these two values are equal. Since all "variables" in Haskell are actually "constant" and immutable, the compiler is perfectly entitled to assume that every occurrence of randStr in your program can be replaced by whatever value it first calculates for it.
Unlike an imperative language, Haskell programs are not a sequence of statements to execute, which perform side effects such as updating state. Haskell programs consist of expressions, which are evaluated more or less in whatever order the compiler deems best. (In particular there is the main expression, which describes what your entire program will do - this is then converted by the compiler and runtime into executable machine code.) So when you assign a complex expression to a variable, you are not saying "at this point in the execution flow, do this calculation and assign the result to this variable". You are saying that "this is the value of the variable", for "all time" - that value isn't allowed to change.
Indeed the only reason that it seems to change here is because you have used unsafePerformIO. As the name itself says, this function is "unsafe" - it should basically never be used, at least unless you really know exactly what you're doing. It is not supposed to be a way of "cheating", as you use it here, to use IO, and thereby generate an "impure" result that may be different in different parts of the program, but pretend the result is pure. It's hardly surprising that this doesn't work.
Since generating random values is inherently impure, you need to do the whole thing in the IO monad, as #DanielWagner has shown one approach for in his answer.
(There is actually another way, involving taking a random generator and functions like randomR to generate a random value together with a new generator. This allows you to do more in pure code, which is generally preferable - but it takes more effort, likely including using the State monad to simplify the threading through of the generator values, and you'll still need IO in the end to make sure you get a new random sequence each time you run the program.)

Related

Randomness in a nested pure function

I want to provide a function that replaces each occurrence of # in a string with a different random number. In a non-pure language, it's trivial. However, how should it be designed in a pure language? I don't want to use unsafePerformIO, as it rather looks like a hack and not a proper design.
Should this function require a random generator as one of its parameters? And if so, would that generator have to be passed through the whole stack of invocations? Are there other possible approaches? Should I use the State monad, here? I would appreciate a toy example demonstrating a viable approach...

You would, in fact, use a variant of the state monad to pass the random generator around behind the scenes. The Rand type in Control.Monad.Random helps with this. The API is a bit confusing, but more because it's polymorphic over the type of random generator you use than because it has to be functional. This extra bit of scaffolding is useful, however, because you can easily reuse your existing code with different random generators which lets you test different algorithms as well as explicitly controlling whether the generator is deterministic (good for testing) or seeded with outside data (in IO).
Here's a simple example of Rand in action. The RandomGen g => in the type signature tells us that we can use any type of random generator for it. We have to explicitly annotate n as an Int because otherwise GHC only knows that it has to be some numeric type that can be generated and turned into a string, which can be one of multiple possible options (like Double).
randomReplace :: RandomGen g => String -> Rand g String
randomReplace = foldM go ""
where go str '#' = do
n :: Int <- getRandomR (0, 10)
return (str ++ show n)
go str chr = return $ str ++ [chr]
To run this, we need to get a random generator from somewhere and pass it into evalRand. The easiest way to do this is to get the global system generator which we can do in IO:
main :: IO ()
main = do gen <- getStdGen
print $ evalRand (randomReplace "ab#c#") gen
This is such a common pattern that the library provides an evalRandIO function which does it for you:
main :: IO ()
main = do res <- evalRandIO $ randomReplace "ab#c#"
print res
In the end, the code is a bit more explicit about having a random generator and passing it around, but it's still reasonably easy to follow. For more involved code, you could also use RandT, which allows you to extend other monads (like IO) with the ability to generate random values, letting you relegate all the plumbing and setup to one part of your code.

It's just a monadic mapping
import Control.Applicative
import Control.Monad.Random
import Data.Char
randomReplace :: RandomGen g => String -> Rand g String
randomReplace = mapM f where
f '#' = intToDigit <$> getRandomR (0, 10)
f c = return c
main = evalRandIO (randomReplace "#abc#def#") >>= print

Random numbers and getStdGen

From the Haskell wikibook we have:
import Control.Monad
import Control.Monad.Trans.State
import System.Random
type GeneratorState = State StdGen
rollDie :: GeneratorState Int
rollDie = do
generator <- get
let (value, newGenerator) = randomR (1,6) generator
put newGenerator
return value
If we execute:
evalState rollDie (mkStdGen 0)
then we get a return type of Int.
This much I understand, but I am wondering if it is possible to wire into this logic the use of the system generator accessed by the function getStdGen. The getStdGen function operates in the IO monad, and my question is (surely this must be the MOST often asked Haskell question) how can you get the generator out of the IO context to use in the non IO monad code above?
Apologies for the newbie question. I am aware that one should not use unsafePerformIO, but otherwise perplexed.

The core of the matter is that it's impossible to produce random numbers out of thin air. You either seed a generator with some input (that's what you do with mkStdGen 0), or you can take the system one with getStdGen.
The problem with mkStdGen 0 is that it's a constant, so your stream of random numbers will show its pseudo-randomness very clearly by being a constant too.
The problem with getStdGen is that it's in the IO monad.
You can't get the generator out of IO to use in a pure computation, that's the whole point of making IO a monad. But within IO, you can bind it using the normal do-notation:
main = do
gen <- getStrGen
print $ evalState rollDie gen
Of course, it'll only produce unique results once.
I heartily concur with the comment recommending MonadRandom. I've been using it for my random-based computations since I found out about it, and haven't looked back.

Reading a file into an array of strings

I'm pretty new to Haskell, and am trying to simply read a file into a list of strings. I'd like one line of the file per element of the list. But I'm running into a type issue that I don't understand. Here's what I've written for my function:
readAllTheLines hdl = (hGetLine hdl):(readAllTheLines hdl)
That compiles fine. I had thought that the file handle needed to be the same one returned from openFile. I attempted to simply show the list from the above function by doing the following:
displayFile path = show (readAllTheLines (openFile path ReadMode))
But when I try to compile it, I get the following error:
filefun.hs:5:43:
Couldn't match expected type 'Handle' with actual type 'IO Handle'
In the return type of a call of 'openFile'
In the first argument of 'readAllTheLines', namely
'(openFile path ReadMode)'
In the first argument of 'show', namely
'(readAllTheLines (openFile path ReadMode))'
So it seems like openFile returns an IO Handle, but hGetLine needs a plain old Handle. Am I misunderstanding the use of these 2 functions? Are they not intended to be used together? Or is there just a piece I'm missing?

Use readFile and lines for a better alternative.
readLines :: FilePath -> IO [String]
readLines = fmap lines . readFile
Coming back to your solution openFile returns IO Handle so you have to run the action to get the Handle. You also have to check if the Handle is at eof before reading something from that. It is much simpler to just use the above solution.
import System.IO
readAllTheLines :: Handle -> IO [String]
readAllTheLines hndl = do
eof <- hIsEOF hndl
notEnded eof
where notEnded False = do
line <- hGetLine hndl
rest <- readAllTheLines hndl
return (line:rest)
notEnded True = return []
displayFile :: FilePath -> IO [String]
displayFile path = do
hndl <- openFile path ReadMode
readAllTheLines hndl

To add on to Satvik's answer, the example below shows how you can utilize a function to populate an instance of Haskell's STArray typeclass in case you need to perform computations on a truly random access data type.
Code Example
Let's say we have the following problem. We have lines in a text file "test.txt", and we need to load it into an array and then display the line found in the center of that file. This kind of computation is exactly the sort situation where one would want to use a random access array over a sequentially structured list. Granted, in this example, there may not be a huge difference between using a list and an array, but, generally speaking, list accesses will cost O(n) in time whereas array accesses will give you constant time performance.
First, let's create our sample text file:
test.txt
This
is
definitely
a
test.
Given the file above, we can use the following Haskell program (located in the same directory as test.txt) to print out the middle line of text, i.e. the word "definitely."
Main.hs
{-# LANGUAGE BlockArguments #-} -- See footnote 1
import Control.Monad.ST (runST, ST)
import Data.Array.MArray (newArray, readArray, writeArray)
import Data.Array.ST (STArray)
import Data.Foldable (for_)
import Data.Ix (Ix) -- See footnote 2
populateArray :: (Integral i, Ix i) => STArray s i e -> [e] -> ST s () -- See footnote 3
populateArray stArray es = for_ (zip [0..] es) (uncurry (writeArray stArray))
middleWord' :: (Integral i, Ix i) => i -> STArray s i String -> ST s String
middleWord' arrayLength = flip readArray (arrayLength `div` 2)
middleWord :: [String] -> String
middleWord ws = runST do
let len = length ws
array <- newArray (0, len - 1) "" :: ST s (STArray s Int String)
populateArray array ws
middleWord' len array
main :: IO ()
main = do
ws <- words <$> readFile "test.txt"
putStrLn $ middleWord ws
Explanation
Starting with the top of Main.hs, the ST s monad and its associated function runST allow us to extract pure values from imperative-style computations with in-place updates in a referentially transparent manner. The module Data.Array.MArray exports the MArray typeclass as an interface for instantiating mutable array data types and provides helper functions for creating, reading, and writing MArrays. These functions can be used in conjunction with STArrays since there is an instance of MArray defined for STArray.
The populateArray function is the crux of our example. It uses for_ to "applicatively" loop over a list of tuples of indices and list elements to fill the given STArray with those list elements, producing a value of type () in the ST s monad.
The middleWord' helper function uses readArray to produce a String (wrapped in the ST s monad) that corresponds to the middle element of a given STArray of Strings.
The middleWord function instantiates a new STArray, uses populateArray to fill the array with values from a provided list of strings, and calls middleWord' to obtain the middle string in the array. runST is applied to this whole ST s monadic computation to extract the pure String result.
We finally use our middleWord function in main to find the middle word in the text file "test.txt".
Further Reading
Haskell's STArray is not the only way to work with arrays in Haskell. There are in fact Arrays, IOArrays, DiffArrays and even "unboxed" versions of all of these array types that avoid using the indirection of pointers to simply store "raw" values. There is a page on the Haskell Wikibook on this topic that may be worth some study. Before that, however, looking at the Wikibook page on mutable objects may give you some insight as to why the ST s monad allows us to safely compute pure values from functions that use imperative/destructive operations.
Footnotes
1 The BlockArguments language extension is what allows us to pass a do block directly to a function without any parentheses or use of the function application operator $.
2 As suggested by the Hackage documentation, Ix is a typeclass mainly meant to be used to specify types for indexing arrays.
3 The use of the Integral and Ix type constraints may be a bit of overkill, but it's used to make our type signatures as general as possible.

Dice Game in Haskell

I'm trying to spew out randomly generated dice for every roll that the user plays. The user has 3 rolls per turn and he gets to play 5 turns (I haven't implemented this part yet and I would appreciate suggestions).
I'm also wondering how I can display the colors randomly. I have the list of tuples in place, but I reckon I need some function that uses random and that list to match those colors. I'm struggling as to how.
module Main where
import System.IO
import System.Random
import Data.List
diceColor = [("Black",1),("Green",2),("Purple",3),("Red",4),("White",5),("Yellow",6)]
{-
randomList :: (RandomGen g) -> Int -> g -> [Integer]
random 0 _ = []
randomList n generator = r : randomList (n-1) newGenerator
where (r, newGenerator) = randomR (1, 6) generator
-}
rand :: Int -> [Int] -> IO ()
rand n rlst = do
num <- randomRIO (1::Int, 6)
if n == 0
then doSomething rlst
else rand (n-1) (num:rlst)
doSomething x = putStrLn (show (sort x))
main :: IO ()
main = do
--hSetBuffering stdin LineBuffering
putStrLn "roll, keep, score?"
cmd <- getLine
doYahtzee cmd
--rand (read cmd) []
doYahtzee :: String -> IO ()
doYahtzee cmd = do
if cmd == "roll"
then rand 5 []
else do print "You won"

There's really a lot of errors sprinkled throughout this code, which suggests to me that you tried to build the whole thing at once. This is a recipe for disaster; you should be building very small things and testing them often in ghci.
Lecture aside, you might find the following facts interesting (in order of the associated errors in your code):
List is deprecated; you should use Data.List instead.
No let is needed for top-level definitions.
Variable names must begin with a lower case letter.
Class prerequisites are separated from a type by =>.
The top-level module block should mainly have definitions; you should associate every where clause (especially the one near randomList) with a definition by either indenting it enough not to be a new line in the module block or keeping it on the same line as the definition you want it to be associated with.
do introduces a block; those things in the block should be indented equally and more than their context.
doYahtzee is declared and used as if it has three arguments, but seems to be defined as if it only has one.
The read function is used to parse a String. Unless you know what it does, using read to parse a String from another String is probably not what you want to do -- especially on user input.
putStrLn only takes one argument, not four, and that argument has to be a String. However, making a guess at what you wanted here, you might like the (!!) and print functions.
dieRoll doesn't seem to be defined anywhere.
It's possible that there are other errors, as well. Stylistically, I recommend that you check out replicateM, randomRs, and forever. You can use hoogle to search for their names and read more about them; in the future, you can also use it to search for functions you wish existed by their type.

Haskell and random numbers

I've been messing with Haskell few days and stumbled into a problem.
I need a method that returns a random list of integers ( Rand [[Int]] ).
So, I defined a type: type Rand a = StdGen -> (a, StdGen).
I was able to produce Rand IO Integer and Rand [IO Integer] ( (returnR lst) :: StdGen -> ([IO Integer], StdGen) ) somehow. Any tips how to produce Rand [[Int]]?

How to avoid the IO depends on why it's being introduced in the first place. While pseudo-random number generators are inherently state-oriented, there's no reason IO needs to be involved.
I'm going to take a guess and say that you're using newStdGen or getStdGen to get your initial PRNG. If that's the case, then there's no way to completely escape IO. You could instead seed the PRNG directly with mkStdGen, keeping in mind that the same seed will result in the same "random" number sequence.
More likely, what you want to do is get a PRNG inside IO, then pass that as an argument to a pure function. The entire thing will still be wrapped in IO at the end, of course, but the intermediate computations won't need it. Here's a quick example to give you the idea:
import System.Random
type Rand a = StdGen -> (a, StdGen)
getPRNG = do
rng <- newStdGen
let x = usePRNG rng
print x
usePRNG :: StdGen -> [[Int]]
usePRNG rng = let (x, rng') = randomInts 5 rng
(y, _) = randomInts 10 rng'
in [x, y]
randomInts :: Int -> Rand [Int]
randomInts 0 rng = ([], rng)
randomInts n rng = let (x, rng') = next rng
(xs, rng'') = randomInts (n - 1) rng'
in (x:xs, rng'')
You might notice that the code using the PRNG gets pretty ugly due to passing the current value back and forth constantly. It's also potentially error prone, since it'd be easy to accidentally reuse an old value. As mentioned above, using the same PRNG value will give the same sequence of numbers, which is usually not what you want. Both problems are a perfect example of where it makes sense to use a State monad--which is getting off topic here, but you may want to look into it next.

You are recreating MonadRandom on Hackage. If this is more than just an experiment to see if you can do it, you might want to use that library instead.

If you want to get an infinite list of Integers you're going to run into problems as you won't ever get a StdGen value back out. What you want to do here is split the StdGen first, pass one half out again and 'use up' the other half to generate an infinite list of integers. Something like this:
infiniteRandomInts :: Rand [Int]
infiniteRandomInts g = (ints, g2) where
(g1,g2) = split g
ints = randoms g1
However, if you then repeat this approach to get an infinite matrix of Integers (which you seem to want, by using Rand [[Int]]), you might run into problems of a statistical nature: I don't know how well StdGen deals with repeated splitting. Maybe another implementation of RandomGen might be better, or you could try to use some sort of diagonal walk to turn a [Int] into a [[Int]].

Using monadic notation, you should be able to write something like
randomList gen = do
randomLength <- yourRandomInteger
loop gen (randomLength + 1)
where
loop gen 1 = gen
loop gen n = do { x <- gen; xs <- loop gen (n - 1); return (x:xs) }
And with this
randomInts :: Rand [Int]
randomInts = randomList yourRandomInteger
randomLists :: Rand [[Int]]
randomLists = randomList randomInts
Concerning the monadic computations itself, take a look at this article. Note that random generators are pure, you shouldn't need any IO just for this purpose.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string