linear congruent generator in haskell - haskell

This is a very simple linear-congruent pseudo-random number generator. It works fine when I seed it, but I want to make it so that it self-seeds with every produced number. Problem is that I don't know how to do that in Haskell where the notion of variables does not exist. I can feed the produced number recursively, but then my result would be a list of integers instead of a single number.
linCongGen :: Int -> Int
linCongGen seed = ((2*seed) + 3) `mod` 100

I'll summarize the comments a bit more meaningfully. The simplest solution is, like you observed, an infinite list of the sequence of generated elements. Then, every time you want to get a new number, pop off the head of that list.
linCongGen :: Integral a => a -> [a]
linCongGen = iterate $ \x -> ((2*x) + 3) `mod` 100
That said, here is a solution (which I do not agree with), but which does what I think you want. For mutable state, we usually use IORef, which is sort of like a reference or pointer. Here is the code. Please read the disclaimer afterwards though.
import Data.IORef
import System.IO.Unsafe
seed :: IORef Int
seed = unsafePerformIO $ newIORef 71
linCongGen :: IO Int
linCongGen = do previous <- readIORef seed
modifyIORef' seed $ \x -> ((2*x) + 3) `mod` 100
return previous
And here is a sample usage printing out the first hundred numbers generated: main = replicateM_ 100 $ getRandom >>= print (you'll need to have Control.Monad imported too for replicateM_).
DISCLAIMER
This is a bit of a hacky approach described here. As the link says "Maybe the need for global mutable state is a symptom of bad design." The link also has a good description of a more intelligent workaround. Making an IORef is an inherently IO operation, and we really shouldn't be using unsafePerformIO on it. If you find yourself fighting Haskell in this way, it's because Haskell was designed to get in your way when you are doing things you shouldn't.
That said, I find comfort in knowing that this approach is also the one using in System.Random (the standard random number module) to define the initial seed (check out the source).

Related

Unit testing IO Int and similar in Haskell

From Ninety-Nine Haskell Problems:
Question 23: Extract a given number of randomly selected elements from a list.
This is a partial solution. For simplicity, this code just selects one element from a list.
import System.Random (randomRIO)
randItem :: [a] -> IO a
randItem xs = do
i <- randomRIO (0,length xs - 1)
return $ xs !! i
so randItem [1..10] would return an IO Int that corresponds to (but does not equal) some Int from 1 through 10.
So far, so good. But what kind of tests can I write for my randItem function? What relationship--if any--can I confirm between the input and output?
I can use the same logic as the above function to produce m Bool, but I cannot figure out how to test for m Bool. Thank you.
There's a couple of things you can do. If you're using QuickCheck, you could write the following properties:
The length of the returned list should be equal to the input length.
All elements in the returned list should be elements of the list of candidates.
Apart from that, the beauty of Haskell's Random library is that (as most other Haskell code) it's deterministic. If, instead of basing your implementation on randomRIO, you could base it on randomR or randomRs. This would enable you to pass some known RandomGen values to some deterministic unit test cases (not QuickCheck). These could serve as regression tests.
I've now published an article about the above approach, complete with source code.

Writing fusible O(1) update for vector

It is continuation of this question. Since vector library doesn't seem to have a fusible O(1) update function, I am wondering if it is possible to write a fusible O(1) update function that doesn't involve unsafeFreeze and unsafeThaw. It would use vector stream representation, I guess - I am not familiar with how to write one using stream and unstream - hence, this question. The reason is this will give us the ability to write a cache-friendly update function on vector where only a narrow region of vector is being modified, and so, we don't want to walk through entire vector just to process that narrow region (and this operation can happen billions of times in each function call - so, the motivation to keep the overhead really low). The transformation functions like map process entire vector - so they will be too slow.
I have a toy example of what I want to do, but the upd function below uses unsafeThaw and unsafeFreeze - it doesn't seem to be optimized away in the core, and also breaks the promise of not using the buffer further:
module Main where
import Data.Vector.Unboxed as U
import Data.Vector.Unboxed.Mutable as MU
import Control.Monad.ST
upd :: Vector Int -> Int -> Int -> Vector Int
upd v i x = runST $ do
v' <- U.unsafeThaw v
MU.write v' i x
U.unsafeFreeze v'
sum :: Vector Int -> Int
sum = U.sum . (\x -> upd x 0 73) . (\x -> upd x 1 61)
main = print $ Main.sum $ U.fromList [1..3]
I know how to implement imperative algorithms using STVector. In case you are wondering why this alternative approach, I want to try out this approach of using pure vectors to check how GHC transformation of a particular algorithm differs when written using fusible pure vector streams (with monadic operations under the hood of course).
When the algorithm is written using STVector, it doesn't seem to be as nicely iterative as I would like it to be (I guess it is harder for GHC optimizer to spot loops when there is lot of mutability strewn around). So, I am investigating this alternative approach to see I can get a nicer loop in there.
The upd function you have written does not look correct, let alone fusable. Fusion is a library level optimization and requires you to write your code out of certain primatives. In this case what you want is not just fusion, but recycling which can be easily achieved via the bulk update operations such as // and update. These operations will fuse, and even happen in place much of the time.
If you really want to write your own destructive update based code DO NOT use unsafeThaw--use modify
Any function is a fusible update function; you seem to be trying to escape from the programming model the vector package is trying to get you to use
module Main where
import Data.Vector.Unboxed as U
change :: Int -> Int -> Int
change 0 n = 73
change 1 n = 61
change m n = n
myfun2 = U.sum . U.imap change . U.enumFromStepN 1 1
main = print $ myfun2 30000000
-- this doesn't create any vectors much less 'update' them, as you will see if you study the core.

Procedurally generating large list of values in Haskell -- most idiomatic approach? memory management?

I have a function that takes a series of random numbers/floats, and uses them to generate a value/structure (ie, taking a random velocity and position of the point a ball is thrown from and outputting the coordinates of where it would land). And I need to generate several thousands in succession.
The way I have everything implemented is each calculation takes in an stdGen, uses it to generate several numbers, and passes out a new stdGen to allow it to be chained to another one.
And to do this for 10000 items, I make a sort of list from generate_item n which basically outputs a (value,gen) tuple (the value being the value i'm trying to calculate), where the value of gen is the recursively outputted stdGen from the calculations involved in getting the value from generate_item n-1
However, this program seems to crawl to be impractically slow at around a thousand results or so. And seems to definitely not be scalable. Could it have to do with the fact that I am storing all of the generate_item results in memory?
Or is there a more idomatic way of approaching this problem in Haskell using Monads or something than what I have describe above?
Note that the code to generate the algorithm from the random value generates 10k within seconds even in high-level scripting languages like ruby and python; these calculations are hardly intensive.
Code
-- helper functions that take in StdGen and return (Result,new StdGen)
plum_radius :: StdGen -> (Float,StdGen)
unitpoint :: Float -> StdGen -> ((Float,Float,Float),StdGen)
plum_speed :: Float -> StdGen -> (Float,StdGen)
-- The overall calculation of the value
plum_point :: StdGen -> (((Float,Float,Float),(Float,Float,Float)),StdGen)
plum_point gen = (((px,py,pz),(vx,vy,vz)),gen_out)
where
(r, gen2) = plum_radius gen
((px,py,pz),gen3) = unitpoint r gen2
(s, gen4) = plum_speed r gen3
((vx,vy,vz),gen5) = unitpoint s gen4
gen_out = gen5
-- Turning it into some kind of list
plum_data_list :: StdGen -> Int -> (((Float,Float,Float),(Float,Float,Float)),StdGen)
plum_data_list seed_gen 0 = plum_point seed_gen
plum_data_list seed_gen i = plum_point gen2
where
(_,gen2) = plum_data_list seed_gen (i-1)
-- Getting 100 results
main = do
gen <- getStdGen
let data_list = map (plum_data_list gen) [1..100]
putStrLn List.intercalate " " (map show data_list)
Consider just using the mersenne-twister and the vector-random package , which is specifically optimized to generate large amounts of high-quality random data.
Lists are unsuitable for allocating large amounts of data -- better to use a packed representation -- unless you're streaming.
First of all, the pattern you are describing -- taking an StdGen and then returning a tuple with a value and another StdGen to be chained into the next computation -- is exactly the pattern the State monad encodes. Refactoring your code to use it might be a good way to start to become familiar with monadic patterns.
As for your performance problem, StdGen is notoriously slow. I haven't done a lot with this stuff, but I've heard mersenne twister is faster.
However, you might also want to post your code, since in cases where you are generating large lists, laziness can really work to your advantage or disadvantage depending on how you are doing it. But it is hard to give specific advice without seeing what you are doing. One rule of thumb just in case you are coming from another functional language such as Lisp -- when generating a list (or other lazy data structure -- e.g. a tree, but not a Int), avoid tail recursion. The intuition for it being faster does not transfer to lazy languages. E.g. use (written without the monadic style that I would acutally use in practice)
randoms :: Int -> StdGen -> (StdGen, [Int])
randoms 0 g = (g, [])
randoms n g = let (g', x) = next g
(g'', xs) = randoms (n-1) g'
in (g'', x : xs)
This will allow the result list to be "streamed", so you can access the earlier parts of it before generating the later parts. (In this state case, it's a little subtle because accessing the resulting StdGen will have to generate the whole list, so you'll have to carefully avoid doing that until after you have consumed the list -- I wish there was a fast random generator that supported a good split operation, then you could get around having to return a generator at all).
Oh, just in case you're having trouble getting going with the monads thing, here's the above function written with a state monad:
randomsM :: Int -> State StdGen [Int]
randomsM 0 = return []
randomsM n = do
x <- state next
xs <- randomsM (n-1)
return (x : xs)
See the correspondence?
The other posters have good points, StdGen doesn't perform very well, and you should probably try to use State instead of manually passing the generator along. But I think the biggest problem is your plum_data_list function.
It seems to be intended to be some kind of lookup, but since it's implemented recursively without any memoization, the calls you make have to recurse to the base case. That is, plum_data_list seed_gen 100 needs the random generator from plum_data_list seed_gen 99 and so on, until plum_data_list seed_gen 0. This will give you quadratic performance when you try to generate a list of these values.
Probably the more idiomatic way is to let plum_data_list seed_gen generate an infinite list of points like so:
plum_data_list :: StdGen -> [((Float,Float,Float),(Float,Float,Float))]
plum_data_list seed_gen = first_point : plum_data_list seed_gen'
where
(first_point, seed_gen') = plum_point seed_gen
Then you just need to modify the code in main to something like take 100 $ plum_data_list gen, and you are back to linear performance.

SIMPLE random number generation

I'm writing this after a good while of frustrating research, and I'm hoping someone here can enlighten me about the topic.
I want to generate a simple random number in a haskell function, but alas, this seems impossible to do without all sorts of non-trivial elements, such as Monads, asignation in "do", creating generators, etc.
Ideally I was looking for an equivalent of C's "rand()". But after much searching I'm pretty convinced there is no such thing, because of how the language is designed. (If there is, please someone enlighten me). As that doesn't seem feasible, I'd like to find a way to get a random number for my particular problem, and a general explanation on how it works to get a random number.
prefixGenerator :: (Ord a, Arbitrary a) => Gen ([a],[a])
prefixGenerator = frequency [
(1, return ([],[])),
(2, do {
xs1 <- orderedListEj13 ;
xs2 <- orderedListEj13 ;
return (xs1,xs2)
}),
(2, do {
xs2 <- orderedListEj13 ;
return ((take RANDOMNUMBERHERE xs2),xs2)
})
]
I'm trying to get to grips with QuickCheck but my inability to use random numbers is making it hard. I've tried something like this (by putting an drawInt 0 (length xs2) instead of RANDOMNUMBERHERE)but I get stuck with the fact that take requires an Int and that method leaves me with a IO Int, which seems impossible to transform to an Int according to this.
As haskell is a pure functional programming language, functions are referentially transparent which means essentially that only a function's arguments determine its result. If you were able to pull a random number out of the air, you can imagine how that would cause problems.
I suppose you need something like this:
prefixGenerator :: (Ord a, Arbitrary a) => Gen ([a],[a])
prefixGenerator = do
randn <- choose (1,999) -- number in range 1-999
frequency [
(1, return ([],[])),
(2, do {
xs1 <- orderedListEj13 ;
xs2 <- orderedListEj13 ;
return (xs1,xs2)
}),
(2, do {
xs2 <- orderedListEj13 ;
return ((take randn xs2),xs2)
})
]
In general in haskell you approach random number generation by either pulling some randomness from the IO monad, or by maintaining a PRNG that is initialized with some integer seed hard-coded, or pulled from IO (gspr's comment is excellent).
Reading about how pseudo random number generators work might help you understand System.Random, and this might help as well (scroll down to section on randomness).
You're right in that nondeterministic random (by which I mean "pseudo-random") number generation is impossible without trickery. Functions in Haskell are pure which means that the same input will always produce the same output.
The good news is that you don't seem to need a nondeterministic PRNG. In fact, it would be better if your QuickCheck test used the same sequence of "random" numbers each time, to make your tests fully reproducible.
You can do this with the mkStdGen function from System.Random. Adapted from the Haskell wiki:
import System.Random
import Data.List
randomInts :: Int -> [Int]
randomInts n = take n $ unfoldr (Just . random) (mkStdGen 4)
Here, 4 is the seed that you may want to choose by a fair dice roll.
The standard library provides a monad for random-number generation. The monadic stuff is not that hard to learn, but if you want to avoid it, find a pseudo-random function next that takes an Int to an Int in a pseudorandom way, and then just create and pass an infinite list of random numbers:
next :: Int -> Int
randoms :: [Int]
randoms = iterate next 73
You can then pass this list of random numbers wherever you need it.
Here's a linear congruential next from Wikipedia:
next n = (1103515245 * n + 12345) `mod` 1073741824
And here are the first 20 pseudorandom numbers following 73:
Prelude> take 20 $ iterate next 73
[73,25988430,339353199,182384508,910120965,1051209818,737424011,14815080,325218177,1034483750,267480167,394050068,4555453,647786674,916350979,980771712,491556281,584902142,110461279,160249772]

Storing values in a data structure Haskell

I'm trying to store randomly generated dice values in some data structure, but don't know how exactly to do it in Haskell. I have so far, only been able to generate random ints, but I want to be able to compare them to the corresponding color values and store the colors instead (can't really conceive what the function would look like). Here is the code I have --
module Main where
import System.IO
import System.Random
import Data.List
diceColor = [("Black",1),("Green",2),("Purple",3),("Red",4),("White",5),("Yellow",6)]
diceRoll = []
rand :: Int -> [Int] -> IO ()
rand n rlst = do
num <- randomRIO (1::Int, 6)
if n == 0
then printList rlst -- here is where I need to do something to store the values
else rand (n-1) (num:rlst)
printList x = putStrLn (show (sort x))
--matchColor x = doSomething()
main :: IO ()
main = do
--hSetBuffering stdin LineBuffering
putStrLn "roll, keep, score?"
cmd <- getLine
doYahtzee cmd
--rand (read cmd) []
doYahtzee :: String -> IO ()
doYahtzee cmd = do
if cmd == "roll"
then do rand 5 []
else putStrLn "Whatever"
After this, I want to be able to give the user the ability to keep identical dices (as in accumulate points for it) and give them a choice to re-roll the left over dices - I'm thinking this can done by traversing the data structure (with the dice values) and counting the repeating dices as points and storing them in yet another data structure. If the user chooses to re-roll he must be able to call random again and replace values in the original data structure.
I'm coming from an OOP background and Haskell is new territory for me. Help is much appreciated.
So, several questions, lets take them one by one :
First : How to generate something else than integers with the functions from System.Random (which is a slow generator, but for your application, performance isn't vital).
There is several approaches, with your list, you would have to write a function intToColor :
intToColor :: Int -> String
intToColor n = head . filter (\p -> snd p == n) $ [("Black",1),("Green",2),("Purple",3),("Red",4),("White",5),("Yellow",6)]
Not really nice. Though you could do better if you wrote the pair in the (key, value) order instead since there's a little bit of support for "association list" in Data.List with the lookup function :
intToColor n = fromJust . lookup n $ [(1,"Black"),(2,"Green"),(3,"Purple"),(4,"Red"),(5,"White"),(6,"Yellow")]
Or of course you could just forget this business of Int key from 1 to 6 in a list since lists are already indexed by Int :
intToColor n = ["Black","Green","Purple","Red","White","Yellow"] !! n
(note that this function is a bit different since intToColor 0 is "Black" now rather than intToColor 1, but this is not really important given your objective, if it really shock you, you can write "!! (n-1)" instead)
But since your colors are not really Strings and more like symbols, you should probably create a Color type :
data Color = Black | Green | Purple | Red | White | Yellow deriving (Eq, Ord, Show, Read, Enum)
So now Black is a value of type Color, you can use it anywhere in your program (and GHC will protest if you write Blak) and thanks to the magic of automatic derivation, you can compare Color values, or show them, or use toEnum to convert an Int into a Color !
So now you can write :
randColorIO :: IO Color
randColorIO = do
n <- randomRIO (0,5)
return (toEnum n)
Second, you want to store dice values (colors) in a data structure and give the option to keep identical throws. So first you should stock the results of several throws, given the maximum number of simultaneous throws (5) and the complexity of your data, a simple list is plenty and given the number of functions to handle lists in Haskell, it is the good choice.
So you want to throws several dices :
nThrows :: Int -> IO [Color]
nThrows 0 = return []
nThrows n = do
c <- randColorIO
rest <- nThrows (n-1)
return (c : rest)
That's a good first approach, that's what you do, more or less, except you use if instead of pattern matching and you have an explicit accumulator argument (were you going for a tail recursion ?), not really better except for strict accumulator (Int rather than lists).
Of course, Haskell promotes higher-order functions rather than direct recursion, so let's see the combinators, searching "Int -> IO a -> IO [a]" with Hoogle gives you :
replicateM :: Monad m => Int -> m a -> m [a]
Which does exactly what you want :
nThrows n = replicateM n randColorIO
(I'm not sure I would even write this as a function since I find the explicit expression clearer and almost as short)
Once you have the results of the throws, you should check which are identical, I propose you look at sort, group, map and length to achieve this objective (transforming your list of results in a list of list of identical results, not the most efficient of data structure but at this scale, the most appropriate choice). Then keeping the colors you got several time is just a matter of using filter.
Then you should write some more functions to handle interaction and scoring :
type Score = Int
yahtzee :: IO Score
yahtzeeStep :: Int -> [[Color]] -> IO [[Color]] -- recursive
scoring :: [[Color]] -> Score
So I recommend to keep and transmit a [[Color]] to keeps track of what was put aside. This should be enough for your needs.
You are basically asking two different questions here. The first question can be answered with a function like getColor n = fst . head $ filter (\x -> snd x == n) diceColor.
Your second question, however, is much more interesting. You can't replace elements. You need a function that can call itself recursively, and this function will be driving your game. It needs to accept as parameters the current score and the list of kept dice. On entry the score will be zero and the kept dice list will be empty. It will then roll as many dice as needed to fill the list (I'm not familiar with the rules of Yahtzee), output it to the user, and ask for choice. If the user chooses to end the game, the function returns the score. If he chooses to keep some dice, the function calls itself with the current score and the list of kept dice. So, to sum it up, playGame :: Score -> [Dice] -> IO Score.
Disclaimer: I am, too, very much a beginner in Haskell.
at first thought:
rand :: Int -> IO [Int]
rand n = mapM id (take n (repeat (randomRIO (1::Int, 6))))
although the haskellers could remove the parens

Resources