What's the best way to handle random number generation in Haskell (or what are the tradeoffs)?
I haven't really seen an authoritative answer.
Consider: minimizing the impact on otherwise pure functions, how / when to seed, performance, thread safety
IMHO, the best idea is to keep the generator in a strict state record. Then you can use the ordinary do-Syntax to work with the generator. Seeding is done only once - at the beginning of the main program (or at the beginning of each thread). You can avoid IO by using the split operation, which yields two random generators from one. (Different, of course).
As state is still pure, threadsafety can be guaranteed. Additionally, you can always escape state by giving a random generator to the function. This is useful for instance in case of automatic unit tests.
Related
I have a function with a lot of random.choice, random.choices, random.randint, random.uniform etc functions and i have to put random.seed before them all.
I use python 3.8.6, is there anyway to keep a seed initialized just in the function or atleast a way to toggle it instead of doing it every time?
It sounds like you have a misconception about how PRNGs work. They are generators (it's right there in the name!) which produce a sequence of values that are deterministic but are constructed in an attempt to be indistinguishable from true randomness based on statistical tests. In essence, they attempt to pass a Turing test/imitation game for randomness. They do this by maintaining an internal state of bits, which gets updated via a deterministic algorithm with every call to the generator. The only role a seed is supposed to play is to set the initial internal state so that separate runs create reproducible sequences. Repeating the seed for separate runs can be useful for debugging and for playing tricks to reduce the variability of some classes of estimators in Monte Carlo simulation.
All PRNGs eventually cycle. Since the internal state is composed of a finite number of bits and the state update mechanism is deterministic, the entire sequence will repeat from the point where any state duplication occurs. In other words, the output sequence is actually a loop of values. The seed value has nothing to do with the quality of the pseudo-random numbers, that's determined by the state transition algorithm. Conceptually you can think of the seed as just being the entry point to the PRNG's cycle. (Note that this doesn't mean you have a cycle just because you observe the same output, cycling occurs only when the internal state that produces the output repeats. That's why the 1980's and 90's saw an emergence of PRNGs whose state space contained more bits than the output space, allowing duplicate output values as predicted by the birthday problem without having the sequence repeat verbatim from that point on.)
If you mess with a good PRNG by reseeding it multiple times, you're undoing all of the hard work that went into designing an algorithm which passes statistically based Turing tests. Since the seed does not determine the quality of the results, you're invoking additional cost (to spawn a new state from the seed), gaining no statistical benefit, and quite likely harming the ability to pass statistical testing. Don't do that!
In his article "Why Functional Programming Matters," John Hughes argues that "Lazy evaluation is perhaps the most powerful tool for modularization in the functional programmer's repertoire." To do so, he provides an example like this:
Suppose you have two functions, "infiniteLoop" and "terminationCondition." You can do the following:
terminationCondition(infiniteLoop input)
Lazy evaluation, in Hughes' words "allows termination conditions to be separated from loop bodies." This is definitely true, since "terminationCondition" using lazy evaluation here means this condition can be defined outside the loop -- infiniteLoop will stop executing when terminationCondition stops asking for data.
But couldn't higher-order functions achieve the same thing as follows?
infiniteLoop(input, terminationCondition)
How does lazy evaluation provide modularization here that's not provided by higher-order functions?
Yes you could use a passed in termination check, but for that to work the author of infiniteLoop would have had to forsee the possibility of wanting to terminate the loop with that sort of condition, and hardwire a call to the termination condition into their function.
And even if the specific condition can be passed in as a function, the "shape" of it is predetermined by the author of infiniteLoop. What if they give me a termination condition "slot" that is called on each element, but I need access to the last several elements to check some sort of convergence condition? Maybe for a simple sequence generator you could come up with "the most general possible" termination condition type, but it's not obvious how to do so and remain efficient and easy to use. Do I repeatedly pass the entire sequence so far into the termination condition, in case that's what it's checking? Do I force my callers to wrap their simple termination conditions up in a more complicated package so they fit the most general condition type?
The callers certainly have to know exactly how the termination condition is called in order to supply a correct condition. That could be quite a bit of dependence on this specific implementation. If they switch to a different implementation of infiniteLoop written by another third party, how likely is it that exactly the same design for the termination condition would be used? With a lazy infiniteLoop, I can drop in any implementation that is supposed to produce the same sequence.
And what if infiniteLoop isn't a simple sequence generator, but actually generates a more complex infinite data structure, like a tree? If all the branches of the tree are independently recursively generated (think of a move tree for a game like chess) it could make sense to cut different branches at different depths, based on all sorts of conditions on the information generated thus far.
If the original author didn't prepare (either specifically for my use case or for a sufficiently general class of use cases), I'm out of luck. The author of the lazy infiniteLoop can just write it the natural way, and let each individual caller lazily explore what they want; neither has to know much about the other at all.
Furthermore, what if the decision to stop lazily exploring the infinite output is actually interleaved with (and dependent on) the computation the caller is doing with that output? Think of the chess move tree again; how far I want to explore one branch of the tree could easily depend on my evaluation of the best option I've found in other branches of the tree. So either I do my traversal and calculation twice (once in the termination condition to return a flag telling infinteLoop to stop, and then once again with the finite output so I can actually have my result), or the author of infiniteLoop had to prepare for not just a termination condition, but a complicated function that also gets to return output (so that I can push my entire computation inside the "termination condition").
Taken to extremes, I could explore the output and calculate some results, display them to a user and get input, and then continue exploring the data structure (without recalling infiniteLoop based on the user's input). The original author of the lazy infiniteLoop need have no idea that I would ever think of doing such a thing, and it will still work. If we've got purity enforced by the type system, then that would be impossible with the passed-in termination condition approach unless the whole infiniteLoop was allowed to have side effects if the termination condition needs to (say by giving the whole thing a monadic interface).
In short, to allow the same flexibility you'd get with lazy evaluation by using a strict infiniteLoop that takes higher order functions to control it can be a large amount of extra complexity for both the author of infiniteLoop and its caller (unless a variety of simpler wrappers are exposed, and one of them matches the caller's use case). Lazy evaluation can allow producers and consumers to be almost completely decoupled, while still giving the consumer the ability to control how much output the producer generates. Everything you can do that way you can do with extra function arguments as you say, but it requires to the producer and consumer to essentially agree on a protocol for how the control functions work; and that protocol is almost always either specialised to the use case at hand (tying the consumer and producer together) or so complicated in order to be fully-general that the producer and consumer are up tied to that protocol, which is unlikely to be recreated elsewhere, and so they're still tied together.
I'm trying out the random number generation from the new library in C++11 for a simple dice class. I'm not really grasping what actually happens but the reference shows an easy example:
std::default_random_engine generator;
std::uniform_int_distribution<int> distribution(1,6);
int dice_roll = distribution(generator);
I read somewhere that with the "old" way you should only seed once (e.g. in the main function) in your application ideally. However I'd like an easily reusable dice class. So would it be okay to use this code block in a dice::roll() method although multiple dice objects are instantiated and destroyed multiple times in an application?
Currently I made the generator as a class member and the last two lines are in the dice:roll() methods. It looks okay but before I compute statistics I thought I'd ask here...
Think of instantiating a pseudo-random number generator (PRNG) as digging a well - it's the overhead you have to go through to be able to get access to water. Generating instances of a pseudo-random number is like dipping into the well. Most people wouldn't dig a new well every time they want a drink of water, why invoke the unnecessary overhead of multiple instantiations to get additional pseudo-random numbers?
Beyond the unnecessary overhead, there's a statistical risk. The underlying implementations of PRNGs are deterministic functions that update some internally maintained state to generate the next value. The functions are very carefully crafted to give a sequence of uncorrelated (but not independent!) values. However, if the state of two or more PRNGs is initialized identically via seeding, they will produce the exact same sequences. If the seeding is based on the clock (a common default), PRNGs initialized within the same tick of the clock will produce identical results. If your statistical results have independence as a requirement then you're hosed.
Unless you really know what you're doing and are trying to use correlation induction strategies for variance reduction, best practice is to use a single instantiation of a PRNG and keep going back to it for additional values.
Don't quite understand determinism in the context of concurrency and parallelism in Haskell. Some examples would be helpful.
Thanks
When dealing with pure values, the order of evaluation does not matter. That is essentially what parallelism does: Evaluating pure values in parallel. As opposed to pure values, order usually matters for actions with side-effects. Running actions simultaneously is called concurrency.
As an example, consider the two actions putStr "foo" and putStr "bar". Depending on the order in which those two actions get evaluated, the output is either "foobar", "barfoo" or any state in between. The output is indeterministic as it depends on the specific order of evaluation.
As another example, consider the two values sum [1..10] and 5 * 3. Regardless of the order in which those two get evaluated, they always reduce to the same results. This determinism is something you can usually only guarantee with pure values.
Concurrency and parallelism are two different things.
Concurrency means that you have multiple threads interacting non-deterministically. For example, you might have a chat server where each client is handled by one thread. The non-determinism is essential to the system you're trying to model.
Parallelism is about using multiple threads for simply making your program run faster. However, the end result should be exactly the same as if you run the algorithm sequentially.
Many languages don't have primitives for parallelism, so you have to implement it using concurrency primitives like threads and locks. However, this means that you the programmer have to be careful to ensure that you don't accidentally introduce unwanted non-determinism or other concurrency issues. With explicit parallelism primitives like par and pseq, many of these concerns simply go away.
I have a game tree that is too big to walk in its entirety.
How can I write a function that will evaluate the tree until a time limit or depth limit is reached?
It would help to have a bit more detail, I think. Also, you raise two entirely separate issues--do you want both limits applied simultaneously, or are you looking for how to do each independently? That said, in rough terms:
Time limit: This is clearly impossible without using IO, to start with. Assuming your game tree traversal function is largely pure you'd probably prefer not to intertwine it with a bunch of time-tracking that determines control flow. I think the simplest thing here is probably to have the traversal produce a stream of progressively better results, place each "best result so far" into an MVar or such, and run it on a separate thread. If the time limit is reached, just kill the thread and take the current value from the MVar.
Depth limit: The most thorough way to do this would be to simply perform a breadth-first search, wouldn't it? If that's not viable for whatever reason, I don't think there's any better solution than the obvious one of simply keeping a counter to indicate the current depth and not continuing deeper when the maximum is reached. Note that this is a case where the code can potentially be tidied up using a Reader-style monad, where each recursive call is wrapped in something like local (subtract 1).
The timeout function in the base package allows you to kill a computation after a certain period. Interleaving timeout with a stream of increasingly deeper results, such that the most recent result is stored in an MVar is a relatively common trick for search problems in Haskell.
You can also use a lazy writer monad for your traversal, generating a list of improving answers. Now you've simplified your problem somewhat, to just taking the first "good enough" or "best so far" result from the list by some criteria. On top of that you can use the timeout trick that dons described, or any other approach you think appropriate...