Write a prime module using file content in Haskell - haskell

I have a file containing many prime numbers.
I'd like to write a module with the following functions:
module Primes as
( init,
primes,
is_prime)
where ...
where init should read the file and initialize the primes which should be a list and also is_prime.
My problem is, how should I write it? Isn't there a way to "hide" the IO monad?
More generally I believe I think like an OO programmer. What is the good functional way to handle this?

The way to 'hide' the IO monad is to write isPrime (and any other functions that use the list of prime numbers) as a pure function, and only introduce IO as late as you possibly need. The fact that you are using a list of pregenerated prime numbers and reading them from a file is just an implementation detail that is irrelevant to general functions dealing with prime numbers.
Here's a simple implementation of a pure isPrime that takes an integer to test and a list of prime numbers. Notice that it doesn't care where the list of primes came from, just that it's a list of integers.
isPrime :: Integer -> [Integer] -> Bool
isPrime n ps = n `elem` ps
Now let's introduce a function that will read the prime numbers from disk. Its return type must be in the IO monad because, well, we're performing I/O.
readPrimesFromFile :: String -> IO [Integer]
readPrimesFromFile filename = ...
Now we can use this function in combination with our isPrime function. Once we start using readPrimesFromFile we are forever 'trapped' in the IO monad.
main :: IO ()
main = do
primeList <- readPrimesFromFile "primes.txt"
let result = isPrime 123 primeList
print result

Why exactly do you start with a file of primes? The IO time to read the file is probably slower than just generating the primes from scratch.
See Data.Numbers.Primes for (afaik) the most efficient way to generate the list of all prime numbers.

Related

Unit testing IO Int and similar in Haskell

From Ninety-Nine Haskell Problems:
Question 23: Extract a given number of randomly selected elements from a list.
This is a partial solution. For simplicity, this code just selects one element from a list.
import System.Random (randomRIO)
randItem :: [a] -> IO a
randItem xs = do
i <- randomRIO (0,length xs - 1)
return $ xs !! i
so randItem [1..10] would return an IO Int that corresponds to (but does not equal) some Int from 1 through 10.
So far, so good. But what kind of tests can I write for my randItem function? What relationship--if any--can I confirm between the input and output?
I can use the same logic as the above function to produce m Bool, but I cannot figure out how to test for m Bool. Thank you.
There's a couple of things you can do. If you're using QuickCheck, you could write the following properties:
The length of the returned list should be equal to the input length.
All elements in the returned list should be elements of the list of candidates.
Apart from that, the beauty of Haskell's Random library is that (as most other Haskell code) it's deterministic. If, instead of basing your implementation on randomRIO, you could base it on randomR or randomRs. This would enable you to pass some known RandomGen values to some deterministic unit test cases (not QuickCheck). These could serve as regression tests.
I've now published an article about the above approach, complete with source code.

linear congruent generator in haskell

This is a very simple linear-congruent pseudo-random number generator. It works fine when I seed it, but I want to make it so that it self-seeds with every produced number. Problem is that I don't know how to do that in Haskell where the notion of variables does not exist. I can feed the produced number recursively, but then my result would be a list of integers instead of a single number.
linCongGen :: Int -> Int
linCongGen seed = ((2*seed) + 3) `mod` 100
I'll summarize the comments a bit more meaningfully. The simplest solution is, like you observed, an infinite list of the sequence of generated elements. Then, every time you want to get a new number, pop off the head of that list.
linCongGen :: Integral a => a -> [a]
linCongGen = iterate $ \x -> ((2*x) + 3) `mod` 100
That said, here is a solution (which I do not agree with), but which does what I think you want. For mutable state, we usually use IORef, which is sort of like a reference or pointer. Here is the code. Please read the disclaimer afterwards though.
import Data.IORef
import System.IO.Unsafe
seed :: IORef Int
seed = unsafePerformIO $ newIORef 71
linCongGen :: IO Int
linCongGen = do previous <- readIORef seed
modifyIORef' seed $ \x -> ((2*x) + 3) `mod` 100
return previous
And here is a sample usage printing out the first hundred numbers generated: main = replicateM_ 100 $ getRandom >>= print (you'll need to have Control.Monad imported too for replicateM_).
DISCLAIMER
This is a bit of a hacky approach described here. As the link says "Maybe the need for global mutable state is a symptom of bad design." The link also has a good description of a more intelligent workaround. Making an IORef is an inherently IO operation, and we really shouldn't be using unsafePerformIO on it. If you find yourself fighting Haskell in this way, it's because Haskell was designed to get in your way when you are doing things you shouldn't.
That said, I find comfort in knowing that this approach is also the one using in System.Random (the standard random number module) to define the initial seed (check out the source).

Procedurally generating large list of values in Haskell -- most idiomatic approach? memory management?

I have a function that takes a series of random numbers/floats, and uses them to generate a value/structure (ie, taking a random velocity and position of the point a ball is thrown from and outputting the coordinates of where it would land). And I need to generate several thousands in succession.
The way I have everything implemented is each calculation takes in an stdGen, uses it to generate several numbers, and passes out a new stdGen to allow it to be chained to another one.
And to do this for 10000 items, I make a sort of list from generate_item n which basically outputs a (value,gen) tuple (the value being the value i'm trying to calculate), where the value of gen is the recursively outputted stdGen from the calculations involved in getting the value from generate_item n-1
However, this program seems to crawl to be impractically slow at around a thousand results or so. And seems to definitely not be scalable. Could it have to do with the fact that I am storing all of the generate_item results in memory?
Or is there a more idomatic way of approaching this problem in Haskell using Monads or something than what I have describe above?
Note that the code to generate the algorithm from the random value generates 10k within seconds even in high-level scripting languages like ruby and python; these calculations are hardly intensive.
Code
-- helper functions that take in StdGen and return (Result,new StdGen)
plum_radius :: StdGen -> (Float,StdGen)
unitpoint :: Float -> StdGen -> ((Float,Float,Float),StdGen)
plum_speed :: Float -> StdGen -> (Float,StdGen)
-- The overall calculation of the value
plum_point :: StdGen -> (((Float,Float,Float),(Float,Float,Float)),StdGen)
plum_point gen = (((px,py,pz),(vx,vy,vz)),gen_out)
where
(r, gen2) = plum_radius gen
((px,py,pz),gen3) = unitpoint r gen2
(s, gen4) = plum_speed r gen3
((vx,vy,vz),gen5) = unitpoint s gen4
gen_out = gen5
-- Turning it into some kind of list
plum_data_list :: StdGen -> Int -> (((Float,Float,Float),(Float,Float,Float)),StdGen)
plum_data_list seed_gen 0 = plum_point seed_gen
plum_data_list seed_gen i = plum_point gen2
where
(_,gen2) = plum_data_list seed_gen (i-1)
-- Getting 100 results
main = do
gen <- getStdGen
let data_list = map (plum_data_list gen) [1..100]
putStrLn List.intercalate " " (map show data_list)
Consider just using the mersenne-twister and the vector-random package , which is specifically optimized to generate large amounts of high-quality random data.
Lists are unsuitable for allocating large amounts of data -- better to use a packed representation -- unless you're streaming.
First of all, the pattern you are describing -- taking an StdGen and then returning a tuple with a value and another StdGen to be chained into the next computation -- is exactly the pattern the State monad encodes. Refactoring your code to use it might be a good way to start to become familiar with monadic patterns.
As for your performance problem, StdGen is notoriously slow. I haven't done a lot with this stuff, but I've heard mersenne twister is faster.
However, you might also want to post your code, since in cases where you are generating large lists, laziness can really work to your advantage or disadvantage depending on how you are doing it. But it is hard to give specific advice without seeing what you are doing. One rule of thumb just in case you are coming from another functional language such as Lisp -- when generating a list (or other lazy data structure -- e.g. a tree, but not a Int), avoid tail recursion. The intuition for it being faster does not transfer to lazy languages. E.g. use (written without the monadic style that I would acutally use in practice)
randoms :: Int -> StdGen -> (StdGen, [Int])
randoms 0 g = (g, [])
randoms n g = let (g', x) = next g
(g'', xs) = randoms (n-1) g'
in (g'', x : xs)
This will allow the result list to be "streamed", so you can access the earlier parts of it before generating the later parts. (In this state case, it's a little subtle because accessing the resulting StdGen will have to generate the whole list, so you'll have to carefully avoid doing that until after you have consumed the list -- I wish there was a fast random generator that supported a good split operation, then you could get around having to return a generator at all).
Oh, just in case you're having trouble getting going with the monads thing, here's the above function written with a state monad:
randomsM :: Int -> State StdGen [Int]
randomsM 0 = return []
randomsM n = do
x <- state next
xs <- randomsM (n-1)
return (x : xs)
See the correspondence?
The other posters have good points, StdGen doesn't perform very well, and you should probably try to use State instead of manually passing the generator along. But I think the biggest problem is your plum_data_list function.
It seems to be intended to be some kind of lookup, but since it's implemented recursively without any memoization, the calls you make have to recurse to the base case. That is, plum_data_list seed_gen 100 needs the random generator from plum_data_list seed_gen 99 and so on, until plum_data_list seed_gen 0. This will give you quadratic performance when you try to generate a list of these values.
Probably the more idiomatic way is to let plum_data_list seed_gen generate an infinite list of points like so:
plum_data_list :: StdGen -> [((Float,Float,Float),(Float,Float,Float))]
plum_data_list seed_gen = first_point : plum_data_list seed_gen'
where
(first_point, seed_gen') = plum_point seed_gen
Then you just need to modify the code in main to something like take 100 $ plum_data_list gen, and you are back to linear performance.

Haskell checking if number is from Fibonacci sequence

I'm Haskell beginner. Last time I have learnt about Fibonacci sequences, so I can create Fib sequence. Now I'm wondering how to write a function which checks if number belongs to Fib sequence.
I mean function:
belongToFib :: Int -> Bool
I don't really need code. Some hints how to handle with this would be enough. Thanks in advance.
I will give you some hints for a solution involving lazy evaluation:
Define the list of all fibonacci numbers.
Check whether your input number belongs to the sequence.
These are the signatures for the two things you'll need to define:
fib :: [Int]
belongToFib :: Int -> Bool
Of course you will need some tricks to make this work. Even though your list has a (theoretically) infinite sequence of numbers, if you make sure that you only need to work on a finite subsequence, thanks to its laziness, Haskell will generate only the strictly needed part, and your function will not loop forever. So, when checking for the membership of your number to fib, make sure you return False at some point.
Another possible solution is to try to find out whether your number is in the fibonacci sequence without actually generating it up to the input, but rather by relying on arithmetic only. As a hint for this, have a look at this thread.
On Wikipedia you'll find a number of other ways to check membership to the fibonacci sequence.
edit: by the way, beware of overflows with Int. You may wish to switch to Integer instead.
Here is a skeleton of a function that tests if a number occurs in an increasing list of numbers:
contains _ [] = False
contains n (x:xs)
| n == x = True
| n < x = ???
| otherwise = ???
Think about what should happen in the cases I left open...
Or, if you are both lazy and allowed to use Prelude functions, you may have a look at dropWhile instead.

SIMPLE random number generation

I'm writing this after a good while of frustrating research, and I'm hoping someone here can enlighten me about the topic.
I want to generate a simple random number in a haskell function, but alas, this seems impossible to do without all sorts of non-trivial elements, such as Monads, asignation in "do", creating generators, etc.
Ideally I was looking for an equivalent of C's "rand()". But after much searching I'm pretty convinced there is no such thing, because of how the language is designed. (If there is, please someone enlighten me). As that doesn't seem feasible, I'd like to find a way to get a random number for my particular problem, and a general explanation on how it works to get a random number.
prefixGenerator :: (Ord a, Arbitrary a) => Gen ([a],[a])
prefixGenerator = frequency [
(1, return ([],[])),
(2, do {
xs1 <- orderedListEj13 ;
xs2 <- orderedListEj13 ;
return (xs1,xs2)
}),
(2, do {
xs2 <- orderedListEj13 ;
return ((take RANDOMNUMBERHERE xs2),xs2)
})
]
I'm trying to get to grips with QuickCheck but my inability to use random numbers is making it hard. I've tried something like this (by putting an drawInt 0 (length xs2) instead of RANDOMNUMBERHERE)but I get stuck with the fact that take requires an Int and that method leaves me with a IO Int, which seems impossible to transform to an Int according to this.
As haskell is a pure functional programming language, functions are referentially transparent which means essentially that only a function's arguments determine its result. If you were able to pull a random number out of the air, you can imagine how that would cause problems.
I suppose you need something like this:
prefixGenerator :: (Ord a, Arbitrary a) => Gen ([a],[a])
prefixGenerator = do
randn <- choose (1,999) -- number in range 1-999
frequency [
(1, return ([],[])),
(2, do {
xs1 <- orderedListEj13 ;
xs2 <- orderedListEj13 ;
return (xs1,xs2)
}),
(2, do {
xs2 <- orderedListEj13 ;
return ((take randn xs2),xs2)
})
]
In general in haskell you approach random number generation by either pulling some randomness from the IO monad, or by maintaining a PRNG that is initialized with some integer seed hard-coded, or pulled from IO (gspr's comment is excellent).
Reading about how pseudo random number generators work might help you understand System.Random, and this might help as well (scroll down to section on randomness).
You're right in that nondeterministic random (by which I mean "pseudo-random") number generation is impossible without trickery. Functions in Haskell are pure which means that the same input will always produce the same output.
The good news is that you don't seem to need a nondeterministic PRNG. In fact, it would be better if your QuickCheck test used the same sequence of "random" numbers each time, to make your tests fully reproducible.
You can do this with the mkStdGen function from System.Random. Adapted from the Haskell wiki:
import System.Random
import Data.List
randomInts :: Int -> [Int]
randomInts n = take n $ unfoldr (Just . random) (mkStdGen 4)
Here, 4 is the seed that you may want to choose by a fair dice roll.
The standard library provides a monad for random-number generation. The monadic stuff is not that hard to learn, but if you want to avoid it, find a pseudo-random function next that takes an Int to an Int in a pseudorandom way, and then just create and pass an infinite list of random numbers:
next :: Int -> Int
randoms :: [Int]
randoms = iterate next 73
You can then pass this list of random numbers wherever you need it.
Here's a linear congruential next from Wikipedia:
next n = (1103515245 * n + 12345) `mod` 1073741824
And here are the first 20 pseudorandom numbers following 73:
Prelude> take 20 $ iterate next 73
[73,25988430,339353199,182384508,910120965,1051209818,737424011,14815080,325218177,1034483750,267480167,394050068,4555453,647786674,916350979,980771712,491556281,584902142,110461279,160249772]

Resources