Only generate positive integers with QuickCheck - haskell

We have two functions that compare two different power functions, and return true if they return the same value (on the same input).
Then we have two other functions that test these functions against two lists to see if there is any value that doesn't return true.
But instead of using lists that use a range [1..100], we would like to use QuickCheck.
Is it possible to make QuickCheck only return positive integers?
Code:
comparePower1 :: Integer -> Integer -> Bool
comparePower1 n k = power n k == power1 n k
comparePower2 :: Integer -> Integer -> Bool
comparePower2 n k = power n k == power2 n k
testing1 = and [comparePower1 n k | n <- [0..100], k <- [0..100]]
testing2 = and [comparePower2 n k | n <- [0..100], k <- [0..100]]

QuickCheck has support for Positive numbers, but for the sake of this tutorial I will show you how to create your own Generator. One of the main features of QuickCheck is that you can design your own generator to output just what you need. For instance
genPos :: Gen Int
genPos = abs `fmap` (arbitrary :: Gen Int) `suchThat` (> 0)
Then you can create your own list genereator
genListOfPos :: Gen [Int]
genListOfPos = listOf genPos
Finally you can use forAll, pass the generator and profit.
main :: IO ()
main = do
quickCheck $ forAll genPos $ \x -> x > 0
quickCheck $ forAll genListOfPos $ all (> 0)

Related

Parallelize computation of mutable vector in ST

How can computations done in ST be made to run in parallel?
I have a vector which needs to be filled in by random access, hence the use of ST, and the computation runs correctly single-threaded, but have been unable to figure out how to use more than one core.
Random access is needed because of the meaning of the indices into the vector. There are n things and every possible way of choosing among n things has an entry in the vector, as in the choice function. Each of these choices corresponds to a binary number (conceptually, a packed [Bool]) and these Int values are the indices. If there are n things, then the size of the vector is 2^n. The natural way the algorithm runs is for every entry corresponding to "n choose 1" to be filled in, then every entry for "n choose 2," etc. The entries corresponding to "n choose k" depends on the entries corresponding to "n choose (k-1)." The integers for the different choices do not occur in numerical order, and that's why random access is needed.
Here's a pointless (but slow) computation that follows the same pattern. The example function shows how I tried to break the computation up so that the bulk of the work is done in a pure world (no ST monad). In the code below, bogus is where most of the work is done, with the intent of calling that in parallel, but only one core is ever used.
import qualified Data.Vector as Vb
import qualified Data.Vector.Mutable as Vm
import qualified Data.Vector.Generic.Mutable as Vg
import qualified Data.Vector.Generic as Gg
import Control.Monad.ST as ST ( ST, runST )
import Data.Foldable(forM_)
import Data.Char(digitToInt)
main :: IO ()
main = do
putStrLn $ show (example 9)
example :: Int -> Vb.Vector Int
example n = runST $ do
m <- Vg.new (2^n) :: ST s (Vm.STVector s Int)
Vg.unsafeWrite m 0 (1)
forM_ [1..n] $ \i -> do
p <- prev m n (i-1)
let newEntries = (choiceList n i) :: [Int]
forM_ newEntries $ \e -> do
let v = bogus p e
Vg.unsafeWrite m e v
Gg.unsafeFreeze m
choiceList :: Int -> Int -> [Int]
choiceList _ 0 = [0]
choiceList n 1 = [ 2^k | k <- [0..(n-1) ] ]
choiceList n k
| n == k = [2^n - 1]
| otherwise = (choiceList (n-1) k) ++ (map ((2^(n-1)) +) $ choiceList (n-1) (k-1))
prev :: Vm.STVector s Int -> Int -> Int -> ST s Integer
prev m n 0 = return 1
prev m n i = do
let chs = choiceList n i
v <- mapM (\k -> Vg.unsafeRead m k ) chs
let e = map (\k -> toInteger k ) v
return (sum e)
bogus :: Integer -> Int -> Int
bogus prior index = do
let f = fac prior
let g = (f^index) :: Integer
let d = (map digitToInt (show g)) :: [Int]
let a = fromIntegral (head d)^2
a
fac :: Integer -> Integer
fac 0 = 1
fac n = n * fac (n - 1)
If anyone tests this, using more than 9 or 10 in show (example 9) will take much longer than you want to wait for such a pointless sequence of numbers.
Just do it in IO. If you need to use the result in pure code, then unsafePerformIO is available.
The following version runs about 3-4 times faster with +RTS -N16 than +RTS -N1. My changes involved converting the ST vectors to IO, changing the forM_ to forConcurrently_, and adding a bang annotation to let !v = bogus ....
Full code:
import qualified Data.Vector as Vb
import qualified Data.Vector.Mutable as Vm
import qualified Data.Vector.Generic.Mutable as Vg
import qualified Data.Vector.Generic as Gg
import Control.Monad.ST as ST ( ST, runST )
import Data.Foldable(forM_)
import Data.Char(digitToInt)
import Control.Concurrent.Async
import System.IO.Unsafe
main :: IO ()
main = do
let m = unsafePerformIO (example 9)
putStrLn $ show m
example :: Int -> IO (Vb.Vector Int)
example n = do
m <- Vg.new (2^n)
Vg.unsafeWrite m 0 (1)
forM_ [1..n] $ \i -> do
p <- prev m n (i-1)
let newEntries = (choiceList n i) :: [Int]
forConcurrently_ newEntries $ \e -> do
let !v = bogus p e
Vg.unsafeWrite m e v
Gg.unsafeFreeze m
choiceList :: Int -> Int -> [Int]
choiceList _ 0 = [0]
choiceList n 1 = [ 2^k | k <- [0..(n-1) ] ]
choiceList n k
| n == k = [2^n - 1]
| otherwise = (choiceList (n-1) k) ++ (map ((2^(n-1)) +) $ choiceList (n-1) (k-1))
prev :: Vm.IOVector Int -> Int -> Int -> IO Integer
prev m n 0 = return 1
prev m n i = do
let chs = choiceList n i
v <- mapM (\k -> Vg.unsafeRead m k ) chs
let e = map (\k -> toInteger k ) v
return (sum e)
bogus :: Integer -> Int -> Int
bogus prior index = do
let f = fac prior
let g = (f^index) :: Integer
let d = (map digitToInt (show g)) :: [Int]
let a = fromIntegral (head d)^2
a
fac :: Integer -> Integer
fac 0 = 1
fac n = n * fac (n - 1)
I think this can not be done in a safe way. In the general case, it seems it would break Haskell's referential transparency.
If we could perform multi-threaded computations within ST s, then we could spawn two threads that race over the same STRef s Bool. Let's say one thread is writing False and the other one True.
After we use runST on the computation, we get an expression of type Bool which is sometimes False and sometimes True. That should not be possible.
If you are absolutely certain that your parallelization does not break referential transparency, you could try using unsafe primitives like unsafeIOToST to spawn new threads. Use with extreme care.
There might be safer ways to achieve something similar. Outside ST, we do have some parallelism available in Control.Parallel.Strategies.
There are a number of ways to do parallelization in Haskell. Usually they will give comparable performance improvements, however some are better then the others and it mostly depends on problem that needs parallelization. This particular use case looked very interesting to me, so I decided to investigate a few approaches.
Approaches
vector-strategies
We are using a boxed vector, therefore we can utilize laziness and built-in spark pool for parallelization. One very simple approach is provided by vector-strategies package, which can iterate over any immutable boxed vector and evaluate all of the thunks in parallel. It is also possible to split the vector in chunks, but as it turns out the chunk size of 1 is the optimal one:
exampleParVector :: Int -> Vb.Vector Int
exampleParVector n = example n `using` parVector 1
parallel
parVector uses par underneath and requires one extra iteration over the vector. In this case we are already iterating over thee vector, thus it would actually make more sense to use par from parallel directly. This would allow us to perform computation in parallel while continue using ST monad:
import Control.Parallel (par)
...
forM_ [1..n] $ \i -> do
p <- prev m n (i-1)
let newEntries = choiceList n i :: [Int]
forM_ newEntries $ \e -> do
let v = bogus p e
v `par` Vg.unsafeWrite m e v
It is important to note that the computation of each element of the vector is expensive when compared to the total number of elements in the vector. That is why using par is a very good solution here. If it was the opposite, namely the vector was very large, but elements weren't too expensive to compute, it would be better to use an unboxed vector and switch it to a different parallelization method.
async
Another way was described by #K.A.Buhr. Switch to IO from ST and use async:
import Control.Concurrent.Async (forConcurrently_)
...
forM_ [1..n] $ \i -> do
p <- prev m n (i-1)
let newEntries = choiceList n i :: [Int]
forConcurrently_ newEntries $ \e -> do
let !v = bogus p e
Vg.unsafeWrite m e v
The concern that #chi has raised is a valid one, however in this particular implementation it is safe to use unsafePerformIO instead of runST, because parallelization does not violate the invariant of deterministic computation. Namely, we can promise that regardless of the input supplied to example function, the output will always be exactly the same.
scheduler
Green threads are pretty cheap in Haskell, but they aren't free. The solution above with async package has one slight drawback: it will spin up at least as many threads as there are elements in the newEntries list each time forConcurrently_ is called. It would be better to spin up as many threads as there are capabilities (the -N RTS option) and let them do all the work. For this we can use scheduler package, which is a work stealing scheduler:
import Control.Scheduler (Comp(Par), runBatch_, withScheduler_)
...
withScheduler_ Par $ \scheduler ->
forM_ [1..n] $ \i -> runBatch_ scheduler $ \_ -> do
p <- prev m n (i-1)
let newEntries = choiceList n i :: [Int]
forM_ newEntries $ \e -> scheduleWork_ scheduler $ do
let !v = bogus p e
Vg.unsafeWrite m e v
Spark pool in GHC also uses a work stealing scheduler, which is built into RTS and is unrelated to the package above in any shape or form, but the idea is very similar: few threads with many units of computation.
Benchmarks
Here are some benchmarks on a 16-core machine for all of the approaches with example 7 (value 9 takes on the order of seconds, which introduces too much noise for criterion). We only get about x5 speedup, because a significant part of the algorithm is sequential in nature and can't be parallelized.

Haskell: Difference between "(Int a, Bool b) => a -> b" and Int -> Bool [duplicate]

I wrote my first program in Haskell today. It compiles and runs successfully. And since it is not a typical "Hello World" program, it in fact does much more than that, so please congrats me :D
Anyway, I've few doubts regarding my code, and the syntax in Haskell.
Problem:
My program reads an integer N from the standard input and then, for each integer i in the range [1,N], it prints whether i is a prime number or not. Currently it doesn't check for input error. :-)
Solution: (also doubts/questions)
To solve the problem, I wrote this function to test primality of an integer:
is_prime :: Integer -> Bool
is_prime n = helper n 2
where
helper :: Integer -> Integer -> Bool
helper n i
| n < 2 * i = True
| mod n i > 0 = helper n (i+1)
| otherwise = False
It works great. But my doubt is that the first line is a result of many hit-and-trials, as what I read in this tutorial didn't work, and gave this error (I suppose this is an error, though it doesn't say so):
prime.hs:9:13:
Type constructor `Integer' used as a class
In the type signature for `is_prime':
is_prime :: Integer a => a -> Bool
According to the tutorial (which is a nicely-written tutorial, by the way), the first line should be: (the tutorial says (Integral a) => a -> String, so I thought (Integer a) => a -> Bool should work as well.)
is_prime :: (Integer a) => a -> Bool
which doesn't work, and gives the above posted error (?).
And why does it not work? What is the difference between this line (which doesn't work) and the line (which works)?
Also, what is the idiomatic way to loop through 1 to N? I'm not completely satisfied with the loop in my code. Please suggest improvements. Here is my code:
--read_int function
read_int :: IO Integer
read_int = do
line <- getLine
readIO line
--is_prime function
is_prime :: Integer -> Bool
is_prime n = helper n 2
where
helper :: Integer -> Integer -> Bool
helper n i
| n < 2 * i = True
| mod n i > 0 = helper n (i+1)
| otherwise = False
main = do
n <- read_int
dump 1 n
where
dump i x = do
putStrLn ( show (i) ++ " is a prime? " ++ show (is_prime i) )
if i >= x
then putStrLn ("")
else do
dump (i+1) x
You are misreading the tutorial. It would say the type signature should be
is_prime :: (Integral a) => a -> Bool
-- NOT Integer a
These are different types:
Integer -> Bool
This is a function that takes a value of type Integer and gives back a value of type Bool.
Integral a => a -> Bool
This is a function that takes a value of type a and gives back a value of type Bool.
What is a? It can be any type of the caller's choice that implements the Integral type class, such as Integer or Int.
(And the difference between Int and Integer? The latter can represent an integer of any magnitude, the former wraps around eventually, similar to ints in C/Java/etc.)
The idiomatic way to loop depends on what your loop does: it will either be a map, a fold, or a filter.
Your loop in main is a map, and because you're doing i/o in your loop, you need to use mapM_.
let dump i = putStrLn ( show (i) ++ " is a prime? " ++ show (is_prime i) )
in mapM_ dump [1..n]
Meanwhile, your loop in is_prime is a fold (specifically all in this case):
is_prime :: Integer -> Bool
is_prime n = all nondivisor [2 .. n `div` 2]
where
nondivisor :: Integer -> Bool
nondivisor i = mod n i > 0
(And on a minor point of style, it's conventional in Haskell to use names like isPrime instead of names like is_prime.)
Part 1: If you look at the tutorial again, you'll notice that it actually gives type signatures in the following forms:
isPrime :: Integer -> Bool
-- or
isPrime :: Integral a => a -> Bool
isPrime :: (Integral a) => a -> Bool -- equivalent
Here, Integer is the name of a concrete type (has an actual representation) and Integral is the name of a class of types. The Integer type is a member of the Integral class.
The constraint Integral a means that whatever type a happens to be, a has to be a member of the Integral class.
Part 2: There are plenty of ways to write such a function. Your recursive definition looks fine (although you might want to use n < i * i instead of n < 2 * i, since it's faster).
If you're learning Haskell, you'll probably want to try writing it using higher-order functions or list comprehensions. Something like:
module Main (main) where
import Control.Monad (forM_)
isPrime :: Integer -> Bool
isPrime n = all (\i -> (n `rem` i) /= 0) $ takeWhile (\i -> i^2 <= n) [2..]
main :: IO ()
main = do n <- readLn
forM_ [1..n] $ \i ->
putStrLn (show (i) ++ " is a prime? " ++ show (isPrime i))
It is Integral a, not Integer a. See http://www.haskell.org/haskellwiki/Converting_numbers.
map and friends is how you loop in Haskell. This is how I would re-write the loop:
main :: IO ()
main = do
n <- read_int
mapM_ tell_prime [1..n]
where tell_prime i = putStrLn (show i ++ " is a prime? " ++ show (is_prime i))

How to randomly shuffle a list

I have random number generator
rand :: Int -> Int -> IO Int
rand low high = getStdRandom (randomR (low,high))
and a helper function to remove an element from a list
removeItem _ [] = []
removeItem x (y:ys) | x == y = removeItem x ys
| otherwise = y : removeItem x ys
I want to shuffle a given list by randomly picking an item from the list, removing it and adding it to the front of the list. I tried
shuffleList :: [a] -> IO [a]
shuffleList [] = []
shuffleList l = do
y <- rand 0 (length l)
return( y:(shuffleList (removeItem y l) ) )
But can't get it to work. I get
hw05.hs:25:33: error:
* Couldn't match expected type `[Int]' with actual type `IO [Int]'
* In the second argument of `(:)', namely
....
Any idea ?
Thanks!
Since shuffleList :: [a] -> IO [a], we have shuffleList (xs :: [a]) :: IO [a].
Obviously, we can't cons (:) :: a -> [a] -> [a] an a element onto an IO [a] value, but instead we want to cons it onto the list [a], the computation of which that IO [a] value describes:
do
y <- rand 0 (length l)
-- return ( y : (shuffleList (removeItem y l) ) )
shuffled <- shuffleList (removeItem y l)
return y : shuffled
In do notation, values to the right of <- have types M a, M b, etc., for some monad M (here, IO), and values to the left of <- have the corresponding types a, b, etc..
The x :: a in x <- mx gets bound to the pure value of type a produced / computed by the M-type computation which the value mx :: M a denotes, when that computation is actually performed, as a part of the combined computation represented by the whole do block, when that combined computation is performed as a whole.
And if e.g. the next line in that do block is y <- foo x, it means that a pure function foo :: a -> M b is applied to x and the result is calculated which is a value of type M b, denoting an M-type computation which then runs and produces / computes a pure value of type b to which the name y is then bound.
The essence of Monad is thus this slicing of the pure inside / between the (potentially) impure, it is these two timelines going on of the pure calculations and the potentially impure computations, with the pure world safely separated and isolated from the impurities of the real world. Or seen from the other side, the pure code being run by the real impure code interacting with the real world (in case M is IO). Which is what computer programs must do, after all.
Your removeItem is wrong. You should pick and remove items positionally, i.e. by index, not by value; and in any case not remove more than one item after having picked one item from the list.
The y in y <- rand 0 (length l) is indeed an index. Treat it as such. Rename it to i, too, as a simple mnemonic.
Generally, with Haskell it works better to maximize the amount of functional code at the expense of non-functional (IO or randomness-related) code.
In your situation, your “maximum” functional component is not removeItem but rather a version of shuffleList that takes the input list and (as mentioned by Will Ness) a deterministic integer position. List function splitAt :: Int -> [a] -> ([a], [a]) can come handy here. Like this:
funcShuffleList :: Int -> [a] -> [a]
funcShuffleList _ [] = []
funcShuffleList pos ls =
if (pos <=0) || (length(take (pos+1) ls) < (pos+1))
then ls -- pos is zero or out of bounds, so leave list unchanged
else let (left,right) = splitAt pos ls
in (head right) : (left ++ (tail right))
Testing:
λ>
λ> funcShuffleList 4 [0,1,2,3,4,5,6,7,8,9]
[4,0,1,2,3,5,6,7,8,9]
λ>
λ> funcShuffleList 5 "#ABCDEFGH"
"E#ABCDFGH"
λ>
Once you've got this, you can introduce randomness concerns in simpler fashion. And you do not need to involve IO explicitely, as any randomness-friendly monad will do:
shuffleList :: MonadRandom mr => [a] -> mr [a]
shuffleList [] = return []
shuffleList ls =
do
let maxPos = (length ls) - 1
pos <- getRandomR (0, maxPos)
return (funcShuffleList pos ls)
... IO being just one instance of MonadRandom.
You can run the code using the default IO-hosted random number generator:
main = do
let inpList = [0,1,2,3,4,5,6,7,8]::[Integer]
putStrLn $ "inpList = " ++ (show inpList)
-- mr automatically instantiated to IO:
outList1 <- shuffleList inpList
putStrLn $ "outList1 = " ++ (show outList1)
outList2 <- shuffleList outList1
putStrLn $ "outList2 = " ++ (show outList2)
Program output:
$ pickShuffle
inpList = [0,1,2,3,4,5,6,7,8]
outList1 = [6,0,1,2,3,4,5,7,8]
outList2 = [8,6,0,1,2,3,4,5,7]
$
$ pickShuffle
inpList = [0,1,2,3,4,5,6,7,8]
outList1 = [4,0,1,2,3,5,6,7,8]
outList2 = [2,4,0,1,3,5,6,7,8]
$
The output is not reproducible here, because the default generator is seeded by its launch time in nanoseconds.
If what you need is a full random permutation, you could have a look here and there - Knuth a.k.a. Fisher-Yates algorithm.

Why does this SBV code stop before hitting the limit I set?

I have this theorem (not sure if that's the right word), and I want to get all the solutions.
pairCube limit = do
m <- natural exists "m"
n <- natural exists "n"
a <- natural exists "a"
constrain $ m^3 .== n^2
constrain $ m .< limit
return $ m + n .== a^2
res <- allSat (pairCube 1000)
-- Run from ghci
extractModels res :: [[Integer]]
This is trying to solve the problem:
There are infinite pairs of integers (m, n) such that m^3 = n^2 and m + n is a perfect square. What is the pair with the greatest m less than 1000?
I know the actual answer, just through brute forcing, but I want to do with SBV.
However, when I run the code it gives only the following values (in the form [m, n, a]):
[[9,27,6],[64,512,24],[]]
However, there are several other solutions with an m value less than 1000 that aren't included.
It's always good to give a full program:
{-# LANGUAGE ScopedTypeVariables #-}
import Data.SBV
pairCube :: SInteger -> Symbolic SBool
pairCube limit = do
(m :: SInteger) <- exists "m"
(n :: SInteger) <- exists "n"
(a :: SInteger) <- exists "a"
constrain $ m^(3::Integer) .== n^(2::Integer)
constrain $ m .< limit
return $ m + n .== a^(2::Integer)
main :: IO ()
main = print =<< allSat (pairCube 1000)
When I run it, I get:
Main> main
Solution #1:
m = 0 :: Integer
n = 0 :: Integer
a = 0 :: Integer
Solution #2:
m = 9 :: Integer
n = 27 :: Integer
a = -6 :: Integer
Solution #3:
m = 1 :: Integer
n = -1 :: Integer
a = 0 :: Integer
Solution #4:
m = 9 :: Integer
n = 27 :: Integer
a = 6 :: Integer
Solution #5:
m = 64 :: Integer
n = 512 :: Integer
a = -24 :: Integer
Solution #6:
m = 64 :: Integer
n = 512 :: Integer
a = 24 :: Integer
Unknown
Note the final Unknown.
Essentially, SBV queried Z3, and got 6 solutions; when SBV asked for the 7th, Z3 said "I don't know if there's any other solution." With non-linear arithmetic, this behavior is expected.
To answer the original question (i.e., find the max m), I changed the constraint to read:
constrain $ m .== limit
and coupled it with the following "driver:"
main :: IO ()
main = loop 1000
where loop (-1) = putStrLn "Can't find the largest m!"
loop m = do putStrLn $ "Trying: " ++ show m
mbModel <- extractModel `fmap` sat (pairCube m)
case mbModel of
Nothing -> loop (m-1)
Just r -> print (r :: (Integer, Integer, Integer))
After running about 50 minutes on my machine, Z3 produced:
(576,13824,-120)
So, clearly the allSat based approach is causing Z3 to give-up way earlier than what it can actually achieve if we fix m and iterate ourself. With a non-linear problem, expecting anything faster/better would be too much to ask of a general purpose SMT solver..

Haskell State monadic function using recursion

TL:DR: Is there a way to do example 3 without passing an argument
I'm trying to understand the state monad in haskell (Control.Monad.State). I made an extremely simple function:
Example 1
example :: State Int Int
example = do
e <- get
put (e*5)
return e
This example works in ghci...
runState example 3
(3,15)
I modified it to be able to take arguments....
Example 2
example :: Int -> State Int Int
example n = do
e <- get
put (e*n)
return e
also works in ghci...
runState (example 5) 3
(3,15)
I made it recursive, counting the number of steps it takes for a computation to satisfy some condition
Example 3
example :: Int -> State Int Int
example n = do
e <- get
if (n /= 1)
then do
put (succ e)
example (next n)
else return (succ e)
next :: Int -> Int
next n
| even n = div n 2
| otherwise = 3*n+1
ghci
evalState (example 13) 0
10
My question is, is there a way to do the previous example without explicitly passing a value?
You can store n in the state along side of e, for example, something like:
example = do
(e,n) <- get
if n /= 1
then do put (succ e, next n); example
else return e
There is some overhead to using the State monad, so you should compare this with the alternatives.
For instance, a more Haskelly way of approaching this problem is compose list operations to compute the answer, e.g.:
collatz :: Int -> [Int]
collatz n = iterate next n
collatzLength n = length $ takeWhile (/= 1) $ collatz n

Resources