Arbitrary String generator in Haskell (Test.QuickCheck.Gen) - haskell

I am struggling on Real World Haskell Chapter 11 quickCheck generator implementation for a an algebraic data type.
Following the book implementation (which was published in 2008), I came up with the following:
-- file: ch11/Prettify2.hs
module Prettify2(
Doc(..)
) where
data Doc = Empty
| Char Char
| Text String
| Line
| Concat Doc Doc
| Union Doc Doc
deriving (Show, Eq)
And my Arbitrary implementation:
-- file: ch11/Arbitrary.hs
import System.Random
import Test.QuickCheck.Gen
import qualified Test.QuickCheck.Arbitrary
class Arbitrary a where
arbitrary :: Gen a
-- elements' :: [a] => Gen a {- Expected a constraint, but ‘[a]’ has kind ‘*’ -}
-- choose' :: Random a => (a, a) -> Gen a
-- oneof' :: [Gen a] -> a
data Ternary = Yes
| No
| Unknown
deriving(Eq, Show)
instance Arbitrary Ternary where
arbitrary = do
n <- choose (0, 2) :: Gen Int
return $ case n of
0 -> Yes
1 -> No
_ -> Unknown
instance (Arbitrary a, Arbitrary b) => Arbitrary (a, b) where
arbitrary = do
x <- arbitrary
y <- arbitrary
return (x, y)
instance Arbitrary Char where
arbitrary = elements (['A'..'Z'] ++ ['a' .. 'z'] ++ " ~!##$%^&*()")
I tried the two following implementation with no success:
import Prettify2
import Control.Monad( liftM, liftM2 )
instance Arbitrary Doc where
arbitrary = do
n <- choose (1,6) :: Gen Int
case n of
1 -> return Empty
2 -> do x <- arbitrary
return (Char x)
3 -> do x <- arbitrary
return (Text x)
4 -> return Line
5 -> do x <- arbitrary
y <- arbitrary
return (Concat x y)
6 -> do x <- arbitrary
y <- arbitrary
return (Union x y)
instance Arbitrary Doc where
arbitrary =
oneof [ return Empty
, liftM Char arbitrary
, liftM Text arbitrary
, return Line
, liftM2 Concat arbitrary arbitrary
, liftM2 Union arbitrary arbitrary ]
But it doesn't compile since No instance for (Arbitrary String)
I tried then to implement the instance for Arbitrary String in the following ways:
import qualified Test.QuickCheck.Arbitrary but it does not implement Arbitrary String neither
installing Test.RandomStrings hackage link
instance Arbitrary String where
arbitrary = do
n <- choose (8, 16) :: Gen Int
return $ randomWord randomASCII n :: Gen String
With the following backtrace:
$ ghci
GHCi, version 7.10.3: http://www.haskell.org/ghc/ :? for help
Prelude> :l Arbitrary.hs
[1 of 2] Compiling Prettify2 ( Prettify2.hs, interpreted )
[2 of 2] Compiling Main ( Arbitrary.hs, interpreted )
Arbitrary.hs:76:9:
The last statement in a 'do' block must be an expression
return <- randomWord randomASCII n :: Gen String
Failed, modules loaded: Prettify2
Would you have any good suggestion about how to implement this particular generator and - more in general - how to proceed in these cases?
Thank you in advance

Don't define a new Arbitrary type class, import Test.QuickCheck instead. It defines most of these instances for you. Also be careful about the version of quickcheck, RWH assumes version 1.
The resulting full implementation will be:
-- file: ch11/Arbitrary.hs
import Test.QuickCheck
import Prettify2
import Control.Monad( liftM, liftM2 )
data Ternary = Yes
| No
| Unknown
deriving(Eq, Show)
instance Arbitrary Ternary where
arbitrary = do
n <- choose (0, 2) :: Gen Int
return $ case n of
0 -> Yes
1 -> No
_ -> Unknown
instance Arbitrary Doc where
arbitrary =
oneof [ return Empty
, liftM Char arbitrary
, liftM Text arbitrary
, return Line
, liftM2 Concat arbitrary arbitrary
, liftM2 Union arbitrary arbitrary ]

Related

How can I create a random 2d array in Haskell?

I am fairly new to Haskell and I'm always confused when it comes to dealing with random values. This time I'm trying to create a 5x5 2d-array in which every cell contains a random value between 4 options.
data Content = Bomb | One | Two | Three deriving (Eq, Show)
data Board = Array (Int, Int) Cell
data Cell = Cell {
content :: Content,
state :: CellState,
notes :: Notes
}
type Notes = [String]
type CellState = Bool
then in the main function I typed
main :: IO()
main = do
let cellMatrix = createMatrix
print cellMatrix
How can I create a function that creates a Board, and where do I need to do all the g<-newStdGen stuff?
Any advice would be greatly appreciated!
Focusing solely on the randomness aspect of the problem here, we see that the System.Random module in its version 1.2 provides an uniformR function that can produce one pseudo-random integer value from a specified range. Typically, we would need 5*5=25 values from the (0,3) range.
Side Note: Beware the changes between Random v1.1 and v1.2 are massive ones.
Alternatively, the same module provides a randomRs function, but that one provides an infinite list of integer values, hence cannot return an updated value of the generator. That might not be what you require.
We can use as our workhorse a function that returns a finite count of random integer values, together with an updated generator:
import System.Random
getManyInts :: RandomGen g => g -> Int -> (Int,Int) -> ([Int],g)
getManyInts g0 count range =
if (count <= 0) then ([], g0)
else let
(v,g1) = uniformR range g0
(vs,gf) = getManyInts g1 (count-1) range
in
(v:vs, gf)
In order to navigate from integer to Enum values, we will use a slightly improved Content type:
data Content = Bomb | One | Two | Three
deriving (Eq, Show, Enum, Bounded)
At that point, we can write a small test program that provides 5*5=25 random values of Content type:
main :: IO ()
main = do
g0 <- newStdGen
let count = 5*5
(loC, hiC) = (minBound :: Content, maxBound :: Content)
intRange = (fromEnum loC, fromEnum hiC)
(xs, g1) = getManyInts g0 count intRange
cs = (map toEnum xs) :: [Content]
putStrLn $ "contents: " ++ (show cs)
Test program output:
contents: [Two,Three,One,One,One,Two,One,Two,One,Bomb,Two,Two,One,Three,Bomb,Two,Two,Bomb,Three,One,One,Bomb,One,One,Bomb]
Addendum:
A better, polymorphic function:
Next, we can provide a function that returns a list of random values for any similar type:
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE ExplicitForAll #-}
getManyEnums :: forall g e. (RandomGen g, Bounded e, Enum e) => g -> Int -> ([e],g)
getManyEnums g0 count = (map toEnum xs, g1)
where
intRange = (fromEnum (minBound :: e), fromEnum (maxBound :: e))
(xs, g1) = getManyInts g0 count intRange
This function can be specialized as required. For example:
Testing under ghci:
$ ghci
GHCi, version 8.10.5: https://www.haskell.org/ghc/ :? for help
...
λ>
λ> :load q73268595.hs
[1 of 1] Compiling Main ( q73268595.hs, interpreted )
Ok, one module loaded.
λ>
λ> getManyContents = getManyEnums :: StdGen -> Int -> ([Content],StdGen)
λ>
λ> g0 <- newStdGen
λ>
λ> (cs,gf) = getManyContents g0 10
λ>
λ> cs
[One,One,Two,One,One,Three,Three,Three,Bomb,One]
λ>
λ> :q
Leaving GHCi.
$

How to randomly shuffle a list

I have random number generator
rand :: Int -> Int -> IO Int
rand low high = getStdRandom (randomR (low,high))
and a helper function to remove an element from a list
removeItem _ [] = []
removeItem x (y:ys) | x == y = removeItem x ys
| otherwise = y : removeItem x ys
I want to shuffle a given list by randomly picking an item from the list, removing it and adding it to the front of the list. I tried
shuffleList :: [a] -> IO [a]
shuffleList [] = []
shuffleList l = do
y <- rand 0 (length l)
return( y:(shuffleList (removeItem y l) ) )
But can't get it to work. I get
hw05.hs:25:33: error:
* Couldn't match expected type `[Int]' with actual type `IO [Int]'
* In the second argument of `(:)', namely
....
Any idea ?
Thanks!
Since shuffleList :: [a] -> IO [a], we have shuffleList (xs :: [a]) :: IO [a].
Obviously, we can't cons (:) :: a -> [a] -> [a] an a element onto an IO [a] value, but instead we want to cons it onto the list [a], the computation of which that IO [a] value describes:
do
y <- rand 0 (length l)
-- return ( y : (shuffleList (removeItem y l) ) )
shuffled <- shuffleList (removeItem y l)
return y : shuffled
In do notation, values to the right of <- have types M a, M b, etc., for some monad M (here, IO), and values to the left of <- have the corresponding types a, b, etc..
The x :: a in x <- mx gets bound to the pure value of type a produced / computed by the M-type computation which the value mx :: M a denotes, when that computation is actually performed, as a part of the combined computation represented by the whole do block, when that combined computation is performed as a whole.
And if e.g. the next line in that do block is y <- foo x, it means that a pure function foo :: a -> M b is applied to x and the result is calculated which is a value of type M b, denoting an M-type computation which then runs and produces / computes a pure value of type b to which the name y is then bound.
The essence of Monad is thus this slicing of the pure inside / between the (potentially) impure, it is these two timelines going on of the pure calculations and the potentially impure computations, with the pure world safely separated and isolated from the impurities of the real world. Or seen from the other side, the pure code being run by the real impure code interacting with the real world (in case M is IO). Which is what computer programs must do, after all.
Your removeItem is wrong. You should pick and remove items positionally, i.e. by index, not by value; and in any case not remove more than one item after having picked one item from the list.
The y in y <- rand 0 (length l) is indeed an index. Treat it as such. Rename it to i, too, as a simple mnemonic.
Generally, with Haskell it works better to maximize the amount of functional code at the expense of non-functional (IO or randomness-related) code.
In your situation, your “maximum” functional component is not removeItem but rather a version of shuffleList that takes the input list and (as mentioned by Will Ness) a deterministic integer position. List function splitAt :: Int -> [a] -> ([a], [a]) can come handy here. Like this:
funcShuffleList :: Int -> [a] -> [a]
funcShuffleList _ [] = []
funcShuffleList pos ls =
if (pos <=0) || (length(take (pos+1) ls) < (pos+1))
then ls -- pos is zero or out of bounds, so leave list unchanged
else let (left,right) = splitAt pos ls
in (head right) : (left ++ (tail right))
Testing:
λ>
λ> funcShuffleList 4 [0,1,2,3,4,5,6,7,8,9]
[4,0,1,2,3,5,6,7,8,9]
λ>
λ> funcShuffleList 5 "#ABCDEFGH"
"E#ABCDFGH"
λ>
Once you've got this, you can introduce randomness concerns in simpler fashion. And you do not need to involve IO explicitely, as any randomness-friendly monad will do:
shuffleList :: MonadRandom mr => [a] -> mr [a]
shuffleList [] = return []
shuffleList ls =
do
let maxPos = (length ls) - 1
pos <- getRandomR (0, maxPos)
return (funcShuffleList pos ls)
... IO being just one instance of MonadRandom.
You can run the code using the default IO-hosted random number generator:
main = do
let inpList = [0,1,2,3,4,5,6,7,8]::[Integer]
putStrLn $ "inpList = " ++ (show inpList)
-- mr automatically instantiated to IO:
outList1 <- shuffleList inpList
putStrLn $ "outList1 = " ++ (show outList1)
outList2 <- shuffleList outList1
putStrLn $ "outList2 = " ++ (show outList2)
Program output:
$ pickShuffle
inpList = [0,1,2,3,4,5,6,7,8]
outList1 = [6,0,1,2,3,4,5,7,8]
outList2 = [8,6,0,1,2,3,4,5,7]
$
$ pickShuffle
inpList = [0,1,2,3,4,5,6,7,8]
outList1 = [4,0,1,2,3,5,6,7,8]
outList2 = [2,4,0,1,3,5,6,7,8]
$
The output is not reproducible here, because the default generator is seeded by its launch time in nanoseconds.
If what you need is a full random permutation, you could have a look here and there - Knuth a.k.a. Fisher-Yates algorithm.

Haskell: RandomGen drops half of values

I am writing a simple deterministic Random Number generator, based on the xorshift. The goal here is not to get a cryptographically secure or statistically perfect (pseudo-)random number generator, but to be able to archieve the same deterministic sequence of semi-random numbers across programming languages.
My Haskell program looks like follows:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module SimpleRNG where
import Data.Word (Word32)
import Data.Bits (xor, shift)
import System.Random (RandomGen(..))
import Control.Arrow
(|>) :: a -> (a -> b) -> b
(|>) x f = f x
infixl 0 |>
newtype SeedState = SeedState Word32
deriving (Eq, Show, Enum, Bounded)
seed :: Integral a => a -> SeedState
seed = SeedState . fromIntegral
rand_r :: SeedState -> (Word32, SeedState)
rand_r (SeedState num) = (res, SeedState res)
where
res = num
|> xorshift 13
|> xorshift (-17)
|> xorshift 5
xorshift :: Int -> Word32 -> Word32
xorshift amount x = x `xor` (shift x amount)
instance RandomGen SeedState where
next seed_state = (first fromIntegral) $ rand_r seed_state
where
genRange seed_state = (fromEnum (minBound `asTypeOf` seed_state),
fromEnum (maxBound `asTypeOf` seed_state))
split seed_state#(SeedState num) = (seed_state', inverted_seed_state')
where
(_, seed_state') = next seed_state
(_, inverted_seed_state') = next inverted_seed_state
inverted_seed_state = SeedState (maxBound - num)
Now, for some reason, when running
take 10 $ System.Random.randoms (seed 42) :: [Word32]
it returns only the 'odd' results, compared to the output of the following Python program:
class SeedState(object):
def __init__(self, seed = 42):
self.data = seed
def rand_r(rng_state):
num = rng_state.data
num ^= (num << 13) % (2 ** 32)
num ^= (num >> 17) % (2 ** 32)
num ^= (num << 5) % (2 ** 32)
rng_state.data = num
return num
__global_rng_state = SeedState(42)
def rand():
global __global_rng_state
return rand_r(__global_rng_state)
def seed(seed):
global __global_rng_state
__global_rng_state = SeedState(seed)
if __name__ == '__main__':
for x in range(0, 10):
print(rand())
It seems like the internals of the System.Random module do some weird trickery with the return result of the generator
(calling
map fst $ take 10 $ iterate (\(_, rng) -> rand_r rng) (rand_r $ seed 42)
gives the result I'd expect).
This is odd, since the type returned by the generator is already a Word32, so it could/should just be passed on unaltered without any remapping happening.
What is going on here, and is there a way to plug this xorshift-generator into System.Random in a way that returns the same results?
This is to do with the behaviour of System.Random.randoms, which repeatedly applies random to a RandomGen, not next.
class Random a where
...
random :: (RandomGen g) => g -> (a, g)
The Random class is what allows you to reuse RandomGen instances across different enums, and the instance for Word32 (as well as nearly all other types) is defined as
instance Random Word32 where randomR = randomIvalIntegral; random = randomBounded
randomBounded just calls randomR, so the behaviour of random is decided by `
randomIvalIntegral (l,h) = randomIvalInteger (toInteger l, toInteger h)
randomIvalInteger is an interesting function, you can read the source here. It's actually causing your problem because the function will discard a certain number of intermediate values based on the range of the generator and the range being generated over.
To get the values you want, you just need to use next instead - the easiest way would just be to define
randoms' g = x : (randoms' g') where (x, g') = next g

Visiting GHC AST with SYB

I wrote a program that visited the AST with Haskell-src-exts. I'm trying to convert it to use the GHC API. The former uses Uniplate, while for the latter it seems that unfortunately I'm forced with SYB (the documentation is horribly scarce).
Here is the original code:
module Argon.Visitor (funcsCC)
where
import Data.Data (Data)
import Data.Generics.Uniplate.Data (childrenBi, universeBi)
import Language.Haskell.Exts.Syntax
import Argon.Types (ComplexityBlock(..))
-- | Compute cyclomatic complexity of every function binding in the given AST.
funcsCC :: Data from => from -> [ComplexityBlock]
funcsCC ast = map funCC [matches | FunBind matches <- universeBi ast]
funCC :: [Match] -> ComplexityBlock
funCC [] = CC (0, 0, "<unknown>", 0)
funCC ms#(Match (SrcLoc _ l c) n _ _ _ _:_) = CC (l, c, name n, complexity ms)
where name (Ident s) = s
name (Symbol s) = s
sumWith :: (a -> Int) -> [a] -> Int
sumWith f = sum . map f
complexity :: Data from => from -> Int
complexity node = 1 + visitMatches node + visitExps node
visitMatches :: Data from => from -> Int
visitMatches = sumWith descend . childrenBi
where descend :: [Match] -> Int
descend x = length x - 1 + sumWith visitMatches x
visitExps :: Data from => from -> Int
visitExps = sumWith inspect . universeBi
where inspect e = visitExp e + visitOp e
visitExp :: Exp -> Int
visitExp (If {}) = 1
visitExp (MultiIf alts) = length alts - 1
visitExp (Case _ alts) = length alts - 1
visitExp (LCase alts) = length alts - 1
visitExp _ = 0
visitOp :: Exp -> Int
visitOp (InfixApp _ (QVarOp (UnQual (Symbol op))) _) =
case op of
"||" -> 1
"&&" -> 1
_ -> 0
visitOp _ = 0
I need to visit function bindings, matches and expressions. This is what I managed to write (not working):
import Data.Generics
import qualified GHC
import Outputable -- from the GHC package
funcs :: (Data id, Typeable id, Outputable id, Data from, Typeable from) => from -> [GHC.HsBindLR id id]
funcs ast = everything (++) (mkQ [] (\fun#(GHC.FunBind {}) -> [fun])) ast
It complains that there are too many instances for id, but I don't know what the heck it is. The relevant GHC module is:
http://haddock.stackage.org/lts-3.10/ghc-7.10.2/HsBinds.html
I'm getting insane from this. The goal is to count complexity (as you can see in the original code). I'd like to switch to the GHC API because it uses the same parser as the compiler, so it can parse every module without worrying about extensions.
EDIT: Here is why the current code does not work:
λ> :m +Language.Haskell.GHC.ExactPrint.Parsers GHC Data.Generics Outputable
λ> r <- Language.Haskell.GHC.ExactPrint.parseModule src/Argon/Visitor.hs
λ> let ast = snd $ (\(Right t) -> t) r
.>
λ> :t ast
ast :: Located (HsModule RdrName)
λ> let funcs = everything (++) (mkQ [] (un#(FunBind _ _ _ _ _ _) -> [fun])) ast :: (Data id, Typeable id, Outputable id) => [HsBindLR id id]
.>
λ> length funcs
<interactive>:12:8:
No instance for (Data id0) arising from a use of ‘funcs’
The type variable ‘id0’ is ambiguous
Note: there are several potential instances:
instance Data aeson-0.8.0.2:Data.Aeson.Types.Internal.Value
-- Defined in ‘aeson-0.8.0.2:Data.Aeson.Types.Internal’
instance Data attoparsec-0.12.1.6:Data.Attoparsec.Number.Number
-- Defined in ‘attoparsec-0.12.1.6:Data.Attoparsec.Number’
instance Data a => Data (Data.Complex.Complex a)
-- Defined in ‘Data.Complex’
../..plus 367 others
In the first argument of ‘length’, namely ‘funcs’
In the expression: length funcs
In an equation for ‘it’: it = length funcs
The GHC AST is parametrised on the type of names used in the tree: the parser outputs an AST with RdrName names which it seems you're working with. The GHC Commentary and the Haddocks have more information.
You might have more luck if you tell the compiler that you are working with HsBindLR RdrName RdrName.
Like this:
import Data.Generics
import GHC
import Outputable -- from the GHC package
funcs :: (Data from, Typeable from) => from -> [GHC.HsBindLR RdrName RdrName]
funcs ast = everything (++) (mkQ [] (\fun#(GHC.FunBind {}) -> [fun])) ast

Using a custom generator vs Arbitrary instance in QuickCheck

Here's a simple function. It takes an input Int and returns a (possibly empty) list of (Int, Int) pairs, where the input Int is the sum of the cubed elements of any of the pairs.
cubeDecomposition :: Int -> [(Int, Int)]
cubeDecomposition n = [(x, y) | x <- [1..m], y <- [x..m], x^3 + y^3 == n]
where m = truncate $ fromIntegral n ** (1/3)
-- cubeDecomposition 1729
-- [(1,12),(9,10)]
I want to test the property that the above is true; if I cube each element and sum any of the return tuples, then I get my input back:
import Control.Arrow
cubedElementsSumToN :: Int -> Bool
cubedElementsSumToN n = all (== n) d
where d = map (uncurry (+) . ((^3) *** (^3))) (cubeDecomposition n)
For runtime considerations, I'd like to limit the input Ints to a certain size when testing this with QuickCheck. I can define an appropriate type and Arbitrary instance:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Test.QuickCheck
newtype SmallInt = SmallInt Int
deriving (Show, Eq, Enum, Ord, Num, Real, Integral)
instance Arbitrary SmallInt where
arbitrary = fmap SmallInt (choose (-10000000, 10000000))
And then I guess I have to define versions of the function and property that use SmallInt rather than Int:
cubeDecompositionQC :: SmallInt -> [(SmallInt, SmallInt)]
cubeDecompositionQC n = [(x, y) | x <- [1..m], y <- [x..m], x^3 + y^3 == n]
where m = truncate $ fromIntegral n ** (1/3)
cubedElementsSumToN' :: SmallInt -> Bool
cubedElementsSumToN' n = all (== n) d
where d = map (uncurry (+) . ((^3) *** (^3))) (cubeDecompositionQC n)
-- cubeDecompositionQC 1729
-- [(SmallInt 1,SmallInt 12),(SmallInt 9,SmallInt 10)]
This works fine, and the standard 100 tests pass as expected. But it seems unnecessary to define a new type, instance, and function when all I really need is a custom generator. So I tried this:
smallInts :: Gen Int
smallInts = choose (-10000000, 10000000)
cubedElementsSumToN'' :: Int -> Property
cubedElementsSumToN'' n = forAll smallInts $ \m -> all (== n) (d m)
where d = map (uncurry (+) . ((^3) *** (^3)))
. cubeDecomposition
Now, the first few times I ran this, everything worked fine, and all tests pass. But on subsequent runs I observed failures. Bumping up the test size reliably finds one:
*** Failed! Falsifiable (after 674 tests and 1 shrink):
0
8205379
I'm a bit confused here due to the presence of two shrunken inputs - 0 and 8205379 - returned from QuickCheck, where I would intuitively expect one. Also, those inputs work as predicted (on my show-able property, at least):
*Main> cubedElementsSumToN 0
True
*Main> cubedElementsSumToN 8205379
True
So it seems like obviously there's a problem in the property that uses the custom Gen I defined.
What have I done wrong?
I quickly realized that the property as I've written it is obviously incorrect. Here's the proper way to do it, using the original cubedElementsSumToN property:
quickCheck (forAll smallInts cubedElementsSumToN)
which reads quite naturally.

Resources