QuickCheck with Dynamic Element Sets - haskell

Is there a way to control programmatically the set of values to use in an elements call within an arbitrary definition? I want to be able to generate an arbitrary variable reference as part of a random expression, but the set of variables identifiers to choose from should be configurable.
As example, imagine the following data type:
data Expr = Num Int
| Var String
| BinOp Op Expr Expr
data Op = Add | Sub | Mul | Div deriving (Eq, Ord, Enum)
And then I want to define an arbitrary instance for this type that would look something like this:
instance Arbitrary Op where
arbitrary = elements [Add .. ]
instance Arbitrary Expr where
arbitrary = oneof [ fmap Num arbitrary
, arbitraryBinOp
, fmap Var (elements varNames)
]
arbitraryBinOp = do (op, e0, e1) <- arbitrary
return (BinOp op e0 e1)
Now the tricky thing is the "varNames" part. Conceptually I would like to be able to do something like this:
do args <- getArgs
tests <- generate $ vectorOf 10 ((arbitrary args)::Gen Expr)
But obviously I can't propagate that args-vector down through the arbitrary calls as "arbitrary" does not take such an argument...

Arbitrary is really only a convenience when the generator does not require any context. If you need to parameterize your generators, you can define them as regular functions, and QuickCheck has combinators to use such explicit generators instead of Arbitrary instances.
genExpr :: [String] -> Gen Expr
genExpr varNames =
oneof [ fmap Num arbitrary
, arbitraryBinOp
, fmap Var (elements varNames)
]
main :: IO ()
main = do
args <- getArgs
tests <- generate $ vectorOf 10 (genExpr args)
{- do stuff -}
return ()

Related

How can I declare the types for this problem?

I'm trying to create a simple programming language with some primitives and user defined functions.
These are the types I created:
data Type = IntT | BoolT
data Value = IntV Int | BoolV Bool | OperatorCall String [Value]
data Expr = LetE String Value | ProcedureCall String [Value]
As you can see, I've divided functions into operators (which return a value) and procedures (which don't return anything and act as expressions instead of values). A function call contains the string id of the function being called and the list of arguments being passed. Also, a program is just a list of expressions (I've omitted user defined functions here for the sake of simplicity)
My problem comes from the fact that I need to write a function that parses a function call from a string:
parseFunctionCall :: String -> ???
...
The return type of that function can be a Value (for operator calls) or an Expr (for procedure calls). This function is rather complicated and I'd prefer to avoid writing it twice, or polluting it with an Either return type. What should I do? How can I change my types so that this can be achieved cleanly? Something like this perhaps, but I don't think this is the way:
type FunctionCall = (String, [Value])
data Value = ... | OperatorCall FunctionCall
data Expr = ... | ProcedureCall FunctionCall
parseAsFunctionCall :: String -> FunctionCall
...
You can have the function call parser return (String, [Value]), and let the caller fix that up into whatever data structure they like best -- in your case, by applying \(s, vs) -> OperatorCall s vs if parsing a value or \(s, vs) -> ProcedureCall s vs if parsing an expression.
parseFunctionCall :: Parser (String, [Value])
parseLiteralInt :: Parser Int
parseLiteralBool :: Parser Bool
parseLet :: Parser (String, Value)
(parseFunctionCall, parseLiteralInt, parseBool, parseLet) = {- ... -}
parseValue :: Parser Value
parseValue =
((\(s, vs) -> OperatorCall s vs) <$> parseFunctionCall)
<|>
(IntV <$> parseLiteralInt)
<|>
(BoolV <$> parseLiteralBool)
parseExpr :: Parser Expr
((\(s, vs) -> ProcedureCall s vs) <$> parseFunctionCall)
<|>
((\(s, v) -> Let s v) <$> parseLet)

Deriving Eq and Show for an ADT that contains fields that can't have Eq or Show

I'd like to be able to derive Eq and Show for an ADT that contains multiple fields. One of them is a function field. When doing Show, I'd like it to display something bogus, like e.g. "<function>"; when doing Eq, I'd like it to ignore that field. How can I best do this without hand-writing a full instance for Show and Eq?
I don't want to wrap the function field inside a newtype and write my own Eq and Show for that - it would be too bothersome to use like that.
One way you can get proper Eq and Show instances is to, instead of hard-coding that function field, make it a type parameter and provide a function that just “erases” that field. I.e., if you have
data Foo = Foo
{ fooI :: Int
, fooF :: Int -> Int }
you change it to
data Foo' f = Foo
{ _fooI :: Int
, _fooF :: f }
deriving (Eq, Show)
type Foo = Foo' (Int -> Int)
eraseFn :: Foo -> Foo' ()
eraseFn foo = foo{ fooF = () }
Then, Foo will still not be Eq- or Showable (which after all it shouldn't be), but to make a Foo value showable you can just wrap it in eraseFn.
Typically what I do in this circumstance is exactly what you say you don’t want to do, namely, wrap the function in a newtype and provide a Show for that:
data T1
{ f :: X -> Y
, xs :: [String]
, ys :: [Bool]
}
data T2
{ f :: OpaqueFunction X Y
, xs :: [String]
, ys :: [Bool]
}
deriving (Show)
newtype OpaqueFunction a b = OpaqueFunction (a -> b)
instance Show (OpaqueFunction a b) where
show = const "<function>"
If you don’t want to do that, you can instead make the function a type parameter, and substitute it out when Showing the type:
data T3' a
{ f :: a
, xs :: [String]
, ys :: [Bool]
}
deriving (Functor, Show)
newtype T3 = T3 (T3' (X -> Y))
data Opaque = Opaque
instance Show Opaque where
show = const "..."
instance Show T3 where
show (T3 t) = show (Opaque <$ t)
Or I’ll refactor my data type to derive Show only for the parts I want to be Showable by default, and override the other parts:
data T4 = T4
{ f :: X -> Y
, xys :: T4' -- Move the other fields into another type.
}
instance Show T4 where
show (T4 f xys) = "T4 <function> " <> show xys
data T4' = T4'
{ xs :: [String]
, ys :: [Bool]
}
deriving (Show) -- Derive ‘Show’ for the showable fields.
Or if my type is small, I’ll use a newtype instead of data, and derive Show via something like OpaqueFunction:
{-# LANGUAGE DerivingVia #-}
newtype T5 = T5 (X -> Y, [String], [Bool])
deriving (Show) via (OpaqueFunction X Y, [String], [Bool])
You can use the iso-deriving package to do this for data types using lenses if you care about keeping the field names / record accessors.
As for Eq (or Ord), it’s not a good idea to have an instance that equates values that can be observably distinguished in some way, since some code will treat them as identical and other code will not, and now you’re forced to care about stability: in some circumstance where I have a == b, should I pick a or b? This is why substitutability is a law for Eq: forall x y f. (x == y) ==> (f x == f y) if f is a “public” function that upholds the invariants of the type of x and y (although floating-point also violates this). A better choice is something like T4 above, having equality only for the parts of a type that can satisfy the laws, or explicitly using comparison modulo some function at use sites, e.g., comparing someField.
The module Text.Show.Functions in base provides a show instance for functions that displays <function>. To use it, just:
import Text.Show.Functions
It just defines an instance something like:
instance Show (a -> b) where
show _ = "<function>"
Similarly, you can define your own Eq instance:
import Text.Show.Functions
instance Eq (a -> b) where
-- all functions are equal...
-- ...though some are more equal than others
_ == _ = True
data Foo = Foo Int Double (Int -> Int) deriving (Show, Eq)
main = do
print $ Foo 1 2.0 (+1)
print $ Foo 1 2.0 (+1) == Foo 1 2.0 (+2) -- is True
This will be an orphan instance, so you'll get a warning with -Wall.
Obviously, these instances will apply to all functions. You can write instances for a more specialized function type (e.g., only for Int -> String, if that's the type of the function field in your data type), but there is no way to simultaneously (1) use the built-in Eq and Show deriving mechanisms to derive instances for your datatype, (2) not introduce a newtype wrapper for the function field (or some other type polymorphism as mentioned in the other answers), and (3) only have the function instances apply to the function field of your data type and not other function values of the same type.
If you really want to limit applicability of the custom function instances without a newtype wrapper, you'd probably need to build your own generics-based solution, which wouldn't make much sense unless you wanted to do this for a lot of data types. If you go this route, then the Generics.Deriving.Show and Generics.Deriving.Eq modules in generic-deriving provide templates for these instances which could be modified to treat functions specially, allowing you to derive per-datatype instances using some stub instances something like:
instance Show Foo where showsPrec = myGenericShowsPrec
instance Eq Foo where (==) = myGenericEquality
I proposed an idea for adding annotations to fields via fields, that allows operating on behaviour of individual fields.
data A = A
{ a :: Int
, b :: Int
, c :: Int -> Int via Ignore (Int->Int)
}
deriving
stock GHC.Generic
deriving (Eq, Show)
via Generically A -- assuming Eq (Generically A)
-- Show (Generically A)
But this is already possible with the "microsurgery" library, but you might have to write some boilerplate to get it going. Another solution is to write separate behaviour in "sums-of-products style"
data A = A Int Int (Int->Int)
deriving
stock GHC.Generic
deriving
anyclass SOP.Generic
deriving (Eq, Show)
via A <-𝈖-> '[ '[ Int, Int, Ignore (Int->Int) ] ]

How can I find out which functions can be run on an expression?

I'd like to programmatically find out which functions in a module could possibly apply to a particular expression.
Let's make this concrete.
{-# LANGUAGE TemplateHaskell #-}
module Test where
-- we'll import template-haskell from Lens
-- so we can create prisms automatically for our 'AST'
import qualified Control.Lens.TH as LTH
--- some 'AST' in a toy language
data CExp
= CLit Int -- a literal integer
| CAdd CExp CExp -- addition
| CMul CExp CExp -- multiplication
| CSub CExp CExp -- subtraction
deriving Show
-- an eval for our AST
eval :: CExp -> Int
eval exp =
case exp of
CLit i -> i
CAdd e1 e2 ->
eval e1 + eval e2
CMul e1 e2 ->
eval e1 * eval e2
CSub e1 e2 ->
eval e1 - eval e2
-- a function to build a sum using add with our AST, from a list of Int values
listToSums :: [Int] -> CExp
listToSums =
foldr CAdd (CLit 0) . fmap CLit
-- here we make prisms for looking at particular values
-- in the CExp AST
LTH.makePrisms ''CExp
-- let's have an expression:
theList1 :: CExp
theList1 = listToSums [1..38]
Now, at this point, I'd like a function that can give me a list of all the top level functions of a particular module (including this one) that are able to be applied to the expression theList1. This will include the prisms that were created with makePrisms.
It would be fine if it uses the hint library's Interpreter monad. I've been experimenting with it a bit, and while I can get a list of all of the definitions at the top level of any module, and I can find the types of them, too (more or less), I'm a bit lost about how to pass an expression in as an argument to these functions then check if those expressions will typecheck.
If I can do that, I can run filter across all of the functions in a module, which lets me find out which ones are applicable.
Many thanks in advance.
Hint has a function called typeChecks :: MonadInterpreter m => String -> m Bool, which tells you whether the expression in the string is well typed in the interpreter's context.
fnsAccepting :: MonadInterpreter m => String -> m [Id]
fnsAccepting expr = do
moduleContents <- getLoadedModules >>= traverse getModuleExports
let importedFns = [fn | exports <- moduleContents, Fun fn <- exports]
filterM (\fn -> typeChecks $ fn ++ " " ++ parens expr) importedFns

Uniqueness and other restrictions for Arbitrary in QuickCheck

I'm trying to write a modified Arbitrary instance for my data type, where (in my case) a subcomponent has a type [String]. I would ideally like to bring uniqueness in the instance itself, that way I don't need ==> headers / prerequisites for every test I write.
Here's my data type:
data Foo = Vars [String]
and the trivial Arbitrary instance:
instance Arbitrary Foo where
arbitrary = Vars <$> (:[]) <$> choose ('A','z')
This instance is strange, I know. In the past, I've had difficulty when quickcheck combinatorically explodes, so I'd like to keep these values small. Another request - how can I make an instance where the generated strings are under 4 characters, for instance?
All of this, fundamentally requires (boolean) predicates to augment Arbitrary instances. Is this possible?
Definitely you want the instance to produce only instances that match the intention of the data type. If you want all the variables to be distinct, the Arbitrary instance must reflect this. (Another question is if in this case it wouldn't make more sense to define Vars as a set, like newtype Vars = Set [String].)
I'd suggest to check for duplicates using Set or Hashtable, as nub has O(n^2) complexity, which might slow down your test considerably for larger inputs. For example:
import Control.Applicative
import Data.List (nub)
import qualified Data.Set as Set
import Test.QuickCheck
newtype Foo = Vars [String]
-- | Checks if a given list has no duplicates in _O(n log n)_.
hasNoDups :: (Ord a) => [a] -> Bool
hasNoDups = loop Set.empty
where
loop _ [] = True
loop s (x:xs) | s' <- Set.insert x s, Set.size s' > Set.size s
= loop s' xs
| otherwise
= False
-- | Always worth to test if we wrote `hasNoDups` properly.
prop_hasNoDups :: [Int] -> Property
prop_hasNoDups xs = hasNoDups xs === (nub xs == xs)
Your instance then needs to create a list of list, and each list should be randomized. So instead of (: []), which creates just a singleton list (and just one level), you need to call listOf twice:
instance Arbitrary Foo where
arbitrary = Vars <$> (listOf . listOf $ choose ('A','z'))
`suchThat` hasNoDups
Also notice that choose ('A', 'z') allows to use all characters between A and z, which includes many control characters. My guess is that you rather want something like
oneof [choose ('A','Z'), choose ('a','z')]
If you really want, you could also make hasNoDups O(n) using hash tables in the ST monad.
Concerning limiting the size: you could always have your own parametrized functions that produce different Gen Foo, but I'd say in most cases it's not necessary. Gen has it's own internal size parameter, which is increased throughout the tests (see this answer), so different sizes (as generated using listOf) of lists are covered.
But I'd suggest you to implement shrink, as this will give you much nicer counter-examples. For example, if we define (a wrong test) that tried to verify that no instance of Var contains 'a' in any of its variable:
prop_Foo_hasNoDups :: Foo -> Property
prop_Foo_hasNoDups (Vars xs) = all (notElem 'a') xs === True
we'll get ugly counter-examples such as
Vars ["RhdaJytDWKm","FHHhrqbI","JVPKGTqNCN","awa","DABsOGNRYz","Wshubp","Iab","pl"]
But adding
shrink (Vars xs) = map Vars $ shrink xs
to Arbitrary Foo makes the counter-example to be just
Vars ["a"]
suchThat :: Gen a -> (a -> Bool) -> Gen a is a way to embed Boolean predicates in a Gen. See the haddocks for more info.
Here's how you would make the instance unique:
instance Arbitrary Foo where
arbitrary = Vars <$> (:[]) <$> (:[]) <$> choose ('A','z')
`suchThat` isUnique
where
isUnique x = nub x == x

Counting occurrences in an expression

I'm pretty new to Haskell and I have an assessment which involves a manipulator and evaluator of boolean expressions.
Thee expression type is:
type Variable = String
data Expr = T | Var Variable | And Expr Expr | Not Expr
I've worked through a lot of the questions but i am stuck on how to approach the following function. I need to count the occurences of all the variables in an expression
addCounter :: Expr -> Expr
addCounter = undefined
prop_addCounter1 = addCounter (And (Var "y") (And (Var "x") (Var "y"))) ==
And (Var "y1") (And (Var "x2") (Var "y1"))
prop_addCounter2 = addCounter (Not (And (Var "y") T)) ==
Not (And (Var "y1") T)
I'm not looking for an answer on exactly how to do this as it is an assessment question but I would like some tips on how I would go about approaching this?
In my head I imagine incrementing a counter so that I can get the y1, x2 part but this isn't really something that is possible in Haskell (or not advised to do anyway!) Would I go about this through recursion and if so how do I know what number to add to the variable?
As you say you cannot keep a shared counter which would be very natural in this case. What you can do instead is to pass the current counter value down the tree as you recursively visit all Expr's, and receive back the incremented counter value from the function being called. It must be a two-way communication. You pass down the current value and receive back the updated Expr and the new counter value.
If you want each unique variable name to have the same counter value you need to keep a mapping of variable names to assigned counter values. You need to pass that one around just like the current counter value.
Hope that helps.
Atomize your stateful updates
So, this is definitely a great time to use a State monad. In particular, the atomic transform you're looking for is a way to take String -> String enumerating strings by a unique id for each string. Let's call it enumerate
import Control.Monad.State
-- | This is the only function which is going to touch our 'Variable's
enumerate :: Variable -> State OurState Variable
To do this, we'll need to track state that maps Strings to counts (Ints)
import qualified Data.Map as M
type OurState = Map String Int
runOurState :: State OurState a -> a
runOurState = flip evalState M.empty
runOurState $ mapM enumerate ["x", "y", "z", "x" ,"x", "x", "y"]
-- ["x1", "y1", "z1", "x2", "x3", "x4", "y2"]
so we can implement enumerate pretty directly as a stateful action.
enumerate :: Variable -> State OurState Variable
enumerate var = do m <- get
let n = 1 + M.findWithDefault 0 var m
put $ M.insert var n m
return $ var ++ show n
Cool!
Folding generically over an expression tree
Now we really ought to write an elaborate folding apparatus which maps Expr -> State OurState Expr by applying enumerate on each Var-type leaf.
enumerateExpr :: Expr -> State OurState Expr
enumerateExpr T = return T
enumerateExpr (Var s) = fmap Var (enumerate s)
enumerateExpr (And e1 e2) = do em1 <- addCounter e1
em2 <- addCounter e2
return (Add em1 em2)
enumerateExpr (Not expr) = fmap Not (addCounter expr)
But this is pretty tedious, so we'll use the Uniplate library to keep dry.
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data
import Data.Generics.Uniplate.Data
data Expr = T | Var Variable | And Expr Expr | Not Expr
deriving (Show,Eq,Ord,Data)
onVarStringM :: (Variable -> State OurState Variable) -> Expr -> State OurState Expr
onVarStringM action = transformM go
where go :: Expr -> State OurState Expr
go (Var s) = fmap Var (action s)
go x = return x
The transformM operator does just what we want—apply a monadic transformation over all the pieces of a generic tree (our Expr).
So now, we just unpack the Stateful action to make addCounter
addCounter :: Expr -> Expr
addCounter = runOurState . onVarStringM enumerate
Oh, wait!
Just noticed, this doesn't actually have the right behavior—it doesn't enumerate your variables quite right (prop_addCounter1 fails but prop_addCounter2 passes). Unfortunately, I'm not really sure how it ought to be done... but given this separation of concerns laid out here it'd be very easy to just write the appropriate enumerate Stateful action and apply it to the same generic Expr-transforming machinery.

Resources