How to organize definition with QuickCheck Arbitrary class in Haskell - haskell

I am learning to use QuickCheck and find it is a very powerful library. I wrote some toy program with stack, involving a few customized data type and defined all Arbitrary instances in one file in my first run like this and it worked well:
-- MyModule.hs
module MyModule
( A(..)
, B(..)
, ...
) where
data A = ...
data B = ... -- depends on type A
-- in test program
module TestMyModule where
import Test.QuickCheck
import MyModule
( A
, B
)
instance Arbitrary A where
arbitrary = ...
instance Arbitrary B where
arbitrary = ... -- depends on arbitrary for type A
prop_A :: A -> Bool
prop_A = undefined
prop_B :: B -> Bool
prop_B = undefined
main = do
quickCheck prop_A
quickCheck prop_B
This works quite well. However when I tried to separate files into two parts, one for A and one for B I ran into issues. I think this was really caused the test program split-up.
-- TestA program
import Test.QuickCheck
import MyModule (A)
instance Arbitrary A where
arbitrary = ...
...
-- TestB program
import Test.QuickCheck
import MyModule (A, B)
instance Arbitrary B where
arbitrary = ... -- code here depends on type A
I got errors like: No instance for (Arbitrary A) arising from a use of ‘quickCheck', which makes sense, because in TestB program, A is not an instance of Arbitrary. But how should I organize the files without putting Arbitrary instance in MyModule itself? I wanted to avoid putting test related definitions in the module.
Thank you in advance.

As Fyodor Soikin writes, you can package these Arbitray instances in a module that you put in your test library.
-- in test program
module MyTests.Arbs with
import Test.QuickCheck
import MyModule
-- Arbitrary instances go here...
-- Import MyTests.Arbs in your other test modules
The compiler ought to complain about orphan instances (it must have done so for the code in the OP as well), so you may want to consider wrapping those types in newtype definitions.
I use a naming scheme for that so that I'd typically have ValidA, InvalidB, AnyB, etc.

Related

How can I make my type an instance of Arbitrary?

I have the following data and function
data Foo = A | B deriving (Show)
foolist :: Maybe Foo -> [Foo]
foolist Nothing = [A]
foolist (Just x) = [x]
prop_foolist x = (length (foolist x)) == 1
when running quickCheck prop_foolist, ghc tells me that Foo needs to be an instance of Arbitrary.
No instance for (Arbitrary Foo) arising from a use of ‘quickCheck’
In the expression: quickCheck prop_foolist
In an equation for ‘it’: it = quickCheck prop_foolist
I tried data Foo = A | B deriving (Show, Arbitrary), but this results in
Can't make a derived instance of ‘Arbitrary Foo’:
‘Arbitrary’ is not a derivable class
Try enabling DeriveAnyClass
In the data declaration for ‘Foo’
However, I can't figure out how to enble DeriveAnyClass. I just wanted to use quickcheck with my simple function! The possible values of x is Nothing, Just A and Just B. Surely this should be possible to test?
There are two reasonable approaches:
Reuse an existing instance
If there's another instance that looks similar, you can use it. The Gen type is an instance of Functor, Applicative, and even Monad, so you can easily build generators from other ones. This is probably the most important general technique for writing Arbitrary instances. Most complex instances will be built up from one or more simpler ones.
boolToFoo :: Bool -> Foo
boolToFoo False = A
boolToFoo True = B
instance Arbitrary Foo where
arbitrary = boolToFoo <$> arbitrary
In this case, Foo can't be "shrunk" to subparts in any meaningful way, so the default trivial implementation of shrink will work fine. If it were a more interesting type, you could have used some analogue of
shrink = map boolToFoo . shrink . fooToBool
Use the pieces available in Test.QuickCheck.Arbitrary and/or Test.QuickCheck.Gen
In this case, it's pretty easy to just put together the pieces:
import Test.QuickCheck.Arbitrary
data Foo = A | B
deriving (Show,Enum,Bounded)
instance Arbitrary Foo where
arbitrary = arbitraryBoundedEnum
As mentioned, the default shrink implementation would be fine in this case. In the case of a recursive type, you'd likely want to add
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics (Generic)
and then derive Generic for your type and use
instance Arbitrary ... where
...
shrink = genericShrink
As the documentation warns, genericShrink does not respect any internal validity conditions you may wish to impose, so some care may be required in some cases.
You asked about DeriveAnyClass. If you wanted that, you'd add
{-# LANGUAGE DeriveAnyClass #-}
to the top of your file. But you don't want that. You certainly don't want it here, anyway. It only works for classes that have a full complement of defaults based on Generics, typically using the DefaultSignatures extension. In this case, there is no default arbitrary :: Generic a => Gen a line in the Arbitrary class definition, and arbitrary is mandatory. So an instance of Arbitrary produced by DeriveAnyClass will produce a runtime error as soon as QuickCheck tries to call its arbitrary method.

HTF does not test props generated by TH

I want to do a number of similar tests on various types in my library.
To simplify things, assume I have a number of vector types implementing Num class, and I want to generate the same QuickCheck property check prop_absNorm x y = abs x + abs y >= abs (x+y) that would work on all of the types in library.
I generate such properties using TH:
$(writeTests
(\t ->
[d| prop_absNorm :: $(t) -> $(t) -> Bool
prop_absNorm x y = abs x + abs y >= abs (x+y)
|])
)
My function to generate tests has the following signature:
writeTests :: (TypeQ -> Q [Dec]) -> Q [Dec]
This function looks for all instances of my vector class VectorMath (n::Nat) t (and, at the same time, instances of Num) through reify ''VectorMath and generates all prop functions accordingly.
-ddump-splices shows something like this:
prop_absNormIntX4 :: Vector 4 Int -> Vector 4 Int -> Bool
prop_absNormIntX4 x y = abs x + abs y >= abs (x+y)
prop_absNormCIntX4 :: Vector 4 CInt -> Vector 4 CInt -> Bool
prop_absNormCIntX4 x y = abs x + abs y >= abs (x+y)
...
prop_absNormFloatX4 :: Vector 4 Float -> Vector 4 Float -> Bool
prop_absNormFloatX4 x y = abs x + abs y >= abs (x+y)
prop_absNormFloatX3 :: Vector 3 Float -> Vector 3 Float -> Bool
prop_absNormFloatX3 x y = abs x + abs y >= abs (x+y)
The problem is that all manually written properties are checked, but generated ones are not.
Note 1: I have generated and non-generated properties in the same file (i.e. TH expression $(..) is in the same file as the other props).
Note 2: the list of types for creation of prop functions is variable - I want to add other instances of VectorMath later, so they are automatically added into the test list.
I believe that the problem is that HTF (which presumably uses TH too) parses the original file, not the one with generated code - but I cannot get why this happens.
So my question is: how to solve this problem? If it is not possible to use TH-generated props, then is that possible to do QuickCheck tests on various types (i.e. that it substitutes them into prop_absNorm :: Vector 4 a -> Vector 4 a -> Bool)?
Also another alternative may be to use TH further to add test entries manually to htf_Main, but I have not figured out how to do this yet;
and it does not look like a nice clean solution.
If you know in advance what the names of the generated property tests are, then you could always manually define stubs so that HTF sees them, e.g.:
$(generate prop test for Int)
$(generate prop test for CInt)
prop_p1 = prop_absNormInt
prop_p2 = prop_absNormCInt
HTF will see the tests as prop_p1 and prop_p2. You shouldn't have to put type signatures on these stubs.
Another idea is to create your own source pre-processor to add these stubs for you (and give them better names). Your source pre-processor would automatically call htfpp to complete the pre-processing.
If you show me how your TH is invoked I can show you how to write the pre-processor.
Update:
Given your comment I would look at doing the following:
Write a program to generate the test module source.
Include that program and the output it generates in your cabal project.
Tell users to run the program if they want to update the test module.
So - the test cases remain fixed until the program is run to regenerate the test module.
Having a static test module has the advantage that you can tell exactly what is being tested.
Having a program to recreate the test module gives you the ability to easily update it when new Num instances become available.
Ok, I managed to solve this problem.
The idea is to use TH to aggregate the tests and insert them into htfMain.
On top of what I have in the question, this includes following steps:
Convert all testable properties into IO actions running QuickCheck tests;
Aggregate all tests into TestSuite;
Aggregate all test suites into one list and put it into htfMain.
In order to use step 1 I had to use semi-internal function of HTF called qcAssertion :: (QCAssertion t) => t -> Assertion.
This function is available, but not recommended for external use; it allows running QuickCheck tests nicely, integrating them into report.
To proceed with step 2, I use two functions from HTF: makeTestSuite and makeQuickCheckTest.
I also use location function from TH to provide filename and line of the place where the splice with test template is inserted (for nicer test logs).
Step 3 is a tricky one: for this we need to find all generated test suites.
The problem is that TH does not allow to browse through all functions (including generated) in a module.
To overcome this, I added following type class:
class MultitypeTestSuite name where
multitypeTestSuite :: name -> TestSuite
So my function writeTests generates a new data type data MTS[prop_name] and an instance of MultitypeTestSuite for that data type.
This allows me later to use another splice function in htfMain that will generate a list of test suites out of instances of that class using reify:
aggregateTests :: ExpQ
aggregateTests = do
ClassI _ instances <- reify ''MultitypeTestSuite
liftM ListE . forM instances
$ \... -> [e| multitypeTestSuite $(...) |]
In the end, including all generated tests together with manually written ones looks pretty simple:
main :: IO ()
main = htfMain $ htf_importedTests ++ $(aggregateTests)
So, by adjusting function $(writeTests) I am able now to generate and test properties that vary in argument type - for all types available in scope at the same type.
Test results and logs are included the same way as original tests.
On that the problem is fully solved.
HTF does not use TemplateHaskell for collecting the tests , this would slow down compilation-time significantly. Instead, HTF uses a custom preprocessor called htfpp. htfpp runs before the compiler (and thus before TemplateHaskell splices are expanded). This means that you cannot use automatic test discovery with htfpp when generating your tests with TemplateHaskell.
My suggestion: when you are using TemplateHaskell anyway, then just use TemplateHaskell to collect your generated test cases. This functionality is not built into HTF, but it's not difficult to implement such a function. Here is it:
-- file TH.hs
{-# LANGUAGE TemplateHaskell #-}
module TH ( genTestSuiteFromQcProps ) where
import Language.Haskell.TH
import Test.Framework
import Test.Framework.Location
genTestSuiteFromQcProps :: String -> [Name] -> Q Exp
genTestSuiteFromQcProps suiteName names =
[| makeTestSuite $(stringE suiteName) $(listE genTests) |]
where
genTests :: [ExpQ]
genTests =
map genTest names
genTest :: Name -> Q Exp
genTest name =
[| makeQuickCheckTest $(stringE (show name)) unknownLocation
(qcAssertion $(varE name)) |]
The function genTestSuiteFromQcProps takes the name of the test suite to generated and a list of names, referring to your QC properties. genTestSuiteFromQcProps returns an expression of type TestSuite. TestSuite is one of the types HTF uses to organize tests.
(The htfpp preprocessor als uses the TestSuite type in its output.)
Here is how you wold use genTestSuiteFromQcProps:
-- file Main.hs
{-# OPTIONS_GHC -F -pgmF htfpp #-}
{-# LANGUAGE TemplateHaskell #-}
module Main where
import TH
import Test.Framework
import {-# HTF_TESTS #-} OtherTests
prop_additionCommutative :: Int -> Int -> Bool
prop_additionCommutative x y = (x + y) == (y + x)
prop_reverseReverseIdentity :: [Int] -> Bool
prop_reverseReverseIdentity l = l == reverse (reverse l)
myTestSuite :: TestSuite
myTestSuite =
$(genTestSuiteFromQcProps
"MyTestSuite"
['prop_additionCommutative
,'prop_reverseReverseIdentity])
main :: IO ()
main = htfMain (myTestSuite : htf_importedTests)
For your case, you would pass genTestSuiteFromQcProps the names of the QC properties you generated with TemplateHaskell.
The example also shows that you can mix test cases generated with the TemplateHaskell function with tests cases collected by htfpp. For completeness, here is the content of OtherTests:
{-# OPTIONS_GHC -F -pgmF htfpp #-}
module OtherTests ( htf_thisModulesTests) where
import Test.Framework
test_someOtherTest :: IO ()
test_someOtherTest =
assertEqual 1 1

Which dictionary does GHC choose when more than one is in scope?

Consider the following example:
import Data.Constraint
class Bar a where
bar :: a -> a
foo :: (Bar a) => Dict (Bar a) -> a -> a
foo Dict = bar
GHC has two choices for the dictionary to use when selecting a Bar instance in foo: it could use the dictionary from the Bar a constraint on foo, or it could use the runtime Dict to get a dictionary. See this question for an example where the dictionaries correspond to different instances.
Which dictionary does GHC use, and why is it the "correct" choice?
It just picks one. This isn't the correct choice; it's a pretty well-known wart. You can cause crashes this way, so it's a pretty bad state of affairs. Here is a short example using nothing but GADTs that demonstrates that it is possible to have two different instances in scope at once:
-- file Class.hs
{-# LANGUAGE GADTs #-}
module Class where
data Dict a where
Dict :: C a => Dict a
class C a where
test :: a -> Bool
-- file A.hs
module A where
import Class
instance C Int where
test _ = True
v :: Dict Int
v = Dict
-- file B.hs
module B where
import Class
instance C Int where
test _ = False
f :: Dict Int -> Bool
f Dict = test (0 :: Int)
-- file Main.hs
import TestA
import TestB
main = print (f v)
You will find that Main.hs compiles just fine, and even runs. It prints True on my machine with GHC 7.10.1, but that's not a stable outcome. Turning this into a crash is left to the reader.
GHC just picks one, and this is the correct choice. Any two dictionaries for the same constraint are supposed to be equal.
OverlappingInstances and IncoherentInstances are basically equivalent in destructive power; they both lose instance coherence by design (any two equal constraints in your program being satisfied by the same dictionary). OverlappingInstances gives you a little more ability to work out which instances will be used on a case-by-case basis, but this isn't that useful when you get to the point of passing around Dicts as first class values and so on. I would only consider using OverlappingInstances when I consider the overlapping instances extensionally equivalent (e.g., a more efficient but otherwise equal implementation for a specific type like Int), but even then, if I care enough about performance to write that specialized implementation, isn't it a performance bug if it doesn't get used when it could be?
In short, if you use OverlappingInstances, you give up the right to ask the question of which dictionary will be selected here.
Now it's true that you can break instance coherence without OverlappingInstances. In fact you can do it without orphans and without any extensions other than FlexibleInstances (arguably the problem is that the definition of "orphan" is wrong when FlexibleInstances is enabled). This is a very long-standing GHC bug, which hasn't been fixed in part because (a) it actually can't cause crashes directly as far as anybody seems to know, and (b) there might be a lot of programs that actually rely on having multiple instances for the same constraint in separate parts of the program, and that might be hard to avoid.
Getting back to the main topic, in principle it's important that GHC can select any dictionary that it has available to satisfy a constraint, because even though they are supposed to be equal, GHC might have more static information about some of them than others. Your example is a little bit too simple to be illustrative but imagine that you passed an argument to bar; in general GHC doesn't know anything about the dictionary passed in via Dict so it has to treat this as a call to an unknown function, but you called foo at a specific type T for which there was a Bar T instance in scope, then GHC would know that the bar from the Bar a constraint dictionary was T's bar and could generate a call to a known function, and potentially inline T's bar and do more optimizations as a result.
In practice, GHC is currently not this smart and it just uses the innermost dictionary available. It would probably be already better to always use the outermost dictionary. But cases like this where there are multiple dictionaries available are not very common, so we don't have good benchmarks to test on.
Here's a test:
{-# LANGUAGE FlexibleInstances, OverlappingInstances, IncoherentInstances #-}
import Data.Constraint
class C a where foo :: a -> String
instance C [a] where foo _ = "[a]"
instance C [()] where foo _ = "[()]"
aDict :: Dict (C [a])
aDict = Dict
bDict :: Dict (C [()])
bDict = Dict
bar1 :: String
bar1 = case (bDict, aDict :: Dict (C [()])) of
(Dict,Dict) -> foo [()] -- output: "[a]"
bar2 :: String
bar2 = case (aDict :: Dict (C [()]), bDict) of
(Dict,Dict) -> foo [()] -- output: "[()]"
GHC above happens to use the "last" dictionary which was brought into scope. I wouldn't rely on this, though.
If you limit yourself to overlapping instances, only, then you wouldn't be able to bring in scope two different dictionaries for the same type (as far as I can see), and everything should be fine since the choice of the dictionary becomes immaterial.
However, incoherent instances are another beast, since they allow you to commit to a generic instance and then use it at a type which has a more specific instance. This makes it very hard to understand which instance will be used.
In short, incoherent instances are evil.
Update: I ran some further tests. Using only overlapping instances and an orphan instance in a separate module you can still obtain two different dictionaries for the same type. So, we need even more caveats. :-(

Plug new FFI method into GHC

Is there a way to plug a Haskell function of type
myFFI :: (C a) => String -> IO a
(where C is some typeclass describing the types of variables I can import) into GHC as an FFI scheme so that I can write in my Haskell program stuff like
foreign import myFFI "foo" foo :: T1 -> T2
that gets compiled into a call to foo = unsafePerformIO $ myFFI "foo" :: T1 -> T2?
I imagine this could be done by modifying GHC, but is there a way to do it via a plugin I can write without touching the GHC codebase proper?
To answer the question in the comments (since the main question is answered with "use TH"), you can use TH as well to collect a list of all the names you've thus bound. Then, at startup, an init call can walk through that and force them.
There is no requirement that the second argument be in the IO monad in the first place.
foreign import ccall sin :: Double -> Double
is perfectly legit, but leads to undefined behavior if sin is impure.

Explicitly import instances

How do I explicitly import typeclass instances? Also, how do I do this with a qualified import?
Currently, I'm doing
import Control.Monad.Error ()
to import the monad instance that I can use for (Either String). Previously, I used
import Control.Monad.Error
I'm not satisfied with either one, because the Monad instance is implicitly imported.
The inability to control imports of instances is one of the trade-offs the Haskell typeclass system makes. Here's an example in a hypothetical Haskell dialect where you can:
Foo.hs:
module Foo where
data Foo = FooA | FooB deriving (Eq, Ord)
Bar.hs:
module Bar (myMap) where
import Data.Map (Map)
import qualified Data.Map as Map
import Foo
myMap :: Map Foo Int
myMap = Map.singleton FooA 42
Baz.hs:
module Baz where
import Data.Map (Map)
import qualified Data.Map as Map
import Foo hiding (instance Ord Foo)
import Bar (myMap)
instance Ord Foo where
FooA > FooB = True
FooB > FooA = False
ouch :: Map Foo Int
ouch = Map.insert FooB 42 myMap
Yikes! The set myMap was created with the proper instance Ord Foo, but it's being combined with a map created with a different, contradictory instance.
Being able to do this would violate Haskell's open world assumption. Unfortunately, I don't know of a good, centralised resource for learning about it. This section of RWH might be helpful (I searched for "haskell open world assumption").
You can't. Instances are always implicitly exported and hence you can't explicitly import them. By the way, Either e's Monad instance is nowadays in Control.Monad.Instances.
Although the generally correct answer would be "no, you can't", I suggest this horrendous solution:
copy + paste
Take a look at the library source code for the desired module, and copy/paste the necessary data declarations, imports, and function definitions into your own code. Don't copy the instances you don't want.
Depending on the problem at hand, the ghc type system extensions OverlappingInstances or IncoherentInstances might be an alternate solution, though this probably won't solve any problems with the base libraries.

Resources