Safest way to generate random GADT with Hedgehog (or any other property-based testing framework) - haskell

I have GADT like this one:
data TType a where
TInt :: TType Int
TBool :: TType Bool
I want to have a function like this one:
genTType :: Gen (TType a)
Which can generate random constructor of TType type. I can do this simply by creating existentially qualified data type like
data AnyType = forall a . MkAnyType (TType a)
then generate random number from 0 to 1 (including) and create AnyType depending on the integer value. Like this:
intToAnyType :: Int -> AnyType
intToAnyType 0 = MkAnyType TInt
intToAnyType 1 = MkAnyType TBool
intToAnyType _ = error "Impossible happened"
But this approach has couple drawbacks to me:
No external type safety. If I add another constructor to TType data type I can forgot to fix tests and compiler won't warn me about this.
Compiler can't stop me from writing intToAnyType 1 = MkAnyType TInt.
I don't like this error. Int type is too broad to me. It would be nice to make this pattern-matching exhaustive.
What can I do in Haskell to eliminate as much as possible drawbacks here? Preferably using generators from this module:
https://hackage.haskell.org/package/hedgehog-0.5.1/docs/Hedgehog-Gen.html

Generating genTType with Template Haskell is probably your best bet to automate maintenance of the generators, because there is no generic programming support for GADTs.
For your last point, instead of generating an integer and then mapping it to a value, use oneof or element.
element [MkAnyType TInt, MkAnyType TBool]

Related

How does the :: operator syntax work in the context of bounded typeclass?

I'm learning Haskell and trying to understand the reasoning behind it's syntax design at the same time. Most of the syntax is beautiful.
But since :: normally is like a type annotation, How is it that this works:
Input: minBound::Int
Output: -2147483648
There is no separate operator: :: is a type annotation in that example. Perhaps the best way to understand this is to consider this code:
main = print (f minBound)
f :: Int -> Int
f = id
This also prints -2147483648. The use of minBound is inferred to be an Int because it is the parameter to f. Once the type has been inferred, the value for that type is known.
Now, back to:
main = print (minBound :: Int)
This works in the same way, except that minBound is known to be an Int because of the type annotation, rather than for some more complex reason. The :: isn't some binary operation; it just directs the compiler that the expression minBound has the type Int. Once again, since the type is known, the value can be determined from the type class.
:: still means "has type" in that example.
There are two ways you can use :: to write down type information. Type declarations, and inline type annotations. Presumably you've been used to seeing type declarations, as in:
plusOne :: Integer -> Integer
plusOne = (+1)
Here the plusOne :: Integer -> Integer line is a separate declaration about the identifier plusOne, informing the compiler what its type should be. It is then actually defined on the following line in another declaration.
The other way you can use :: is that you can embed type information in the middle of any expression. Any expression can be followed by :: and then a type, and it means the same thing as the expression on its own except with the additional constraint that it must have the given type. For example:
foo = ('a', 2) :: (Char, Integer)
bar = ('a', 2 :: Integer)
Note that for foo I attached the entire expression, so it is very little different from having used a separate foo :: (Char, Integer) declaration. bar is more interesting, since I gave a type annotation for just the 2 but used that within a larger expression (for the whole pair). 2 :: Integer is still an expression for the value 2; :: is not an operator that takes 2 as input and computes some result. Indeed if the 2 were already used in a context that requires it to be an Integer then the :: Integer annotation changes nothing at all. But because 2 is normally polymorphic in Haskell (it could fit into a context requiring an Integer, or a Double, or a Complex Float) the type annotation pins down that the type of this particular expression is Integer.
The use is that it avoids you having to restructure your code to have a separate declaration for the expression you want to attach a type to. To do that with my simple example would have required something like this:
two :: Integer
two = 2
baz = ('a', two)
Which adds a relatively large amount of extra code just to have something to attach :: Integer to. It also means when you're reading bar, you have to go read a whole separate definition to know what the second element of the pair is, instead of it being clearly stated right there.
So now we can answer your direct question. :: has no special or particular meaning with the Bounded type class or with minBound in particular. However it's useful with minBound (and other type class methods) because the whole point of type classes is to have overloaded names that do different things depending on the type. So selecting the type you want is useful!
minBound :: Int is just an expression using the value of minBound under the constraint that this particular time minBound is used as an Int, and so the value is -2147483648. As opposed to minBound :: Char which is '\NUL', or minBound :: Bool which is False.
None of those options mean anything different from using minBound where there was already some context requiring it to be an Int, or Char, or Bool; it's just a very quick and simple way of adding that context if there isn't one already.
It's worth being clear that both forms of :: are not operators as such. There's nothing terribly wrong with informally using the word operator for it, but be aware that "operator" has a specific meaning in Haskell; it refers to symbolic function names like +, *, &&, etc. Operators are first-class citizens of Haskell: we can bind them to variables1 and pass them around. For example I can do:
(|+|) = (+)
x = 1 |+| 2
But you cannot do this with ::. It is "hard-wired" into the language, just as the = symbol used for introducing definitions is, or the module Main ( main ) where syntax for module headers. As such there are lots of things that are true about Haskell operators that are not true about ::, so you need to be careful not to confuse yourself or others when you use the word "operator" informally to include ::.
1 Actually an operator is just a particular kind of variable name that is applied by writing it between two arguments instead of before them. The same function can be bound to operator and ordinary variables, even at the same time.
Just to add another example, with Monads you can play a little like this:
import Control.Monad
anyMonad :: (Monad m) => Int -> m Int
anyMonad x = (pure x) >>= (\x -> pure (x*x)) >>= (\x -> pure (x+2))
$> anyMonad 4 :: [Int]
=> [18]
$> anyMonad 4 :: Either a Int
=> Right 18
$> anyMonad 4 :: Maybe Int
=> Just 18
it's a generic example telling you that the functionality may change with the type, another example:

Practical applications of Rank 2 polymorphism?

I'm covering polymorphism and I'm trying to see the practical uses of such a feature.
My basic understanding of Rank 2 is:
type MyType = ∀ a. a -> a
subFunction :: a -> a
subFunction el = el
mainFunction :: MyType -> Int
mainFunction func = func 3
I understand that this is allowing the user to use a polymorphic function (subFunction) inside mainFunction and strictly specify it's output (Int). This seems very similar to GADT's:
data Example a where
ExampleInt :: Int -> Example Int
ExampleBool :: Bool -> Example Bool
1) Given the above, is my understanding of Rank 2 polymorphism correct?
2) What are the general situations where Rank 2 polymorphism can be used, as opposed to GADT's, for example?
If you pass a polymorphic function as and argument to a Rank2-polymorphic function, you're essentially passing not just one function but a whole family of functions – for all possible types that fulfill the constraints.
Typically, those forall quantifiers come with a class constraint. For example, I might wish to do number arithmetic with two different types simultaneously (for comparing precision or whatever).
data FloatCompare = FloatCompare {
singlePrecision :: Float
, doublePrecision :: Double
}
Now I might want to modify those numbers through some maths operation. Something like
modifyFloat :: (Num -> Num) -> FloatCompare -> FloatCompare
But Num is not a type, only a type class. I could of course pass a function that would modify any particular number type, but I couldn't use that to modify both a Float and a Double value, at least not without some ugly (and possibly lossy) converting back and forth.
Solution: Rank-2 polymorphism!
modifyFloat :: (∀ n . Num n => n -> n) -> FloatCompare -> FloatCompare
mofidyFloat f (FloatCompare single double)
= FloatCompare (f single) (f double)
The best single example of how this is useful in practice are probably lenses. A lens is a “smart accessor function” to a field in some larger data structure. It allows you to access fields, update them, gather results... while at the same time composing in a very simple way. How it works: Rank2-polymorphism; every lens is polymorphic, with the different instantiations corresponding to the “getter” / “setter” aspects, respectively.
The go-to example of an application of rank-2 types is runST as Benjamin Hodgson mentioned in the comments. This is a rather good example and there are a variety of examples using the same trick. For example, branding to maintain abstract data type invariants across multiple types, avoiding confusion of differentials in ad, a region-based version of ST.
But I'd actually like to talk about how Haskell programmers are implicitly using rank-2 types all the time. Every type class whose methods have universally quantified types desugars to a dictionary with a field with a rank-2 type. In practice, this is virtually always a higher-kinded type class* like Functor or Monad. I'll use a simplified version of Alternative as an example. The class declaration is:
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
The dictionary representing this class would be:
data AlternativeDict f = AlternativeDict {
empty :: forall a. f a,
(<|>) :: forall a. f a -> f a -> f a }
Sometimes such an encoding is nice as it allows one to use different "instances" for the same type, perhaps only locally. For example, Maybe has two obvious instances of Alternative depending on whether Just a <|> Just b is Just a or Just b. Languages without type classes, such as Scala, do indeed use this encoding.
To connect to leftaroundabout's reference to lenses, you can view the hierarchy there as a hierarchy of type classes and the lens combinators as simply tools for explicitly building the relevant type class dictionaries. Of course, the reason it isn't actually a hierarchy of type classes is that we usually will have multiple "instances" for the same type. E.g. _head and _head . _tail are both "instances" of Traversal' s a.
* A higher-kinded type class doesn't necessarily lead to this, and it can happen for a type class of kind *. For example:
-- Higher-kinded but doesn't require universal quantification.
class Sum c where
sum :: c Int -> Int
-- Not higher-kinded but does require universal quantification.
class Length l where
length :: [a] -> l
If you are using modules in Haskell, you are already using Rank-2 types. Theoretically speaking, modules are records with rank-2 type properties.
For example, the Foo module below in Haskell ...
module Foo(id) where
id :: forall a. a -> a
id x = x
import qualified Foo
main = do
putStrLn (Foo.id "hello")
return ()
... can actually be thought as a record as follows:
type FooType = FooType {
id :: forall a. a -> a
}
Foo :: FooType
Foo = Foo {
id = \x -> x
}
P/S (unrelated this question): from a language design perspective, if you are going to support module system, then you might as well support higher-rank types (i.e. allow arbitrary quantification of type variables on any level) to reduce duplication of efforts (i.e. type checking a module should be almost the same as type checking a record with higher rank types).

Data type design in Haskell

Learning Haskell, I write a formatter of C++ header files. First, I parse all class members into a-collection-of-class-members which is then passed to the formatting routine. To represent class members I have
data ClassMember = CmTypedef Typedef |
CmMethod Method |
CmOperatorOverload OperatorOverload |
CmVariable Variable |
CmFriendClass FriendClass |
CmDestructor Destructor
(I need to classify the class members this way because of some peculiarities of the formatting style.)
The problem that annoys me is that to "drag" any function defined for the class member types to the ClassMember level, I have to write a lot of redundant code. For example,
instance Formattable ClassMember where
format (CmTypedef td) = format td
format (CmMethod m) = format m
format (CmOperatorOverload oo) = format oo
format (CmVariable v) = format v
format (CmFriendClass fc) = format fc
format (CmDestructor d) = format d
instance Prettifyable ClassMember where
-- same story here
On the other hand, I would definitely like to have a list of ClassMember objects (at least, I think so), hence defining it as
data ClassMember a = ClassMember a
instance Formattable ClassMember a
format (ClassMember a) = format a
doesn't seem to be an option.
The alternatives I'm considering are:
Store in ClassMember not object instances themselves, but functions defined on the corresponding types, which are needed by the formatting routine. This approach breaks the modularity, IMO, as the parsing results, represented by [ClassMember], need to be aware of all their usages.
Define ClassMember as an existential type, so [ClassMember] is no longer a problem. I doubt whether this design is strict enough and, again, I need to specify all constraints in the definition, like data ClassMember = forall a . Formattable a => ClassMember a. Also, I would prefer a solution without using extensions.
Is what I'm doing a proper way to do it in Haskell or there is a better way?
First, consider trimming down that ADT a bit. Operator overloads and destructors are special kinds of methods, so it might make more sense to treat all three in CmMethod; Method will then have special ways to separate them. Alternatively, keep all three CmMethod, CmOperatorOverload, and CmDestructor, but let them all contain the same Method type.
But of course, you can reduce the complexity only so much.
As for the specific example of a Show instance: you really don't want to write that yourself except in some special cases. For your case, it's much more reasonable to have the instance derived automatically:
data ClassMember = CmTypedef Typedef
| CmMethod Method
| ...
| CmDestructor Destructor
deriving (Show)
This will give different results from your custom instance – because yours is wrong: showing a contained result should also give information about the constructor.
If you're not really interested in Show but talking about another class C that does something more specific to ClassMembers – well, then you probably shouldn't have defined C in the first place! The purpose of type classes is to express mathematical concepts that hold for a great variety of types.
A possible solution is to use records.
It can be used without extensions and preserves flexibility.
There is still some boilerplate code, but you need to type it only once for all. So if you would need to perform another set of operations over your ClassMember, it would be very easy and quick to do it.
Here is an example for your particular case (template Haskell and Control.Lens makes things easier but are not mandatory):
{-# LANGUAGE TemplateHaskell #-}
module Test.ClassMember
import Control.Lens
-- | The class member as initially defined.
data ClassMember =
CmTypedef Typedef
| CmMethod Method
| CmOperatorOverload OperatorOverload
| CmVariable Variable
| CmFriendClass FriendClass
| CmDestructor Destructor
-- | Some dummy definitions of the data types, so the code will compile.
data Typedef = Typedef
data Method = Method
data OperatorOverload = OperatorOverload
data Variable = Variable
data FriendClass = FriendClass
data Destructor = Destructor
{-|
A data type which defines one function per constructor.
Note the type a, which means that for a given Hanlder "a" all functions
must return "a" (as for a type class!).
-}
data Handler a = Handler
{
_handleType :: Typedef -> a
, _handleMethod :: Method -> a
, _handleOperator :: OperatorOverload -> a
, _handleVariable :: Variable -> a
, _handleFriendClass :: FriendClass -> a
, _handleDestructor :: Destructor -> a
}
{-|
Here I am using lenses. This is not mandatory at all, but makes life easier.
This is also the reason of the TemplateHaskell language pragma above.
-}
makeLenses ''Handler
{-|
A function acting as a dispatcher (the boilerplate code!!!), telling which
function of the handler must be used for a given constructor.
-}
handle :: Handler a -> ClassMember -> a
handle handler member =
case member of
CmTypedef a -> handler^.handleType $ a
CmMethod a -> handler^.handleMethod $ a
CmOperatorOverload a -> handler^.handleOperator $ a
CmVariable a -> handler^.handleVariable $ a
CmFriendClass a -> handler^.handleFriendClass $ a
CmDestructor a) -> handler^.handleDestructor $ a
{-|
A dummy format method.
I kept things simple here, but you could define much more complicated
functions.
You could even define some generic functions separately and... you could define
them with some extra arguments that you would only provide when building
the Handler! An (dummy!) example is the way the destructor function is
constructed.
-}
format :: Handler String
format = Handler
(\x -> "type")
(\x -> "method")
(\x -> "operator")
(\x -> "variable")
(\x -> "Friend")
(destructorFunc $ (++) "format ")
{-|
A dummy function showcasing partial application.
It has one more argument than handleDestructor. In practice you are free
to add as many as you wish as long as it ends with the expected type
(Destructor -> String).
-}
destructorFunc :: (String -> String) -> Destructor -> String
destructorFunc f _ = f "destructor"
{-|
Construction of the pretty handler which illustrates the reason why
using lens by keeping a nice and concise syntax.
The "&" is the backward operator and ".~" is the set operator.
All we do here is to change the functions of the handleType and the
handleDestructor.
-}
pretty :: Handler String
pretty = format & handleType .~ (\x -> "Pretty type")
& handleDestructor .~ (destructorFunc ((++) "Pretty "))
And now we can run some tests:
test1 = handle format (CmDestructor Destructor)
> "format destructor"
test2 = handle pretty (CmDestructor Destructor)
> "Pretty destructor"

Why does Haskell not have records with structural typing?

I have heard Haskell described as having structural typing. Records are an exception to that though as I understand. For example foo cannot be called with something of type HRec2 even though HRec and HRec2 are only nominally different in their fields.
data HRec = HRec { x :: Int, y :: Bool }
data HRec2 = HRec2 { p :: Int, q :: Bool }
foo :: HRec -> Bool
Is there some explanation for rejecting extending structural typing to everything including records?
Are there statically typed languages with structural typing even for records? Is there maybe some debate on this I can read about for all statically typed languages in general?
Haskell has structured types, but not structural typing, and that's not likely to change.*
The refusal to permit nominally different but structurally similar types as interchangeable arguments is called type safety. It's a good thing. Haskell even has a newtype declaration to provide types which are only nominally different, to allow you to enforce more type safety. Type safety is an easy way to catch bugs early rather than permit them at runtime.
In addition to amindfv's good answer which includes ad hoc polymorphism via typeclasses (effectively a programmer-declared feature equivalence), there's parametric polymorphism where you allow absolutely any type, so [a] allows any type in your list and BTree a allows any type in your binary tree.
This gives three answers to "are these types interchangeable?".
No; the programmer didn't say so.
Equivalent for a specific purpose because the programmer said so.
Don't care - I can do the same thing to this collection of data because it doesn't use any property of the data itself.
There's no 4: compiler overrules programmer because they happened to use a couple of Ints and a String like in that other function.
*I said Haskell is unlikely to change to structural typing. There is some discussion to introduce some form of extensible records, but no plans to make (Int,(Int,Int)) count as the same as (Int, Int, Int) or Triple {one::Int, two::Int, three::Int} the same as Triple2 {one2::Int, two2::Int, three2::Int}.
Haskell records aren't really "less structural" than the rest of the type system. Every type is either completely specified, or "specifically vague" (i.e. defined with a typeclass).
To allow both HRec and HRec2 to f, you have a couple of options:
Algebraic types:
Here, you define HRec and HRec2 records to both be part of the HRec type:
data HRec = HRec { x :: Int, y :: Bool }
| HRec2 { p :: Int, q :: Bool }
foo :: HRec -> Bool
(alternately, and maybe more idiomatic:)
data HRecType = Type1 | Type2
data HRec = HRec { hRecType :: HRecType, x :: Int, y :: Bool }
Typeclasses
Here, you define foo as able to accept any type as input, as long as a typeclass instance has been written for that type:
data HRec = HRec { x :: Int, y :: Bool }
data HRec2 = HRec2 { p :: Int, q :: Bool }
class Flexible a where
foo :: a -> Bool
instance Flexible HRec where
foo (HRec a _) = a == 5 -- or whatever
instance Flexible HRec2 where
foo (HRec2 a _) = a == 5
Using typeclasses allows you to go further than regular structural typing -- you can accept types that have the necessary information embedded in them, even if the types don't superficially look similar, e.g.:
data Foo = Foo { a :: String, b :: Float }
data Bar = Bar { c :: String, d :: Integer }
class Thing a where
doAThing :: a -> Bool
instance Thing Foo where
doAThing (Foo x y) = (x == "hi") && (y == 0)
instance Thing Bar where
doAThing (Bar x y) = (x == "hi") && ((fromInteger y) == 0)
We can run fromInteger (or any arbitrary function) to get the data we need from what we have!
I'm aware of two library implementations of structurally typed records in Haskell:
HList is older, and is explained in an excellent paper: Haskell's overlooked object system (free online, but SO won't let me include more links)
vinyl is newer, and uses fancy new GHC features. There's at least one library, vinyl-gl, using it.
I cannot answer the language-design part of your question, though.
To answer your last question, Go and Scalas definitely have structural typing. Some people (including me) would call that "statically unsafe typing", since it implicitly declares all samely- named methods in a program to have the same semantics, which implies "spooky action at a distance", relating code in on source file to code in some library that the program has never seen.
IMO, better to require that the same-named methods to explicitly declare that they conform to a named semantic "model" of behavior.
Yes, the compiler would guarantee that the method is callable, but it isn't much safer than saying:
f :: [a] -> Int
And letting the compiler choose an arbitrary implementation that may or may not be length.
(A similar idea can be made safe with Scala "implicits" or Haskell (GHC?) "reflection" package.)

Alias for datatypes in Haskell

So I've got a structure like this:
data Maybe a = Nothing | Just a
but I want a structure that is defined as
data MaybeInt = Nothing | Just Int
is there a way to define MaybeInt using Maybe a, and if so how?
There are a few ways to define MaybeInt. I'll state them then have some commentary.
Direct
data MaybeInt = NothingInt | JustInt Int
Newtype
newtype MaybeInt = MI (Maybe Int)
Type synonym
type MaybeInt = Maybe Int
Plain
-- just use `(Maybe Int)` wherever you would write `MaybeInt`
Commentary
Most commonly, one would use the plain method since most people are familiar with Maybe and thus know to use Just and Nothing to match it. This makes it good for libraries—very transparent. The type synonym method is a common documentation method, but is basically useless for your synonym. It makes it so that foo :: Int -> Maybe Int and bar :: Int -> MaybeInt have identical type signatures. It also means that as soon as someone knows that MaybeInt === Maybe Int they can use the Just/Nothing constructors for matching.
The newtype method gets fairly interesting. Here you have to begin "wrapping" and "unwrapping" the MI constructor every time you want to use the MaybeInt type. Compare:
baz :: MaybeInt -> Bool
baz (MI Nothing) = False
baz (MI (Just int)) = True
this is nice because if you don't export MI then nobody will be able to match on MaybeInt (despite having a pretty good guess at what's going on inside of it). This is really useful for making stable APIs. Another interesting property of newtype is that you can write new instances for MaybeInt which are different from Maybe's built-in ones. For instance, you could override the Monoid instance
instance Monoid MaybeInt where
mempty = MI Nothing
mi `mappend` (MI Nothing) = mi
_ `mappend` mi = mi
which is just the same as the Last a newtype built-in to Data.Monoid which wraps Maybe as.
Finally, we get the full-blown data instance. It's more verbose, more likely to error, marginally slower (since the compiler has to track a new, unique data type), and requires that people learn new constructors. For functionality so obviously identical to Maybe Int there's really no reason to use it at all.

Resources