How to create an instance of Arbitrary for parametric types in Haskell

How to create an instance of Arbitrary for parametric types in Haskell - haskell

I'm following haskellbook.com and there is an exercise for QuickCheck, long history short I can't figure out how to implement a instance for arbitrary for my type because it has a parametric type
Here is the code
module First where
import Test.QuickCheck
-- I have this type that has a type parameter, it's a Maybe like
data Optional a = Some a | None deriving (Show, Eq)
-- I want to check that this is monoidal, but that is not the problem yet
newtype First' a =
First' { getFirst' :: Optional a }
deriving (Eq, Show)
-- Here I have an undefined. I simply cant do `instance Arbitrary (First String)`
-- And I cant place a concrete type on the undefined place. How can I implement
-- this type class for First' a?
instance (Arbitrary a) => Arbitrary (First' a) where
arbitrary = frequency [ (1, return (First' (Some undefined))) --- how I get rid of this undefined ?????
, (1, return (First' None))
]
I would like to have something like Some "foo" in place of Some undefined, but I cant
make a a concrete type, I'm struggling with this for some hours and just cant
come up with a solution.

What you want to do is use the Arbitrary instance for a. I can tell you know you need to do this because you already added Arbitrary a as a constraint to the instance, but you need to actually use it. For instance:
instance Arbitrary a => Arbitrary (First' a) where
arbitrary = frequency [ (1, First' . Some <$> arbitrary) -- here we have `arbitrary :: Gen a`
, (1, return (First' None))
]
But really, you should go one step further. Rather than just making an instance for First' a, you can first make an instance for Optional a that will make your First' instance even easier. Consider:
instance Arbitrary a => Arbitrary (Optional a) where
arbitrary = oneof [ Some <$> arbitrary -- here we have `arbitrary :: Gen a`
, return None
]
instance Arbitrary a => Arbitrary (First' a) where
arbitrary = First' <$> arbitrary -- this one is `arbitrary :: Gen (Optional a)`
(Note that oneof is like frequency where all the frequency numbers are the same.)

Related

Set specific properties for data in Haskell

Let us say I want to make a ADT as follows in Haskell:
data Properties = Property String [String]
deriving (Show,Eq)
I want to know if it is possible to give the second list a bounded and enumerated property? Basically the first element of the list will be the minBound and the last element will be the maxBound. I am trying,
data Properties a = Property String [a]
deriving (Show, Eq)
instance Bounded (Properties a) where
minBound a = head a
maxBound a = (head . reverse) a
But not having much luck.

Well no, you can't do quite what you're asking, but maybe you'll find inspiration in this other neat trick.
{-# language ScopedTypeVariables, FlexibleContexts, UndecidableInstances #-}
import Data.Reflection -- from the reflection package
import qualified Data.List.NonEmpty as NE
import Data.List.NonEmpty (NonEmpty (..))
import Data.Proxy
-- Just the plain string part
newtype Pstring p = P String deriving Eq
-- Those properties you're interested in. It will
-- only be possible to produce bounds if there's at
-- least one property, so NonEmpty makes more sense
-- than [].
type Props = NonEmpty String
-- This is just to make a Show instance that does
-- what you seem to want easier to write. It's not really
-- necessary.
data Properties = Property String [String] deriving Show
Now we get to the key part, where we use reflection to produce class instances that can depend on run-time values. Roughly speaking, you can think of
Reifies x t => ...
as being a class-level version of
\(x :: t) -> ...
Because it operates at the class level, you can use it to parametrize instances. Since Reifies x t binds a type variable x, rather than a term variable, you need to use reflect to actually get the value back. If you happen to have a value on hand whose type ends in p, then you can just apply reflect to that value. Otherwise, you can always magic up a Proxy :: Proxy p to do the job.
-- If some Props are "in the air" tied to the type p,
-- then we can show them along with the string.
instance Reifies p Props => Show (Pstring p) where
showsPrec k p#(P str) =
showsPrec k $ Property str (NE.toList $ reflect p)
-- If some Props are "in the air" tied to the type p,
-- then we can give Pstring p a Bounded instance.
instance Reifies p Props => Bounded (Pstring p) where
minBound = P $ NE.head (reflect (Proxy :: Proxy p))
maxBound = P $ NE.last (reflect (Proxy :: Proxy p))
Now we need to have a way to actually bind types that can be passed to the type-level lambdas. This is done using the reify function. So let's throw some Props into the air and then let the butterfly nets get them back.
main :: IO ()
main = reify ("Hi" :| ["how", "are", "you"]) $
\(_ :: Proxy p) -> do
print (minBound :: Pstring p)
print (maxBound :: Pstring p)
./dfeuer#squirrel:~/src> ./WeirdBounded
Property "Hi" ["Hi","how","are","you"]
Property "you" ["Hi","how","are","you"]
You can think of reify x $ \(p :: Proxy p) -> ... as binding a type p to the value x; you can then pass the type p where you like by constraining things to have types involving p.
If you're just doing a couple of things, all this machinery is way more than necessary. Where it gets nice is when you're performing lots of operations with values that have phantom types carrying extra information. In many cases, you can avoid most of the explicit applications of reflect and the explicit proxy handling, because type inference just takes care of it all for you. For a good example of this technique in action, see the hyperloglog package. Configuration information for the HyperLogLog data structure is carried in a type parameter; this guarantees, at compile time, that only similarly configured structures are merged with each other.

Default to a typeclass when a data type does not instantiate it [duplicate]

What I'd like to achieve is that any instance of the following class (SampleSpace) should automatically be an instance of Show, because SampleSpace contains the whole interface necessary to create a String representation and hence all possible instances of the class would be virtually identical.
{-# LANGUAGE FlexibleInstances #-}
import Data.Ratio (Rational)
class SampleSpace space where
events :: Ord a => space a -> [a]
member :: Ord a => a -> space a -> Bool
probability :: Ord a => a -> space a -> Rational
instance (Ord a, Show a, SampleSpace s) => Show (s a) where
show s = showLines $ events s
where
showLines [] = ""
showLines (e:es) = show e ++ ": " ++ (show $ probability e s)
++ "\n" ++ showLines es
Since, as I found out already, while matching instance declarations GHC only looks at the head, and not at contraints, and so it believes Show (s a) is about Rational as well:
[1 of 1] Compiling Helpers.Probability ( Helpers/Probability.hs, interpreted )
Helpers/Probability.hs:21:49:
Overlapping instances for Show Rational
arising from a use of ‘show’
Matching instances:
instance (Integral a, Show a) => Show (GHC.Real.Ratio a)
-- Defined in ‘GHC.Real’
instance (Ord a, Show a, SampleSpace s) => Show (s a)
-- Defined at Helpers/Probability.hs:17:10
In the expression: show
In the first argument of ‘(++)’, namely ‘(show $ probability e s)’
In the second argument of ‘(++)’, namely
‘(show $ probability e s) ++ "" ++ showLines es
Question: is it possible (otherwise than by enabling overlapping instances) to make any instance of a typeclass automatically an instance of another too?

tl;dr: don't do that, or, if you insist, use -XOverlappingInstances.
This is not what the Show class is there for. Show is for simply showing plain data, in a way that is actually Haskell code and can be used as such again, yielding the original value.
SampleSpace should perhaps not be a class in the first place. It seems to be basically the class of types that have something like Map a Rational associated with them. Why not just use that as a field in a plain data type?
Even if we accept the design... such a generic Show instance (or, indeed, generic instance for any single-parameter class) runs into problems when someone makes another instance for a concrete type – in the case of Show, there are of course already plenty of instances around. Then how should the compiler decide which of the two instances to use? GHC can do it, in fact: if you turn on the -XOverlappingInstances extension, it will select the more specific one (i.e. instance SampleSpace s => Show (s a) is “overridden” by any more specific instance), but really this isn't as trivial as may seem – what if somebody defined another such generic instance? Crucial to recall: Haskell type classes are always open, i.e. basically the compiler has to assume that all types could possibly in any class. Only when a specific instance is invoke will it actually need the proof for that, but it can never proove that a type isn't in some class.
What I'd recommend instead – since that Show instance doesn't merely show data, it should be made a different function. Either
showDistribution :: (SampleSpace s, Show a, Ord a) => s a -> String
or indeed
showDistribution :: (Show a, Ord a) => SampleSpace a -> String
where SampleSpace is a single concrete type, instead of a class.

Make a typeclass instance automatically an instance of another

What I'd like to achieve is that any instance of the following class (SampleSpace) should automatically be an instance of Show, because SampleSpace contains the whole interface necessary to create a String representation and hence all possible instances of the class would be virtually identical.
{-# LANGUAGE FlexibleInstances #-}
import Data.Ratio (Rational)
class SampleSpace space where
events :: Ord a => space a -> [a]
member :: Ord a => a -> space a -> Bool
probability :: Ord a => a -> space a -> Rational
instance (Ord a, Show a, SampleSpace s) => Show (s a) where
show s = showLines $ events s
where
showLines [] = ""
showLines (e:es) = show e ++ ": " ++ (show $ probability e s)
++ "\n" ++ showLines es
Since, as I found out already, while matching instance declarations GHC only looks at the head, and not at contraints, and so it believes Show (s a) is about Rational as well:
[1 of 1] Compiling Helpers.Probability ( Helpers/Probability.hs, interpreted )
Helpers/Probability.hs:21:49:
Overlapping instances for Show Rational
arising from a use of ‘show’
Matching instances:
instance (Integral a, Show a) => Show (GHC.Real.Ratio a)
-- Defined in ‘GHC.Real’
instance (Ord a, Show a, SampleSpace s) => Show (s a)
-- Defined at Helpers/Probability.hs:17:10
In the expression: show
In the first argument of ‘(++)’, namely ‘(show $ probability e s)’
In the second argument of ‘(++)’, namely
‘(show $ probability e s) ++ "" ++ showLines es
Question: is it possible (otherwise than by enabling overlapping instances) to make any instance of a typeclass automatically an instance of another too?

tl;dr: don't do that, or, if you insist, use -XOverlappingInstances.
This is not what the Show class is there for. Show is for simply showing plain data, in a way that is actually Haskell code and can be used as such again, yielding the original value.
SampleSpace should perhaps not be a class in the first place. It seems to be basically the class of types that have something like Map a Rational associated with them. Why not just use that as a field in a plain data type?
Even if we accept the design... such a generic Show instance (or, indeed, generic instance for any single-parameter class) runs into problems when someone makes another instance for a concrete type – in the case of Show, there are of course already plenty of instances around. Then how should the compiler decide which of the two instances to use? GHC can do it, in fact: if you turn on the -XOverlappingInstances extension, it will select the more specific one (i.e. instance SampleSpace s => Show (s a) is “overridden” by any more specific instance), but really this isn't as trivial as may seem – what if somebody defined another such generic instance? Crucial to recall: Haskell type classes are always open, i.e. basically the compiler has to assume that all types could possibly in any class. Only when a specific instance is invoke will it actually need the proof for that, but it can never proove that a type isn't in some class.
What I'd recommend instead – since that Show instance doesn't merely show data, it should be made a different function. Either
showDistribution :: (SampleSpace s, Show a, Ord a) => s a -> String
or indeed
showDistribution :: (Show a, Ord a) => SampleSpace a -> String
where SampleSpace is a single concrete type, instead of a class.

Why constraints on data are a bad thing?

I know this question has been asked and answered lots of times but I still don't really understand why putting constraints on a data type is a bad thing.
For example, let's take Data.Map k a. All of the useful functions involving a Map need an Ord k constraint. So there is an implicit constraint on the definition of Data.Map. Why is it better to keep it implicit instead of letting the compiler and programmers know that Data.Map needs an orderable key.
Also, specifying a final type in a type declaration is something common, and one can see it as a way of "super" constraining a data type.
For example, I can write
data User = User { name :: String }
and that's acceptable. However is that not a constrained version of
data User' s = User' { name :: s }
After all 99% of the functions I'll write for the User type don't need a String and the few which will would probably only need s to be IsString and Show.
So, why is the lax version of User considered bad:
data (IsString s, Show s, ...) => User'' { name :: s }
while both User and User' are considered good?
I'm asking this, because lots of the time, I feel I'm unnecessarily narrowing my data (or even function) definitions, just to not have to propagate constraints.
Update
As far as I understand, data type constraints only apply to the constructor and don't propagate. So my question is then, why do data type constraints not work as expected (and propagate)? It's an extension anyway, so why not have a new extension doing data properly, if it was considered useful by the community?

TL;DR:
Use GADTs to provide implicit data contexts.
Don't use any kind of data constraint if you could do with Functor instances etc.
Map's too old to change to a GADT anyway.
Scroll to the bottom if you want to see the User implementation with GADTs
Let's use a case study of a Bag where all we care about is how many times something is in it. (Like an unordered sequence. We nearly always need an Eq constraint to do anything useful with it.
I'll use the inefficient list implementation so as not to muddy the waters over the Data.Map issue.
GADTs - the solution to the data constraint "problem"
The easy way to do what you're after is to use a GADT:
Notice below how the Eq constraint not only forces you to use types with an Eq instance when making GADTBags, it provides that instance implicitly wherever the GADTBag constructor appears. That's why count doesn't need an Eq context, whereas countV2 does - it doesn't use the constructor:
{-# LANGUAGE GADTs #-}
data GADTBag a where
GADTBag :: Eq a => [a] -> GADTBag a
unGADTBag (GADTBag xs) = xs
instance Show a => Show (GADTBag a) where
showsPrec i (GADTBag xs) = showParen (i>9) (("GADTBag " ++ show xs) ++)
count :: a -> GADTBag a -> Int -- no Eq here
count a (GADTBag xs) = length.filter (==a) $ xs -- but == here
countV2 a = length.filter (==a).unGADTBag
size :: GADTBag a -> Int
size (GADTBag xs) = length xs
ghci> count 'l' (GADTBag "Hello")
2
ghci> :t countV2
countV2 :: Eq a => a -> GADTBag a -> Int
Now we didn't need the Eq constraint when we found the total size of the bag, but it didn't clutter up our definition anyway. (We could have used size = length . unGADTBag just as well.)
Now lets make a functor:
instance Functor GADTBag where
fmap f (GADTBag xs) = GADTBag (map f xs)
oops!
DataConstraints_so.lhs:49:30:
Could not deduce (Eq b) arising from a use of `GADTBag'
from the context (Eq a)
That's unfixable (with the standard Functor class) because I can't restrict the type of fmap, but need to for the new list.
Data Constraint version
Can we do as you asked? Well, yes, except that you have to keep repeating the Eq constraint wherever you use the constructor:
{-# LANGUAGE DatatypeContexts #-}
data Eq a => EqBag a = EqBag {unEqBag :: [a]}
deriving Show
count' a (EqBag xs) = length.filter (==a) $ xs
size' (EqBag xs) = length xs -- Note: doesn't use (==) at all
Let's go to ghci to find out some less pretty things:
ghci> :so DataConstraints
DataConstraints_so.lhs:1:19: Warning:
-XDatatypeContexts is deprecated: It was widely considered a misfeature,
and has been removed from the Haskell language.
[1 of 1] Compiling Main ( DataConstraints_so.lhs, interpreted )
Ok, modules loaded: Main.
ghci> :t count
count :: a -> GADTBag a -> Int
ghci> :t count'
count' :: Eq a => a -> EqBag a -> Int
ghci> :t size
size :: GADTBag a -> Int
ghci> :t size'
size' :: Eq a => EqBag a -> Int
ghci>
So our EqBag count' function requires an Eq constraint, which I think is perfectly reasonable, but our size' function also requires one, which is less pretty. This is because the type of the EqBag constructor is EqBag :: Eq a => [a] -> EqBag a, and this constraint must be added every time.
We can't make a functor here either:
instance Functor EqBag where
fmap f (EqBag xs) = EqBag (map f xs)
for exactly the same reason as with the GADTBag
Constraintless bags
data ListBag a = ListBag {unListBag :: [a]}
deriving Show
count'' a = length . filter (==a) . unListBag
size'' = length . unListBag
instance Functor ListBag where
fmap f (ListBag xs) = ListBag (map f xs)
Now the types of count'' and show'' are exactly as we expect, and we can use standard constructor classes like Functor:
ghci> :t count''
count'' :: Eq a => a -> ListBag a -> Int
ghci> :t size''
size'' :: ListBag a -> Int
ghci> fmap (Data.Char.ord) (ListBag "hello")
ListBag {unListBag = [104,101,108,108,111]}
ghci>
Comparison and conclusions
The GADTs version automagically propogates the Eq constraint everywhere the constructor is used. The type checker can rely on there being an Eq instance, because you can't use the constructor for a non-Eq type.
The DatatypeContexts version forces the programmer to manually propogate the Eq constraint, which is fine by me if you want it, but is deprecated because it doesn't give you anything more than the GADT one does and was seen by many as pointless and annoying.
The unconstrained version is good because it doesn't prevent you from making Functor, Monad etc instances. The constraints are written exactly when they're needed, no more or less. Data.Map uses the unconstrained version partly because unconstrained is generally seen as most flexible, but also partly because it predates GADTs by some margin, and there needs to be a compelling reason to potentially break existing code.
What about your excellent User example?
I think that's a great example of a one-purpose data type that benefits from a constraint on the type, and I'd advise you to use a GADT to implement it.
(That said, sometimes I have a one-purpose data type and end up making it unconstrainedly polymorphic just because I love to use Functor (and Applicative), and would rather use fmap than mapBag because I feel it's clearer.)
{-# LANGUAGE GADTs #-}
import Data.String
data User s where
User :: (IsString s, Show s) => s -> User s
name :: User s -> s
name (User s) = s
instance Show (User s) where -- cool, no Show context
showsPrec i (User s) = showParen (i>9) (("User " ++ show s) ++)
instance (IsString s, Show s) => IsString (User s) where
fromString = User . fromString
Notice since fromString does construct a value of type User a, we need the context explicitly. After all, we composed with the constructor User :: (IsString s, Show s) => s -> User s. The User constructor removes the need for an explicit context when we pattern match (destruct), becuase it already enforced the constraint when we used it as a constructor.
We didn't need the Show context in the Show instance because we used (User s) on the left hand side in a pattern match.

Constraints
The problem is that constraints are not a property of the data type, but of the algorithm/function that operates on them. Different functions might need different and unique constraints.
A Box example
As an example, let's assume we want to create a container called Box which contains only 2 values.
data Box a = Box a a
We want it to:
be showable
allow the sorting of the two elements via sort
Does it make sense to apply the constraint of both Ord and Show on the data type? No, because the data type in itself could be only shown or only sorted and therefore the constraints are related to its use, not it's definition.
instance (Show a) => Show (Box a) where
show (Box a b) = concat ["'", show a, ", ", show b, "'"]
instance (Ord a) => Ord (Box a) where
compare (Box a b) (Box c d) =
let ca = compare a c
cb = compare b d
in if ca /= EQ then ca else cb
The Data.Map case
Data.Map's Ord constraints on the type is really needed only when we have > 1 elements in the container. Otherwise the container is usable even without an Ord key. For example, this algorithm:
transf :: Map NonOrd Int -> Map NonOrd Int
transf x =
if Map.null x
then Map.singleton NonOrdA 1
else x
Live demo
works just fine without the Ord constraint and always produce a non empty map.

Using DataTypeContexts reduces the number of programs you can write. If most of those illegal programs are nonsense, you might say it's worth the runtime cost associated with ghc passing in a type class dictionary that isn't used. For example, if we had
data Ord k => MapDTC k a
then #jefffrey's transf is rejected. But we should probably have transf _ = return (NonOrdA, 1) instead.
In some sense the context is documentation that says "every Map must have ordered keys". If you look at all of the functions in Data.Map you'll get a similar conclusion "every useful Map has ordered keys". While you can create maps with unordered keys using
mapKeysMonotonic :: (k1 -> k2) -> Map k1 a -> Map k2 a
singleton :: k2 a -> Map k2 a
But the moment you try to do anything useful with them, you'll wind up with No instance for Ord k2 somewhat later.

Generate triple (Network.HTTP.ResponseCode) using an instance of Arbitrary

I have a function that takes a ResponseCode from Network.HTTP. In order to test it with QuickCheck, I wanted to write an instance of Arbitrary for ResponseCode. (In case you don't know, a ResponseCode is just a triple of ints in that library: type ResponseCode = (Int, Int, Int)).
So I wrote something like this:
instance Arbitrary ResponseCode where
arbitrary = triple ( elements [1..6] )
where triple f = (f, f, f)
First of all, GHC complains that the way I'm using types is not standard haskell so I would have to use some compiler flags (which is not really what I want since I feel like there must be an easy solution for this simple problem without flags).
Secondly, my arbitrary function has the wrong type, which is pretty obvious. But then I really didn't figure out how to write a function that returns a triple with random Ints ranging from 1-6.
I would appreciate if somebody could help me out here.
Thank you.

First of all, there is already are these two instances:
instance Arbitrary Int
instance (Arbitrary a, Arbitrary b, Arbitrary c) =>
Arbitrary (a, b, c)
This means that (Int,Int,Int) is already an instance of Arbitrary. This means that the type synonym ResponseCode is already an instance. You cannot define and use a second instance.
You could try using Test.QuickCheck.Gen.suchThat but I hypothesize that it will not work well in this case. If you can, I suggest using a newtype wrapper:
import Test.QuickCheck
import Network.HTTP.Base
import Control.Applicative
import System.Random
newtype Arb'ResponseCode = Arb'ResponseCode { arb'ResponseCode :: ResponseCode }
deriving (Show)
instance Arbitrary Arb'ResponseCode where
arbitrary = do
responseCode <- (,,) <$> f <*> f <*> f
return (Arb'ResponseCode responseCode)
where f = elements [1..6]
-- To use this you can call
-- (fmap arb'ResponseCode arbitrary)
-- instead of just (arbitrary)
Or perhaps use the built-in (Int,Int,Int) instance and post-process the three elements with (succ . (mod 6)).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string