Constraints for instances of typeclasses not mentioning the constrained type variable - haskell

Let's say we're writing a simple bitset that uses a field of type c (think Word) to signify the presence of different values of type a:
data BitSet c a = BitSet c
deriving (Eq, Ord, Data, Typeable, Generic, NFData)
isSet :: (Bits c, Enum a) => a -> BitSet c a -> Bool
isSet e (BitSet c) = testBit c $ fromEnum e
a is assumed to be Enum so that it could be converted to an integer corresponding to the bit index.
Let's now try writing a Foldable instance for BitSet c, starting with the following:
instance FiniteBits c => Foldable (BitSet c) where
foldr f b0 (BitSet c) = foldr f b0 list
where (list :: [a]) = toEnum <$> filter (testBit c) [0..finiteBitSize c - 1]
But this won't work: it doesn't have the constraint that a should satisfy Enum a, and I don't see any way to provide it (and I have a gut feel that it doesn't make much sense from more theoretical point of view if a isn't listed in the instance declaration, but that's probably another story).
The most voted answer here suggests that using GADTs might help, but GADTs have their own shortcomings:
One would need to use standalone deriving, which gets particularly ugly for Data and Typeable, and, what's worse,
GADTs seem to break derivation of Generic for this type at all.
Is there any other way to write a Foldable instance while still keeping autoderived Generic instance?

Related

True isomorphisms in Haskell

Are the following assertions true:
The only real isomorphism, accessible programatically to the user, verified by Haskell type system, and that the Haskell compiler is/can be made aware of, is between:
the set of values of a Haskell datatype
the set of values of types those required by its constructors
Even generic programming can't produce "true" isomorphism, whose composition results at run time in an identity (thus staged-sop - and similarly in Ocaml)
Haskell itself is the only producing isomorphism, Coercible, but those isomorphism are restricted to the identity isomorphism
By "real isomorphism, accessible programatically to the user, verified by Haskell type system, and that the Haskell compiler is/can be made aware of" I mean a pair of function u : a -> b and v : b -> a such that Haskell knows (by being informed or otherwise) that u.v=id and v.u=id. Just like it knows (at compile time) how to rewrite some code to do "fold fusion", which is akin to, at once, recognize and apply it.
Look into Homotopy Type Theory/Cubical Agda where an "equality is isomorphism". I am not familiar enough with it to know what happens operationally, even if Agda knows isomorphic types are equal I still think your "true isomorphism" (i.e. with a proof and fusion) is too tall of an order.
In GHC it is possible to derive via "isomorphisms" but we need to wait for dependent types to properly verify isomorphisms in the type system. Even so they can be used to produce bone fide code even if you have to do some work operationally.
You already mentioned "representational equality" (Coercible) but it is worth discussing it. It underpins the two coerce-based deriving strategies: GeneralizedNewtypeDeriving and DerivingVia which generalizes GND.
GND is the simplest way to turn an isomorphism (Coercible USD Int) into code:
type USD :: Type
newtype USD = MkUSD Int
deriving
newtype (Eq, Ord, Show, Num)
Operationally coerce is zero-cost at so they incur no cost at run-time. This is the only way you will get what you want in Haskell.
Isomorphisms can also be done through user-defined type classes.
An instance of Representable f means f is (naturally) isomorphic to functions from its representing object (Rep f ->). The newtype Co uses this isomorphism to derive function instances for representable functor. A Pair a of two values is represented by Bool, and is thus isomorphic to Bool -> a.
This isomorphism lets Pair derive Functor, Applicative and Monad by roundtripping through (Bool ->):
type Pair :: Type -> Type
data Pair a = a :# a
deriving (Functor, Applicative, Monad)
via Co Pair
instance Distributive Pair where
distribute :: Functor f => f (Pair a) -> Pair (f a)
distribute = distributeRep
instance Representable Pair where
type Rep Pair = Bool
index :: Pair a -> (Bool -> a)
index (false :# true) = \case
False -> false
True -> true
tabulate :: (Bool -> a) -> Pair a
tabulate make = make False :# make True
When you derive Generic/Generic1 the compiler generates an isomorphism between a generic type and its generic representation Rep/Rep1 (not to be confused with the representing object Rep from the above example).
The class laws state that to/from and to1/from1 witness that isomorphism. The type system does not enforce these laws but if you derive them they should hold.
They are the main way to define generic implementations in Haskell. I recently introduced two newtypes Generically and Generically1 to base, as standard names for generic behaviour (use generic-data until the next GHC release). You can derive a generic isomorphism and programmatically use it in the next line without leaving the data declaration:
type Lists :: Type -> Type
data Lists a = Lists [a] [a] [a]
deriving
stock (Show, Generic, Generic1)
deriving (Semigroup, Monoid)
via Generically (Lists a)
deriving (Functor, Applicative, Alternative)
via Generically1 Lists
>> mempty #(Lists _)
Lists [] [] []
>> empty #Lists
Lists [] [] []
>> Lists "a" "b" "c" <> Lists "!" "." "?"
Lists "a!" "b." "c?"
>> pure #Lists 'a'
Lists "a" "a" "a"
You will however have to pay for the converstion cost, it's not as simple as adding {-# Rules "to/from" to . from = id #-} because the actual instances will appear with intermediate terms like to (from a <> from b). Even your "true isomorphisms" GHC could not fuse away the conversion since it's not of the form to . from.
There is also a library iso-deriving (blog) that allows deriving via arbitrary isomorphisms.

Type Constraint in Constructor [duplicate]

This question already has answers here:
Type Constraints in Data Declaration Haskell
(3 answers)
Closed 3 years ago.
I am trying to make a normal form game solver for game theory, and I'm trying to make it as generic as possible for good practice and for my own convenience. I would like to use the same functions to solve both zero-sum and non-zero-sum games, so I am using the following data type:
data Payoffs = (Num a, Eq a, Ord a) => ZS a
| (Num a, Eq a, Ord a) => NZS (a,a)
However, this is not correct syntax. Is there any way to constrain a so that it must satisfy those type constraints?
Short answer (and probably not the one you need):
To make your code work as is, you need a forall quantifier (for which you need to enable ExistentialQuantification):
{-# LANGUAGE ExistentialQuantification #-}
data Payoffs =
forall a. (Num a, Eq a, Ord a) => ZS a
| forall a. (Num a, Eq a, Ord a) => NZS (a,a)
If you have a type variable in the data constructor (i.e. ZS a), then you have two choices: either that variable has to appear in the type constructor (i.e. data Payoffs a =), or you need to say "I don't care what type it is, as long as it supports these classes" - which is achieved via the forall quantifier.
But this looks kinda useless to me, which suggests that you may be misunderstanding what it means. If you write the above code, every value of your Payoffs type will be able to wrap a value of any type, as long as that type supports Num, Eq, and Ord. One subtle consequence of this is that, if you have two values of Payoffs lying around, they will not necessarily wrap the same type. For example:
let x = ZS (42 :: Int) -- wraps an Int
let y = NZS (2.71 :: Double, 3.14) -- wraps two Doubles
This means that, upon unpacking them, you won't be able to, for example, add them together, because, even though they both implement Num, the compiler doesn't have any proof that they're actually the same type.
What I suspect you actually need is a parametrized type, like this:
data Payoffs a = ZS a | NZS (a, a)
But then, of course, you lose the constraints: anybody can go and create ZS String or something. You can use the GADT syntax (with the GADTs extension) to bring them back:
{-# LANGUAGE GADTs #-}
data Payoffs a where
ZS :: (Num a, Ord a, Eq a) => a -> Payoffs a
NZS :: (Num a, Ord a, Eq a) => (a, a) -> Payoffs a
This notation is equivalent to ZS a | NZS (a, a), except you get to define each constructor with the same syntax as any function - including constraints. A type defined like this won't allow for creating values of type Payoffs a unless a satisfies the constraints.
At the same time, if you have a value of a type like this lying around, you know what type it wraps inside. And this allows you to tell if two Payoffs values wrap the same type or different. And then, if you know that they're the same, you can do things with them using the supported classes, for example:
addPayoffs :: Payoffs a -> Payoffs a -> Payoffs a
addPayoffs (ZS a) (ZS b) = ZS (a + b)
addPayoffs (ZS a) (NZS (x,y)) = NZS (a+x, a+y)
... etc.

Data families vs Injective type families

Now that we have injective type families, is there any remaining use case for using data families over type families?
Looking at past StackOverflow questions about data families, there is this question from a couple years ago discussing the difference between type families and data families, and this answer about use cases of data families. Both say that the injectivity of data families is their greatest strength.
Looking at the docs on data families, I see reason not to rewrite all uses of data families using injective type families.
For example, say I have a data family (I've merged some examples from the docs to try to squeeze in all the features of data families)
data family G a b
data instance G Int Bool = G11 Int | G12 Bool deriving (Eq)
newtype instance G () a = G21 a
data instance G [a] b where
G31 :: c -> G [Int] b
G32 :: G [a] Bool
I might as well rewrite it as
type family G a b = g | g -> a b
type instance G Int Bool = G_Int_Bool
type instance G () a = G_Unit_a a
type instance G [a] b = G_lal_b a b
data G_Int_Bool = G11 Int | G12 Bool deriving (Eq)
newtype G_Unit_a a = G21 a
data G_lal_b a b where
G31 :: c -> G_lal_b [Int] b
G32 :: G_lal_b [a] Bool
It goes without saying that associated instances for data families correspond to associated instances with type families in the same way. Then is the only remaining difference that we have less things in the type-namespace?
As a followup, is there any benefit to having less things in the type-namespace? All I can think of is that this will become debugging hell for someone playing with this on ghci - the types of the constructors all seem to indicate that the constructors are all under one GADT...
type family T a = r | r -> a
data family D a
An injective type family T satisfies the injectivity axiom
if T a ~ T b then a ~ b
But a data family satisfies the much stronger generativity axiom
if D a ~ g b then D ~ g and a ~ b
(If you like: Because the instances of D define new types that are different from any existing types.)
In fact D itself is a legitimate type in the type system, unlike a type family like T, which can only ever appear in a fully saturated application like T a. This means
D can be the argument to another type constructor, like MaybeT D. (MaybeT T is illegal.)
You can define instances for D, like instance Functor D. (You can't define instances for a type family Functor T, and it would be unusable anyway because instance selection for, e.g., map :: Functor f => (a -> b) -> f a -> f b relies on the fact that from the type f a you can determine both f and a; for this to work f cannot be allowed to vary over type families, even injective ones.)
You're missing one other detail - data families create new types. Type families can only refer to other types. In particular, every instance of a data family declares new constructors. And it's nicely generic. You can create a data instance with newtype instance if you want newtype semantics. Your instance can be a record. It can have multiple constructors. It can even be a GADT if you want.
It's exactly the difference between the type and data/newtype keywords. Injective type families don't give you new types, rendering them useless in the case where you need that.
I understand where you're coming from. I had this same issue with the difference initially. Then I finally ran into a use case where they're useful, even without a type class getting involved.
I wanted to write an api for dealing with mutable cells in a few different contexts, without using classes. I knew I wanted to do it with a free monad with interpreters in IO, ST, and maybe some horrible hacks with unsafeCoerce to even go so far as shoehorning it into State. This wasn't for any practical purpose, of course - I was just exploring API designs.
So I had something like this:
data MutableEnv (s :: k) a ...
newRef :: a -> MutableEnv s (Ref s a)
readRef :: Ref s a -> MutableEnv s a
writeRef :: Ref s a -> a -> MutableEnv s ()
The definition of MutableEnv wasn't important. Just standard free/operational monad stuff with constructors matching the three functions in the api.
But I was stuck on what to define Ref as. I didn't want some sort of class, I wanted it to be a concrete type as far as the type system was concerned.
Then late one night I was out for a walk and it hit me - what I essentially want is a type whose constructors are indexed by an argument type. But it had to be open, unlike a GADT - new interpreters could be added at will. And then it hit me. That's exactly what a data family is. An open, type-indexed family of data values. I could complete the api with just the following:
data family Ref (s :: k) :: * -> *
Then, dealing with the underlying representation for a Ref was no big deal. Just create a data instance (or newtype instance, more likely) whenever an interpreter for MutableEnv is defined.
This exact example isn't really useful. But it clearly illustrates something data families can do that injective type families can't.
The answer by Reid Barton explains the distinction between my two examples perfectly. It has reminded me of something I read in Richard Eisenberg's thesis about adding dependent types to Haskell and I thought that since the heart of this question is injectivity and generativity, it would be worth mentioning how DependentHaskell will deal with this (when it eventually gets implemented, and if the quantifiers proposed now are the ones eventually implemented).
What follows is based on pages 56 and 57 (4.3.4 Matchability) of the aforementioned thesis:
Definition (Generativity). If f and g are generative, then f a ~ g b implies f ~ g
Definition (Injectivity). If f is injective, then f a ~ f b implies a ~ b
Definition (Matchability). A function f is matchable iff it is generative and injective
In Haskell as we know it now (8.0.1) the matchable (type-level) functions consist exactly of newtype, data, and data family type constructors. In the future, under DependentHaskell, one of the new quantifiers we will get will be '-> and this will be used to denote matchable functions. In other words, there will be a way to inform the compiler a type-level function is generative (which currently can only be done by making sure that function is a type constructor).

Convert from type `T a` to `T b` without boilerplate

So, I have an AST data type with a large number of cases, which is parameterized by an "annotation" type
data Expr a = Plus a Int Int
| ...
| Times a Int Int
I have annotation types S and T, and some function f :: S -> T. I want to take an Expr S and convert it to an Expr T using my conversion f on each S which occurs within an Expr value.
Is there a way to do this using SYB or generics and avoid having to pattern match on every case? It seems like the type of thing that this is suited for. I just am not familiar enough with SYB to know the specific way to do it.
It sounds like you want a Functor instance. This can be automatically derived by GHC using the DeriveFunctor extension.
Based on your follow-up question, it seems that a generics library is more appropriate to your situation than Functor. I'd recommend just using the function given on SYB's wiki page:
{-# LANGUAGE DeriveDataTypeable, ScopedTypeVariables, FlexibleContexts #-}
import Data.Generics
import Unsafe.Coerce
newtype C a = C a deriving (Data,Typeable)
fmapData :: forall t a b. (Typeable a, Data (t (C a)), Data (t a)) =>
(a -> b) -> t a -> t b
fmapData f input = uc . everywhere (mkT $ \(x::C a) -> uc (f (uc x)))
$ (uc input :: t (C a))
where uc = unsafeCoerce
The reason for the extra C type is to avoid a problematic corner case where there are occurrences of fields at the same type as a (more details on the wiki). The caller of fmapData doesn't need to ever see it.
This function does have a few extra requirements compared to the real fmap: there must be instances of Typeable for a, and Data for t a. In your case t a is Expr a, which means that you'll need to add a deriving Data to the definition of Expr, as well as have a Data instance in scope for whatever a you're using.

Use of 'unsafeCoerce'

In Haskell, there is a function called unsafeCoerce, that turns anything into any other type of thing. What exactly is this used for? Like, why we would you want to transform things into each other in such an "unsafe" way?
Provide an example of a way that unsafeCoerce is actually used. A link to Hackage would help. Example code in someones question would not.
unsafeCoerce lets you convince the type system of whatever property you like. It's thus only "safe" exactly when you can be completely certain that the property you're declaring is true. So, for instance:
unsafeCoerce True :: Int
is a violation and can lead to wonky, bad runtime behavior.
unsafeCoerce (3 :: Int) :: Int
is (obviously) fine and will not lead to runtime misbehavior.
So what's a non-trivial use of unsafeCoerce? Let's say we've got an typeclass-bound existential type
module MyClass ( SomethingMyClass (..), intSomething ) where
class MyClass x where {}
instance MyClass Int where {}
data SomethingMyClass = forall a. MyClass a => SomethingMyClass a
Let's also say, as noted here, that the typeclass MyClass is not exported and thus nobody else can ever create instances of it. Indeed, Int is the only thing that instantiates it and the only thing that ever will.
Now when we pattern match to destruct a value of SomethingMyClass we'll be able to pull a "something" out from inside
foo :: SomethingMyClass -> ...
foo (SomethingMyClass a) =
-- here we have a value `a` with type `exists a . MyClass a => a`
--
-- this is totally useless since `MyClass` doesn't even have any
-- methods for us to use!
...
Now, at this point, as the comment suggests, the value we've pulled out has no type information—it's been "forgotten" by the existential context. It could be absolutely anything which instantiates MyClass.
Of course, in this very particular situation we know that the only thing implementing MyClass is Int. So our value a must actually have type Int. We could never convince the typechecker that this is true, but due to an outside proof we know that it is.
Therefore, we can (very carefully)
intSomething :: SomethingMyClass -> Int
intSomething (SomethingMyClass a) = unsafeCoerce a -- shudder!
Now, hopefully I've suggested that this is a terrible, dangerous idea, but it also may give a taste of what kind of information we can take advantage of in order to know things that the typechecker cannot.
In non-pathological situations, this is rare. Even rarer is a situation where using something we know and the typechecker doesn't isn't itself pathological. In the above example, we must be completely certain that nobody ever extends our MyClass module to instantiate more types to MyClass otherwise our use of unsafeCoerce becomes instantly unsafe.
> instance MyClass Bool where {}
> intSomething (SomethingMyClass True)
6917529027658597398
Looks like our compiler internals are leaking!
A more common example where this sort of behavior might be valuable is when using newtype wrappers. It's a fairly common idea that we might wrap a type in a newtype wrapper in order to specialize its instance definitions.
For example, Int does not have a Monoid definition because there are two natural monoids over Ints: sums and products. Instead, we use newtype wrappers to be more explicit.
newtype Sum a = Sum { getSum :: a }
instance Num a => Monoid (Sum a) where
mempty = Sum 0
mappend (Sum a) (Sum b) = Sum (a+b)
Now, normally the compiler is pretty smart and recognizes that it can eliminate all of those Sum constructors in order to produce more efficient code. Sadly, there are times when it cannot, especially in highly polymorphic situations.
If you (a) know that some type a is actually just a newtype-wrapped b and (b) know that the compiler is incapable of deducing this itself, then you might want to do
unsafeCoerce (x :: a) :: b
for a slight efficiency gain. This, for instance, occurs frequently in lens and is expressed in the Data.Profunctor.Unsafe module of profunctors, a dependency of lens.
But let me again suggest that you really need to know what's going on before using unsafeCoerce like this is anything but highly unsafe.
One final thing to compare is the "typesafe cast" available in Data.Typeable. This function looks a bit like unsafeCoerce, but with much more ceremony.
unsafeCoerce :: a -> b
cast :: (Typeable a, Typeable b) => a -> Maybe b
Which, you might think of as being implemented using unsafeCoerce and a function typeOf :: Typeable a => a -> TypeRep where TypeRep are unforgeable, runtime tokens which reflect the type of a value. Then we have
cast :: (Typeable a, Typeable b) => a -> Maybe b
cast a = if (typeOf a == typeOf b) then Just b else Nothing
where b = unsafeCoerce a
Thus, cast is able to ensure that the types of a and b really are the same at runtime, and it can decide to return Nothing if they are not. As an example:
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE ExistentialQuantification #-}
data A = A deriving (Show, Typeable)
data B = B deriving (Show, Typeable)
data Forget = forall a . Typeable a => Forget a
getAnA :: Forget -> Maybe A
getAnA (Forget something) = cast something
which we can run as follows
> getAnA (Forget A)
Just A
> getAnA (Forget B)
Nothing
So if we compare this usage of cast with unsafeCoerce we see that it can achieve some of the same functionality. In particular, it allows us to rediscover information that may have been forgotten by ExistentialQuantification. However, cast manually checks the types at runtime to ensure that they are truly the same and thus cannot be used unsafely. To do this, it demands that both the source and target types allow for runtime reflection of their types via the Typeable class.
The only time I ever felt compelled to use unsafeCoerce was on finite natural numbers.
{-# LANGUAGE DataKinds, GADTs, TypeFamilies, StandaloneDeriving #-}
data Nat = Z | S Nat deriving (Eq, Show)
data Fin (n :: Nat) :: * where
FZ :: Fin (S n)
FS :: Fin n -> Fin (S n)
deriving instance Show (Fin n)
Fin n is a singly linked data structure that is statically ensured to be smaller than the n type level natural number by which it is parametrized.
-- OK, 1 < 2
validFin :: Fin (S (S Z))
validFin = FS FZ
-- type error, 2 < 2 is false
invalidFin :: Fin (S (S Z))
invalidFin = FS (FS FZ)
Fin can be used to safely index into various data structures. It's pretty standard in dependently typed languages, though not in Haskell.
Sometimes we want to convert a value of Fin n to Fin m where m is greater than n.
relaxFin :: Fin n -> Fin (S n)
relaxFin FZ = FZ
relaxFin (FS n) = FS (relaxFin n)
relaxFin is a no-op by definition, but traversing the value is still required for the types to check out. So we might just use unsafeCoerce instead of relaxFin. More pronounced gains in speed can result from coercing larger data structures that contain Fin-s (for example, you could have lambda terms with Fin-s as bound variables).
This is an admittedly exotic example, but I find it interesting in the sense that it's pretty safe: I can't really think of ways for external libraries or safe user code to mess this up. I might be wrong though and I'd be eager to hear about potential safety issues.
There is no use of unsafeCoerce I can really recommend, but I can see that in some cases such a thing might be useful.
The first use that springs to mind is the implementation of the Typeable-related routines. In particular cast :: (Typeable a, Typeable b) => a -> Maybe b achieves a type-safe behaviour, so it is safe to use, yet it has to play dirty tricks in its implementation.
Maybe unsafeCoerce can find some use when importing FFI subroutines to force types to match. After all, FFI already allows to import impure C functions as pure ones, so it is intrinsecally usafe. Note that "unsafe" does not mean impossible to use, but just "putting the burden of proof on the programmer".
Finally, pretend that sortBy did not exist. Consider then this example:
-- Like Int, but using the opposite ordering
newtype Rev = Rev { unRev :: Int }
instance Ord Rev where compare (Rev x) (Rev y) = compare y x
sortDescending :: [Int] -> [Int]
sortDescending = map unRev . sort . map Rev
The code above works, but feels silly IMHO. We perform two maps using functions such as Rev,unRev which we know to be no-ops at runtime. So we just scan the list twice for no reason, but that of convincing the compiler to use the right Ord instance.
The performance impact of these maps should be small since we also sort the list. Yet it is tempting to rewrite map Rev as unsafeCoerce :: [Int]->[Rev] and save some time.
Note that having a coercing function
castNewtype :: IsNewtype t1 t2 => f t2 -> f t1
where the constraint means that t1 is a newtype for t2 would help, but it would be quite dangerous. Consider
castNewtype :: Data.Set Int -> Data.Set Rev
The above would cause the data structure invariant to break, since we are changing the ordering underneath! Since Data.Set is implemented as a binary search tree, it would cause quite a large damage.

Resources