Is this use of GADTs fully equivalent to existential types? - haskell

Existentially quantified data constructors like
data Foo = forall a. MkFoo a (a -> Bool)
| Nil
can be easily translated to GADTs:
data Foo where
MkFoo :: a -> (a -> Bool) -> Foo
Nil :: Foo
Are there any differences between them: code which compiles with one but not another, or gives different results?

They are nearly equivalent, albeit not completely so, depending on which extensions you turn on.
First of all, note that you don't need to enable the GADTs extension to use the data .. where syntax for existential types. It suffices to enable the following lesser extensions.
{-# LANGUAGE GADTSyntax #-}
{-# LANGUAGE ExistentialQuantification #-}
With these extensions, you can compile
data U where
U :: a -> (a -> String) -> U
foo :: U -> String
foo (U x f) = f x
g x = let h y = const y x
in (h True, h 'a')
The above code also compiles if we replace the extensions and the type definition with
{-# LANGUAGE ExistentialQuantification #-}
data U = forall a . U a (a -> String)
The above code, however, does not compile with the GADTs extension turned on! This is because GADTs also turns on the MonoLocalBinds extension, which prevents the above definition of g to compile. This is because the latter extension prevents h to receive a polymorphic type.

From the documentation:
Notice that GADT-style syntax generalises existential types (Existentially quantified data constructors). For example, these two declarations are equivalent:
data Foo = forall a. MkFoo a (a -> Bool)
data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
(emphasis on the word equivalent)

The latter isn't actually a GADT - it's an existentially quantified data type declared with GADT syntax. As such, it is identical to the former.
The reason it's not a GADT is that there is no type variable that gets refined based on the choice of constructor. That's the key new functionality added by GADTs. If you have a GADT like this:
data Foo a where
StringFoo :: String -> Foo String
IntFoo :: Int -> Foo Int
Then pattern-matching on each constructor reveals additional information that can be used inside the matching clause. For instance:
deconstructFoo :: Foo a -> a
deconstructFoo (StringFoo s) = "Hello! " ++ s ++ " is a String!"
deconstructFoo (IntFoo i) = i * 3 + 1
Notice that something very interesting is happening there, from the point of view of the type system. deconstructFoo promises it will work for any choice of a, as long as it's passed a value of type Foo a. But then the first equation returns a String, and the second equation returns an Int.
This is what you cannot do with a regular data type, and the new thing GADTs provide. In the first equation, the pattern match adds the constraint (a ~ String) to its context. In the second equation, the pattern match adds (a ~ Int).
If you haven't created a type where pattern-matching can cause type refinement, you don't have a GADT. You just have a type declared with GADT syntax. Which is fine - in a lot of ways, it's a better syntax than the basic data type syntax. It's just more verbose for the easiest cases.

Related

Parameterized Types in Haskell

Why do types in Haskell have to be explicitly parameterized in the type constructor parameter?
For example:
data Maybe a = Nothing | Just a
Here a has to be specified with the type. Why can't it be specified only in the constructor?
data Maybe = Nothing | Just a
Why did they make this choice from a design point of view? Is one better than the other?
I do understand that first is more strongly typed than the second, but there isn't even an option for the second one.
Edit :
Example function
data Maybe = Just a | Nothing
div :: (Int -> Int -> Maybe)
div a b
| b == 0 = Nothing
| otherwise = Just (a / b)
It would probably clear things up to use GADT notation, since the standard notation kind of mangles together the type- and value-level languages.
The standard Maybe type looks thus as a GADT:
{-# LANGUAGE GADTs #-}
data Maybe a where
Nothing :: Maybe a
Just :: a -> Maybe a
The “un-parameterised” version is also possible:
data EMaybe where
ENothing :: EMaybe
EJust :: a -> EMaybe
(as Joseph Sible commented, this is called an existential type). And now you can define
foo :: Maybe Int
foo = Just 37
foo' :: EMaybe
foo' = EJust 37
Great, so why don't we just use EMaybe always?
Well, the problem is when you want to use such a value. With Maybe it's fine, you have full control of the contained type:
bhrar :: Maybe Int -> String
bhrar Nothing = "No number 😞"
bhrar (Just i)
| i<0 = "Negative 😖"
| otherwise = replicate i '😌'
But what can you do with a value of type EMaybe? Not much, it turns out, because EJust contains a value of some unknown type. So whatever you try to use the value for, will be a type error, because the compiler has no way to confirm it's actually the right type.
bhrar :: EMaybe -> String
bhrar' (EJust i) = replicate i '😌'
=====> Error couldn't match expected type Int with a
If a variable is not reflected in the return type it is considered existential. This is possible to define data ExMaybe = ExNothing | forall a. ExJust a but the argument to ExJust is completely useless. ExJust True and ExJust () both have type ExMaybe and are indistinguisable from the type system's perspective.
Here is the GADT syntax for both the original Maybe and the existential ExMaybe
{-# Language GADTs #-}
{-# Language LambdaCase #-}
{-# Language PolyKinds #-}
{-# Language ScopedTypeVariables #-}
{-# Language StandaloneKindSignatures #-}
{-# Language TypeApplications #-}
import Data.Kind (Type)
import Prelude hiding (Maybe(..))
type Maybe :: Type -> Type
data Maybe a where
Nothing :: Maybe a
Just :: a -> Maybe a
type ExMaybe :: Type
data ExMaybe where
ExNothing :: ExMaybe
ExJust :: a -> ExMaybe
You're question is like asking why a function f x = .. needs to specify its argument, there is the option of making the type argument invisible but this is very odd but the argument is still there even if invisible.
-- >> :t JUST
-- JUST :: a -> MAYBE
-- >> :t JUST 'a'
-- JUST 'a' :: MAYBE
type MAYBE :: forall (a :: Type). Type
data MAYBE where
NOTHING :: MAYBE #a
JUST :: a -> MAYBE #a
mAYBE :: b -> (a -> b) -> MAYBE #a -> b
mAYBE nOTHING jUST = \case
NOTHING -> nOTHING
JUST a -> jUST a
Having explicit type parameters makes it much more expressive. You lose so much information without it. For example, how would you write the type of map? Or functors in general?
map :: (a -> b) -> [a] -> [b]
This version says almost nothing about what’s going on
map :: (a -> b) -> [] -> []
Or even worse, head:
head :: [] -> a
Now we suddenly have access to unsafe coerce and zero type safety at all.
unsafeCoerce :: a -> b
unsafeCoerce x = head [x]
But we don’t just lose safety, we also lose the ability to do some things. For example if we want to read something into a list or Maybe, we can no longer specify what kind of list we want.
read :: Read a => a
example :: [Int] -> String
main = do
xs <- getLine
putStringLine (example xs)
This program would be impossible to write without lists having an explicit type parameter. (Or rather, read would be unable to have different implementations for different list types, since content type is now opaque)
It is however, as was mentioned by others, still possible to define a similar type by using the ExistentialQuantification extension. But in those cases you are very limited in how you can use those data types, since you cannot know what they contain.

Why do Haskell's scoped type variables not allow binding of type variables in pattern bindings?

I noticed that GHC's ScopedTypeVariables is able to bind type variables in function patterns but not let patterns.
As a minimal example, consider the type
data Foo where Foo :: Typeable a => a -> Foo
If I want to gain access to the type inside a Foo, the following function does not compile:
fooType :: Foo -> TypeRep
fooType (Foo x) =
let (_ :: a) = x
in typeRep (Proxy::Proxy a)
But using this trick to move the type variable binding to a function call, it works without issue:
fooType (Foo x) =
let helper (_ :: a) = typeRep (Proxy::Proxy a)
in helper x
Since let bindings are actually function bindings in disguise, why aren't the above two code snippets equivalent?
(In this example, other solutions would be to create the TypeRep with typeOf x, or bind the variable directly as x :: a in the top-level function. Neither of those options are available in my real code, and using them doesn't answer the question.)
The big thing is, functions are case expressions in disguise, not let expressions. case matching and let matching have different semantics. This is also why you can't match a GADT constructor that does type refinement in a let expression.
The difference is that case matches evaluate the scrutinee before continuing, whereas let matches throw a thunk onto the heap that says "do this evaluation when the result is demanded". GHC doesn't know how to preserve locally-scoped types (like a in your example) across all the potential ways laziness may interact with them, so it just doesn't try. If locally-scoped types are involved, use a case expression such that laziness can't become a problem.
As for your code, ScopedTypeVariables actually provides you a far more succinct option:
{-# Language ScopedTypeVariables, GADTs #-}
import Data.Typeable
import Data.Proxy
data Foo where
Foo :: Typeable a => a -> Foo
fooType :: Foo -> TypeRep
fooType (Foo (x :: a)) = typeRep (Proxy :: Proxy a)

What can type families do that multi param type classes and functional dependencies cannot

I have played around with TypeFamilies, FunctionalDependencies, and MultiParamTypeClasses. And it seems to me as though TypeFamilies doesn't add any concrete functionality over the other two. (But not vice versa). But I know type families are pretty well liked so I feel like I am missing something:
"open" relation between types, such as a conversion function, which does not seem possible with TypeFamilies. Done with MultiParamTypeClasses:
class Convert a b where
convert :: a -> b
instance Convert Foo Bar where
convert = foo2Bar
instance Convert Foo Baz where
convert = foo2Baz
instance Convert Bar Baz where
convert = bar2Baz
Surjective relation between types, such as a sort of type safe pseudo-duck typing mechanism, that would normally be done with a standard type family. Done with MultiParamTypeClasses and FunctionalDependencies:
class HasLength a b | a -> b where
getLength :: a -> b
instance HasLength [a] Int where
getLength = length
instance HasLength (Set a) Int where
getLength = S.size
instance HasLength Event DateDiff where
getLength = dateDiff (start event) (end event)
Bijective relation between types, such as for an unboxed container, which could be done through TypeFamilies with a data family, although then you have to declare a new data type for every contained type, such as with a newtype. Either that or with an injective type family, which I think is not available prior to GHC 8. Done with MultiParamTypeClasses and FunctionalDependencies:
class Unboxed a b | a -> b, b -> a where
toList :: a -> [b]
fromList :: [b] -> a
instance Unboxed FooVector Foo where
toList = fooVector2List
fromList = list2FooVector
instance Unboxed BarVector Bar where
toList = barVector2List
fromList = list2BarVector
And lastly a surjective relations between two types and a third type, such as python2 or java style division function, which can be done with TypeFamilies by also using MultiParamTypeClasses. Done with MultiParamTypeClasses and FunctionalDependencies:
class Divide a b c | a b -> c where
divide :: a -> b -> c
instance Divide Int Int Int where
divide = div
instance Divide Int Double Double where
divide = (/) . fromIntegral
instance Divide Double Int Double where
divide = (. fromIntegral) . (/)
instance Divide Double Double Double where
divide = (/)
One other thing I should also add is that it seems like FunctionalDependencies and MultiParamTypeClasses are also quite a bit more concise (for the examples above anyway) as you only have to write the type once, and you don't have to come up with a dummy type name which you then have to type for every instance like you do with TypeFamilies:
instance FooBar LongTypeName LongerTypeName where
FooBarResult LongTypeName LongerTypeName = LongestTypeName
fooBar = someFunction
vs:
instance FooBar LongTypeName LongerTypeName LongestTypeName where
fooBar = someFunction
So unless I am convinced otherwise it really seems like I should just not bother with TypeFamilies and use solely FunctionalDependencies and MultiParamTypeClasses. Because as far as I can tell it will make my code more concise, more consistent (one less extension to care about), and will also give me more flexibility such as with open type relationships or bijective relations (potentially the latter is solver by GHC 8).
Here's an example of where TypeFamilies really shines compared to MultiParamClasses with FunctionalDependencies. In fact, I challenge you to come up with an equivalent MultiParamClasses solution, even one that uses FlexibleInstances, OverlappingInstance, etc.
Consider the problem of type level substitution (I ran across a specific variant of this in Quipper in QData.hs). Essentially what you want to do is recursively substitute one type for another. For example, I want to be able to
substitute Int for Bool in Either [Int] String and get Either [Bool] String,
substitute [Int] for Bool in Either [Int] String and get Either Bool String,
substitute [Int] for [Bool] in Either [Int] String and get Either [Bool] String.
All in all, I want the usual notion of type level substitution. With a closed type family, I can do this for any types (albeit I need an extra line for each higher-kinded type constructor - I stopped at * -> * -> * -> * -> *).
{-# LANGUAGE TypeFamilies #-}
-- Subsitute type `x` for type `y` in type `a`
type family Substitute x y a where
Substitute x y x = y
Substitute x y (k a b c d) = k (Substitute x y a) (Substitute x y b) (Substitute x y c) (Substitute x y d)
Substitute x y (k a b c) = k (Substitute x y a) (Substitute x y b) (Substitute x y c)
Substitute x y (k a b) = k (Substitute x y a) (Substitute x y b)
Substitute x y (k a) = k (Substitute x y a)
Substitute x y a = a
And trying at ghci I get the desired output:
> :t undefined :: Substitute Int Bool (Either [Int] String)
undefined :: Either [Bool] [Char]
> :t undefined :: Substitute [Int] Bool (Either [Int] String)
undefined :: Either Bool [Char]
> :t undefined :: Substitute [Int] [Bool] (Either [Int] String)
undefined :: Either [Bool] [Char]
With that said, maybe you should be asking yourself why am I using MultiParamClasses and not TypeFamilies. Of the examples you gave above, all except Convert translate to type families (albeit you will need an extra line per instance for the type declaration).
Then again, for Convert, I am not convinced it is a good idea to define such a thing. The natural extension to Convert would be instances such as
instance (Convert a b, Convert b c) => Convert a c where
convert = convert . convert
instance Convert a a where
convert = id
which are as unresolvable for GHC as they are elegant to write...
To be clear, I am not saying there are no uses of MultiParamClasses, just that when possible you should be using TypeFamilies - they let you think about type-level functions instead of just relations.
This old HaskellWiki page does an OK job of comparing the two.
EDIT
Some more contrasting and history I stumbled upon from augustss blog
Type families grew out of the need to have type classes with
associated types. The latter is not strictly necessary since it can be
emulated with multi-parameter type classes, but it gives a much nicer
notation in many cases. The same is true for type families; they can
also be emulated by multi-parameter type classes. But MPTC gives a
very logic programming style of doing type computation; whereas type
families (which are just type functions that can pattern match on the
arguments) is like functional programming.
Using closed type families
adds some extra strength that cannot be achieved by type classes. To
get the same power from type classes we would need to add closed type
classes. Which would be quite useful; this is what instance chains
gives you.
Functional dependencies only affect the process of constraint solving, while type families introduced the notion of non-syntactic type equality, represented in GHC's intermediate form by coercions. This means type families interact better with GADTs. See this question for the canonical example of how functional dependencies fail here.

Is there a way to represent the function unpack on Showable in GHC's Haskell?

I have a type named Showable like this:
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE ExplicitForAll #-}
data Showable = forall a . Show a => Showable a
Then making a function that packs it is trivial. I just have to write:
pack :: forall a . Show a => a -> Showable
pack x = Showable x
However it seems impossible to create the inverse function that would unpack the data from Showable. If I try to just invert what I wrote for pack and write:
unpack :: exists a . Show a => Showable -> a
unpack (Showable x) = x
then I get an error from GHC.
I have looked into the docs on GHC Language extensions and there seems to be no support for the exists keyword. I have seen that it might be possible in some other Haskell compilers, but I would prefer to be able to do it in GHC as well.
Funnily enough, I can still pattern match on Showable and extract the data from inside it in that way. So I could get the value out of it that way, but if I wanted to make a pointfree function involving Showable, then I would need unpack.
So is there some way to implement unpack in GHC's Haskell, maybe using Type Families or some other arcane magic that are GHC extensions?
Your unpack, as you have typed, cannot be written. The reason is that the existential variable a "escapes" the scope of the pattern match and there is no facility to track that behavior. Quote the docs:
data Baz = forall a. Eq a => Baz1 a a
| forall b. Show b => Baz2 b (b -> b)
When pattern matching, each pattern match introduces a new, distinct, type for each existential type variable. These types cannot be unified with any other type, nor can they escape from the scope of the pattern match. For example, these fragments are incorrect:
f1 (MkFoo a f) = a
What is this "a" in the result type? Clearly we don't mean this:
f1 :: forall a. Foo -> a -- Wrong!
The original program is just plain wrong. Here's another sort of error
f2 (Baz1 a b) (Baz1 p q) = a==q
It's ok to say a==b or p==q, but a==q is wrong because it equates the two distinct types arising from the two Baz1 constructors.
You can however rewrite the unpack type equivalently so that the existential does not escape:
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE RankNTypes #-}
module Lib where
data Showable = forall a. Show a => Showable a
pack :: Show a => a -> Showable
pack = Showable
unpack :: Showable -> (forall a. Show a => a -> r) -> r
unpack (Showable a) a2r = a2r a

Convert from type `T a` to `T b` without boilerplate

So, I have an AST data type with a large number of cases, which is parameterized by an "annotation" type
data Expr a = Plus a Int Int
| ...
| Times a Int Int
I have annotation types S and T, and some function f :: S -> T. I want to take an Expr S and convert it to an Expr T using my conversion f on each S which occurs within an Expr value.
Is there a way to do this using SYB or generics and avoid having to pattern match on every case? It seems like the type of thing that this is suited for. I just am not familiar enough with SYB to know the specific way to do it.
It sounds like you want a Functor instance. This can be automatically derived by GHC using the DeriveFunctor extension.
Based on your follow-up question, it seems that a generics library is more appropriate to your situation than Functor. I'd recommend just using the function given on SYB's wiki page:
{-# LANGUAGE DeriveDataTypeable, ScopedTypeVariables, FlexibleContexts #-}
import Data.Generics
import Unsafe.Coerce
newtype C a = C a deriving (Data,Typeable)
fmapData :: forall t a b. (Typeable a, Data (t (C a)), Data (t a)) =>
(a -> b) -> t a -> t b
fmapData f input = uc . everywhere (mkT $ \(x::C a) -> uc (f (uc x)))
$ (uc input :: t (C a))
where uc = unsafeCoerce
The reason for the extra C type is to avoid a problematic corner case where there are occurrences of fields at the same type as a (more details on the wiki). The caller of fmapData doesn't need to ever see it.
This function does have a few extra requirements compared to the real fmap: there must be instances of Typeable for a, and Data for t a. In your case t a is Expr a, which means that you'll need to add a deriving Data to the definition of Expr, as well as have a Data instance in scope for whatever a you're using.

Resources