Existential types in Haskell and generics in other languages - haskell

I was trying to grasp the concept of existential types in Haskell using the article Haskell/Existentially quantified types. At the first glance, the concept seems clear and somewhat similar to generics in object oriented languages. The main example there is something called "heterogeneous list", defined as follows:
data ShowBox = forall s. Show s => SB s
heteroList :: [ShowBox]
heteroList = [SB (), SB 5, SB True]
instance Show ShowBox where
show (SB s) = show s
f :: [ShowBox] -> IO ()
f xs = mapM_ print xs
main = f heteroList
I had a different notion of a "heterogeneous list", something like Shapeless in Scala. But here, it's just a list of items wrapped in an existential type that only adds a type constraint. The exact type of its elements is not manifested in its type signature, the only thing we know is that they all conform to the type constraint.
In object-oriented languages, it seems very natural to write something like this (example in Java). This is a ubiquitous use case, and I don't need to create a wrapper type to process a list of objects that all implement a certain interface. The animals list has a generic type List<Vocal>, so I can assume that its elements all conform to this Vocal interface:
interface Vocal {
void voice();
}
class Cat implements Vocal {
public void voice() {
System.out.println("meow");
}
}
class Dog implements Vocal {
public void voice() {
System.out.println("bark");
}
}
var animals = Arrays.asList(new Cat(), new Dog());
animals.forEach(Vocal::voice);
I noticed that existential types are only available as a language extension, and they are not described in most of the "basic" Haskell books or tutorials, so my suggestion is that this is quite an advanced language feature.
My question is, why? Something that seems basic in languages with generics (constructing and using a list of objects whose types implement some interface and accessing them polymorphically), in Haskell requires a language extension, custom syntax and creating an additional wrapper type? Is there no way of achieving something like that without using existential types, or is there just no basic-level use cases for this?
Or maybe I'm just mixing up the concepts, and existential types and generics mean completely different things. Please help me make sense of it.

Yes,existential types and generic mean different things. An existential type can be used similarly to an interface in an object-oriented language. You can put one in a list of course, but a list or any other generic type is not needed to use an interface. It is enough to have a variable of type Vocal to demonstrate its usage.
It is not widely used in Haskell because it is not really needed most of the time.
nonHeteroList :: [IO ()]
nonHeteroList = [print (), print 5, print True]
does the same thing without any language extension.
An existential type (or an interface in an object-oriented language) is nothing but a piece of data with a bundled dictionary of methods. If you only have one method in your dictionary, just use a function. If you have more than one, you can use a tuple or a record of those. So if you have something like
interface Shape {
void Draw();
double Area();
}
you can express it in Haskell as, for example,
type Shape = (IO (), Double)
and say
circle center radius = (drawCircle center radius, pi * radius * radius)
rectangle topLeft bottomRight = (drawRectangle topLeft bottomRight,
abs $ (topLeft.x-bottomRight.x) * (topLeft.y-bottomRight.y))
shapes = [circle (P 2.0 3.5) 4.2, rectangle (P 3.3 7.2) (P -2.0 3.1)]
though you can express exactly the same thing with type classes, instances and existentials
class Shape a where
draw :: a -> IO ()
area :: a -> Double
data ShapeBox = forall s. Shape s => SB s
instance Shape ShapeBox where
draw (SB a) = draw a
area (SB a) = area a
data Circle = Circle Point Double
instance Shape Circle where
draw (Circle c r) = drawCircle c r
area (Circle _ r) = pi * r * r
data Rectangle = Rectangle Point Point
instance Shape Rectangle where
draw (Rectangle tl br) = drawRectangle tl br
area (Rectangle tl br) = abs $ (tl.x - br.x) * (tl.y - br.y)
shapes = [Circle (P 2.0 3.5) 4.2, Rectangle (P 3.3 7.2) (P -2.0 3.1)]
and there you have it, N times longer.

is there just no basic-level use cases for this?
Sort-of, yeah. While in Java, you have no choice but to have open classes, Haskell has ADTs which you'd normally use for these kind of use-cases. In your example, Haskell can represent it in one of two ways:
data Cat = Cat
data Dog = Dog
class Animal a where
voice :: a -> String
instance Animal Cat where
voice Cat = "meow"
instance Animal Dog where
voice Dog = "woof"
or
data Animal = Cat | Dog
voice Cat = "meow"
voice Dog = "woof"
If you needed something extensible, you'd use the former, but if you need to be able to case on the type of animal, you'd use the latter. If you wanted the former, but wanted a list, you don't have to use existential types, you could instead capture what you wanted in a list, like:
voicesOfAnimals :: [() -> String]
voicesOfAnimals = [\_ -> voice Cat, \_ -> voice Dog]
Or even more simply
voicesOfAnimals :: [String]
voicesOfAnimals = [voice Cat, voice Dog]
This is kind-of what you're doing with Heterogenous lists anyway, you have a constraint, in this case Animal a on each element, which lets you call voice on each element, but nothing else, since the constraint doesn't give you any more information about the value (well if you had the constraint Typeable a you'd be able to do more, but let's not worry about dynamic types here).
As for the reason for why Haskell doesn't support Heterogenous lists without extensions and wrappers, I'll let someone else explain it but key topics are:
subtyping
variance
inference
https://gitlab.haskell.org/ghc/ghc/-/wikis/impredicative-polymorphism (I think)
In your Java example, what's the type of Arrays.asList(new Cat())? Well, it depends on what you declare it as. If you declare the variable with List<Cat>, it typechecks, you can declare it with List<Animal>, and you can declare it with List<Object>. If you declared it as a List<Cat>, you wouldn't be able to reassign it to List<Animal> as that would be unsound.
In Haskell, typeclasses can't be used as the type within a list (so [Cat] is valid in the first example and [Animal] is valid in the second example, but [Animal] isn't valid in the first example), and this seems to be due to impredicative polymorphism not being supported in Haskell (not 100% sure). Haskell lists are defined something like [a] = [] | a : [a]. [x, y, z] is just syntatic sugar for x : (y : (z : [])). So consider the example in Haskell. Let's say you type [Dog] in the repl (this is equivalent to Dog : [] btw). Haskell infers this to have the type [Dog]. But if you were to give it Cat at the front, like [Cat, Dog] (Cat : Dog : []), it would match the 2nd constructor (:), and would infer the type of Cat : ... to [Cat], which Dog : [] would fail to match.

Since others have explained how you can avoid existential types in many cases, I figured I'd point out why you might want them. The simplest example I can think of is called Coyoneda:
data Coyoneda f a = forall x. Coyoneda (x -> a) (f x)
Coyoneda f a holds a container (or other functor) full of some type x and a function that can be mapped over it to produce an f a. Here's the Functor instance:
instance Functor (Coyoneda f) where
fmap f (Coyoneda g x) = Coyoneda (f . g) x
Note that this does not have a Functor f constraint! What makes it useful? To explain that takes two more functions:
liftCoyoneda :: f a -> Coyoneda f a
liftCoyoneda = Coyoneda id
lowerCoyoneda :: Functor f => Coyoneda f a -> f a
lowerCoyoneda (Coyoneda f x) = fmap f x
The cool thing is that fmap applications get built up and performed all together:
lowerCoyoneda . fmap f . fmap g . fmap h . liftCoyoneda
is operationally
fmap (f . g . h)
rather than
fmap f . fmap g . fmap h
This can be useful if fmap is expensive in the underlying functor.

Related

Expressing infinite kinds

When expressing infinite types in Haskell:
f x = x x -- This doesn't type check
One can use a newtype to do it:
newtype Inf = Inf { runInf :: Inf -> * }
f x = x (Inf x)
Is there a newtype equivalent for kinds that allows one to express infinite kinds?
I already found that I can use type families to get something similar:
{-# LANGUAGE TypeFamilies #-}
data Inf (x :: * -> *) = Inf
type family RunInf x :: * -> *
type instance RunInf (Inf x) = x
But I'm not satisfied with this solution - unlike the types equivalent, Inf doesn't create a new kind (Inf x has the kind *), so there's less kind safety.
Is there a more elegant solution to this problem?
Responding to:
Like recursion-schemes, I want a way to construct ASTs, except I want my ASTs to be able to refer to each other - that is a term can contain a type (for a lambda parameter), a type can contain a row-type in it and vice-versa. I'd like the ASTs to be defined with an external fix-point (so one can have "pure" expressions or ones annotated with types after type inference), but I also want these fix-points to be able to contain other types of fix-points (just like terms can contain terms of different types). I don't see how Fix helps me there
I have a method, which maybe is near what you are asking for, that I have been experimenting with. It seems to be quite powerful, though the abstractions around this construction need some development. The key is that there is a kind Label which indicates from where the recursion will continue.
{-# LANGUAGE DataKinds #-}
import Data.Kind (Type)
data Label = Label ((Label -> Type) -> Type)
type L = 'Label
L is just a shortcut to construct labels.
Open-fixed-point definitions are of kind (Label -> Type) -> Type, that is, they take a "label interpretation (type) function" and give back a type. I called these "shape functors", and abstractly refer to them with the letter h. The simplest shape functor is one that does not recurse:
newtype LiteralF a f = Literal a -- does not depend on the interpretation f
type Literal a = L (LiteralF a)
Now we can make a little expression grammar as an example:
data Expr f
= Lit (f (Literal Integer))
| Let (f (L Defn)) (f (L Expr))
| Var (f (L String))
| Add (f (L Expr)) (f (L Expr))
data Defn f
= Defn (f (Literal String)) (f (L Expr))
Notice how we pass labels to f, which is in turn responsible for closing off the recursion. If we just want a normal expression tree, we can use Tree:
data Tree :: Label -> Type where
Tree :: h Tree -> Tree (L h)
Then a Tree (L Expr) is isomorphic to the normal expression tree you would expect. But this also allows us to, e.g., annotate the tree with a label-dependent annotation at each level of the tree:
data Annotated :: (Label -> Type) -> Label -> Type where
Annotated :: ann (L h) -> h (Annotated ann) -> Annotated ann (L h)
In the repo ann is indexed by a shape functor rather than a label, but this seems cleaner to me now. There are a lot of little decisions like this to be made, and I have yet to find the most convenient pattern. The best abstractions to use around shape functors still needs exploration and development.
There are many other possibilities, many of which I have not explored. If you end up using it I would love to hear about your use case.
With data-kinds, we can use a regular newtype:
import Data.Kind (Type)
newtype Inf = Inf (Inf -> Type)
And promote it (with ') to create new kinds to represent loops:
{-# LANGUAGE DataKinds #-}
type F x = x ('Inf x)
For a type to unpack its 'Inf argument we need a type-family:
{-# LANGUAGE TypeFamilies #-}
type family RunInf (x :: Inf) :: Inf -> Type
type instance RunInf ('Inf x) = x
Now we can express infinite kinds with a new kind for them, so no kind-safety is lost.
Thanks to #luqui for pointing out the DataKinds part in his answer!
I think you're looking for Fix which is defined as
data Fix f = Fix (f (Fix f))
Fix gives you the fixpoint of the type f. I'm not sure what you're trying to do but such infinite types are generally used when you use recursion scehemes (patterns of recursion that you can use) see recursion-schemes package by Edward Kmett or with free monads which among other things allow you to construct ASTs in a monadic style.

What are Prisms?

I'm trying to achieve a deeper understanding of lens library, so I play around with the types it offers. I have already had some experience with lenses, and know how powerful and convenient they are. So I moved on to Prisms, and I'm a bit lost. It seems that prisms allow two things:
Determining if an entity belongs to a particular branch of a sum type, and if it does, capturing the underlying data in a tuple or a singleton.
Destructuring and reconstructing an entity, possibly modifying it in process.
The first point seems useful, but usually one doesn't need all the data from an entity, and ^? with plain lenses allows getting Nothing if the field in question doesn't belong to the branch the entity represents, just like it does with prisms.
The second point... I don't know, might have uses?
So the question is: what can I do with a Prism that I can't with other optics?
Edit: thank you everyone for excellent answers and links for further reading! I wish I could accept them all.
Lenses characterise the has-a relationship; Prisms characterise the is-a relationship.
A Lens s a says "s has an a"; it has methods to get exactly one a from an s and to overwrite exactly one a in an s. A Prism s a says "a is an s"; it has methods to upcast an a to an s and to (attempt to) downcast an s to an a.
Putting that intuition into code gives you the familiar "get-set" (or "costate comonad coalgebra") formulation of lenses,
data Lens s a = Lens {
get :: s -> a,
set :: a -> s -> s
}
and an "upcast-downcast" representation of prisms,
data Prism s a = Prism {
up :: a -> s,
down :: s -> Maybe a
}
up injects an a into s (without adding any information), and down tests whether the s is an a.
In lens, up is spelled review and down is preview. There’s no Prism constructor; you use the prism' smart constructor.
What can you do with a Prism? Inject and project sum types!
_Left :: Prism (Either a b) a
_Left = Prism {
up = Left,
down = either Just (const Nothing)
}
_Right :: Prism (Either a b) b
_Right = Prism {
up = Right,
down = either (const Nothing) Just
}
Lenses don't support this - you can't write a Lens (Either a b) a because you can't implement get :: Either a b -> a. As a practical matter, you can write a Traversal (Either a b) a, but that doesn't allow you to create an Either a b from an a - it'll only let you overwrite an a which is already there.
Aside: I think this subtle point about Traversals is the source of your confusion about partial record fields.
^? with plain lenses allows getting Nothing if the field in question doesn't belong to the branch the entity represents
Using ^? with a real Lens will never return Nothing, because a Lens s a identifies exactly one a inside an s.
When confronted with a partial record field,
data Wibble = Wobble { _wobble :: Int } | Wubble { _wubble :: Bool }
makeLenses will generate a Traversal, not a Lens.
wobble :: Traversal' Wibble Int
wubble :: Traversal' Wibble Bool
For an example of this how Prisms can be applied in practice, look to Control.Exception.Lens, which provides a collection of Prisms into Haskell's extensible Exception hierarchy. This lets you perform runtime type tests on SomeExceptions and inject specific exceptions into SomeException.
_ArithException :: Prism' SomeException ArithException
_AsyncException :: Prism' SomeException AsyncException
-- etc.
(These are slightly simplified versions of the actual types. In reality these prisms are overloaded class methods.)
Thinking at a higher level, certain whole programs can be thought of as being "basically a Prism". Encoding and decoding data is one example: you can always convert structured data to a String, but not every String can be parsed back:
showRead :: (Show a, Read a) => Prism String a
showRead = Prism {
up = show,
down = listToMaybe . fmap fst . reads
}
To summarise, Lenses and Prisms together encode the two core design tools of object-oriented programming: composition and subtyping. Lenses are a first-class version of Java's . and = operators, and Prisms are a first-class version of Java's instanceof and implicit upcasting.
One fruitful way of thinking about Lenses is that they give you a way of splitting up a composite s into a focused value a and some context c. Pseudocode:
type Lens s a = exists c. s <-> (a, c)
In this framework, a Prism gives you a way to look at an s as being either an a or some context c.
type Prism s a = exists c. s <-> Either a c
(I'll leave it to you to convince yourself that these are isomorphic to the simple representations I demonstrated above. Try implementing get/set/up/down for these types!)
In this sense a Prism is a co-Lens. Either is the categorical dual of (,); Prism is the categorical dual of Lens.
You can also observe this duality in the "profunctor optics" formulation - Strong and Choice are dual.
type Lens s t a b = forall p. Strong p => p a b -> p s t
type Prism s t a b = forall p. Choice p => p a b -> p s t
This is more or less the representation which lens uses, because these Lenses and Prisms are very composable. You can compose Prisms to get bigger Prisms ("a is an s, which is a p") using (.); composing a Prism with a Lens gives you a Traversal.
I just wrote a blog post, which might help build some intuition about Prisms: Prisms are constructors (Lenses are fields). http://oleg.fi/gists/posts/2018-06-19-prisms-are-constructors.html
Prisms could be introduced as first-class pattern matching, but that is a
one-sided view. I'd say they are generalised constructors, though maybe
more often used for pattern matching than for actual construction.
The important property of constructors (and lawful prisms), is their
injectivity. Though the usual prism laws don't state that directly,
injectivity property can be deduced.
To quote lens-library documentation, the prisms laws are:
First, if I review a value with a Prism and then preview, I will get it back:
preview l (review l b) ≡ Just b
Second, if you can extract a value a using a Prism l from a value s, then
the value s is completely described by l and a:
preview l s ≡ Just a ⇒ review l a ≡ s
In fact, the first law alone is enough to prove the injectivity of construction
via Prism:
review l x ≡ review l y ⇒ x ≡ y
The proof is straight-forward:
review l x ≡ review l y
-- x ≡ y -> f x ≡ f y
preview l (review l x) ≡ preview l (review l y)
-- rewrite both sides with the first law
Just x ≡ Just y
-- injectivity of Just
x ≡ y
We can use injectivity property as an additional tool in the equational
reasoning toolbox. Or we can use it as a easy property to check to decide
whether something is a lawful Prism. The check is easy as we only the
review side of Prism. Many smart constructors, which for example
normalise the input data, aren't lawful prisms.
An example using case-insensitive:
-- Bad!
_CI :: FoldCase s => Prism' (CI s) s
_CI = prism' ci (Just . foldedCase)
λ> review _CI "FOO" == review _CI "foo"
True
λ> "FOO" == "foo"
False
The first law is also violated:
λ> preview _CI (review _CI "FOO")
Just "foo"
In addition to the other excellent answers, I feel Isos provide a nice vantage point for considering this matter.
There being some i :: Iso' s a means if you have an s value you also (virtually) have an a value, and vice versa. The Iso' gives you two conversion functions, view i :: s -> a and review i :: a -> s which are both guaranteed to succeed and lossless.
There being some l :: Lens' s a means if you have an s you also have an a, but not vice versa. view l :: s -> a may drop information along the way, as the conversion isn't required to be lossless, and so you can't go the other way if all you have is an a (cf. set l :: a -> s -> s, which also requires an s in addition to the a value in order to provide the missing information).
There being some p :: Prism' s a means if you have an s value you might also have an a, but there are no guarantees. The conversion preview p :: s -> Maybe a is not guaranteed to succeed. Still, you do have the other direction, review p :: a -> s.
In other words, an Iso is invertible and always succeeds. If you drop the invertibility requirement, you get a Lens; if you drop the success guarantee, you get a Prism. If you drop both, you get an affine traversal (which is not in lens as a separate type), and if you go a step further and give up on having at most one target you end up with a Traversal. That is reflected in one of the diamonds of the lens subtype hierarchy:
Traversal
/ \
/ \
/ \
Lens Prism
\ /
\ /
\ /
Iso

Is nested pair a good idea in Haskell

AFAIK there is no way to do heterogeneous arrays in Haskell or extend data type.
However it seems it could be achieved easily by using nested pair (like CONS are).
For example
data Point2D a = Point2D a a
data Point3D a = Point3D a a a
Could be written using nested pair like this;
type Point2D = (a, (a, ())
type Point3D = (a, (a, (a, ()))
That way accessors can be common for Point2D and Point3D
x = fst
y = fst.snd
z = fst.snd.snd
This technique can also be used to extend record like this
type Person = (String, ())
name = fst
type User = (String, (String, ())
email = fst.snd
etc ...
Is it a good idea and if so , why is there no built-in support for such thing in Haskell ?
Is it what GADT is about ?
It can be a good idea, if that's what you need.
GADTs are not directly to do with this idea (if you want to know more about GADTs, probably better to google GADTs and read some of the links, and/or ask a separate Stackoverflow question).
There's no built in support for the same reason there's no built in support for graphics, matrix operations, or most other things; there's no need for support to be built in, because it can be added perfectly well by libraries, such as the HList package (indeed, even most of the functionality that is "built in" to Haskell is actually merely implemented in libraries; just libraries that happen to be always distributed with Haskell).
HList has fancy types for representing "heterogenous lists" and convenient functions for performing operations on them, and also uses these to develop "extensible records". HLists basically are equivalent to what you could do by developing your nested tuple idea further, so your basic idea is good enough that someone's already thought of it. :)
Here is a way to have arbitrarily nested pairs:
data Y f = In (f (Y f))
data P a = Pair Int a | STOP
type PY = Y P
a, b :: PY
a = In (Pair 23 (In (Pair 24 (In STOP))))
b = In (Pair 23 (In (Pair 24 (In (Pair 25 (In STOP))))))
c = [a,b] -- a and b have same type

Subtypes for natural language types

I'm a linguist working on the formal syntax/semantics of Natural Languages. I've started
using Haskell quite recently and very soon I realized that I needed to add subtyping. For example given the types Human
and Animal,
I would like to have Human as a subtype of Animal. I found that this is possible using a coerce function where the instances are declared by the user, but I do not know how to define coerce in the instances I'm interested in. So basically I do not know what to add after 'coerce =' to make it work'. Here is the code up to that point:
{-# OPTIONS
-XMultiParamTypeClasses
-XFlexibleInstances
-XFunctionalDependencies
-XRankNTypes
-XTypeSynonymInstances
-XTypeOperators
#-}
module Model where
import Data.List
data Animal = A|B deriving (Eq,Show,Bounded,Enum)
data Man = C|D|E|K deriving (Eq,Show,Bounded,Enum)
class Subtype a b where
coerce :: a->b
instance Subtype Man Animal where
coerce=
animal:: [Animal]
animal = [minBound..maxBound]
man:: [Man]
man = [minBound..maxBound]
Thanks in advance
Just ignore the Subtype class for a second and examine the type of the coerce function you are writing. If the a is a Man and the b is an Animal, then the type of the coerce function you are writing should be:
coerce :: Man -> Animal
This means that all you have to do is write a sensible function that converts each one of your Man constructors (i.e. C | D | E | K) to a corresponding Animal constructor (i.e. A | B). That's what it means to subtype, where you define some function that maps the "sub" type onto the original type.
Of course, you can imagine that because you have four constructors for your Man type and only two constructors for your Animal type then you will end up with more than one Man constructor mapping to the same Animal constructor. There's nothing wrong with that and it just means that the coerce function is not reversible. I can't comment more on that without knowing exactly what those constructors were meant to represent.
The more general answer to your question is that there is no way to automatically know which constructors in Man should map to which constructors in Animal. That's why you have to write the coerce function to tell it what the relationship between men and animals is.
Note also that there is nothing special about the 'Subtype' class and 'coerce' function. You can just skip them and write an 'manToAnimal' function. After all there is no built-in language or compiler support for sub-typing and Subtype is just another class that some random guy came up with (and frankly, subtyping is not really idiomatic Haskell, but you didn't really ask about that). All that defining the class instance does is allow you to overload the function coerce to work on the Man type.
I hope that helps.
What level of abstraction are you working where you "need to add subtyping"?
Are you trying to create world model for your program encoded by Haskell types? (I can see this if your types are actually Animal, Dog, etc.)
Are you trying to create more general software, and you think subtypes would be a good design?
Or are you just learning haskell and playing around with things.
If (1), I think that will not work out for you so well. Haskell does not have very good reflective abilities -- i.e. ability to weave type logic into runtime logic. Your model would end up pretty deeply entangled with the implementation. I would suggest creaing a "world model" (set of) types, as opposed to a set of types corresponding to a specific world model. I.e., answer this question for Haskell: what is a world model?
If (2), think again :-). Subtyping is part of a design tradition in which Haskell does not participate. There are other ways to design your program, and they will end up playing nicer with the functional mindset then subtyping would have. It takes times to develop your functional design sense, so be patient with it. Just remember: keep it simple, stupid. Use data types and functions over them (but remember to use higher-order functions to generalize and share code). If you are reaching for advanced features (even typeclasses are fairly advanced in the sense I mean), you are probably doing it wrong.
If (3), see Doug's answer, and play with stuff. There are lots of ways to fake it, and they all kind of suck eventually.
I don't know much about Natural Languages so my suggestion may be missing the point, but this may be what you are looking for.
{-# OPTIONS
-XMultiParamTypeClasses
-XFlexibleContexts
#-}
module Main where
data Animal = Mammal | Reptile deriving (Eq, Show)
data Dog = Terrier | Hound deriving (Eq, Show)
data Snake = Cobra | Rattle deriving (Eq, Show)
class Subtype a b where
coerce :: a -> b
instance Subtype Animal Animal where
coerce = id
instance Subtype Dog Animal where
coerce _ = Mammal
instance Subtype Snake Animal where
coerce _ = Reptile
isWarmBlooded :: (Subtype a Animal) => a -> Bool
isWarmBlooded = (Mammal == ) . coerce
main = do
print $ isWarmBlooded Hound
print $ isWarmBlooded Cobra
print $ isWarmBlooded Mammal
Gives you:
True
False
True
Is that kind of what you are shooting for? Haskell doesn't have subtyping built-in, but this might do as a work-around. Admittedly, there are probably better ways to do this.
Note: This answer is not intended to point out the best, correct or idomatic way to solve the problem at hand. It is intended to answer the question which was "what to add after 'coerce=' to make it work."
You can't write the coerce function you're looking for — at least, not sensibly. There aren't any values in Animal that correspond with the values in Man, so you can't write a definition for coerce.
Haskell doesn't have subtyping as an explicit design decision, for various reasons (it allows type inference to work better, and allowing subtyping vastly complicates the language's type system). Instead, you should express relationships like this using aggregation:
data Animal = A | B | AnimalMan Man deriving (Eq, Show, Bounded, Enum)
data Man = C | D | E | K deriving (Eq, Show, Bounded, Enum)
AnimalMan now has the type Man -> Animal, exactly as you wanted coerce to have.
If I understood you correctly, it is quite possible. We will use type classes and generalized algebraic data types to implement this functionality.
If you want to be able to do something like this (where animals and humans can be fed but only humans can think):
animals :: [AnyAnimal]
animals = (replicate 5 . AnyAnimal $ SomeAnimal 10) ++ (replicate 5 . AnyAnimal $ SomeHuman 10 10)
humans :: [AnyHuman]
humans = replicate 5 . AnyHuman $ SomeHuman 10 10
animals' :: [AnyAnimal]
animals' = map coerce humans
animals'' :: [AnyAnimal]
animals'' = (map (\(AnyAnimal x) -> AnyAnimal $ feed 50 x) animals) ++
(map (\(AnyAnimal x) -> AnyAnimal $ feed 50 x) animals') ++
(map (\(AnyHuman x) -> AnyAnimal $ feed 50 x) humans)
humans' :: [AnyHuman]
humans' = (map (\(AnyHuman x) -> AnyHuman . think 100 $ feed 50 x) humans)
Then it's possible, for example:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE MultiParamTypeClasses #-}
-- | The show is there only to make things easier
class (Show a) => IsAnimal a where
feed :: Int -> a -> a
-- other interface defining functions
class (IsAnimal a) => IsHuman a where
think :: Int -> a -> a
-- other interface defining functions
class Subtype a b where
coerce :: a -> b
data AnyAnimal where
AnyAnimal :: (IsAnimal a) => a -> AnyAnimal
instance Show AnyAnimal where
show (AnyAnimal x) = "AnyAnimal " ++ show x
data AnyHuman where
AnyHuman :: (IsHuman a) => a -> AnyHuman
instance Show AnyHuman where
show (AnyHuman x) = "AnyHuman " ++ show x
data SomeAnimal = SomeAnimal Int deriving Show
instance IsAnimal SomeAnimal where
feed = flip const
data SomeHuman = SomeHuman Int Int deriving Show
instance IsAnimal SomeHuman where
feed = flip const
instance IsHuman SomeHuman where
think = flip const
instance Subtype AnyHuman AnyAnimal where
coerce (AnyHuman x) = AnyAnimal x
animals :: [AnyAnimal]
animals = (replicate 5 . AnyAnimal $ SomeAnimal 10) ++ (replicate 5 . AnyAnimal $ SomeHuman 10 10)
humans :: [AnyHuman]
humans = replicate 5 . AnyHuman $ SomeHuman 10 10
animals' :: [AnyAnimal]
animals' = map coerce humans
Few comments:
You can make AnyAnimal and AnyHuman instances of their respective classes for convenience (atm. you have to unpack them first and pack them afterwards).
We could have single GADT AnyAnimal like this (both approaches have their use I would guess):
data AnyAnimal where
AnyAnimal :: (IsAnimal a) => a -> AnyAnimal
AnyHuman :: (IsHuman a) => a -> AnyAnimal
instance Show AnyHuman where
show (AnyHuman x) = "AnyHuman " ++ show x
show (AnyAnimal x) = "AnyAnimal " ++ show x
instance Subtype AnyAnimal AnyAnimal where
coerce (AnyHuman x) = AnyAnimal x
coerce (AnyAnimal x) = AnyAnimal x
It's rather advanced, but have a look at Edward Kmett's work on using the new Constraint kinds for this kind of functionality.

Would a type class "between" Category and Arrow make sense?

Often you have something like an Applicative without pure, or something like a Monad, but without return. The semigroupoid package covers these cases with Apply and Bind. Now I'm in a similar situation concerning Arrow, where I can't define a meaningful arr function, but I think the other functions would make perfect sense.
I defined a type that holds a function and it's reverse function:
import Control.Category
data Rev a b = Rev (a -> b) (b -> a)
reverse (Rev f g) = Rev g f
apply (Rev f _) x = f x
applyReverse (Rev _ g) y = g y
compose (Rev f f') (Rev g g') = Rev ((Prelude..) f g) ((Prelude..) g' f')
instance Category Rev where
id = Rev Prelude.id Prelude.id
(.) x y = compose x y
Now I can't implement Arrow, but something weaker:
--"Ow" is an "Arrow" without "arr"
class Category a => Ow a where
first :: a b c -> a (b,d) (c,d)
first f = stars f Control.Category.id
second :: a b c -> a (d,b) (d,c)
second f = stars Control.Category.id f
--same as (***)
stars :: a b c -> a b' c' -> a (b,b') (c,c')
...
import Control.Arrow
instance Ow Rev where
stars (Rev f f') (Rev g g') = Rev (f *** g) (f' *** g')
I think I can't implement the equivalent of &&&, as it is defined as f &&& g = arr (\b -> (b,b)) >>> f *** g, and (\b -> (b,b)) isn't reversable. Still, do you think this weaker type class could be useful? Does it even make sense from a theoretical point of view?
This approach was explored in "There and Back Again: Arrows for invertible programming": http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.153.9383
For precisely the reasons you're running into, this turned out to be a bad approach that wasn't picked up more widely. More recently, Tillmann Rendel produced a delightful approach to invertible syntax that substituted partial isomorphisms for biarrows ( http://www.informatik.uni-marburg.de/~rendel/rendel10invertible.pdf ) . This has been packaged up on hackage for folks to use and play with: http://hackage.haskell.org/package/invertible-syntax
That said, I think an arrow without arr makes a certain amount of sense. I just don't think that such a thing is an appropriate vehicle to capture invertible functions.
Edit: There's also Adam Megacz's Generalized Arrows (http://www.cs.berkeley.edu/~megacz/garrows/). These are maybe not useful for invertible programming either (though the basic typeclass does seem to be inveritble), but they do have uses in other situations where arr is too strong, but other arrow operations may make sense.
From a Category Theory standpoint, the Category type class describes any category whose arrows can be describedn straightforwardly in Haskell by a type constructor. Almost any additional feature you want to build on top of this, in the form of new primitive arrows or arrow-building functions, will make sense to some extent if you can implement it using total functions. The only caveat is that adding expressive power can break other requirements, as often happens with arr.
Your specific example of invertible functions describes a category where all arrows are isomorphisms. In a shocking twist of the completely and utterly expected, Edward Kmett already has an implementation of this on Hackage.
The arr function roughly amounts to a functor (in the category theory sense) from Haskell functions to the Arrow instance, leaving objects the same (i.e., the type parameters). Simply removing arr from Arrow gives you... something else, which is probably not very useful on its own, without at least adding the equivalents of arr fst and arr snd as primitives.
I believe that adding primitives for fst and snd, along with (&&&) to build a new arrow from two inputs, should give you a category with products, which is absolutely sensible from a theoretical point of view, as well as not being compatible with the invertible arrows you're using for the reasons you've found.

Resources