Name for Subtype Relationship with Same Size - subtype

Is there a word for when type A is a subtype of B, but also has the same size in memory?
For instance, types A and B both have one member that is an int, but type A has all the member functions of B plus some additional.

Related

Clarification of Terms around Haskell Type system

Type system in haskell seem to be very Important and I wanted to clarify some terms revolving around haskell type system.
Some type classes
Functor
Applicative
Monad
After using :info I found that Functor is a type class, Applicative is a type class with => (deriving?) Functor and Monad deriving Applicative type class.
I've read that Maybe is a Monad, does that mean Maybe is also Applicative and Functor?
-> operator
When i define a type
data Maybe = Just a | Nothing
and check :t Just I get Just :: a -> Maybe a. How to read this -> operator?
It confuses me with the function where a -> b means it evaluates a to b (sort of returns a maybe) – I tend to think lhs to rhs association but it turns when defining types?
The term type is used in ambiguous ways, Type, Type Class, Type Constructor, Concrete Type etc... I would like to know what they mean to be exact
Indeed the word “type” is used in somewhat ambiguous ways.
The perhaps most practical way to look at it is that a type is just a set of values. For example, Bool is the finite set containing the values True and False.Mathematically, there are subtle differences between the concepts of set and type, but they aren't really important for a programmer to worry about. But you should in general consider the sets to be infinite, for example Integer contains arbitrarily big numbers.
The most obvious way to define a type is with a data declaration, which in the simplest case just lists all the values:
data Colour = Red | Green | Blue
There we have a type which, as a set, contains three values.
Concrete type is basically what we say to make it clear that we mean the above: a particular type that corresponds to a set of values. Bool is a concrete type, that can easily be understood as a data definition, but also String, Maybe Integer and Double -> IO String are concrete types, though they don't correspond to any single data declaration.
What a concrete type can't have is type variables†, nor can it be an incompletely applied type constructor. For example, Maybe is not a concrete type.
So what is a type constructor? It's the type-level analogue to value constructors. What we mean mathematically by “constructor” in Haskell is an injective function, i.e. a function f where if you're given f(x) you can clearly identify what was x. Furthermore, any different constructors are assumed to have disjoint ranges, which means you can also identify f.‡
Just is an example of a value constructor, but it complicates the discussion that it also has a type parameter. Let's consider a simplified version:
data MaybeInt = JustI Int | NothingI
Now we have
JustI :: Int -> MaybeInt
That's how JustI is a function. Like any function of the same signature, it can be applied to argument values of the right type, like, you can write JustI 5.What it means for this function to be injective is that I can define a variable, say,
quoxy :: MaybeInt
quoxy = JustI 9328
and then I can pattern match with the JustI constructor:
> case quoxy of { JustI n -> print n }
9328
This would not be possible with a general function of the same signature:
foo :: Int -> MaybeInt
foo i = JustI $ negate i
> case quoxy of { foo n -> print n }
<interactive>:5:17: error: Parse error in pattern: foo
Note that constructors can be nullary, in which case the injective property is meaningless because there is no contained data / arguments of the injective function. Nothing and True are examples of nullary constructors.
Type constructors are the same idea as value constructors: type-level functions that can be pattern-matched. Any type-name defined with data is a type constructor, for example Bool, Colour and Maybe are all type constructors. Bool and Colour are nullary, but Maybe is a unary type constructor: it takes a type argument and only the result is then a concrete type.
So unlike value-level functions, type-level functions are kind of by default type constructors. There are also type-level functions that aren't constructors, but they require -XTypeFamilies.
A type class may be understood as a set of types, in the same vein as a type can be seen as a set of values. This is not quite accurate, it's closer to true to say a class is a set of type constructors but again it's not as useful to ponder the mathematical details – better to look at examples.
There are two main differences between type-as-set-of-values and class-as-set-of-types:
How you define the “elements”: when writing a data declaration, you need to immediately describe what values are allowed. By contrast, a class is defined “empty”, and then the instances are defined later on, possibly in a different module.
How the elements are used. A data type basically enumerates all the values so they can be identified again. Classes meanwhile aren't generally concerned with identifying types, rather they specify properties that the element-types fulfill. These properties come in the form of methods of a class. For example, the instances of the Num class are types that have the property that you can add elements together.
You could say, Haskell is statically typed on the value level (fixed sets of values in each type), but duck-typed on the type level (classes just require that somebody somewhere implements the necessary methods).
A simplified version of the Num example:
class Num a where
(+) :: a -> a -> a
instance Num Int where
0 + x = x
x + y = ...
If the + operator weren't already defined in the prelude, you would now be able to use it with Int numbers. Then later on, perhaps in a different module, you could also make it usable with new, custom number types:
data MyNumberType = BinDigits [Bool]
instance Num MyNumberType where
BinDigits [] + BinDigits l = BinDigits l
BinDigits (False:ds) + BinDigits (False:es)
= BinDigits (False : ...)
Unlike Num, the Functor...Monad type classes are not classes of types, but of 1-ary type constructors. I.e. every functor is a type constructor taking one argument to make it a concrete type. For instance, recall that Maybe is a 1-ary type constructor.
class Functor f where
fmap :: (a->b) -> f a -> f b
instance Functor Maybe where
fmap f (Just a) = Just (f a)
fmap _ Nothing = Nothing
As you have concluded yourself, Applicative is a subclass of Functor. D being a subclass of C means basically that D is a subset of the set of type constructors in C. Therefore, yes, if Maybe is an instance of Monad it also is an instance of Functor.
†That's not quite true: if you consider the _universal quantor_ explicitly as part of the type, then a concrete type can contain variables. This is a bit of an advanced subject though.
‡This is not guaranteed to be true if the -XPatternSynonyms extension is used.

Does the Either type constructor contain a phantom type each for the left/right case?

AFAIK, only types are inhabited by values in Haskell, not type constructors. Either is a binary type constructor of kind * -> * -> *. Left and Right both apply this type constructor to a single type, which is provided by the passed value. Doesn't that mean that in both cases Either is merely partially applied and thus still a type constructor awaiting the missing type argument?
let x = Right 'x' -- Either a Char
x has the type Either a Char. I would assume that this type would have the kind * -> *. This is clearly a polymorphic type, not a ground one. Yet Either a Char can be inhabited by values like 'x'.
My suspicion is that the type variable a is a phantom type for the Right case resp. b for Left. I know phantom types in connection with Const, where the respective type variable isn't used at all. Am I on the right tack?
AFAIK, only types are inhabited by values in Haskell, not type constructors.
Spot on.
Left and Right both apply this type constructor to a single type
You can't say that. Left and Right don't live in the type language at all, so they don't apply anything to any types, they only apply themselves to values.
x has the type Either a Char. I would assume that this type would have the kind * -> *
You need to distinguish between function/constructor arguments, and type variables. It's basically the distinction between free and bound variables. Either a Char still has kind *, not * -> *, because it is already applied to a. Yes, that's a type variable, but it still is an argument that's already applied.
Yet Either a Char can be inhabited by values like 'x'.
Not quite – it can be inhabited by values like Right 'x'.
My suspicion is that the type variable a is a phantom type for the Right case resp. b for Left
kind of, but I wouldn't call it “phantom” because you can't just count out Left or Right. At least not unless you choose Either Void b, but in that case you don't have the a variable.
I would argue that a type variable is phantom if and only if the choice of the type variable does not restrict what values can be passed to the type's constructors. The important part is that this is a type-centric definition. It is determined by looking only at the type definition, not at some particular value of the type.
So does it matter that no value of type String appears in the value Left 5 :: Either Int String? Not at all. What matters is that the choice of String in Either Int String prevents Right () from type-checking.
Haskell has "implicit universal quantification", which means that type variables have an implicit forall. Either a Int is equivalent to forall a. Either a Int.
One way to consider a forall is that it's like a lambda, but for type variables. If we use the syntax # for type application, then, you can "apply" a type to this and get a new type out.
let foo = Right 1 :: forall a. Either a Int
foo #Char :: Either Char Int
foo #Double :: Either Double Int

Why do We Need Sum Types?

Imagine a language which doesn't allow multiple value constructors for a data type. Instead of writing
data Color = White | Black | Blue
we would have
data White = White
data Black = Black
data Blue = Black
type Color = White :|: Black :|: Blue
where :|: (here it's not | to avoid confusion with sum types) is a built-in type union operator. Pattern matching would work in the same way
show :: Color -> String
show White = "white"
show Black = "black"
show Blue = "blue"
As you can see, in contrast to coproducts it results in a flat structure so you don't have to deal with injections. And, unlike sum types, it allows to randomly combine types resulting in greater flexibility and granularity:
type ColorsStartingWithB = Black :|: Blue
I believe it wouldn't be a problem to construct recursive data types as well
data Nil = Nil
data Cons a = Cons a (List a)
type List a = Cons a :|: Nil
I know union types are present in TypeScript and probably other languages, but why did the Haskell committee chose ADTs over them?
Haskell's sum type is very similar to your :|:.
The difference between the two is that the Haskell sum type | is a tagged union, while your "sum type" :|: is untagged.
Tagged means every instance is unique - you can distunguish Int | Int from Int (actually, this holds for any a):
data EitherIntInt = Left Int | Right Int
In this case: Either Int Int carries more information than Int because there can be a Left and Right Int.
In your :|:, you cannot distinguish those two:
type EitherIntInt = Int :|: Int
How do you know if it was a left or right Int?
See the comments for an extended discussion of the section below.
Tagged unions have another advantage: The compiler can verify whether you as the programmer handled all cases, which is implementation-dependent for general untagged unions. Did you handle all cases in Int :|: Int? Either this is isomorphic to Int by definition or the compiler has to decide which Int (left or right) to choose, which is impossible if they are indistinguishable.
Consider another example:
type (Integral a, Num b) => IntegralOrNum a b = a :|: b -- untagged
data (Integral a, Num b) => IntegralOrNum a b = Either a b -- tagged
What is 5 :: IntegralOrNum Int Double in the untagged union? It is both an instance of Integral and Num, so we can't decide for sure and have to rely on implementation details. On the other hand, the tagged union knows exactly what 5 should be because it is branded with either Left or Right.
As for naming: The disjoint union in Haskell is a union type. ADTs are only a means of implementing these.
I will try to expand the categorical argument mentioned by #BenjaminHodgson.
Haskell can be seen as the category Hask, in which objects are types and morphisms are functions between types (disregarding bottom).
We can define a product in Hask as tuple - categorically speaking it meets the definition of the product:
A product of a and b is the type c equipped with projections p and q such that p :: c -> a and q :: c -> b and for any other candidate c' equipped with p' and q' there exists a morphism m :: c' -> c such that we can write p' as p . m and q' as q . m.
Read up on this in Bartosz' Category Theory for Programmers for further information.
Now for every category, there exists the opposite category, which has the same morphism but reverses all the arrows. The coproduct is thus:
The coproduct c of a and b is the type c equipped with injections i :: a -> c and j :: b -> c such that for all other candidates c' with i' and j' there exists a morphism m :: c -> c' such that i' = m . i and j' = m . j.
Let's see how the tagged and untagged union perform given this definition:
The untagged union of a and b is the type a :|: b such that:
i :: a -> a :|: b is defined as i a = a and
j :: b -> a :|: b is defined as j b = b
However, we know that a :|: a is isomorphic to a. Based on that observation we can define a second candidate for the product a :|: a :|: b which is equipped with the exact same morphisms. Therefore, there is no single best candidate, since the morphism m between a :|: a :|: b and a :|: b is id. id is a bijection, which implies that m is invertible and "convert" types either way. A visual representation of that argument. Replace p with i and q with j.
Restricting ourselves Either, as you can verify yourself with:
i = Left and
j = Right
This shows that the categorical complement of the product type is the disjoint union, not the set-based union.
The set union is part of the disjoint union, because we can define it as follows:
data Left a = Left a
data Right b = Right b
type DisjUnion a b = Left a :|: Right b
Because we have shown above that the set union is not a valid candidate for the coproduct of two types, we would lose many "free" properties (which follow from parametricity as leftroundabout mentioned) by not choosing the disjoint union in the category Hask (because there would be no coproduct).
This is an idea I've thought a lot about myself: a language with “first-class type algebra”. Pretty sure we could do about everything this way that we do in Haskell. Certainly if these disjunctions were, like Haskell alternatives, tagged unions; then you could directly rewrite any ADT to use them. In fact GHC can do this for you: if you derive a Generic instance, a variant type will be represented by a :+: construct, which is in essence just Either.
I'm not so sure if untagged unions would also do. As long as you require the types participating in a sum to be discernibly different, the explicit tagging should in principle not be necessary. The language would then need a convenient way to match on types at runtime. Sounds a lot like what dynamic languages do – obviously comes with quite some overhead though.
The biggest problem would be that if the types on both sides of :|: must be unequal then you lose parametricity, which is one of Haskell's nicest traits.
Given that you mention TypeScript, it is instructive to have a look at what its docs have to say about its union types. The example there starts from a function...
function padLeft(value: string, padding: any) { //etc.
... that has a flaw:
The problem with padLeft is that its padding parameter is typed as any. That means that we can call it with an argument that’s neither a number nor a string
One plausible solution is then suggested, and rejected:
In traditional object-oriented code, we might abstract over the two types by creating a hierarchy of types. While this is much more explicit, it’s also a little bit overkill.
Rather, the handbook suggests...
Instead of any, we can use a union type for the padding parameter:
function padLeft(value: string, padding: string | number) { // etc.
Crucially, the concept of union type is then described in this way:
A union type describes a value that can be one of several types.
A string | number value in TypeScript can be either of string type or of number type, as string and number are subtypes of string | number (cf. Alexis King's comment to the question). An Either String Int value in Haskell, however, is neither of String type nor of Int type -- its only, monomorphic, type is Either String Int. Further implications of that difference show up in the remainder of the discussion:
If we have a value that has a union type, we can only access members that are common to all types in the union.
In a roughly analogous Haskell scenario, if we have, say, an Either Double Int, we cannot apply (2*) directly on it, even though both Double and Int have instances of Num. Rather, something like bimap is necessary.
What happens when we need to know specifically whether we have a Fish? [...] we’ll need to use a type assertion:
let pet = getSmallPet();
if ((<Fish>pet).swim) {
(<Fish>pet).swim();
}
else {
(<Bird>pet).fly();
}
This sort of downcasting/runtime type checking is at odds with how the Haskell type system ordinarily works, even though it can be implemented using the very same type system (also cf. leftaroundabout's answer). In contrast, there is nothing to figure out at runtime about the type of an Either Fish Bird: the case analysis happens at value level, and there is no need to deal with anything failing and producing Nothing (or worse, null) due to runtime type mismatches.

haskell sum type multiple declaration error

data A=A
data B=B
data AB=A|B
Which makes a sum type AB from A and B.
but the last line induces a compile error "multiple declarations of B"
I also tried sth like this:
data A=Int|Bool
It compiles. but why ghc disallows me from making sum types for user-defined types?
You're getting fooled. You think when you write data A=Int|Bool that you are saying that a value of type A can be a value of type Int or a value of type Bool; but what you are actually saying is that there are two new value-level constructors named Int and Bool, each containing no information at all, of type A. Similarly, you think that data AB=A|B says you can either be of type A or type B, but in fact you are saying you can either have value A or value B.
The key thing to keep in mind is that there are two namespaces, type-level and term-level, and that they are distinct.
Here is a simple example of how to do it right:
data A=A
data B=B
data AB=L A|R B
The last line declares two new term-level constructors, L and R. The L constructor carries a value of type A, while the R constructor carries a value of type B.
You might also like the Either type, defined as follows:
data Either a b = Left a | Right b
You could use this to implement your AB if you wanted:
type AB = Either A B
Similarly, you could use Either Int Bool for your tagged union of Int and Bool.
When you say data AB = A | B, you are not referring to the types A and B, but rather are defining data constructors A and B. These conflict with the constructors defined on the the previous lines.
If you want to create a type AB that is the sum of A and B, you must provide data constructors that wrap the types A and B, e.g.:
data AB = ABA A | ABB B
Because the type of the value created using data constructor A or B will be ambiguous. When I have a = B for instance, what is the type of a? It is A or AB?
You should consider using different data constructor as follows:
data A = MkA
data B = MkB
data AB = A A | B B
Sum types have to be tagged. a+a has to have two injections from a.
To understand how algebraic data types work, take a simple example:
data X = A | B C
This defines a new type constructor, X, along with data constructors A and B. The B constructor takes/holds an argument of type C.
The primary canonical sum type in Haskell is Either:
data Either a b = Left a | Right b

Understanding Polytypes in Hindley-Milner Type Inference

I'm reading the Wikipedia article on Hindley–Milner Type Inference trying to make some sense out of it. So far this is what I've understood:
Types are classified as either monotypes or polytypes.
Monotypes are further classified as either type constants (like int or string) or type variables (like α and β).
Type constants can either be concrete types (like int and string) or type constructors (like Map and Set).
Type variables (like α and β) behave as placeholders for concrete types (like int and string).
Now I'm having a little difficulty understanding polytypes but after learning a bit of Haskell this is what I make of it:
Types themselves have types. Formally types of types are called kinds (i.e. there are different kinds of types).
Concrete types (like int and string) and type variables (like α and β) are of kind *.
Type constructors (like Map and Set) are lambda abstractions of types (e.g. Set is of kind * -> * and Map is of kind * -> * -> *).
What I don't understand is what do qualifiers signify. For example what does ∀α.σ represent? I can't seem to make heads or tails of it and the more I read the following paragraph the more confused I get:
A function with polytype ∀α.α -> α by contrast can map any value of the same type to itself, and the identity function is a value for this type. As another example ∀α.(Set α) -> int is the type of a function mapping all finite sets to integers. The count of members is a value for this type. Note that qualifiers can only appear top level, i.e. a type ∀α.α -> ∀α.α for instance, is excluded by syntax of types and that monotypes are included in the polytypes, thus a type has the general form ∀α₁ . . . ∀αₙ.τ.
First, kinds and polymorphic types are different things. You can have a HM type system where all types are of the same kind (*), you could also have a system without polymorphism but with complex kinds.
If a term M is of type ∀a.t, it means that for whatever type s we can substitute s for a in t (often written as t[a:=s] and we'll have that M is of type t[a:=s]. This is somewhat similar to logic, where we can substitute any term for a universally quantified variable, but here we're dealing with types.
This is precisely what happens in Haskell, just that in Haskell you don't see the quantifiers. All type variables that appear in a type signature are implicitly quantified, just as if you had forall in front of the type. For example, map would have type
map :: forall a . forall b . (a -> b) -> [a] -> [b]
etc. Without this implicit universal quantification, type variables a and b would have to have some fixed meaning and map wouldn't be polymorphic.
The HM algorithm distinguishes types (without quantifiers, monotypes) and type schemas (universaly quantified types, polytypes). It's important that at some places it uses type schemas (like in let), but at other places only types are allowed. This makes the whole thing decidable.
I also suggest you to read the article about System F. It is a more complex system, which allows forall anywhere in types (therefore everything there is just called type), but type inference/checking is undecidable. It can help you understand how forall works. System F is described in depth in Girard, Lafont and Taylor, Proofs and Types.
Consider l = \x -> t in Haskell. It is a lambda, which represents a term t fith a variable x, which will be substituted later (e.g. l 1, whatever it would mean) . Similarly, ∀α.σ represents a type with a type variable α, that is, f : ∀α.σ if a function parameterized by a type α. In some sense, σ depends on α, so f returns a value of type σ(α), where α will be substituted in σ(α) later, and we will get some concrete type.
In Haskell you are allowed to omit ∀ and define functions just like id : a -> a. The reason to allowing omitting the quantifier is basically since they are allowed only top level (without RankNTypes extension). You can try this piece of code:
id2 : a -> a -- I named it id2 since id is already defined in Prelude
id2 x = x
If you ask ghci for the type of id(:t id), it will return a -> a. To be more precise (more type theoretic), id has the type ∀a. a -> a. Now, if you add to your code:
val = id2 3
, 3 has the type Int, so the type Int will be substituted into σ and we will get the concrete type Int -> Int.

Resources