Haskell Polymorphism With Kinds and Type Variables - haskell

In Haskell I have learnt that there are type variables (ex. id :: a -> a), applied to type signatures, and kinds (ex. Maybe :: * -> *), applied to type constructors and type classes. A type must have kind * (be a concrete type) in order to hold values.
We use type variables to enable polymorphism: Nothing :: Maybe a means that the constant Nothing can belong to a family of possible types. This leads me to believe that kinding and type variables serve the same purpose; wouldn't the last code sample work as simply Nothing :: Maybe, where type class Maybe remains with the kind * -> * to signify that the type belongs to generic family?
What it seems we are doing is taking an empty parameter (* -> *) and filling it in with a type variable (a) that represents the same level of variance.
We see this behavior in another example:
>>> :k Either
Either :: * -> * -> *
>>> :t Left ()
Left () :: Either () b
>>> :t Right ()
Right () :: Either a ()
Why is it theoretically necessary to make the distinction between kinds and type variables?

They are distinct just like typing and usual (value-level) variables are distinct. Variables have types, but they aren't types. So also, type variables have kinds. Type variables are the primary notion: without them you don't have parametric polymorphism, and they exist also in many other languages like Java, C#, etc. But Haskell goes further in allowing types-which-take-parameters ([], Maybe, ->, etc.) to exist on their own and to have type variables which represent such non-concrete types. And this means it needs a kind system to disallow things like Maybe Int Int.
From the example, it seems that you suggest that you can write a signature without type variables and restore it to the signature with them. But then how could you distinguish a -> b -> a and a -> b -> b?

Related

Which is a polymorphic type: a type or a set of types?

Programming in Haskell by Hutton says:
A type that contains one or more type variables is called polymorphic.
Which is a polymorphic type: a type or a set of types?
Is a polymorphic type with a concrete type substituting its type variable a type?
Is a polymorphic type with different concrete types substituting its type variable considered the same or different types?
Is a polymorphic type with a concrete type substituting its type variable a type?
That's the point, yes. However, you need to be careful. Consider:
id :: a -> a
That's polymorphic. You can substitute a := Int and get Int -> Int, and a := Float -> Float and get (Float -> Float) -> Float -> Float. However, you cannot say a := Maybe and get id :: Maybe -> Maybe. That just doesn't make sense. Instead, we have to require that you can only substitute concrete types like Int and Maybe Float for a, not abstract ones like Maybe. This is handled with the kind system. This is not too important for your question, so I'll just summarize. Int and Float and Maybe Float are all concrete types (that is, they have values), so we say that they have type Type (the type of a type is often called its kind). Maybe is a function that takes a concrete type as an argument and returns a new concrete type, so we say Maybe :: Type -> Type. In the type a -> a, we say the type variable a must have type Type, so now the substitutions a := Int, a := String, etc. are allowed, while stuff like a := Maybe isn't.
Is a polymorphic type with different concrete types substituting its type variable considered the same or different types?
No. Back to a -> a: a := Int gives Int -> Int, but a := Float gives Float -> Float. Not the same.
Which is a polymorphic type: a type or a set of types?
Now that's a loaded question. You can skip to the TL;DR at the end, but the question of "what is a polymorphic type" is actually really confusing in Haskell, so here's a wall of text.
There are two ways to see it. Haskell started with one, then moved to the other, and now we have a ton of old literature referring to the old way, so the syntax of the modern system tries to maintain compatibility. It's a bit of a hot mess. Consider
id x = x
What is the type of id? One point of view is that id :: Int -> Int, and also id :: Float -> Float, and also id :: (Int -> Int) -> Int -> Int, ad infinitum, all simultaneously. This infinite family of types can be summed up with one polymorphic type, id :: a -> a. This point of view gives you the Hindley-Milner type system. This is not how modern GHC Haskell works, but this system is what Haskell was based on at its creation.
In Hindley-Milner, there is a hard line between polymorphic types and monomorphic types, and the union of these two groups gives you "types" in general. It's not really fair to say that, in HM, polymorphic types (in HM jargon, "polytypes") are types. You can't take polytypes as arguments, or return them from functions, or place them in a list. Instead, polytypes are only templates for monotypes. If you squint, in HM, a polymorphic type can be seen as a set of those monotypes that fit the schema.
Modern Haskell is built on System F (plus extensions). In System F,
id = \x -> x -- rewriting the example
is not a complete definition. Therefore we can't even think about giving it a type. Every lambda-bound variable needs a type annotation, but x has no annotation. Worse, we can't even decide on one: \(x :: Int) -> x is just as good as \(x :: Float) -> x. In System F, what we do is we write
id = /\(a :: Type) -> \(x :: a) -> x
using /\ to represent Λ (upper-case lambda) much as we use \ to represent λ.
id is a function taking two arguments. The first argument is a Type, named a. The second argument is an a. The result is also an a. The type signature is:
id :: forall (a :: Type). a -> a
forall is a new kind of function arrow, basically. Note that it provides a binder for a. In HM, when we said id :: a -> a, we didn't really define what a was. It was a fresh, global variable. By convention, more than anything else, that variable is not used anywhere else (otherwise the Generalization rule doesn't apply and everything breaks down). If I had written e.g. inject :: a -> Maybe a, afterwards, the textual occurrences of a would be referring to a new global entity, different from the one in id. In System F, the a in forall a. a -> a actually has scope. It's a "local variable" available only for use underneath that forall. The a in inject :: forall a. a -> Maybe a may or may not be the "same" a; it doesn't matter, because we have actual scoping rules that keep everything from falling apart.
Because System F has hygienic scoping rules for type variables, polymorphic types are allowed to do everything other types can do. You can take them as arguments
runCont :: forall (a :: Type). (forall (r :: Type). (a -> r) -> r) -> a
runCons a f = f a (id a) -- omitting type signatures; you can fill them in
You put them in data constructors
newtype Yoneda f a = Yoneda (forall b. (a -> b) -> f b)
You can place them in polymorphic containers:
type Bool = forall a. a -> a -> a
true, false :: Bool
true a t f = t
false a t f = f
thueMorse :: [Bool]
thueMorse = false : true : true : false : _etc
There's an important difference from HM. In HM, if something has polymorphic type, it also has, simultaneously, an infinity of monomorphic types. In System F, a thing can only have one type. id = /\a -> \(x :: a) -> x has type forall a. a -> a, not Int -> Int or Float -> Float. In order to get an Int -> Int out of id, you have to actually give it an argument: id Int :: Int -> Int, and id Float :: Float -> Float.
Haskell is not System F, however. System F is closer to what GHC calls Core, which is an internal language that GHC compiles Haskell to—basically Haskell without any syntax sugar. Haskell is a Hindley-Milner flavored veneer on top of a System F core. In Haskell, nominally a polymorphic type is a type. They do not act like sets of types. However, polymorphic types are still second class. Haskell doesn't let you actually type forall without -XExplicitForalls. It emulates Hindley-Milner's wonky implicit global variable creation by inserting foralls in certain places. The places where it does so are changed by -XScopedTypeVariables. You can't take polymorphic arguments or have polymorphic fields unless you enable -XRankNTypes. You cannot say things like [forall a. a -> a -> a], nor can you say id (forall a. a -> a -> a) :: (forall a. a -> a -> a) -> (forall a. a -> a -> a)—you must define e.g. newtype Bool = Bool { ifThenElse :: forall a. a -> a -> a } to wrap the polymorphism under something monomorphic. You cannot explicitly give type arguments unless you enable -XTypeApplications, and then you can write id #Int :: Int -> Int. You cannot write type lambdas (/\), period; instead, they are inserted implicitly whenever possible. If you define id :: forall a. a -> a, then you cannot even write id in Haskell. It will always be implicitly expanded to an application, id #_.
TL;DR: In Haskell, a polymorphic type is a type. It's not treated as a set of types, or a rule/schema for types, or whatever. However, due to historical reasons, they are treated as second class citizens. By default, it looks like they are treated as mere sets of types, if you squint a bit. Most restrictions on them can be lifted with suitable language extensions, at which point they look more like "just types". The one remaining big restriction (no impredicative instantiations allowed) is rather fundamental and cannot be erased, but that's fine because there's a workaround.
There is some nuance in the word "type" here. Values have concrete types, which cannot be polymorphic. Expressions, on the other hand, have general types, which can be polymorphic. If you're thinking of types for values, then a polymorphic type can be thought of loosely as defining sets of possible concrete types. (At least first-order polymorphic types! Higher-order polymorphism breaks this intuition.) But that's not always a particularly useful way of thinking, and it's not a sufficient definition. It doesn't capture which sets of types can be described in this way (and related notions like parametricity.)
It's a good observation, though, that the same word, "type", is used in these two related, but different, ways.
EDIT: The answer below turns out not to answer the question. The difference is a subtle mistake in terminology: types like Maybe and [] are higher-kinded, whereas types like forall a. a -> a and forall a. Maybe a are polymorphic. The answer below relates to higher-kinded types, but the question was asked about polymorphic types. I’m still leaving this answer up in case it helps anyone else, but I realise now it’s not really an answer to the question.
I would argue that a polymorphic higher-kinded type is closer to a set of types. For instance, you could see Maybe as the set {Maybe Int, Maybe Bool, …}.
However, strictly speaking, this is a bit misleading. To address this in more detail, we need to learn about kinds. Similarly to how types describe values, we say that kinds describe types. The idea is:
A concrete type (that is, one which has values) has a kind of *. Examples include Bool, Char, Int and Maybe String, which all have type *. This is denoted e.g. Bool :: *. Note that functions such as Int -> String also have kind *, as these are concrete types which can contain values such as show!
A type with a type parameter has a kind containing arrows. For instance, in the same way that id :: a -> a, we can say that Maybe :: * -> *, since Maybe takes a concrete type as an argument (such as Int), and produces a concrete type as a result (such as Maybe Int). Something like a -> a also has kind * -> *, since it has one type parameter (a) and produces a concrete result (a -> a). You can get more complex kinds as well: for instance, data Foo f x = FooConstr (f x x) has kind Foo :: (* -> * -> *) -> * -> *. (Can you see why?)
(If the above explanation doesn’t make sense, the Learn You a Haskell book has a great section on kinds as well.)
So now we can answer your questions properly:
Which is a polymorphic higher-kinded type: a type or a set of types?
Neither: a polymorphic higher-kinded type is a type-level function, as indicated by the arrows in its kind. For instance, Maybe :: * -> * is a type-level function which converts e.g. Int → Maybe Int, Bool → Maybe Bool etc.
Is a polymorphic higher-kinded type with a concrete type substituting its type variable a type?
Yes, when your polymorphic higher-kinded type has a kind * -> * (i.e. it has one type parameter, which accepts a concrete type). When you apply a concrete type Conc :: * to a type Poly :: * -> *, it’s just function application, as detailed above, with the result being Poly Conc :: * i.e. a concrete type.
Is a polymorphic higher-kinded type with different concrete types substituting its type variable considered the same or different types?
This question is a bit out of place, as it doesn’t have anything to do with kinds. The answer is definitely no: two types like Maybe Int and Maybe Bool are not the same. Nothing may be a member of both types, but only the former contains a value Just 4, and only the latter contains a value Just False.
On the other hand, it is possible to have two different substitutions where the resulting types are isomorphic. (An isomorphism is where two types are different, but equivalent in some way. For instance, (a, b) and (b, a) are isomorphic, despite being the same type. The formal condition is that two types p,q are isomorphic when you can write two inverse functions p -> q and q -> p.)
One example of this is Const:
data Const a b = Const { getConst :: a }
This type just ignores its second type parameter; as a result, two types like Const Int Char and Const Int Bool are isomorphic. However, they are not the same type: if you make a value of type Const Int Char, but then use it as something of type Const Int Bool, this will result in a type error. This sort of functionality is incredibly useful, as it means you can ‘tag’ a type a using Const a tag, then use the tag as a marker of information on the type level.

Bind type parameter

I am currently working with the quite nice haskell-eigen library and stumbled upon a fundamental yet probably basic problem (I am quite new to practical haskell development).
I use their basic matrix type
data Matrix a b :: * -> * -> *
where a denotes the haskell and b the internal C type. This is realized via the restriction
Elem a b
with
Elem Double CDouble
Elem Float CFloat
-- more for complex types...
Although not really the question I want to ask here I kind of don't understand why this is done this way. Since it is obviously a kind of functional mapping I already don't understand why this is formulated as an equivalency relation, but anyway...
I now want to define (as a simple example - I got several) an instance of Key from the keys package. It defines the index key for a given container, for example
type instance [] = Int
So instances of Key are defined over types of kind * -> *.
However due to that requirement, this won't work:
type instance Key Matrix = (Int, Int)
I have to in some way make Matrix be of kind * -> *. So (coming from c++ where I would do this using traits classes), I tried this:
type family CType a where
CType Double = CDouble
CType Float = CFloat
type MatX a = Matrix a (CType a)
In other words I tried to use type synonyms as a means of realizing that above mentioned functional type map.
Now I tried the following:
type instance Key MatX = (Int, Int)
which gives me "The type synonym ‘MatX’ should have 1 argument, but has been given none" and I even tried the obviously wrong
type instance Key (MatX a) = (Int, Int)
which gives me "Expected kind * -> *, but MatX a has kind *". This sounds to me like "I the compiler expect a type with more than 0 but - being a type synonym - less than 1 argument".
So my question is: How does one commonly map types in haskell in order to solve such a kind mismatch or get rid of it in another way.
P.S.: I am well aware that the eigen matrix has an indexing function, but
I want it to be a common one with other data types
I have this problem in other variants for other type instances.
Edit: Added reference links to mentioned packages.
You're nearly there. The one missing piece is that type synonyms must be used saturated - that is, you have to supply all of its arguments. MatX on its own is not a valid type, only MatX a. The reason for this is that type synonyms are just synonyms - they're expanded at compile time, which means that the compiler needs to know all of the type synonym's arguments in order to get a valid type after expansion.
The fix is to change your type synonym to a newtype.
newtype MatX a = MatX { getMatX :: Matrix a (CType a) }
newtypes can be partially applied, because MatX a is now a different type to Matrix a (CType a).
type instance Key MatX = (Int, Int)
The other answer shows the general case for converting type synonyms into things that can be used in instance declarations. But in this specific case it can be much simpler: since the index type is the same for all different matrices, you can supply just the arguments needed to get the kind correct. Thus:
type instance Key (Matrix a) = (Int, Int)
No extra type families relating Haskell and C types needed, no new types needed. This will also make working with the keys' library's API much simpler, as you won't need to do any newtype wrapping and unwrapping around each call.

What exactly is the kind "*" in Haskell?

In Haskell, (value-level) expressions are classified into types, which can be notated with :: like so: 3 :: Int, "Hello" :: String, (+ 1) :: Num a => a -> a. Similarly, types are classified into kinds. In GHCi, you can inspect the kind of a type expression using the command :kind or :k:
> :k Int
Int :: *
> :k Maybe
Maybe :: * -> *
> :k Either
Either :: * -> * -> *
> :k Num
Num :: * -> Constraint
> :k Monad
Monad :: (* -> *) -> Constraint
There are definitions floating around that * is the kind of "concrete types" or "values" or "runtime values." See, for example, Learn You A Haskell. How true is that? We've had a few questions about kinds that address the topic in passing, but it'd be nice to have a canonical and precise explanation of *.
What exactly does * mean? And how does it relate to other more complex kinds?
Also, do the DataKinds or PolyKinds extensions change the answer?
First off, * is not a wildcard! It's also typically pronounced "star."
Bleeding edge note: There is as of Feb. 2015 a proposal to simplify GHC's subkind system (in 7.12 or later). That page contains a good discussion of the GHC 7.8/7.10 story. Looking forward, GHC may drop the distinction between types and kinds, with * :: *. See Weirich, Hsu, and Eisenberg, System FC with Explicit Kind Equality.
The Standard: A description of type expressions.
The Haskell 98 report defines * in this context as:
The symbol * represents the kind of all nullary type constructors.
In this context, "nullary" simply means that the constructor takes no parameters. Either is binary; it can be applied to two parameters: Either a b. Maybe is unary; it can be applied to one parameter: Maybe a. Int is nullary; it can be applied to no parameters.
This definition is a little bit incomplete on its own. An expression containing a fully-applied unary, binary, etc. type constructor also has kind *, e.g. Maybe Int :: *.
In GHC: Something that contains values?
If we poke around the GHC documentation, we get something closer to the "can contain a runtime value" definition. The GHC Commentary page "Kinds" states that "'*' is the kind of boxed values. Things like Int and Maybe Float have kind *." The GHC user's guide for version 7.4.1, on the other hand, stated that * is the kind of "lifted types". (That passage wasn't retained when the section was revised for
PolyKinds.)
Boxed values and lifted types are a bit different. According to the GHC Commentary page "TypeType",
A type is unboxed iff its representation is other than a pointer. Unboxed types are also unlifted.
A type is lifted iff it has bottom as an element. Closures always have lifted types: i.e. any let-bound identifier in Core must have a lifted type. Operationally, a lifted object is one that can be entered. Only lifted types may be unified with a type variable.
So ByteArray#, the type of raw blocks of memory, is boxed because it is represented as a pointer, but unlifted because bottom is not an element.
> undefined :: ByteArray#
Error: Kind incompatibility when matching types:
a0 :: *
ByteArray# :: #
Therefore it appears that the old User's Guide definition is more accurate than the GHC Commentary one: * is the kind of lifted types. (And, conversely, # is the kind of unlifted types.)
Note that if types of kind * are always lifted, for any type t :: * you can construct a "value" of sorts with undefined :: t or some other mechanism to create bottom. Therefore even "logically uninhabited" types like Void can have a value, i.e. bottom.
So it seems that, yes, * represents the kind of types that can contain runtime values, if undefined is your idea of a runtime value. (Which isn't a totally crazy idea, I don't think.)
GHC Extensions?
There are several extensions which liven up the kind system a bit. Some of these are mundane: KindSignatures lets us write kind annotations, like type annotations.
ConstraintKinds adds the kind Constraint, which is, roughly, the kind of the left-hand side of =>.
DataKinds lets us introduce new kinds besides * and #, just as we can introduce new types with data, newtype, and type.
With DataKinds every data declaration (terms and conditions may apply) generates a promoted kind declaration. So
data Bool = True | False
introduces the usual value constructor and type name; additionally, it produces a new kind, Bool, and two types: True :: Bool and False :: Bool.
PolyKinds introduces kind variables. This just a way to say "for any kind k" just like we say "for any type t" at the type level. As regards our friend * and whether it still means "types with values", I suppose you could say a type t :: k where k is a kind variable could contain values, if k ~ * or k ~ #.
In the most basic form of the kind language, where there are only the kind * and the kind constructor ->, then * is the kind of things that can stand in a type-of relationship to values; nothing with a different kind can be a type of values.
Types exist to classify values. All values with the same type are interchangeable for the purpose of type-checking, so the type checker only has to care about types, not specific values. So we have the "value level" where all the actual values live, and the "type level" where their types live. The "type-of" relationship forms links between the two levels, with a single type being the type of (usually) many values. Haskell makes these two levels quite explicit; it's why you can have declarations like data Foo = Foo Int Chat Bool where you've declared a type-level thing Foo (a type with kind *) and a value-level thing Foo (a constructor with type Int -> Char -> Bool -> Foo). The two Foos involved simply refer to different entities on different levels, and Haskell separates these so completely that it can always tell what level you're referring to and thus can allow (sometimes confusingly) things on the different levels to have the same name.
But as soon as we introduce types that themselves have structure (like Maybe Int, which is a type constructor Maybe applied to a type Int), then we have things that exist at the type level which do not actually stand in a type-of relationship to any values. There are no values whose type is just Maybe, only values with type Maybe Int (and Maybe Bool, Maybe (), even Maybe Void, etc). So we need to classify our type-level things for the same reason we need to classify our values; only certain type-expressions actually represent something that can be the type of values, but many of them work interchangeably for the purpose of "kind-checking" (whether it's a correct type for the value-level thing it's declared to be the type of is a problem for a different level).1
So * (which is often stated to be pronounced "type") is the basic kind; it's the kind of all type-level things that can be stated to be the type of values. Int has values; therefore its type is *. Maybe does not have values, but it takes an argument and produces a type that has values; this gets us a kind like ___ -> *. We can fill in the blank by observing that Maybe's argument is used as the type of the value appearing in Just a, so its argument must also be a type of values (with kind *), and so Maybe must have kind * -> *. And so on.
When you're dealing with kinds that only involve stars and arrows, then only type-expressions of kind * are types of values. Any other kind (e.g. * -> (* -> * -> *) -> (* -> *)) only contains other "type-level entities" that are not actual types that contain values.
PolyKinds, as I understand it, doesn't really change this picture at all. It just allows you to make polymorphic declarations at the kind-level, meaning it adds variables to our kind language (in addition to stars and arrows). So now I can contemplate type-level things of kind k -> *; this could be instantiated to work as either kind * -> * or (* -> *) -> * or (* -> (* -> *)) -> *. We've gained exactly the same kind of power as having (a -> b) -> [a] -> [b] at the type level gained us; we can write one map function with a type that contains variables, instead of having to write every possible map function separately. But there's still only one kind that contains type-level things that are the types of values: *.
DataKinds also introduces new things to the kind language. Effectively what it does though is to let us declare arbitrary new kinds, which contain new type-level entities (just as ordinary data declarations allow us to declare arbitrary new types, which contain new value-level entities). But it doesn't let us declare things with a correspondence of entities across all 3 levels; if I have data Nat :: Z | S Nat and use DataKinds to lift it to the kind level, then we have two different things named Nat that exist on the type level (as the type of value-level Z, S Z, S (S Z), etc), and at the kind level (as the kind of type-level Z, S Z, S (S Z)). The type-level Z is not the type of any values though; the value Z inhabits the type-level Nat (which in turn is of kind *), not the type-level Z. So DataKinds adds new user defined things to the kind language, which can be the kind of new user-defined things at the type level, but it remains the case that the only type-level things that can be the types of values are of kind *.
The only addition to the kind language that I'm aware of which truly does change this are the kinds mentioned in #ChristianConkle's answer, such as # (I believe there are a couple more now too? I'm not really terribly knowledgeable about "low level" types such as ByteArray#). These are the kinds of types that have values that GHC needs to know to treat differently (such as not assuming they can be boxed and lazily evaluated), even when polymorphic functions are involved, so we can't just attach the knowledge that they need to be treated differently to these values' types, or it would be lost when calling polymorphic functions on them.
1 The word "type" can thus be a little confusing. Sometimes it is used to refer to things that actually stand in a type-of relationship to things on the value level (this is the interpretation used when people say "Maybe is not a type, it's a type-constructor"). And sometimes it's used to refer to anything that exists at the type-level (under this interpretation Maybe is in fact a type). In this post I'm trying to very explicitly refer to "type-level things" rather than use "type" as a short-hand.
For beginners that are trying to learn about kinds (you can think of them as the type of a type) I recommend this chapter of the Learn you a Haskell book.
I personally think of kinds in this way:
You have concrete types, e.g. Int, Bool,String, [Int], Maybe Int or Either Int String.
All of these have the kind *. Why? Because they can't take any more types as a parameter; an Int, is an Int; a Maybe Int is a Maybe Int. What about Maybe or [] or Either, though?
When you say Maybe, you do not have a concrete type, because you didn't specify its parameter. Maybe Int or Maybe String are different but both have a * kind, but Maybe is waiting for a type of kind * to return a kind *. To clarify, let's look at what GHCI's :kind command can tell us:
Prelude> :kind Maybe Int
Maybe Int :: *
Prelude> :kind Maybe
Maybe :: * -> *
With lists it's the same:
Prelude> :k [String]
[String] :: *
Prelude> :k []
[] :: * -> *
What about Either?
Prelude> :k Either Int String
Either Int String :: *
Prelude> :k Either Int
Either Int :: * -> *
You could think of intuitively think of Either as a function that takes parameters, but the parameters are types:
Prelude> :k Either Int
Either Int :: * -> *
means Either Int is waiting for a type parameter.

Understanding Polytypes in Hindley-Milner Type Inference

I'm reading the Wikipedia article on Hindley–Milner Type Inference trying to make some sense out of it. So far this is what I've understood:
Types are classified as either monotypes or polytypes.
Monotypes are further classified as either type constants (like int or string) or type variables (like α and β).
Type constants can either be concrete types (like int and string) or type constructors (like Map and Set).
Type variables (like α and β) behave as placeholders for concrete types (like int and string).
Now I'm having a little difficulty understanding polytypes but after learning a bit of Haskell this is what I make of it:
Types themselves have types. Formally types of types are called kinds (i.e. there are different kinds of types).
Concrete types (like int and string) and type variables (like α and β) are of kind *.
Type constructors (like Map and Set) are lambda abstractions of types (e.g. Set is of kind * -> * and Map is of kind * -> * -> *).
What I don't understand is what do qualifiers signify. For example what does ∀α.σ represent? I can't seem to make heads or tails of it and the more I read the following paragraph the more confused I get:
A function with polytype ∀α.α -> α by contrast can map any value of the same type to itself, and the identity function is a value for this type. As another example ∀α.(Set α) -> int is the type of a function mapping all finite sets to integers. The count of members is a value for this type. Note that qualifiers can only appear top level, i.e. a type ∀α.α -> ∀α.α for instance, is excluded by syntax of types and that monotypes are included in the polytypes, thus a type has the general form ∀α₁ . . . ∀αₙ.τ.
First, kinds and polymorphic types are different things. You can have a HM type system where all types are of the same kind (*), you could also have a system without polymorphism but with complex kinds.
If a term M is of type ∀a.t, it means that for whatever type s we can substitute s for a in t (often written as t[a:=s] and we'll have that M is of type t[a:=s]. This is somewhat similar to logic, where we can substitute any term for a universally quantified variable, but here we're dealing with types.
This is precisely what happens in Haskell, just that in Haskell you don't see the quantifiers. All type variables that appear in a type signature are implicitly quantified, just as if you had forall in front of the type. For example, map would have type
map :: forall a . forall b . (a -> b) -> [a] -> [b]
etc. Without this implicit universal quantification, type variables a and b would have to have some fixed meaning and map wouldn't be polymorphic.
The HM algorithm distinguishes types (without quantifiers, monotypes) and type schemas (universaly quantified types, polytypes). It's important that at some places it uses type schemas (like in let), but at other places only types are allowed. This makes the whole thing decidable.
I also suggest you to read the article about System F. It is a more complex system, which allows forall anywhere in types (therefore everything there is just called type), but type inference/checking is undecidable. It can help you understand how forall works. System F is described in depth in Girard, Lafont and Taylor, Proofs and Types.
Consider l = \x -> t in Haskell. It is a lambda, which represents a term t fith a variable x, which will be substituted later (e.g. l 1, whatever it would mean) . Similarly, ∀α.σ represents a type with a type variable α, that is, f : ∀α.σ if a function parameterized by a type α. In some sense, σ depends on α, so f returns a value of type σ(α), where α will be substituted in σ(α) later, and we will get some concrete type.
In Haskell you are allowed to omit ∀ and define functions just like id : a -> a. The reason to allowing omitting the quantifier is basically since they are allowed only top level (without RankNTypes extension). You can try this piece of code:
id2 : a -> a -- I named it id2 since id is already defined in Prelude
id2 x = x
If you ask ghci for the type of id(:t id), it will return a -> a. To be more precise (more type theoretic), id has the type ∀a. a -> a. Now, if you add to your code:
val = id2 3
, 3 has the type Int, so the type Int will be substituted into σ and we will get the concrete type Int -> Int.

Functions don't just have types: They ARE Types. And Kinds. And Sorts. Help put a blown mind back together

I was doing my usual "Read a chapter of LYAH before bed" routine, feeling like my brain was expanding with every code sample. At this point I was convinced that I understood the core awesomeness of Haskell, and now just had to understand the standard libraries and type classes so that I could start writing real software.
So I was reading the chapter about applicative functors when all of a sudden the book claimed that functions don't merely have types, they are types, and can be treated as such (For example, by making them instances of type classes). (->) is a type constructor like any other.
My mind was blown yet again, and I immediately jumped out of bed, booted up the computer, went to GHCi and discovered the following:
Prelude> :k (->)
(->) :: ?? -> ? -> *
What on earth does it mean?
If (->) is a type constructor, what are the value constructors? I can take a guess, but would have no idea how define it in traditional data (->) ... = ... | ... | ... format. It's easy enough to do this with any other type constructor: data Either a b = Left a | Right b. I suspect my inability to express it in this form is related to the extremly weird type signature.
What have I just stumbled upon? Higher kinded types have kind signatures like * -> * -> *. Come to think of it... (->) appears in kind signatures too! Does this mean that not only is it a type constructor, but also a kind constructor? Is this related to the question marks in the type signature?
I have read somewhere (wish I could find it again, Google fails me) about being able to extend type systems arbitrarily by going from Values, to Types of Values, to Kinds of Types, to Sorts of Kinds, to something else of Sorts, to something else of something elses, and so on forever. Is this reflected in the kind signature for (->)? Because I've also run into the notion of the Lambda cube and the calculus of constructions without taking the time to really investigate them, and if I remember correctly it is possible to define functions that take types and return types, take values and return values, take types and return values, and take values which return types.
If I had to take a guess at the type signature for a function which takes a value and returns a type, I would probably express it like this:
a -> ?
or possibly
a -> *
Although I see no fundamental immutable reason why the second example couldn't easily be interpreted as a function from a value of type a to a value of type *, where * is just a type synonym for string or something.
The first example better expresses a function whose type transcends a type signature in my mind: "a function which takes a value of type a and returns something which cannot be expressed as a type."
You touch so many interesting points in your question, so I am
afraid this is going to be a long answer :)
Kind of (->)
The kind of (->) is * -> * -> *, if we disregard the boxity GHC
inserts. But there is no circularity going on, the ->s in the
kind of (->) are kind arrows, not function arrows. Indeed, to
distinguish them kind arrows could be written as (=>), and then
the kind of (->) is * => * => *.
We can regard (->) as a type constructor, or maybe rather a type
operator. Similarly, (=>) could be seen as a kind operator, and
as you suggest in your question we need to go one 'level' up. We
return to this later in the section Beyond Kinds, but first:
How the situation looks in a dependently typed language
You ask how the type signature would look for a function that takes a
value and returns a type. This is impossible to do in Haskell:
functions cannot return types! You can simulate this behaviour using
type classes and type families, but let us for illustration change
language to the dependently typed language
Agda. This is a
language with similar syntax as Haskell where juggling types together
with values is second nature.
To have something to work with, we define a data type of natural
numbers, for convenience in unary representation as in
Peano Arithmetic.
Data types are written in
GADT style:
data Nat : Set where
Zero : Nat
Succ : Nat -> Nat
Set is equivalent to * in Haskell, the "type" of all (small) types,
such as Natural numbers. This tells us that the type of Nat is
Set, whereas in Haskell, Nat would not have a type, it would have
a kind, namely *. In Agda there are no kinds, but everything has
a type.
We can now write a function that takes a value and returns a type.
Below is a the function which takes a natural number n and a type,
and makes iterates the List constructor n applied to this
type. (In Agda, [a] is usually written List a)
listOfLists : Nat -> Set -> Set
listOfLists Zero a = a
listOfLists (Succ n) a = List (listOfLists n a)
Some examples:
listOfLists Zero Bool = Bool
listOfLists (Succ Zero) Bool = List Bool
listOfLists (Succ (Succ Zero)) Bool = List (List Bool)
We can now make a map function that operates on listsOfLists.
We need to take a natural number that is the number of iterations
of the list constructor. The base cases are when the number is
Zero, then listOfList is just the identity and we apply the function.
The other is the empty list, and the empty list is returned.
The step case is a bit move involving: we apply mapN to the head
of the list, but this has one layer less of nesting, and mapN
to the rest of the list.
mapN : {a b : Set} -> (a -> b) -> (n : Nat) ->
listOfLists n a -> listOfLists n b
mapN f Zero x = f x
mapN f (Succ n) [] = []
mapN f (Succ n) (x :: xs) = mapN f n x :: mapN f (Succ n) xs
In the type of mapN, the Nat argument is named n, so the rest of
the type can depend on it. So this is an example of a type that
depends on a value.
As a side note, there are also two other named variables here,
namely the first arguments, a and b, of type Set. Type
variables are implicitly universally quantified in Haskell, but
here we need to spell them out, and specify their type, namely
Set. The brackets are there to make them invisible in the
definition, as they are always inferable from the other arguments.
Set is abstract
You ask what the constructors of (->) are. One thing to point out
is that Set (as well as * in Haskell) is abstract: you cannot
pattern match on it. So this is illegal Agda:
cheating : Set -> Bool
cheating Nat = True
cheating _ = False
Again, you can simulate pattern matching on types constructors in
Haskell using type families, one canoical example is given on
Brent Yorgey's blog.
Can we define -> in the Agda? Since we can return types from
functions, we can define an own version of -> as follows:
_=>_ : Set -> Set -> Set
a => b = a -> b
(infix operators are written _=>_ rather than (=>)) This
definition has very little content, and is very similar to doing a
type synonym in Haskell:
type Fun a b = a -> b
Beyond kinds: Turtles all the way down
As promised above, everything in Agda has a type, but then
the type of _=>_ must have a type! This touches your point
about sorts, which is, so to speak, one layer above Set (the kinds).
In Agda this is called Set1:
FunType : Set1
FunType = Set -> Set -> Set
And in fact, there is a whole hierarchy of them! Set is the type of
"small" types: data types in haskell. But then we have Set1,
Set2, Set3, and so on. Set1 is the type of types which mentions
Set. This hierarchy is to avoid inconsistencies such as Girard's
paradox.
As noticed in your question, -> is used for types and kinds in
Haskell, and the same notation is used for function space at all
levels in Agda. This must be regarded as a built in type operator,
and the constructors are lambda abstraction (or function
definitions). This hierarchy of types is similar to the setting in
System F omega, and more
information can be found in the later chapters of
Pierce's Types and Programming Languages.
Pure type systems
In Agda, types can depend on values, and functions can return types,
as illustrated above, and we also had an hierarchy of
types. Systematic investigation of different systems of the lambda
calculi is investigated in more detail in Pure Type Systems. A good
reference is
Lambda Calculi with Types by Barendregt,
where PTS are introduced on page 96, and many examples on page 99 and onwards.
You can also read more about the lambda cube there.
Firstly, the ?? -> ? -> * kind is a GHC-specific extension. The ? and ?? are just there to deal with unboxed types, which behave differently from just * (which has to be boxed, as far as I know). So ?? can be any normal type or an unboxed type (e.g. Int#); ? can be either of those or an unboxed tuple. There is more information here: Haskell Weird Kinds: Kind of (->) is ?? -> ? -> *
I think a function can't return an unboxed type because functions are lazy. Since a lazy value is either a value or a thunk, it has to be boxed. Boxed just means it is a pointer rather than just a value: it's like Integer() vs int in Java.
Since you are probably not going to be using unboxed types in LYAH-level code, you can imagine that the kind of -> is just * -> * -> *.
Since the ? and ?? are basically just more general version of *, they do not have anything to do with sorts or anything like that.
However, since -> is just a type constructor, you can actually partially apply it; for example, (->) e is an instance of Functor and Monad. Figuring out how to write these instances is a good mind-stretching exercise.
As far as value constructors go, they would have to just be lambdas (\ x ->) or function declarations. Since functions are so fundamental to the language, they get their own syntax.

Resources