What does Functor's fmap tell about types?

What does Functor's fmap tell about types? - haskell

What does f a and f b tell me about its type?
class Functor f where
fmap :: (a -> b) -> f a -> f b
I think I get the idea behind standard instances of a functor. However I'm having hard time understanding what f a and f actually represent.
I understand that f a and f b are just types and they must carry information what type constructor was used to create them and type arguments that were used.
Is f a type constructor of kind * -> *? Is (->) r a type constructor just like Maybe is?

I understand that f a and f b are just types and they must carry information what type constructor was used to create them and type arguments that were used.
Good explanation.
Is f a type constructor of kind * -> *?
In effect.
Is (->) r a type constructor just like Maybe is?
In effect, yes:
Yes in the sense that you can apply it to a type like String and get r -> String, just like you can apply Maybe to String to get Maybe String. You can use for f anything that gives you a type from any other type.
..but no...
No, in the sense that Daniel Wagner points out; To be precise, Maybe and [] are type constructors, but (->) r and Either a are sort of like partially applied type constructors. Nevertheless they make good functors, because you can freely apply functions "inside" them and change the type of "the contents".
(Stuff in inverted commas is very hand-wavy imprecise terminology.)

My (possibly mildly tortured) reading of chapter 4 of the Haskell 2010 Report is that Maybe and (->) r are both types, of kind * -> *. Alternatively, the Report also labels them as type expressions—but I can't discern a firm difference in how the Report uses the two terms, except perhaps for surface syntax details. (->) and Maybe are type constructors; type expressions are assembled from type constructors and type variables.
For example, section 4.1.1 ("Kinds") of the 2010 report says (my boldface):
To ensure that they are valid, type expressions are classified into different kinds, which take one of two possible forms:
The symbol ∗ represents the kind of all nullary type constructors.
If κ1 and κ2 are kinds, then κ1 → κ2 is the kind of types that take a type of kind κ1 and return a type of kind κ2.
Section 4.3.2, "Instance Declarations" (my boldface):
An instance declaration that makes the type T to be an instance of class C is called a C-T instance declaration and is subject to these static restrictions:
A type may not be declared as an instance of a particular class more than once in the program.
The class and type must have the same kind; this can be determined using kind inference as described in Section 4.6.
So going by that language, the following instance declaration makes the type (->) r to be an instance of the class Functor:
instance Functor ((->) r) where
fmap f g = f . g
The funny thing about this terminology is that we call (->) r a "type" even though there are no expressions in Haskell that have that type—not even undefined:
foo :: (->) r
foo = undefined
{-
[1 of 1] Compiling Main ( ../src/scratch.hs, interpreted )
../src/scratch.hs:1:8:
Expecting one more argument to `(->) r'
In the type signature for `foo': foo :: (->) r
-}
But I think that's not a big deal. Basically, all declarations in Haskell must have types of kind *.
As a side note, from my limited understanding of dependently typed languages, many of these lack Haskell's firm distinction between terms and types, so that something like (->) Boolean is an expression whose value is a function that takes a type as its argument and produces a type as its result.

Related

Clarification of Terms around Haskell Type system

Type system in haskell seem to be very Important and I wanted to clarify some terms revolving around haskell type system.
Some type classes
Functor
Applicative
Monad
After using :info I found that Functor is a type class, Applicative is a type class with => (deriving?) Functor and Monad deriving Applicative type class.
I've read that Maybe is a Monad, does that mean Maybe is also Applicative and Functor?
-> operator
When i define a type
data Maybe = Just a | Nothing
and check :t Just I get Just :: a -> Maybe a. How to read this -> operator?
It confuses me with the function where a -> b means it evaluates a to b (sort of returns a maybe) – I tend to think lhs to rhs association but it turns when defining types?
The term type is used in ambiguous ways, Type, Type Class, Type Constructor, Concrete Type etc... I would like to know what they mean to be exact

Indeed the word “type” is used in somewhat ambiguous ways.
The perhaps most practical way to look at it is that a type is just a set of values. For example, Bool is the finite set containing the values True and False.Mathematically, there are subtle differences between the concepts of set and type, but they aren't really important for a programmer to worry about. But you should in general consider the sets to be infinite, for example Integer contains arbitrarily big numbers.
The most obvious way to define a type is with a data declaration, which in the simplest case just lists all the values:
data Colour = Red | Green | Blue
There we have a type which, as a set, contains three values.
Concrete type is basically what we say to make it clear that we mean the above: a particular type that corresponds to a set of values. Bool is a concrete type, that can easily be understood as a data definition, but also String, Maybe Integer and Double -> IO String are concrete types, though they don't correspond to any single data declaration.
What a concrete type can't have is type variables†, nor can it be an incompletely applied type constructor. For example, Maybe is not a concrete type.
So what is a type constructor? It's the type-level analogue to value constructors. What we mean mathematically by “constructor” in Haskell is an injective function, i.e. a function f where if you're given f(x) you can clearly identify what was x. Furthermore, any different constructors are assumed to have disjoint ranges, which means you can also identify f.‡
Just is an example of a value constructor, but it complicates the discussion that it also has a type parameter. Let's consider a simplified version:
data MaybeInt = JustI Int | NothingI
Now we have
JustI :: Int -> MaybeInt
That's how JustI is a function. Like any function of the same signature, it can be applied to argument values of the right type, like, you can write JustI 5.What it means for this function to be injective is that I can define a variable, say,
quoxy :: MaybeInt
quoxy = JustI 9328
and then I can pattern match with the JustI constructor:
> case quoxy of { JustI n -> print n }
9328
This would not be possible with a general function of the same signature:
foo :: Int -> MaybeInt
foo i = JustI $ negate i
> case quoxy of { foo n -> print n }
<interactive>:5:17: error: Parse error in pattern: foo
Note that constructors can be nullary, in which case the injective property is meaningless because there is no contained data / arguments of the injective function. Nothing and True are examples of nullary constructors.
Type constructors are the same idea as value constructors: type-level functions that can be pattern-matched. Any type-name defined with data is a type constructor, for example Bool, Colour and Maybe are all type constructors. Bool and Colour are nullary, but Maybe is a unary type constructor: it takes a type argument and only the result is then a concrete type.
So unlike value-level functions, type-level functions are kind of by default type constructors. There are also type-level functions that aren't constructors, but they require -XTypeFamilies.
A type class may be understood as a set of types, in the same vein as a type can be seen as a set of values. This is not quite accurate, it's closer to true to say a class is a set of type constructors but again it's not as useful to ponder the mathematical details – better to look at examples.
There are two main differences between type-as-set-of-values and class-as-set-of-types:
How you define the “elements”: when writing a data declaration, you need to immediately describe what values are allowed. By contrast, a class is defined “empty”, and then the instances are defined later on, possibly in a different module.
How the elements are used. A data type basically enumerates all the values so they can be identified again. Classes meanwhile aren't generally concerned with identifying types, rather they specify properties that the element-types fulfill. These properties come in the form of methods of a class. For example, the instances of the Num class are types that have the property that you can add elements together.
You could say, Haskell is statically typed on the value level (fixed sets of values in each type), but duck-typed on the type level (classes just require that somebody somewhere implements the necessary methods).
A simplified version of the Num example:
class Num a where
(+) :: a -> a -> a
instance Num Int where
0 + x = x
x + y = ...
If the + operator weren't already defined in the prelude, you would now be able to use it with Int numbers. Then later on, perhaps in a different module, you could also make it usable with new, custom number types:
data MyNumberType = BinDigits [Bool]
instance Num MyNumberType where
BinDigits [] + BinDigits l = BinDigits l
BinDigits (False:ds) + BinDigits (False:es)
= BinDigits (False : ...)
Unlike Num, the Functor...Monad type classes are not classes of types, but of 1-ary type constructors. I.e. every functor is a type constructor taking one argument to make it a concrete type. For instance, recall that Maybe is a 1-ary type constructor.
class Functor f where
fmap :: (a->b) -> f a -> f b
instance Functor Maybe where
fmap f (Just a) = Just (f a)
fmap _ Nothing = Nothing
As you have concluded yourself, Applicative is a subclass of Functor. D being a subclass of C means basically that D is a subset of the set of type constructors in C. Therefore, yes, if Maybe is an instance of Monad it also is an instance of Functor.
†That's not quite true: if you consider the _universal quantor_ explicitly as part of the type, then a concrete type can contain variables. This is a bit of an advanced subject though.
‡This is not guaranteed to be true if the -XPatternSynonyms extension is used.

Types constructors and existential types

Only polymorphic function can be applied to values of existential types.
Those properties can be expressed by the corresponding quantifiers for expressions, and characterized by natural transformations.
Similarly, when we define a type constructor
data List a = Nil | Cons a (List a)
This type constructor works for all a whereas type families allows to have non uniform type constructors
type family TRes i o
type instance TRes Bool = String
type instance TRes String = Bool
What natural transformation characterizes precisely this idea of "uniformity" at type level ?
Is there an equivalent of forcing naturality like we have at value level with rank-n types ?
ApplyNat :: (forall a. a -> F a) -> b -> F b

I think you've confused a couple of different ideas here.
This type constructor works for all a.
That's totality. List :: * -> * produces a valid type of kind * given any argument a of kind *. Haskell 98 datatypes are always total, but, as you point out, in modern Haskell you can write type families which don't cover all possible cases. TRes Int is not a "real" type, in the sense that it contains no values, it doesn't reduce to any other type, and it's not equal to any type other than TRes Int.
Haskell has no totality checker at the value level or the type level (apart from the rules about undecidable instances, which are a blunt instrument), so, just as there is no way to rule out undefined values, there is no way to rule out "stuck" type families like TRes Int. (For more on "stuck" type families see this blog post by Richard Eisenberg, the designer of TypeInType.)
Naturality is an altogether different idea. In value-level Haskell, a natural transformation between f and g is a polymorphic function mapping values of type f x to values of type g x, without knowing anything about x.
type f ~> g = forall x. f x -> g x
With GHC 8 and TypeInType we can talk about kinds using the same language we use to talk about types, because kinds are types. The type expression forall x. f x -> g x has kind * ((~>) :: forall k. (k -> *) -> (k -> *) -> *), so it's a perfectly valid classifier for types as well. A type with that kind is a polymorphic type function mapping types of kind f x to types of kind g x.
What would you use a type-level natural transformation for, in the real world? I dunno. You wouldn't, probably.

Understanding Polytypes in Hindley-Milner Type Inference

I'm reading the Wikipedia article on Hindley–Milner Type Inference trying to make some sense out of it. So far this is what I've understood:
Types are classified as either monotypes or polytypes.
Monotypes are further classified as either type constants (like int or string) or type variables (like α and β).
Type constants can either be concrete types (like int and string) or type constructors (like Map and Set).
Type variables (like α and β) behave as placeholders for concrete types (like int and string).
Now I'm having a little difficulty understanding polytypes but after learning a bit of Haskell this is what I make of it:
Types themselves have types. Formally types of types are called kinds (i.e. there are different kinds of types).
Concrete types (like int and string) and type variables (like α and β) are of kind *.
Type constructors (like Map and Set) are lambda abstractions of types (e.g. Set is of kind * -> * and Map is of kind * -> * -> *).
What I don't understand is what do qualifiers signify. For example what does ∀α.σ represent? I can't seem to make heads or tails of it and the more I read the following paragraph the more confused I get:
A function with polytype ∀α.α -> α by contrast can map any value of the same type to itself, and the identity function is a value for this type. As another example ∀α.(Set α) -> int is the type of a function mapping all finite sets to integers. The count of members is a value for this type. Note that qualifiers can only appear top level, i.e. a type ∀α.α -> ∀α.α for instance, is excluded by syntax of types and that monotypes are included in the polytypes, thus a type has the general form ∀α₁ . . . ∀αₙ.τ.

First, kinds and polymorphic types are different things. You can have a HM type system where all types are of the same kind (*), you could also have a system without polymorphism but with complex kinds.
If a term M is of type ∀a.t, it means that for whatever type s we can substitute s for a in t (often written as t[a:=s] and we'll have that M is of type t[a:=s]. This is somewhat similar to logic, where we can substitute any term for a universally quantified variable, but here we're dealing with types.
This is precisely what happens in Haskell, just that in Haskell you don't see the quantifiers. All type variables that appear in a type signature are implicitly quantified, just as if you had forall in front of the type. For example, map would have type
map :: forall a . forall b . (a -> b) -> [a] -> [b]
etc. Without this implicit universal quantification, type variables a and b would have to have some fixed meaning and map wouldn't be polymorphic.
The HM algorithm distinguishes types (without quantifiers, monotypes) and type schemas (universaly quantified types, polytypes). It's important that at some places it uses type schemas (like in let), but at other places only types are allowed. This makes the whole thing decidable.
I also suggest you to read the article about System F. It is a more complex system, which allows forall anywhere in types (therefore everything there is just called type), but type inference/checking is undecidable. It can help you understand how forall works. System F is described in depth in Girard, Lafont and Taylor, Proofs and Types.

Consider l = \x -> t in Haskell. It is a lambda, which represents a term t fith a variable x, which will be substituted later (e.g. l 1, whatever it would mean) . Similarly, ∀α.σ represents a type with a type variable α, that is, f : ∀α.σ if a function parameterized by a type α. In some sense, σ depends on α, so f returns a value of type σ(α), where α will be substituted in σ(α) later, and we will get some concrete type.
In Haskell you are allowed to omit ∀ and define functions just like id : a -> a. The reason to allowing omitting the quantifier is basically since they are allowed only top level (without RankNTypes extension). You can try this piece of code:
id2 : a -> a -- I named it id2 since id is already defined in Prelude
id2 x = x
If you ask ghci for the type of id(:t id), it will return a -> a. To be more precise (more type theoretic), id has the type ∀a. a -> a. Now, if you add to your code:
val = id2 3
, 3 has the type Int, so the type Int will be substituted into σ and we will get the concrete type Int -> Int.

What are type quantifiers?

Many statically typed languages have parametric polymorphism. For example in C# one can define:
T Foo<T>(T x){ return x; }
In a call site you can do:
int y = Foo<int>(3);
These types are also sometimes written like this:
Foo :: forall T. T -> T
I have heard people say "forall is like lambda-abstraction at the type level". So Foo is a function that takes a type (for example int), and produces a value (for example a function of type int -> int). Many languages infer the type parameter, so that you can write Foo(3) instead of Foo<int>(3).
Suppose we have an object f of type forall T. T -> T. What we can do with this object is first pass it a type Q by writing f<Q>. Then we get back a value with type Q -> Q. However, certain f's are invalid. For example this f:
f<int> = (x => x+1)
f<T> = (x => x)
So if we "call" f<int> then we get back a value with type int -> int, and in general if we "call" f<Q> then we get back a value with type Q -> Q, so that's good. However, it is generally understood that this f is not a valid thing of type forall T. T -> T, because it does something different depending on which type you pass it. The idea of forall is that this is explicitly not allowed. Also, if forall is lambda for the type level, then what is exists? (i.e. existential quantification). For these reasons it seems that forall and exists are not really "lambda at the type level". But then what are they? I realize this question is rather vague, but can somebody clear this up for me?
A possible explanation is the following:
If we look at logic, quantifiers and lambda are two different things. An example of a quantified expression is:
forall n in Integers: P(n)
So there are two parts to forall: a set to quantify over (e.g. Integers), and a predicate (e.g. P). Forall can be viewed as a higher order function:
forall n in Integers: P(n) == forall(Integers,P)
With type:
forall :: Set<T> -> (T -> bool) -> bool
Exists has the same type. Forall is like an infinite conjunction, where S[n] is the n-th elemen to of the set S:
forall(S,P) = P(S[0]) ∧ P(S[1]) ∧ P(S[2]) ...
Exists is like an infinite disjunction:
exists(S,P) = P(S[0]) ∨ P(S[1]) ∨ P(S[2]) ...
If we do an analogy with types, we could say that the type analogue of ∧ is computing the intersection type ∩, and the type analogue of ∨ computing the union type ∪. We could then define forall and exists on types as follows:
forall(S,P) = P(S[0]) ∩ P(S[1]) ∩ P(S[2]) ...
exists(S,P) = P(S[0]) ∪ P(S[1]) ∪ P(S[2]) ...
So forall is an infinite intersection, and exists is an infinite union. Their types would be:
forall, exists :: Set<T> -> (T -> Type) -> Type
For example the type of the polymorphic identity function. Here Types is the set of all types, and -> is the type constructor for functions and => is lambda abstraction:
forall(Types, t => (t -> t))
Now a thing of type forall T:Type. T -> T is a value, not a function from types to values. It is a value whose type is the intersection of all types T -> T where T ranges over all types. When we use such a value, we do not have to apply it to a type. Instead, we use a subtype judgement:
id :: forall T:Type. T -> T
id = (x => x)
id2 = id :: int -> int
This downcasts id to have type int -> int. This is valid because int -> int also appears in the infinite intersection.
This works out nicely I think, and it clearly explains what forall is and how it is different from lambda, but this model is incompatible with what I have seen in languages like ML, F#, C#, etc. For example in F# you do id<int> to get the identity function on ints, which does not make sense in this model: id is a function on values, not a function on types that returns a function on values.
Can somebody with knowledge of type theory explain what exactly are forall and exists? And to what extent is it true that "forall is lambda at the type level"?

Let me address your questions separately.
Calling forall "a lambda at the type level" is inaccurate for two reasons. First, it is the type of a lambda, not the lambda itself. Second, that lambda lives on the term level, even though it abstracts over types (lambdas on the type level exist as well, they provide what is often called generic types).
Universal quantification does not necessarily imply "same behaviour" for all instantiations. That is a particular property called "parametricity" that may or may not be present. The plain polymorphic lambda calculus is parametric, because you simply cannot express any non-parametric behaviour. But if you add constructs like typecase (a.k.a. intensional type analysis) or checked casts as a weaker form of that, then you loose parametricity. Parametricity implies nice properties, e.g. it allows a language to be implemented without any runtime representation of types. And it induces very strong reasoning principles, see e.g. Wadler's paper "Theorems for free!". But it's a trade-off, sometimes you want dispatch on types.
Existential types essentially denote pairs of a type (the so-called witness) and a term, sometimes called packages. One common way to view these is as implementation of abstract data types. Here is a simple example:
pack (Int, (λx. x, λx. x)) : ∃ T. (Int → T) × (T → Int)
This is a simple ADT whose representation is Int and that only provides two operations (as a nested tuple), for converting ints in and out of the abstract type T. This is the basis of type theories for modules, for example.
In summary, universal quantification provides client-side data abstraction, while existential types dually provides implementor-side data abstraction.
As an additional remark, in the so-called lambda cube, forall and arrow are generalised to the unified notion of Π-type (where T1→T2 = Π(x:T1).T2 and ∀A.T = Π(A:&ast;).T) and likewise exists and tupling can be generalised to Σ-types (where T1×T2 = Σ(x:T1).T2 and ∃A.T = Σ(A:&ast;).T). Here, the type &ast; is the "type of types".

A few remarks to complement the two already-excellent answers.
First, one cannot say that forall is lambda at the type-level because there already is a notion of lambda at the type level, and it is different from forall. It appears in system F_omega, an extension of System F with type-level computation, that is useful to explain ML modules systems for example (F-ing modules, by Andreas Rossberg, Claudio Russo and Derek Dreyer, 2010).
In (a syntax for) System F_omega you can write for example:
type prod =
lambda (a : *). lambda (b : *).
forall (c : *). (a -> b -> c) -> c
This is a definition of the "type constructor" prod, such as prod a b is the type of the church-encoding of the product type (a, b). If there is computation at the type level, then you need to control it if you want to ensure termination of type-checking (otherwise you could define the type (lambda t. t t) (lambda t. t t). This is done by using a "type system at the type level", or a kind system. prod would be of kind * -> * -> *. Only the types at kind * can be inhabited by values, types at higher-kind can only be applied at the type level. lambda (c : k) . .... is a type-level abstraction that cannot be the type of a value, and may live at any kind of the form k -> ..., while forall (c : k) . .... classify values that are polymorphic in some type c : k and is necessarily of ground kind *.
Second, there is an important difference between the forall of System F and the Pi-types of Martin-Löf type theory. In System F, polymorphic values do the same thing on all types. As a first approximation, you could say that a value of type forall a . a -> a will (implicitly) take a type t as input and return a value of type t -> t. But that suggest that there may be some computation happening in the process, which is not the case. Morally, when you instantiate a value of type forall a. a -> a into a value of type t -> t, the value does not change. There are three (related) ways to think about it:
System F quantification has type erasure, you can forget about the types and you will still know what the dynamic semantic of the program is. When we use ML type inference to leave the polymorphism abstraction and instantiation implicit in our programs, we don't really let the inference engine "fill holes in our program", if you think of "program" as the dynamic object that will be run and compute.
A forall a . foo is not a something that "produces an instance of foo for each type a, but a single type foo that is "generic in an unknown type a".
You can explain universal quantification as an infinite conjunction, but there is an uniformity condition that all conjuncts have the same structure, and in particular that their proofs are all alike.
By contrast, Pi-types in Martin-Löf type theory are really more like function types that take something and return something. That's one of the reason why they can easily be used not only to depend on types, but also to depend on terms (dependent types).
This has very important implications once you're concerned about the soundness of those formal theories. System F is impredicative (a forall-quantified type quantifies on all types, itself included), and the reason why it's still sound is this uniformity of universal quantification. While introducing non-parametric constructs is reasonable from a programmer's point of view (and we can still reason about parametricity in an generally-non-parametric language), it very quickly destroys the logical consistency of the underlying static reasoning system. Martin-Löf predicative theory is much simpler to prove correct and to extend in correct way.
For a high-level description of this uniformity/genericity aspect of System F, see Fruchart and Longo's 97 article Carnap's remarks on Impredicative Definitions and the Genericity Theorem. For a more technical study of System F failure in presence of non-parametric constructs, see Parametricity and variants of Girard's J operator by Robert Harper and John Mitchell (1999). Finally, for a description, from a language design point of view, on how to abandon global parametricity to introduce non-parametric constructs but still be able to locally discuss parametricity, see Non-Parametric Parametricity by George Neis, Derek Dreyer and Andreas Rossberg, 2011.
This discussion of the difference between "computational abstraction" and "uniform abstract" has been revived by the large amount of work on representing variable binders. A binding construction feels like an abstraction (and can be modeled by a lambda-abstraction in HOAS style) but has an uniform structure that makes it rather like a data skeleton than a family of results. This has been much discussed, for example in the LF community, "representational arrows" in Twelf, "positive arrows" in Licata&Harper's work, etc.
Recently there have been several people working on the related notion of "irrelevance" (lambda-abstractions where the result "does not depend" on the argument), but it's still not totally clear how closely this is related to parametric polymorphism. One example is the work of Nathan Mishra-Linger with Tim Sheard (eg. Erasure and Polymorphism in Pure Type Systems).

if forall is lambda ..., then what is exists
Why, tuple of course!
In Martin-Löf type theory you have Π types, corresponding to functions/universal quantification and Σ-types, corresponding to tuples/existential quantification.
Their types are very similar to what you have proposed (I am using Agda notation here):
Π : (A : Set) -> (A -> Set) -> Set
Σ : (A : Set) -> (A -> Set) -> Set
Indeed, Π is an infinite product and Σ is infinite sum. Note that they are not "intersection" and "union" though, as you proposed because you can't do that without additionally defining where the types intersect. (which values of one type correspond to which values of the other type)
From these two type constructors you can have all of normal, polymorphic and dependent functions, normal and dependent tuples, as well as existentially and universally-quantified statements:
-- Normal function, corresponding to "Integer -> Integer" in Haskell
factorial : Π ℕ (λ _ → ℕ)
-- Polymorphic function corresponding to "forall a . a -> a"
id : Π Set (λ A -> Π A (λ _ → A))
-- A universally-quantified logical statement: all natural numbers n are equal to themselves
refl : Π ℕ (λ n → n ≡ n)
-- (Integer, Integer)
twoNats : Σ ℕ (λ _ → ℕ)
-- exists a. Show a => a
someShowable : Σ Set (λ A → Σ A (λ _ → Showable A))
-- There are prime numbers
aPrime : Σ ℕ IsPrime
However, this does not address parametricity at all and AFAIK parametricity and Martin-Löf type theory are independent.
For parametricity, people usually refer to the Philip Wadler's work.

Functions don't just have types: They ARE Types. And Kinds. And Sorts. Help put a blown mind back together

I was doing my usual "Read a chapter of LYAH before bed" routine, feeling like my brain was expanding with every code sample. At this point I was convinced that I understood the core awesomeness of Haskell, and now just had to understand the standard libraries and type classes so that I could start writing real software.
So I was reading the chapter about applicative functors when all of a sudden the book claimed that functions don't merely have types, they are types, and can be treated as such (For example, by making them instances of type classes). (->) is a type constructor like any other.
My mind was blown yet again, and I immediately jumped out of bed, booted up the computer, went to GHCi and discovered the following:
Prelude> :k (->)
(->) :: ?? -> ? -> *
What on earth does it mean?
If (->) is a type constructor, what are the value constructors? I can take a guess, but would have no idea how define it in traditional data (->) ... = ... | ... | ... format. It's easy enough to do this with any other type constructor: data Either a b = Left a | Right b. I suspect my inability to express it in this form is related to the extremly weird type signature.
What have I just stumbled upon? Higher kinded types have kind signatures like * -> * -> *. Come to think of it... (->) appears in kind signatures too! Does this mean that not only is it a type constructor, but also a kind constructor? Is this related to the question marks in the type signature?
I have read somewhere (wish I could find it again, Google fails me) about being able to extend type systems arbitrarily by going from Values, to Types of Values, to Kinds of Types, to Sorts of Kinds, to something else of Sorts, to something else of something elses, and so on forever. Is this reflected in the kind signature for (->)? Because I've also run into the notion of the Lambda cube and the calculus of constructions without taking the time to really investigate them, and if I remember correctly it is possible to define functions that take types and return types, take values and return values, take types and return values, and take values which return types.
If I had to take a guess at the type signature for a function which takes a value and returns a type, I would probably express it like this:
a -> ?
or possibly
a -> *
Although I see no fundamental immutable reason why the second example couldn't easily be interpreted as a function from a value of type a to a value of type *, where * is just a type synonym for string or something.
The first example better expresses a function whose type transcends a type signature in my mind: "a function which takes a value of type a and returns something which cannot be expressed as a type."

You touch so many interesting points in your question, so I am
afraid this is going to be a long answer :)
Kind of (->)
The kind of (->) is * -> * -> *, if we disregard the boxity GHC
inserts. But there is no circularity going on, the ->s in the
kind of (->) are kind arrows, not function arrows. Indeed, to
distinguish them kind arrows could be written as (=>), and then
the kind of (->) is * => * => *.
We can regard (->) as a type constructor, or maybe rather a type
operator. Similarly, (=>) could be seen as a kind operator, and
as you suggest in your question we need to go one 'level' up. We
return to this later in the section Beyond Kinds, but first:
How the situation looks in a dependently typed language
You ask how the type signature would look for a function that takes a
value and returns a type. This is impossible to do in Haskell:
functions cannot return types! You can simulate this behaviour using
type classes and type families, but let us for illustration change
language to the dependently typed language
Agda. This is a
language with similar syntax as Haskell where juggling types together
with values is second nature.
To have something to work with, we define a data type of natural
numbers, for convenience in unary representation as in
Peano Arithmetic.
Data types are written in
GADT style:
data Nat : Set where
Zero : Nat
Succ : Nat -> Nat
Set is equivalent to * in Haskell, the "type" of all (small) types,
such as Natural numbers. This tells us that the type of Nat is
Set, whereas in Haskell, Nat would not have a type, it would have
a kind, namely *. In Agda there are no kinds, but everything has
a type.
We can now write a function that takes a value and returns a type.
Below is a the function which takes a natural number n and a type,
and makes iterates the List constructor n applied to this
type. (In Agda, [a] is usually written List a)
listOfLists : Nat -> Set -> Set
listOfLists Zero a = a
listOfLists (Succ n) a = List (listOfLists n a)
Some examples:
listOfLists Zero Bool = Bool
listOfLists (Succ Zero) Bool = List Bool
listOfLists (Succ (Succ Zero)) Bool = List (List Bool)
We can now make a map function that operates on listsOfLists.
We need to take a natural number that is the number of iterations
of the list constructor. The base cases are when the number is
Zero, then listOfList is just the identity and we apply the function.
The other is the empty list, and the empty list is returned.
The step case is a bit move involving: we apply mapN to the head
of the list, but this has one layer less of nesting, and mapN
to the rest of the list.
mapN : {a b : Set} -> (a -> b) -> (n : Nat) ->
listOfLists n a -> listOfLists n b
mapN f Zero x = f x
mapN f (Succ n) [] = []
mapN f (Succ n) (x :: xs) = mapN f n x :: mapN f (Succ n) xs
In the type of mapN, the Nat argument is named n, so the rest of
the type can depend on it. So this is an example of a type that
depends on a value.
As a side note, there are also two other named variables here,
namely the first arguments, a and b, of type Set. Type
variables are implicitly universally quantified in Haskell, but
here we need to spell them out, and specify their type, namely
Set. The brackets are there to make them invisible in the
definition, as they are always inferable from the other arguments.
Set is abstract
You ask what the constructors of (->) are. One thing to point out
is that Set (as well as * in Haskell) is abstract: you cannot
pattern match on it. So this is illegal Agda:
cheating : Set -> Bool
cheating Nat = True
cheating _ = False
Again, you can simulate pattern matching on types constructors in
Haskell using type families, one canoical example is given on
Brent Yorgey's blog.
Can we define -> in the Agda? Since we can return types from
functions, we can define an own version of -> as follows:
_=>_ : Set -> Set -> Set
a => b = a -> b
(infix operators are written _=>_ rather than (=>)) This
definition has very little content, and is very similar to doing a
type synonym in Haskell:
type Fun a b = a -> b
Beyond kinds: Turtles all the way down
As promised above, everything in Agda has a type, but then
the type of _=>_ must have a type! This touches your point
about sorts, which is, so to speak, one layer above Set (the kinds).
In Agda this is called Set1:
FunType : Set1
FunType = Set -> Set -> Set
And in fact, there is a whole hierarchy of them! Set is the type of
"small" types: data types in haskell. But then we have Set1,
Set2, Set3, and so on. Set1 is the type of types which mentions
Set. This hierarchy is to avoid inconsistencies such as Girard's
paradox.
As noticed in your question, -> is used for types and kinds in
Haskell, and the same notation is used for function space at all
levels in Agda. This must be regarded as a built in type operator,
and the constructors are lambda abstraction (or function
definitions). This hierarchy of types is similar to the setting in
System F omega, and more
information can be found in the later chapters of
Pierce's Types and Programming Languages.
Pure type systems
In Agda, types can depend on values, and functions can return types,
as illustrated above, and we also had an hierarchy of
types. Systematic investigation of different systems of the lambda
calculi is investigated in more detail in Pure Type Systems. A good
reference is
Lambda Calculi with Types by Barendregt,
where PTS are introduced on page 96, and many examples on page 99 and onwards.
You can also read more about the lambda cube there.

Firstly, the ?? -> ? -> * kind is a GHC-specific extension. The ? and ?? are just there to deal with unboxed types, which behave differently from just * (which has to be boxed, as far as I know). So ?? can be any normal type or an unboxed type (e.g. Int#); ? can be either of those or an unboxed tuple. There is more information here: Haskell Weird Kinds: Kind of (->) is ?? -> ? -> *
I think a function can't return an unboxed type because functions are lazy. Since a lazy value is either a value or a thunk, it has to be boxed. Boxed just means it is a pointer rather than just a value: it's like Integer() vs int in Java.
Since you are probably not going to be using unboxed types in LYAH-level code, you can imagine that the kind of -> is just * -> * -> *.
Since the ? and ?? are basically just more general version of *, they do not have anything to do with sorts or anything like that.
However, since -> is just a type constructor, you can actually partially apply it; for example, (->) e is an instance of Functor and Monad. Figuring out how to write these instances is a good mind-stretching exercise.
As far as value constructors go, they would have to just be lambdas (\ x ->) or function declarations. Since functions are so fundamental to the language, they get their own syntax.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string