Omitting data type constructors - haskell

I'm trying to implement the following in Haskell:
0,1,2,...:N
x,y,z,...:V
+,*,-,/,...:F
F alias for Expr -> Expr -> Expr
Expr := N|V|F Expr Expr
My question is first:
Is the grammar flawed at type level? Does it make sense? All terms look like they'd type check (allowing for 0,1,... to be both Expr and N subtype, and x,y,... to be both Expr and V subtype).
And secondarily, what's the closest Haskell implementation? My current Haskell implementation is:
data F = +|-|*|...
data Expr = N|V|MakeExpr F Expr Expr
Any suggestions?
EDIT -
The key difference between the grammar and implementation is that type constructor is implicit /omitted in the grammar. Why are type constructors compulsory in Haskell?

The key difference between the grammar and implementation is that type constructor is implicit /omitted in the grammar. Why are type constructors compulsory in Haskell?
.
Is the grammar flawed at type level? Does it make sense? All terms look like they'd type check (allowing for 0,1,... to be both Expr and N subtype, and x,y,... to be both Expr and V subtype
The reason data constructors1 are compulsory in Haskell is specifically to ensure that you can't have x, y, .. be both Expr and V subtypes.
So your grammar looks like a reasonable model for how you want your language terms to work. But it doesn't make sense as a direct design for how you want to represent your language terms as Haskell data types.
Basically, Haskell deliberately does not have subtypes. It ensures that when you create a new type (with newtype or data) that all of the values of the new type are distinct from the values of all other existing types (and all types that will be created in future). It does this by having values of user-defined types always appear inside constructors (and making it impossible to "reuse" constructors; you always make new ones whenever you make a new type).
The way Haskell's type system works depends on this lack of subtyping. You could design a language that allowed subtypes (see Scala, perhaps). But it just fundamentally wouldn't be Haskell.
But what you can do instead is define something like:
data Expr
= ExprN N
| ExprV V
| ExprF Expr Expr
You still can't have a N value and just use it as an Expr. But you can just apply ExprN to it, and then you have an Expr. And it's really no more burden than if Haskell allowed you to use some n of type N as an Expr as well, but only required you to add a type annotation clarifying that that's what you meant; you just have to say ExprN n instead of n :: Expr.
Similarly when you have an Expr and you want to apply a function on N to it, the case statement to extract the N from the ExprN constructor (if it's there) isn't really any more code than you'd have to write to check if your Expr was actually an N.
1 "Type constructor" is a specific term in Haskell, which isn't what we're talking about here. I'm pretty sure what you meant by that was "the constructors for a type", but to be pedantic you accidentally referred to a different thing by using that term.
To clear it up, when you declare a type like data Maybe a = Nothing | Just a, Nothing and Just are new data constructors ("constructor" on its on is extremely likely to mean a data constructor) and Maybe is a new type constructor.

Related

Can Haskell type synonyms be used as type constructors?

I'm writing a benchmark to compare the performance of a number of Haskell collections, including STArray, on a given task. To eliminate repetition, I'm trying to write a set of functions that provide a uniform interface to these collections, so that I can implement the task as a polymorphic higher-order function. More specifically, the task is implemented in terms of a polymorphic monad, which is ST s for STArray, and Identity for collections like HashMap, that do not typically need to be manipulated within a monad.
Due to uniformity requirements, I can't use the Identity and HashMap types directly, as I need their kinds to match the kinds of ST and STArray. I thought that the simplest way to achieve this would be to define type synonyms with phantom parameters:
type Identity' s a = Identity a
type HashMap' s i e = HashMap i e
-- etc.
Unfortunately this doesn't work, because when I try to use these synonyms as type constructors in places where I use ST and STArray as type constructors, GHC gives errors like:
The type synonym ‘Identity'’ should have 2 arguments, but has been given none
I came across the -XLiberalTypeSynonyms GHC extension, and thought it would allow me to do this, as the documentation says:
You can apply a type synonym to a partially applied type synonym
and gives this example of doing so:
type Generic i o = forall x. i x -> o x
type Id x = x
foo :: Generic Id []
That example works in GHC 8.0.2 (with -XExistentialQuantification and -XRank2Types). But replacing Generic with a newtype or data declaration, as needed in my use case, does not work.
I.e. the following code leads to the same kind of error that I reported above:
newtype Generic i o = Generic (forall x. i x -> o x)
type Id x = x
foo :: Generic Id []
foo = Generic (\x -> [x])
Question
Is there some other extension that I need to enable to get this to work? If not, is there a good reason why this doesn't work, or is it just an oversight?
Workaround
I'm aware that I can work around this by defining Identity', etc. as fully-fledged types, e.g.:
newtype Identity' s a = Identity' a
newtype Collection collection s i e = Collection (collection i e)
-- etc.
This is not ideal though, as it means that I have to reimplement Identity's Functor, Applicative and Monad instances for Identity', and it means that I have to write additional wrapping and unwrapping code for the collections.

What is the best practice to generate data which satisfy specific property in QuickCheck?

When we are using QuickCheck to check our programs, we need to define generators for our data, there is some generic way to define them, but the generic way usually become useless when we need the generated data to satisfy some constraints to work.
e.g.
data Expr
= LitI Int
| LitB Bool
| Add Expr Expr
| And Expr Expr
data TyRep = Number | Boolean
typeInfer :: Expr -> Maybe TyRep
typeInfer e = case e of
LitI _ -> Number
LitB _ -> Boolean
Add e1 e2 -> case (typeInfer e1, typeInfer e2) of
(Just Number, Just Number) -> Just Number
_ -> Nothing
And e1 e2 -> case (typeInfer e1, typeInfer e2) of
(Just Boolean, Just Boolean) -> Just Boolean
_ -> Nothing
now I need to define generator of Expr (i.e. Gen Expr or instance Arbitrary Expr), but also want it generates the type correct ones (i.e. isJust (typeInfer generatedExpr))
a naive way to do that is use suchThat to filter out the invalid ones, but that is obviously inefficient when Expr and TyRep becomes complicated with more cases.
Another similar situation is about reference integrity, e.g.
data Expr
= LitI Int
| LitB Bool
| Add Expr Expr
| And Expr Expr
| Ref String -- ^ reference another Expr via it's name
type Context = Map String Expr
In this case, we want all the referenced names in the generated Expr are contained in some specific Context, now I have to generate Expr for specific Context:
arbExpr :: Context -> Gen Expr
but now shrink will be a problem, and to solve this problem, I have to define a specific version of shrink, and use forAllShrink everytime I use arbExpr, that means a lot of work.
So I want to know, is there a best practice to do such things?
For well-typed terms, a simple approach in many cases is to have one generator for each type, or, equivalently, a function TyRep -> Gen Expr. Adding variables on top of that, this usually turns into a function Context -> TyRep -> Gen Expr.
In the case of generating terms with variables (and with no or very simple types), indexing the type of terms by the context (e.g., like you would do using the bound library) should make it quite easy to derive a generator generically.
For shrinking, hedgehog's approach can work quite well, where Gen generates a value together with shrunk versions, sparing you from defining a separate shrinking function.
Note that as the well-formedness/typing relation becomes more complex, you start hitting the theoretical wall where generating terms is at least as hard as arbitrary proof search.
For more advanced techniques/related literature, with my own comments about possibly using it in Haskell:
Generating Constrained Data with Uniform Distribution, by Claessen et al., FLOPS'14 (PDF). I believe the Haskell package lazy-search has most of the machinery described by the paper, but it seems aimed at enumeration rather than random generation.
Making Random Judgments: Automatically Generating Well-Typed Terms from the Definition of a Type-System, by Fetscher et al., ESOP'15 (PDF), the title says it all. I don't know about a Haskell implementation though; you might want to ask the authors.
Beginner's Luck: A Language for Property-Based Generators, by Lampropoulos et al., POPL'17 (PDF) (disclaimer: I'm a coauthor). A language of properties (more concretely, functions T -> Bool, e.g., a typechecker) that can be interpreted as random generators (Gen T). The language's syntax is strongly inspired by Haskell, but there are still a few differences. The implementation has an interface to extract the generated values in Haskell (github repo).
Generating Good Generators for Inductive Relations, by Lampropoulos et al. POPL'18 (PDF). It's in Coq QuickChick, but tying it to Haskell QuickCheck by extraction seems reasonably feasible.

Haskell AST with recursive types

I'm currently trying to build a lambda calculus solver, and I'm having a slight problem with constructing the AST. A lambda calculus term is inductively defined as:
1) A variable
2) A lambda, a variable, a dot, and a lambda expression.
3) A bracket, a lambda expression, a lambda expression and a bracket.
What I would like to do (and at first tried) is this:
data Expr =
Variable
| Abstract Variable Expr
| Application Expr Expr
Now obviously this doesn't work, since Variable is not a type, and Abstract Variable Expr expects types. So my hacky solution to this is to have:
type Variable = String
data Expr =
Atomic Variable
| Abstract Variable Expr
| Application Expr Expr
Now this is really annoying since I don't like the Atomic Variable on its own, but Abstract taking a string rather than an expr. Is there any way I can make this more elegant, and do it like the first solution?
Your first solution is just an erroneous definition without meaning. Variable is not a type there, it's a nullary value constructor. You can't refer to Variable in a type definition much like you can't refer to any value, like True, False or 100.
The second solution is in fact the direct translation of something we could write in BNF:
var ::= <string>
term ::= λ <var>. <term> | <term> <term> | <var>
And thus there is nothing wrong with it.
What you exactly want is to have some type like
data Expr
= Atomic Variable
| Abstract Expr Expr
| Application Expr Expr
But constrain first Expr in Abstract constructor to be only Atomic. There is no straightforward way to do this in Haskell because value of some type can be created by any constructor of this type. So the only approach is to make some separate data type or type alias for existing type (like your Variable type alias) and move all common logic into it. Your solution with Variable seems very ok to me.
But. You can use some other advanced features in Haskell to achieve you goal in different way. You can be inspired by glambda package which uses GADT to create typed lambda calculus. Also see this answer: https://stackoverflow.com/a/39931015/2900502
I can come up with next solution to achieve you minimal goals (if you only want to constrain first argument of Abstract):
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
data AtomicT
data AbstractT
data ApplicationT
data Expr :: * -> * where
Atomic :: String -> Expr AtomicT
Abstract :: Expr AtomicT -> Expr a -> Expr AbstractT
Application :: Expr a -> Expr b -> Expr ApplicationT
And next example works fine:
ex1 :: Expr AbstractT
ex1 = Abstract (Atomic "x") (Atomic "x")
But this example won't compile because of type mismatch:
ex2 :: Expr AbstractT
ex2 = Abstract ex1 ex1

Why do I have to use newtype when my data type declaration only has one constructor? [duplicate]

This question already has answers here:
Difference between `data` and `newtype` in Haskell
(2 answers)
Closed 8 years ago.
It seems that a newtype definition is just a data definition that obeys some restrictions (e.g., only one constructor), and that due to these restrictions the runtime system can handle newtypes more efficiently. And the handling of pattern matching for undefined values is slightly different.
But suppose Haskell would only knew data definitions, no newtypes: couldn't the compiler find out for itself whether a given data definition obeys these restrictions, and automatically treat it more efficiently?
I'm sure I'm missing out on something, there must be some deeper reason for this.
Both newtype and the single-constructor data introduce a single value constructor, but the value constructor introduced by newtype is strict and the value constructor introduced by data is lazy. So if you have
data D = D Int
newtype N = N Int
Then N undefined is equivalent to undefined and causes an error when evaluated. But D undefined is not equivalent to undefined, and it can be evaluated as long as you don't try to peek inside.
Couldn't the compiler handle this for itself.
No, not really—this is a case where as the programmer you get to decide whether the constructor is strict or lazy. To understand when and how to make constructors strict or lazy, you have to have a much better understanding of lazy evaluation than I do. I stick to the idea in the Report, namely that newtype is there for you to rename an existing type, like having several different incompatible kinds of measurements:
newtype Feet = Feet Double
newtype Cm = Cm Double
both behave exactly like Double at run time, but the compiler promises not to let you confuse them.
According to Learn You a Haskell:
Instead of the data keyword, the newtype keyword is used. Now why is
that? Well for one, newtype is faster. If you use the data keyword to
wrap a type, there's some overhead to all that wrapping and unwrapping
when your program is running. But if you use newtype, Haskell knows
that you're just using it to wrap an existing type into a new type
(hence the name), because you want it to be the same internally but
have a different type. With that in mind, Haskell can get rid of the
wrapping and unwrapping once it resolves which value is of what type.
So why not just use newtype all the time instead of data then? Well,
when you make a new type from an existing type by using the newtype
keyword, you can only have one value constructor and that value
constructor can only have one field. But with data, you can make data
types that have several value constructors and each constructor can
have zero or more fields:
data Profession = Fighter | Archer | Accountant
data Race = Human | Elf | Orc | Goblin
data PlayerCharacter = PlayerCharacter Race Profession
When using newtype, you're restricted to just one constructor with one
field.
Now consider the following type:
data CoolBool = CoolBool { getCoolBool :: Bool }
It's your run-of-the-mill algebraic data type that was defined with
the data keyword. It has one value constructor, which has one field
whose type is Bool. Let's make a function that pattern matches on a
CoolBool and returns the value "hello" regardless of whether the Bool
inside the CoolBool was True or False:
helloMe :: CoolBool -> String
helloMe (CoolBool _) = "hello"
Instead of applying this function to a normal CoolBool, let's throw it a curveball and apply it to undefined!
ghci> helloMe undefined
"*** Exception: Prelude.undefined
Yikes! An exception! Now why did this exception happen? Types defined
with the data keyword can have multiple value constructors (even
though CoolBool only has one). So in order to see if the value given
to our function conforms to the (CoolBool _) pattern, Haskell has to
evaluate the value just enough to see which value constructor was used
when we made the value. And when we try to evaluate an undefined
value, even a little, an exception is thrown.
Instead of using the data keyword for CoolBool, let's try using
newtype:
newtype CoolBool = CoolBool { getCoolBool :: Bool }
We don't have to
change our helloMe function, because the pattern matching syntax is
the same if you use newtype or data to define your type. Let's do the
same thing here and apply helloMe to an undefined value:
ghci> helloMe undefined
"hello"
It worked! Hmmm, why is that? Well, like we've said, when we use
newtype, Haskell can internally represent the values of the new type
in the same way as the original values. It doesn't have to add another
box around them, it just has to be aware of the values being of
different types. And because Haskell knows that types made with the
newtype keyword can only have one constructor, it doesn't have to
evaluate the value passed to the function to make sure that it
conforms to the (CoolBool _) pattern because newtype types can only
have one possible value constructor and one field!
This difference in behavior may seem trivial, but it's actually pretty
important because it helps us realize that even though types defined
with data and newtype behave similarly from the programmer's point of
view because they both have value constructors and fields, they are
actually two different mechanisms. Whereas data can be used to make
your own types from scratch, newtype is for making a completely new
type out of an existing type. Pattern matching on newtype values isn't
like taking something out of a box (like it is with data), it's more
about making a direct conversion from one type to another.
Here's another source. According to this Newtype article:
A newtype declaration creates a new type in much the same way as data.
The syntax and usage of newtypes is virtually identical to that of
data declarations - in fact, you can replace the newtype keyword with
data and it'll still compile, indeed there's even a good chance your
program will still work. The converse is not true, however - data can
only be replaced with newtype if the type has exactly one constructor
with exactly one field inside it.
Some Examples:
newtype Fd = Fd CInt
-- data Fd = Fd CInt would also be valid
-- newtypes can have deriving clauses just like normal types
newtype Identity a = Identity a
deriving (Eq, Ord, Read, Show)
-- record syntax is still allowed, but only for one field
newtype State s a = State { runState :: s -> (s, a) }
-- this is *not* allowed:
-- newtype Pair a b = Pair { pairFst :: a, pairSnd :: b }
-- but this is:
data Pair a b = Pair { pairFst :: a, pairSnd :: b }
-- and so is this:
newtype Pair' a b = Pair' (a, b)
Sounds pretty limited! So why does anyone use newtype?
The short version The restriction to one constructor with one field
means that the new type and the type of the field are in direct
correspondence:
State :: (s -> (a, s)) -> State s a
runState :: State s a -> (s -> (a, s))
or in mathematical terms they are isomorphic. This means that after
the type is checked at compile time, at run time the two types can be
treated essentially the same, without the overhead or indirection
normally associated with a data constructor. So if you want to declare
different type class instances for a particular type, or want to make
a type abstract, you can wrap it in a newtype and it'll be considered
distinct to the type-checker, but identical at runtime. You can then
use all sorts of deep trickery like phantom or recursive types without
worrying about GHC shuffling buckets of bytes for no reason.
See the article for the messy bits...
Simple version for folks obsessed with bullet lists (failed to find one, so have to write it by myself):
data - creates new algebraic type with value constructors
Can have several value constructors
Value constructors are lazy
Values can have several fields
Affects both compilation and runtime, have runtime overhead
Created type is a distinct new type
Can have its own type class instances
When pattern matching against value constructors, WILL be evaluated at least to weak head normal form (WHNF) *
Used to create new data type (example: Address { zip :: String, street :: String } )
newtype - creates new “decorating” type with value constructor
Can have only one value constructor
Value constructor is strict
Value can have only one field
Affects only compilation, no runtime overhead
Created type is a distinct new type
Can have its own type class instances
When pattern matching against value constructor, CAN be not evaluated at all *
Used to create higher level concept based on existing type with distinct set of supported operations or that is not interchangeable with original type (example: Meter, Cm, Feet is Double)
type - creates an alternative name (synonym) for a type (like typedef in C)
No value constructors
No fields
Affects only compilation, no runtime overhead
No new type is created (only a new name for existing type)
Can NOT have its own type class instances
When pattern matching against data constructor, behaves the same as original type
Used to create higher level concept based on existing type with the same set of supported operations (example: String is [Char])
[*] On pattern matching laziness:
data DataBox a = DataBox Int
newtype NewtypeBox a = NewtypeBox Int
dataMatcher :: DataBox -> String
dataMatcher (DataBox _) = "data"
newtypeMatcher :: NewtypeBox -> String
newtypeMatcher (NewtypeBox _) = "newtype"
ghci> dataMatcher undefined
"*** Exception: Prelude.undefined
ghci> newtypeMatcher undefined
“newtype"
Off the top of my head; data declarations use lazy evaluation in access and storage of their "members", whereas newtype does not. Newtype also strips away all previous type instances from its components, effectively hiding its implementation; whereas data leaves the implementation open.
I tend to use newtype's when avoiding boilerplate code in complex data types where I don't necessarily need access to the internals when using them. This speeds up both compilation and execution, and reduces code complexity where the new type is used.
When first reading about this I found this chapter of a Gentle Introduction to Haskell rather intuitive.

Functions don't just have types: They ARE Types. And Kinds. And Sorts. Help put a blown mind back together

I was doing my usual "Read a chapter of LYAH before bed" routine, feeling like my brain was expanding with every code sample. At this point I was convinced that I understood the core awesomeness of Haskell, and now just had to understand the standard libraries and type classes so that I could start writing real software.
So I was reading the chapter about applicative functors when all of a sudden the book claimed that functions don't merely have types, they are types, and can be treated as such (For example, by making them instances of type classes). (->) is a type constructor like any other.
My mind was blown yet again, and I immediately jumped out of bed, booted up the computer, went to GHCi and discovered the following:
Prelude> :k (->)
(->) :: ?? -> ? -> *
What on earth does it mean?
If (->) is a type constructor, what are the value constructors? I can take a guess, but would have no idea how define it in traditional data (->) ... = ... | ... | ... format. It's easy enough to do this with any other type constructor: data Either a b = Left a | Right b. I suspect my inability to express it in this form is related to the extremly weird type signature.
What have I just stumbled upon? Higher kinded types have kind signatures like * -> * -> *. Come to think of it... (->) appears in kind signatures too! Does this mean that not only is it a type constructor, but also a kind constructor? Is this related to the question marks in the type signature?
I have read somewhere (wish I could find it again, Google fails me) about being able to extend type systems arbitrarily by going from Values, to Types of Values, to Kinds of Types, to Sorts of Kinds, to something else of Sorts, to something else of something elses, and so on forever. Is this reflected in the kind signature for (->)? Because I've also run into the notion of the Lambda cube and the calculus of constructions without taking the time to really investigate them, and if I remember correctly it is possible to define functions that take types and return types, take values and return values, take types and return values, and take values which return types.
If I had to take a guess at the type signature for a function which takes a value and returns a type, I would probably express it like this:
a -> ?
or possibly
a -> *
Although I see no fundamental immutable reason why the second example couldn't easily be interpreted as a function from a value of type a to a value of type *, where * is just a type synonym for string or something.
The first example better expresses a function whose type transcends a type signature in my mind: "a function which takes a value of type a and returns something which cannot be expressed as a type."
You touch so many interesting points in your question, so I am
afraid this is going to be a long answer :)
Kind of (->)
The kind of (->) is * -> * -> *, if we disregard the boxity GHC
inserts. But there is no circularity going on, the ->s in the
kind of (->) are kind arrows, not function arrows. Indeed, to
distinguish them kind arrows could be written as (=>), and then
the kind of (->) is * => * => *.
We can regard (->) as a type constructor, or maybe rather a type
operator. Similarly, (=>) could be seen as a kind operator, and
as you suggest in your question we need to go one 'level' up. We
return to this later in the section Beyond Kinds, but first:
How the situation looks in a dependently typed language
You ask how the type signature would look for a function that takes a
value and returns a type. This is impossible to do in Haskell:
functions cannot return types! You can simulate this behaviour using
type classes and type families, but let us for illustration change
language to the dependently typed language
Agda. This is a
language with similar syntax as Haskell where juggling types together
with values is second nature.
To have something to work with, we define a data type of natural
numbers, for convenience in unary representation as in
Peano Arithmetic.
Data types are written in
GADT style:
data Nat : Set where
Zero : Nat
Succ : Nat -> Nat
Set is equivalent to * in Haskell, the "type" of all (small) types,
such as Natural numbers. This tells us that the type of Nat is
Set, whereas in Haskell, Nat would not have a type, it would have
a kind, namely *. In Agda there are no kinds, but everything has
a type.
We can now write a function that takes a value and returns a type.
Below is a the function which takes a natural number n and a type,
and makes iterates the List constructor n applied to this
type. (In Agda, [a] is usually written List a)
listOfLists : Nat -> Set -> Set
listOfLists Zero a = a
listOfLists (Succ n) a = List (listOfLists n a)
Some examples:
listOfLists Zero Bool = Bool
listOfLists (Succ Zero) Bool = List Bool
listOfLists (Succ (Succ Zero)) Bool = List (List Bool)
We can now make a map function that operates on listsOfLists.
We need to take a natural number that is the number of iterations
of the list constructor. The base cases are when the number is
Zero, then listOfList is just the identity and we apply the function.
The other is the empty list, and the empty list is returned.
The step case is a bit move involving: we apply mapN to the head
of the list, but this has one layer less of nesting, and mapN
to the rest of the list.
mapN : {a b : Set} -> (a -> b) -> (n : Nat) ->
listOfLists n a -> listOfLists n b
mapN f Zero x = f x
mapN f (Succ n) [] = []
mapN f (Succ n) (x :: xs) = mapN f n x :: mapN f (Succ n) xs
In the type of mapN, the Nat argument is named n, so the rest of
the type can depend on it. So this is an example of a type that
depends on a value.
As a side note, there are also two other named variables here,
namely the first arguments, a and b, of type Set. Type
variables are implicitly universally quantified in Haskell, but
here we need to spell them out, and specify their type, namely
Set. The brackets are there to make them invisible in the
definition, as they are always inferable from the other arguments.
Set is abstract
You ask what the constructors of (->) are. One thing to point out
is that Set (as well as * in Haskell) is abstract: you cannot
pattern match on it. So this is illegal Agda:
cheating : Set -> Bool
cheating Nat = True
cheating _ = False
Again, you can simulate pattern matching on types constructors in
Haskell using type families, one canoical example is given on
Brent Yorgey's blog.
Can we define -> in the Agda? Since we can return types from
functions, we can define an own version of -> as follows:
_=>_ : Set -> Set -> Set
a => b = a -> b
(infix operators are written _=>_ rather than (=>)) This
definition has very little content, and is very similar to doing a
type synonym in Haskell:
type Fun a b = a -> b
Beyond kinds: Turtles all the way down
As promised above, everything in Agda has a type, but then
the type of _=>_ must have a type! This touches your point
about sorts, which is, so to speak, one layer above Set (the kinds).
In Agda this is called Set1:
FunType : Set1
FunType = Set -> Set -> Set
And in fact, there is a whole hierarchy of them! Set is the type of
"small" types: data types in haskell. But then we have Set1,
Set2, Set3, and so on. Set1 is the type of types which mentions
Set. This hierarchy is to avoid inconsistencies such as Girard's
paradox.
As noticed in your question, -> is used for types and kinds in
Haskell, and the same notation is used for function space at all
levels in Agda. This must be regarded as a built in type operator,
and the constructors are lambda abstraction (or function
definitions). This hierarchy of types is similar to the setting in
System F omega, and more
information can be found in the later chapters of
Pierce's Types and Programming Languages.
Pure type systems
In Agda, types can depend on values, and functions can return types,
as illustrated above, and we also had an hierarchy of
types. Systematic investigation of different systems of the lambda
calculi is investigated in more detail in Pure Type Systems. A good
reference is
Lambda Calculi with Types by Barendregt,
where PTS are introduced on page 96, and many examples on page 99 and onwards.
You can also read more about the lambda cube there.
Firstly, the ?? -> ? -> * kind is a GHC-specific extension. The ? and ?? are just there to deal with unboxed types, which behave differently from just * (which has to be boxed, as far as I know). So ?? can be any normal type or an unboxed type (e.g. Int#); ? can be either of those or an unboxed tuple. There is more information here: Haskell Weird Kinds: Kind of (->) is ?? -> ? -> *
I think a function can't return an unboxed type because functions are lazy. Since a lazy value is either a value or a thunk, it has to be boxed. Boxed just means it is a pointer rather than just a value: it's like Integer() vs int in Java.
Since you are probably not going to be using unboxed types in LYAH-level code, you can imagine that the kind of -> is just * -> * -> *.
Since the ? and ?? are basically just more general version of *, they do not have anything to do with sorts or anything like that.
However, since -> is just a type constructor, you can actually partially apply it; for example, (->) e is an instance of Functor and Monad. Figuring out how to write these instances is a good mind-stretching exercise.
As far as value constructors go, they would have to just be lambdas (\ x ->) or function declarations. Since functions are so fundamental to the language, they get their own syntax.

Resources