User-defined tuple-based data constructors - haskell

Me trying to grok The Little MLer again. TLMLer has this SML code
datatype a pizza = Bottom | Topping of (a * (a pizza))
datatype fish = Anchovy | Lox | Tuna
which I've translated as
data PizzaSh a = CrustSh | ToppingSh a (PizzaSh a)
data FishPSh = AnchovyPSh | LoxPSh | TunaPSh
and then an alternative closer to TLMLer perhaps
data PizzaSh2 a = CrustSh2 | ToppingSh2 (a, PizzaSh2 a)
And from each I create a pizza
fpizza1 = ToppingSh AnchovyPSh (ToppingSh TunaPSh (ToppingSh LoxPSh CrustSh))
fpizza2 = ToppingSh2 (AnchovyPSh, ToppingSh2 (LoxPSh, ToppingSh2 (TunaPSh, CrustSh2)))
respectively, which are of type PizzaSh FishPSh and PizzaSh2 FishPSh respectively.
But the second version (which is arguably the closer to the original ML version) seems "offbeat." It's as if I'm creating a 2-tuple when I "cons" toppings together where the second member recursively expands . May I assume the parametric data constructor "function" of PizzaSh2 doesn't literally build a tuple, it's just borrowing the tuple as a cons strategy, correct? Which is preferable in Haskell, PizzaSh or PizzaSh2? As I understand, a tuple (cartesian product) data type will have a single constructor, e.g., data Point a b = Pt a b, not the disjoint union of ored-together (|) constructors. In SML the "*" indicates product, i.e., tuple, but, again, is this just a "tuple-like thing," i.e., it's just a tuple-looking way to cons a pizza together?

In Haskell, we prefer this style:
data PizzaSh a = CrustSh | ToppingSh a (PizzaSh a)
There is no need in Haskell to use a tuple there, since a data constructor like ToppingSh can take multiple arguments.
Using an additional pair as in
data PizzaSh2 a = CrustSh2 | ToppingSh2 (a, PizzaSh2 a)
creates a type which is almost isomorphic to the previous one, but is more cumbersome to handle since it requires to use more parentheses. E.g.
foo (ToppingSh x y)
-- vs
foo (ToppingSh2 (x, y))
bar :: PizzaSh a -> ...
bar (ToppingSh x y) = ....
-- vs
bar (ToppingSh2 (x, y)) = ...
Further, the type is indeed only almost isomorphic. When using an additional pair, because of laziness, we have one more value which can be represented in the type: we have a correpondence
ToppingSh x y <-> ToppingSh2 (x, y)
which breaks down in the case
??? <-> ToppingSh2 undefined
That is, ToppinggSh2 can be applied to a non-terminating (or otherwise exceptional), pair-valued expression, and that constructs a value which can not be represented using ToppingSh.
Operationally, to achieve that GHC uses a double indirection (roughly pointer-to-pointer, or thunk-returning-pair-of-thunks), which further slows down the code. Hence, it's also a bad choice from a performance point of view, if one cares about such micro-optimizations.

As far as the Haskell side, it absolutely is nesting a (,) constructor inside the ToppingSh constructor. It would violate Haskell's non-strict semantics to not do the nesting you requested. If the nesting was removed, you'd be unable to distinguish between undefined :: PizzaSh2 () and ToppingSh undefined :: PizzaSh2 (). And yes, most of the time, that isn't what you want. PizzaSh is the much more natural formulation in Haskell unless you have a particular need to be able to introduce another bottom into the evaluation process.
I can't address what's going on behind the scenes in any particular ML implementation. Though I can say that with strict evaluation semantics, there isn't a behavioral difference to observe, meaning compilers are free to use a wider variety of approaches.

Related

Product type convert to tuple

I've got this
data Pair = P Int Double deriving Show
myP1 = P 1 2.0
getPairVal (P i d) = (i,d)
getPairVal' (P i d) = (,) i d
which work, both producing (1,2.0). Is there any way to make a straight (,) myP1 work? But then this would require some sort of "cast" I'm guessing.
data Pair2 = P2 (Int, Double) deriving Show
is a tuple version that is "handier" sort of
myP21 = (5,5.0)
So what would be the advantage of a product type like Pair with or without a tuple type definition? I seem to remember SML having a closer relationship, i.e., every product type was a tuple type. In math, anything that's a "product" produces a tuple. Haskell apparently not.
why shouldn't a straight (,) myP1 work?
Well, (,) is a function that takes two values and makes a tuple out of them. P 1 2.0 is not two values: it is a single value of type Pair, which contains two fields. But (,) (P 1 2.0) is a well-typed expression: it has built the left half of a tuple, and needs one more argument for the right half. So:
let mkTuple = (,) (P 1 2.0)
in mkTuple "hello"
evaluates to
(P 1 2.0, "hello")
Why should we define types like Pair instead of using tuples? user202729 linked to Haskell: Algebraic data vs Tuple in a comment, which is a good answer to that question.
But one other approach you could use would be to define a newtype wrapping a tuple, so that you have a distinct named type, but the same runtime representation as a tuple:
newtype Pair2 = P2 (Int, Double) deriving Show
If you use newtype instead of data, you do get a simple conversion function, which you might choose to think of as a cast: Data.Coerce.coerce. With that, you could write
myP21 = coerce (5, 5.0)
and vice versa. This approach is primarily used when you want new typeclass instances for existing types. If you need a new domain type composed of two values, I'd still encourage you to default to using a data type with two fields unless there is a compelling reason not to.

How are tail-position contexts GHC join points paper formed?

Compiling without Continuations describes a way to extend ANF System F with join points. GHC itself has join points in Core (an intermediate representation) rather than exposing join points directly in the surface language (Haskell). Out of curiosity, I started trying to write a language that simply extends System F with join points. That is, the join points are user facing. However, there's something about the typing rules in the paper that I don't understand. Here are the parts that I do understand:
There are two environments, one for ordinary values/functions and one that only has join points.
The rational for ∆ being ε in several of the rules. In the expression let x:σ = u in ..., u cannot reference any join points (VBIND) because it join points cannot return to arbitrary locations.
The strange typing rule for JBIND. The paper does a good job explaining this.
Here's what I don't get. The paper introduces a notation that I will call the "overhead arrow", but the paper itself does not explicitly give it a name or mention it. Visually, it looks like a arrow pointing to the right, and it goes above an expression. Roughly, this seems to indicate a "tail context" (the paper does use this term). In the paper, these overhead arrows can be applied to terms, types, data constructors, and even environments. They can be nested as well. Here's the main difficulty I'm having. There are several rules with premises that include type environments under an overhead arrow. JUMP, CASE, RVBIND, and RJBIND all include premises with such a type environment (Figure 2 in the paper). However, none of the typing rules have a conclusion where the type environment is under an overhead arrow. So, I cannot see how JUMP, CASE, etc. can ever be used since the premises cannot be derived by any of the other rules.
That's the question, but if anyone has any supplementary material that provides more context are the overhead arrow convention or if anyone is aware an implementation of the System-F-with-join-points type system (other than in GHC's IR), that would be helpful too.
In this paper, x⃗ means “A sequence of x, separated by appropriate delimiters”.
A few examples:
If x is a variable, λx⃗. e is an abbreviation for λx1. λx2. … λxn e. In other words, many nested 1-argument lambdas, or a many-argument lambda.
If σ and τ are types, σ⃗ → τ is an abbreviation for σ1 → σ2 → … → σn → τ. In other words, a function type with many parameter types.
If a is a type variable and σ is a type, ∀a⃗. σ is an abbreviation for ∀a1. ∀a2. … ∀an. σ. In other words, many nested polymorphic functions, or a polymorphic function with many type parameters.
In Figure 1 of the paper, the syntax of a jump expression is defined as:
e, u, v ⩴ … | jump j ϕ⃗ e⃗ τ
If this declaration were translated into a Haskell data type, it might look like this:
data Term
-- | A jump expression has a label that it jumps to, a list of type argument
-- applications, a list of term argument applications, and the return type
-- of the overall `jump`-expression.
= Jump LabelVar [Type] [Term] Type
| ... -- Other syntactic forms.
That is, a data constructor that takes a label variable j, a sequence of type arguments ϕ⃗, a sequence of term arguments e⃗, and a return type τ.
“Zipping” things together:
Sometimes, multiple uses of the overhead arrow place an implicit constraint that their sequences have the same length. One place that this occurs is with substitutions.
{ϕ/⃗a} means “replace a1 with ϕ1, replace a2 with ϕ2, …, replace an with ϕn”, implicitly asserting that both a⃗ and ϕ⃗ have the same length, n.
Worked example: the JUMP rule:
The JUMP rule is interesting because it provides several uses of sequencing, and even a sequence of premises. Here’s the rule again:
(j : ∀a⃗. σ⃗ → ∀r. r) ∈ Δ
(Γ; ε ⊢⃗ u : σ {ϕ/⃗a})
Γ; Δ ⊢ jump j ϕ⃗ u⃗ τ : τ
The first premise should be fairly straightforward, now: lookup j in the label context Δ, and check that the type of j starts with a bunch of ∀s, followed by a bunch of function types, ending with a ∀r. r.
The second “premise” is actually a sequence of premises. What is it looping over? So far, the sequences we have in scope are ϕ⃗, σ⃗, a⃗, and u⃗.
ϕ⃗ and a⃗ are used in a nested sequence, so probably not those two.
On the other hand, u⃗ and σ⃗ seem quite plausible if you consider what they mean.
σ⃗ is the list of argument types expected by the label j, and u⃗ is the list of argument terms provided to the label j, and it makes sense that you might want to iterate over argument types and argument terms together.
So this “premise” actually means something like this:
for each pair of σ and u:
Γ; ε ⊢ u : σ {ϕ/⃗a}
Pseudo-Haskell implementation
Finally, here’s a somewhat-complete code sample illustrating what this typing rule might look like in an actual implementation. x⃗ is implemented as a list of x values, and some monad M is used to signal failure when a premise is not satisfied.
data LabelVar
data Type
= ...
data Term
= Jump LabelVar [Type] [Term] Type
| ...
typecheck :: TermContext -> LabelContext -> Term -> M Type
typecheck gamma delta (Jump j phis us tau) = do
-- Look up `j` in the label context. If it's not there, throw an error.
typeOfJ <- lookupLabel j delta
-- Check that the type of `j` has the right shape: a bunch of `foralls`,
-- followed by a bunch of function types, ending with `forall r.r`. If it
-- has the correct shape, split it into a list of `a`s, a list of `\sigma`s
-- and the return type, `forall r.r`.
(as, sigmas, ret) <- splitLabelType typeOfJ
-- exactZip is a helper function that "zips" two sequences together.
-- If the sequences have the same length, it produces a list of pairs of
-- corresponding elements. If not, it raises an error.
for each (u, sigma) in exactZip (us, sigmas):
-- Type-check the argument `u` in a context without any tail calls,
-- and assert that its type has the correct form.
sigma' <- typecheck gamma emptyLabelContext u
-- let subst = { \sequence{\phi / a} }
subst <- exactZip as phis
assert (applySubst subst sigma == sigma')
-- After all the premises have been satisfied, the type of the `jump`
-- expression is just its return type.
return tau
-- Other syntactic forms
typecheck gamma delta u = ...
-- Auxiliary definitions
type M = ...
instance Monad M
lookupLabel :: LabelVar -> LabelContext -> M Type
splitLabelType :: Type -> M ([TypeVar], [Type], Type)
exactZip :: [a] -> [b] -> M [(a, b)]
applySubst :: [(TypeVar, Type)] -> Type -> Type
As far as I know SPJ’s style for notation, and this does align with what I see in the paper, it simply means “0 or more”. E.g. you can replace \overarrow{a} with a_1, …, a_n, n >= 0.
It may be “1 or more” in some cases, but it shouldn’t be hard to figure one which one of the two.

Can any recursive definition be rewritten using foldr?

Say I have a general recursive definition in haskell like this:
foo a0 a1 ... = base_case
foo b0 b1 ...
| cond1 = recursive_case_1
| cond2 = recursive_case_2
...
Can it always rewritten using foldr? Can it be proved?
If we interpret your question literally, we can write const value foldr to achieve any value, as #DanielWagner pointed out in a comment.
A more interesting question is whether we can instead forbid general recursion from Haskell, and "recurse" only through the eliminators/catamorphisms associated to each user-defined data type, which are the natural generalization of foldr to inductively defined data types. This is, essentially, (higher-order) primitive recursion.
When this restriction is performed, we can only compose terminating functions (the eliminators) together. This means that we can no longer define non terminating functions.
As a first example, we lose the trivial recursion
f x = f x
-- or even
a = a
since, as said, the language becomes total.
More interestingly, the general fixed point operator is lost.
fix :: (a -> a) -> a
fix f = f (fix f)
A more intriguing question is: what about the total functions we can express in Haskell? We do lose all the non-total functions, but do we lose any of the total ones?
Computability theory states that, since the language becomes total (no more non termination), we lose expressiveness even on the total fragment.
The proof is a standard diagonalization argument. Fix any enumeration of programs in the total fragment so that we can speak of "the i-th program".
Then, let eval i x be the result of running the i-th program on the natural x as input (for simplicity, assume this is well typed, and that the result is a natural). Note that, since the language is total, then a result must exist. Moreover, eval can be implemented in the unrestricted Haskell language, since we can write an interpreter of Haskell in Haskell (left as an exercise :-P), and that would work as fine for the fragment. Then, we simply take
f n = succ $ eval n n
The above is a total function (a composition of total functions) which can be expressed in Haskell, but not in the fragment. Indeed, otherwise there would be a program to compute it, say the i-th program. In such case we would have
eval i x = f x
for all x. But then,
eval i i = f i = succ $ eval i i
which is impossible -- contradiction. QED.
In type theory, it is indeed the case that you can elaborate all definitions by dependent pattern-matching into ones only using eliminators (a more strongly-typed version of folds, the generalisation of lists' foldr).
See e.g. Eliminating Dependent Pattern Matching (pdf)

How to work around F#'s type system

In Haskell, you can use unsafeCoerce to override the type system. How to do the same in F#?
For example, to implement the Y-combinator.
I'd like to offer a different solution, based on embedding the untyped lambda calculus in a typed functional language. The idea is to create a data type that allows us to change between types α and α → α, which subsequently allows to escape the restrictions of a type system. I'm not very familiar with F# so I'll give my answer in Haskell, but I believe it could be adapted easily (perhaps the only complication could be F#'s strictness).
-- | Roughly represents morphism between #a# and #a -> a#.
-- Therefore we can embed a arbitrary closed λ-term into #Any a#. Any time we
-- need to create a λ-abstraction, we just nest into one #Any# constructor.
--
-- The type parameter allows us to embed ordinary values into the type and
-- retrieve results of computations.
data Any a = Any (Any a -> a)
Note that the type parameter isn't significant for combining terms. It just allows us to embed values into our representation and extract them later. All terms of a particular type Any a can be combined freely without restrictions.
-- | Embed a value into a λ-term. If viewed as a function, it ignores its
-- input and produces the value.
embed :: a -> Any a
embed = Any . const
-- | Extract a value from a λ-term, assuming it's a valid value (otherwise it'd
-- loop forever).
extract :: Any a -> a
extract x#(Any x') = x' x
With this data type we can use it to represent arbitrary untyped lambda terms. If we want to interpret a value of Any a as a function, we just unwrap its constructor.
First let's define function application:
-- | Applies a term to another term.
($$) :: Any a -> Any a -> Any a
(Any x) $$ y = embed $ x y
And λ abstraction:
-- | Represents a lambda abstraction
l :: (Any a -> Any a) -> Any a
l x = Any $ extract . x
Now we have everything we need for creating complex λ terms. Our definitions mimic the classical λ-term syntax, all we do is using l to construct λ abstractions.
Let's define the Y combinator:
-- λf.(λx.f(xx))(λx.f(xx))
y :: Any a
y = l (\f -> let t = l (\x -> f $$ (x $$ x))
in t $$ t)
And we can use it to implement Haskell's classical fix. First we'll need to be able to embed a function of a -> a into Any a:
embed2 :: (a -> a) -> Any a
embed2 f = Any (f . extract)
Now it's straightforward to define
fix :: (a -> a) -> a
fix f = extract (y $$ embed2 f)
and subsequently a recursively defined function:
fact :: Int -> Int
fact = fix f
where
f _ 0 = 1
f r n = n * r (n - 1)
Note that in the above text there is no recursive function. The only recursion is in the Any data type, which allows us to define y (which is also defined non-recursively).
In Haskell, unsafeCoerce has the type a -> b and is generally used to assert to the compiler that the thing being coerced actually has the destination type and it's just that the type-checker doesn't know it.
Another, less common use, is to reinterpret a pattern of bits as another type. For example an unboxed Double# could be reinterpreted as an unboxed Int64#. You have to be sure about the underlying representations for this to be safe.
In F#, the first application can be achieved with box |> unbox as John Palmer said in a comment on the question. If possible use explicit type arguments to make sure that you don't accidentally have the wrong coercion inferred, e.g. box<'a> |> unbox<'b> where 'a and 'b are type variables or concrete types that are already in scope in your code.
For the second application, look at the BitConverter class for specific conversions of bit-patterns. In theory you could also do something like interfacing with unmanaged code to achieve this, but that seems very heavyweight.
These techniques won't work for implementing the Y combinator because the cast is only valid if the runtime objects actually do have the target type, but with the Y combinator you actually need to call the same function again but with a different type. For this you need the kinds of encoding tricks mentioned in the question John Palmer linked to.

Functions don't just have types: They ARE Types. And Kinds. And Sorts. Help put a blown mind back together

I was doing my usual "Read a chapter of LYAH before bed" routine, feeling like my brain was expanding with every code sample. At this point I was convinced that I understood the core awesomeness of Haskell, and now just had to understand the standard libraries and type classes so that I could start writing real software.
So I was reading the chapter about applicative functors when all of a sudden the book claimed that functions don't merely have types, they are types, and can be treated as such (For example, by making them instances of type classes). (->) is a type constructor like any other.
My mind was blown yet again, and I immediately jumped out of bed, booted up the computer, went to GHCi and discovered the following:
Prelude> :k (->)
(->) :: ?? -> ? -> *
What on earth does it mean?
If (->) is a type constructor, what are the value constructors? I can take a guess, but would have no idea how define it in traditional data (->) ... = ... | ... | ... format. It's easy enough to do this with any other type constructor: data Either a b = Left a | Right b. I suspect my inability to express it in this form is related to the extremly weird type signature.
What have I just stumbled upon? Higher kinded types have kind signatures like * -> * -> *. Come to think of it... (->) appears in kind signatures too! Does this mean that not only is it a type constructor, but also a kind constructor? Is this related to the question marks in the type signature?
I have read somewhere (wish I could find it again, Google fails me) about being able to extend type systems arbitrarily by going from Values, to Types of Values, to Kinds of Types, to Sorts of Kinds, to something else of Sorts, to something else of something elses, and so on forever. Is this reflected in the kind signature for (->)? Because I've also run into the notion of the Lambda cube and the calculus of constructions without taking the time to really investigate them, and if I remember correctly it is possible to define functions that take types and return types, take values and return values, take types and return values, and take values which return types.
If I had to take a guess at the type signature for a function which takes a value and returns a type, I would probably express it like this:
a -> ?
or possibly
a -> *
Although I see no fundamental immutable reason why the second example couldn't easily be interpreted as a function from a value of type a to a value of type *, where * is just a type synonym for string or something.
The first example better expresses a function whose type transcends a type signature in my mind: "a function which takes a value of type a and returns something which cannot be expressed as a type."
You touch so many interesting points in your question, so I am
afraid this is going to be a long answer :)
Kind of (->)
The kind of (->) is * -> * -> *, if we disregard the boxity GHC
inserts. But there is no circularity going on, the ->s in the
kind of (->) are kind arrows, not function arrows. Indeed, to
distinguish them kind arrows could be written as (=>), and then
the kind of (->) is * => * => *.
We can regard (->) as a type constructor, or maybe rather a type
operator. Similarly, (=>) could be seen as a kind operator, and
as you suggest in your question we need to go one 'level' up. We
return to this later in the section Beyond Kinds, but first:
How the situation looks in a dependently typed language
You ask how the type signature would look for a function that takes a
value and returns a type. This is impossible to do in Haskell:
functions cannot return types! You can simulate this behaviour using
type classes and type families, but let us for illustration change
language to the dependently typed language
Agda. This is a
language with similar syntax as Haskell where juggling types together
with values is second nature.
To have something to work with, we define a data type of natural
numbers, for convenience in unary representation as in
Peano Arithmetic.
Data types are written in
GADT style:
data Nat : Set where
Zero : Nat
Succ : Nat -> Nat
Set is equivalent to * in Haskell, the "type" of all (small) types,
such as Natural numbers. This tells us that the type of Nat is
Set, whereas in Haskell, Nat would not have a type, it would have
a kind, namely *. In Agda there are no kinds, but everything has
a type.
We can now write a function that takes a value and returns a type.
Below is a the function which takes a natural number n and a type,
and makes iterates the List constructor n applied to this
type. (In Agda, [a] is usually written List a)
listOfLists : Nat -> Set -> Set
listOfLists Zero a = a
listOfLists (Succ n) a = List (listOfLists n a)
Some examples:
listOfLists Zero Bool = Bool
listOfLists (Succ Zero) Bool = List Bool
listOfLists (Succ (Succ Zero)) Bool = List (List Bool)
We can now make a map function that operates on listsOfLists.
We need to take a natural number that is the number of iterations
of the list constructor. The base cases are when the number is
Zero, then listOfList is just the identity and we apply the function.
The other is the empty list, and the empty list is returned.
The step case is a bit move involving: we apply mapN to the head
of the list, but this has one layer less of nesting, and mapN
to the rest of the list.
mapN : {a b : Set} -> (a -> b) -> (n : Nat) ->
listOfLists n a -> listOfLists n b
mapN f Zero x = f x
mapN f (Succ n) [] = []
mapN f (Succ n) (x :: xs) = mapN f n x :: mapN f (Succ n) xs
In the type of mapN, the Nat argument is named n, so the rest of
the type can depend on it. So this is an example of a type that
depends on a value.
As a side note, there are also two other named variables here,
namely the first arguments, a and b, of type Set. Type
variables are implicitly universally quantified in Haskell, but
here we need to spell them out, and specify their type, namely
Set. The brackets are there to make them invisible in the
definition, as they are always inferable from the other arguments.
Set is abstract
You ask what the constructors of (->) are. One thing to point out
is that Set (as well as * in Haskell) is abstract: you cannot
pattern match on it. So this is illegal Agda:
cheating : Set -> Bool
cheating Nat = True
cheating _ = False
Again, you can simulate pattern matching on types constructors in
Haskell using type families, one canoical example is given on
Brent Yorgey's blog.
Can we define -> in the Agda? Since we can return types from
functions, we can define an own version of -> as follows:
_=>_ : Set -> Set -> Set
a => b = a -> b
(infix operators are written _=>_ rather than (=>)) This
definition has very little content, and is very similar to doing a
type synonym in Haskell:
type Fun a b = a -> b
Beyond kinds: Turtles all the way down
As promised above, everything in Agda has a type, but then
the type of _=>_ must have a type! This touches your point
about sorts, which is, so to speak, one layer above Set (the kinds).
In Agda this is called Set1:
FunType : Set1
FunType = Set -> Set -> Set
And in fact, there is a whole hierarchy of them! Set is the type of
"small" types: data types in haskell. But then we have Set1,
Set2, Set3, and so on. Set1 is the type of types which mentions
Set. This hierarchy is to avoid inconsistencies such as Girard's
paradox.
As noticed in your question, -> is used for types and kinds in
Haskell, and the same notation is used for function space at all
levels in Agda. This must be regarded as a built in type operator,
and the constructors are lambda abstraction (or function
definitions). This hierarchy of types is similar to the setting in
System F omega, and more
information can be found in the later chapters of
Pierce's Types and Programming Languages.
Pure type systems
In Agda, types can depend on values, and functions can return types,
as illustrated above, and we also had an hierarchy of
types. Systematic investigation of different systems of the lambda
calculi is investigated in more detail in Pure Type Systems. A good
reference is
Lambda Calculi with Types by Barendregt,
where PTS are introduced on page 96, and many examples on page 99 and onwards.
You can also read more about the lambda cube there.
Firstly, the ?? -> ? -> * kind is a GHC-specific extension. The ? and ?? are just there to deal with unboxed types, which behave differently from just * (which has to be boxed, as far as I know). So ?? can be any normal type or an unboxed type (e.g. Int#); ? can be either of those or an unboxed tuple. There is more information here: Haskell Weird Kinds: Kind of (->) is ?? -> ? -> *
I think a function can't return an unboxed type because functions are lazy. Since a lazy value is either a value or a thunk, it has to be boxed. Boxed just means it is a pointer rather than just a value: it's like Integer() vs int in Java.
Since you are probably not going to be using unboxed types in LYAH-level code, you can imagine that the kind of -> is just * -> * -> *.
Since the ? and ?? are basically just more general version of *, they do not have anything to do with sorts or anything like that.
However, since -> is just a type constructor, you can actually partially apply it; for example, (->) e is an instance of Functor and Monad. Figuring out how to write these instances is a good mind-stretching exercise.
As far as value constructors go, they would have to just be lambdas (\ x ->) or function declarations. Since functions are so fundamental to the language, they get their own syntax.

Resources