Product type convert to tuple - haskell

I've got this
data Pair = P Int Double deriving Show
myP1 = P 1 2.0
getPairVal (P i d) = (i,d)
getPairVal' (P i d) = (,) i d
which work, both producing (1,2.0). Is there any way to make a straight (,) myP1 work? But then this would require some sort of "cast" I'm guessing.
data Pair2 = P2 (Int, Double) deriving Show
is a tuple version that is "handier" sort of
myP21 = (5,5.0)
So what would be the advantage of a product type like Pair with or without a tuple type definition? I seem to remember SML having a closer relationship, i.e., every product type was a tuple type. In math, anything that's a "product" produces a tuple. Haskell apparently not.

why shouldn't a straight (,) myP1 work?
Well, (,) is a function that takes two values and makes a tuple out of them. P 1 2.0 is not two values: it is a single value of type Pair, which contains two fields. But (,) (P 1 2.0) is a well-typed expression: it has built the left half of a tuple, and needs one more argument for the right half. So:
let mkTuple = (,) (P 1 2.0)
in mkTuple "hello"
evaluates to
(P 1 2.0, "hello")
Why should we define types like Pair instead of using tuples? user202729 linked to Haskell: Algebraic data vs Tuple in a comment, which is a good answer to that question.
But one other approach you could use would be to define a newtype wrapping a tuple, so that you have a distinct named type, but the same runtime representation as a tuple:
newtype Pair2 = P2 (Int, Double) deriving Show
If you use newtype instead of data, you do get a simple conversion function, which you might choose to think of as a cast: Data.Coerce.coerce. With that, you could write
myP21 = coerce (5, 5.0)
and vice versa. This approach is primarily used when you want new typeclass instances for existing types. If you need a new domain type composed of two values, I'd still encourage you to default to using a data type with two fields unless there is a compelling reason not to.

Related

User-defined tuple-based data constructors

Me trying to grok The Little MLer again. TLMLer has this SML code
datatype a pizza = Bottom | Topping of (a * (a pizza))
datatype fish = Anchovy | Lox | Tuna
which I've translated as
data PizzaSh a = CrustSh | ToppingSh a (PizzaSh a)
data FishPSh = AnchovyPSh | LoxPSh | TunaPSh
and then an alternative closer to TLMLer perhaps
data PizzaSh2 a = CrustSh2 | ToppingSh2 (a, PizzaSh2 a)
And from each I create a pizza
fpizza1 = ToppingSh AnchovyPSh (ToppingSh TunaPSh (ToppingSh LoxPSh CrustSh))
fpizza2 = ToppingSh2 (AnchovyPSh, ToppingSh2 (LoxPSh, ToppingSh2 (TunaPSh, CrustSh2)))
respectively, which are of type PizzaSh FishPSh and PizzaSh2 FishPSh respectively.
But the second version (which is arguably the closer to the original ML version) seems "offbeat." It's as if I'm creating a 2-tuple when I "cons" toppings together where the second member recursively expands . May I assume the parametric data constructor "function" of PizzaSh2 doesn't literally build a tuple, it's just borrowing the tuple as a cons strategy, correct? Which is preferable in Haskell, PizzaSh or PizzaSh2? As I understand, a tuple (cartesian product) data type will have a single constructor, e.g., data Point a b = Pt a b, not the disjoint union of ored-together (|) constructors. In SML the "*" indicates product, i.e., tuple, but, again, is this just a "tuple-like thing," i.e., it's just a tuple-looking way to cons a pizza together?
In Haskell, we prefer this style:
data PizzaSh a = CrustSh | ToppingSh a (PizzaSh a)
There is no need in Haskell to use a tuple there, since a data constructor like ToppingSh can take multiple arguments.
Using an additional pair as in
data PizzaSh2 a = CrustSh2 | ToppingSh2 (a, PizzaSh2 a)
creates a type which is almost isomorphic to the previous one, but is more cumbersome to handle since it requires to use more parentheses. E.g.
foo (ToppingSh x y)
-- vs
foo (ToppingSh2 (x, y))
bar :: PizzaSh a -> ...
bar (ToppingSh x y) = ....
-- vs
bar (ToppingSh2 (x, y)) = ...
Further, the type is indeed only almost isomorphic. When using an additional pair, because of laziness, we have one more value which can be represented in the type: we have a correpondence
ToppingSh x y <-> ToppingSh2 (x, y)
which breaks down in the case
??? <-> ToppingSh2 undefined
That is, ToppinggSh2 can be applied to a non-terminating (or otherwise exceptional), pair-valued expression, and that constructs a value which can not be represented using ToppingSh.
Operationally, to achieve that GHC uses a double indirection (roughly pointer-to-pointer, or thunk-returning-pair-of-thunks), which further slows down the code. Hence, it's also a bad choice from a performance point of view, if one cares about such micro-optimizations.
As far as the Haskell side, it absolutely is nesting a (,) constructor inside the ToppingSh constructor. It would violate Haskell's non-strict semantics to not do the nesting you requested. If the nesting was removed, you'd be unable to distinguish between undefined :: PizzaSh2 () and ToppingSh undefined :: PizzaSh2 (). And yes, most of the time, that isn't what you want. PizzaSh is the much more natural formulation in Haskell unless you have a particular need to be able to introduce another bottom into the evaluation process.
I can't address what's going on behind the scenes in any particular ML implementation. Though I can say that with strict evaluation semantics, there isn't a behavioral difference to observe, meaning compilers are free to use a wider variety of approaches.

Check if tuple or triple in haskell

Is there a way to check how many elements (,) has? I know I can access first and second element of a tuple with fst and snd but and thought I could somehow sum elements and then compare it with fst tuple snd tuple and check like this:
tuple = (1,2)
sum tuple == fst tuple + snd tuple
and then I get True for this case and get False for triple = (1,2,3). Eitherway I cant ask fst (1,2,3) nor I can do sum tuple
Is there a way to check if I have a tuple or not?
Something like this:
is_tuple :: (a,b) -> a
is_tuple (a,_) = a
but to get True when I input tuple and False when I give (1,2,3) or (1,2,3,4) and so on... as a input.
i.e:
is_tuple :: Tuple -> Bool
is_tuple x = if x is Tuple
then True
else False
?
Is there a way to check how many elements (,) has?
No, because the answer is always 2.
The type (,) is the type constructor for a tuple of two values, or a 2-tuple. It is a distinct type from (,,), the constructor for 3-tuples. Likewise, both those types are distinct from (,,,), the constructor for 4-tuples, and so on.
When you write a function with type (Foo, Bar) -> Baz, the typechecker will reject any attempts to call the function with a tuple of a different number of values (or something that isn’t a tuple at all). Therefore, your isTuple function only has one logical implementation,
isTuple :: (a, b) -> Bool
isTuple _ = True
…since it is impossible to ever actually call isTuple with a value that is not a 2-tuple.
Without using typeclasses, it is impossible in Haskell to write a function that accepts a tuple of arbitrary size; that is, you cannot be polymorphic over the size of a tuple. This is because, unlike lists, tuples are heterogenous—they can contain values of different types. A function that accepts a tuple of varying length would have no way to predict which elements of the tuple are of which type, and therefore it wouldn’t be able to actually do anything useful.
Very rarely, when doing advanced, type-level trickery, it can be useful to have a type that represents a tuple of varying length, which in Haskell is frequently known as an HList (for heterogenous list). These can be implemented as a library using fancy typeclass machinery and type-level programming. However, if you are a beginner, this is definitely not what you want.
It is difficult to actually give advice on what you should do because, as a commenter points out, your question reads like an XY problem. Consider asking a different question that gives a little more context about the problem you were actually trying to solve that made you want to find the list of a tuple in the first place, and you’ll most likely get more helpful answers.

Redundancy regarding product types and tuples in Haskell

In Haskell you have product types and you have tuples.
You use tuples if you don't want to associate a dedicated type with the value, and you can use product types if you wish to do so.
However I feel there is redundancy in the notation of product types
data Foo = Foo (String, Int, Char)
data Bar = Bar String Int Char
Why are there both kinds of notations? Is there any case where you would prefer one the other?
I guess you can't use record notation when using tuples, but that's just a convenience problem. Another thing might be the notion of order in tuples, as opposed to product types, but I think that's just due to the naming of the functions fst and snd.
#chi's answer is about the technical differences in terms of Haskell's evaluation model. I hope to give you some insight into the philosophy of this sort of typed programming.
In category theory we generally work with objects "up to isomorphism". Your Bar is of course isomorphic to (String, Int, Char), so from a categorical perspective they're the same thing.
bar_tuple :: Iso' Bar (String, Int, Char)
bar_tuple = iso to from
where to (Bar s i c) = (s, i, c)
from (s, i, c) = Bar s i c
In some sense tuples are a Platonic form of product type, in that they have no meaning beyond being a collection of disparate values. All the other product types can be mapped to and from a plain old tuple.
So why not use tuples everywhere, when all Haskell types ultimately boil down to a sum of products? It's about communication. As Martin Fowler says,
Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
Names are important! Writing down a custom product type like
data Customer = Customer { name :: String, address :: String }
imbues the type Customer with meaning to the person reading the code, unlike (String, String) which just means "two strings".
Custom types are particularly useful when you want to enforce invariants by hiding the representation of your data and using smart constructors:
newtype NonEmpty a = NonEmpty [a]
nonEmpty :: [a] -> Maybe (NonEmpty a)
nonEmpty [] = Nothing
nonEmpty xs = Just (NonEmpty xs)
Now, if you don't export the NonEmpty constructor, you can force people to go through the nonEmpty smart constructor. If someone hands you a NonEmpty value you may safely assume that it has at least one element.
You can of course represent Customer as a tuple under the hood and expose evocatively-named field accessors,
newtype Customer = Bar (String, String)
name, address :: Customer -> String
name (Customer (n, a)) = n
address (Customer (n, a)) = a
but this doesn't really buy you much, except that it's now cheaper to convert Customer to a tuple (if, say, you're writing performance-sensitive code that works with a tuple-oriented API).
If your code is intended to solve a particular problem - which of course is the whole point of writing code - it pays to not just solve the problem, but make it look like you've solved it too. Someone - maybe you in a couple of years - is going to have to read this code and understand it with no a priori knowledge of how it works. Custom types are a very important communication tool in this regard.
The type
data Foo = Foo (String, Int, Char)
represents a double-lifted tuple. It values comprise
undefined
Foo undefined
Foo (undefined, undefined, undefined)
etc.
This is usually troublesome. Because of this, it's rare to see such definitions in actual code. We either have plain data types
data Foo = Foo String Int Char
or newtypes
newtype Foo = Foo (String, Int, Char)
The newtype can be just as inconvenient to use, but at least it
does not double-lift the tuple: undefined and Foo undefined are now equal values.
The newtype also provides zero-cost conversion between a plain tuple and Foo, in both directions.
You can see such newtypes in use e.g. when the programmer needs a different instance for some type class, than the one already associated with the tuple. Or, perhaps, it is used in a "smart constructor" idiom.
I would not expect the pattern used in Foo to be frequent. There is slight difference in what the constructor acts like: Foo :: (String, Int, Char) -> Foo as opposed to Bar :: String -> Int -> Char -> Bar. Then Foo undefined and Foo (undefined, ..., ...) are strictly speaking different things, whereas you miss one level of undefinedness in Bar.

Does Haskell have return type overloading?

Based on what I've read about Haskell, and the experimentation I've done with GHC, it seems like Haskell has return type overloading (aka ad hoc polymorphism). One example of this is the fromInteger function which can give you a Double or an Integer depending on where the result is used. For example:
fd :: Double -> String
fd x = "Double"
fi :: Integer -> String
fi x = "Integer"
fd (fromInteger 5) -- returns "Double"
fi (fromInteger 5) -- returns "Integer"
A Gentle Introduction to Haskell seems to agree with this when it says:
The kind of polymorphism that we have talked about so far is commonly called parametric polymorphism. There is another kind called ad hoc polymorphism, better known as overloading. Here are some examples of ad hoc polymorphism:
The literals 1, 2, etc. are often used to represent both fixed and arbitrary precision integers.
If the numeric literals are considered to be an example of ad hoc polymorphism (aka overloading), then it seems that the same is true for the result of functions like fromInteger.
And in fact, I've found some answers to other questions on Stack Overflow that suggest that Haskell has overloading by return type.
However, at least one Haskell programmer has told me that this isn't return type overloading, and is instead an example of "parametric polymorphism, where the parameter is bound by a universal quantifier".
I think what he's getting at is that fromInteger is returning a value from every instance of Num (sort of a nondeterministic type).
That seems like a reasonable interpretation, but as far as I can tell, Haskell never lets us look at more than one of these instance values (thanks in part to the Monomorphism restriction). It also seems like the actual instance who's value we look at can be determined statically. Because of all of this, it seems reasonable to say that in the expression fd (fromInteger 5) the subexpression fromInteger 5 is of type Double, while in the expression fi (fromInteger 5) the subexpression fromInteger 5 is of type Integer.
So, does Haskell have return type overloading?
If not, please provide an example of one of the following:
valid Haskell code that would have different behavior if Haskell had return type overloading
valid Haskell code that would be invalid if Haskell had return type overloading
invalid Haskell code that would be valid if Haskell had return type overloading
Well, one way to look at it is that Haskell translates the return type polymorphism that you're thinking of into parametric polymorphism, using something called the dictionary-passing translation for type classes. (Though this is not the only way to implement type classes or reason about them; it's just the most popular.)
Basically, fromInteger has this type in Haskell:
fromInteger :: Num a => Integer -> a
That might be translated internally into something like this:
fromInteger# :: NumDictionary# a -> Integer -> a
fromInteger# NumDictionary# { fromInteger = method } x = method x
data NumDictionary# a = NumDictionary# { ...
, fromInteger :: Integer -> a
, ... }
So for each concrete type T with a Num instance, there's a NumDictionary# T value that contains a function fromInteger :: Integer -> T, and all code that uses the Num type class is translated into code that takes a dictionary as the argument.
The seminal paper on Haskell-style typeclasses is called "How to make ad-hoc polymorphism less ad hoc". So, the answer to your question is a qualified "yes" -- depending on just how ad hoc you require your return-type overloading to be...
In other words: there is no question that ad hoc polymorphism is relevant to typeclasses, since that was a motivating example for inventing them. But whether you think the result still qualifies as "return-type overloading" depends on the fiddly details of your favored definition.
I'd like to address one small part of your question:
It also seems like the actual instance who's value we look at can be determined statically.
This isn't really accurate. Consider the following wacky data type:
data PerfectlyBalancedTree a
= Leaf a
| Branch (PerfectlyBalancedTree (a,a))
deriving (Eq, Ord, Show, Read)
Let's gawk at that type for a second first before we move on to the good bits. Here are a few typical values of the type PerfectlyBalancedTree Integer:
Leaf 0
Branch (Leaf (0, 0))
Branch (Branch (Leaf ((0,0),(0,0))))
Branch (Branch (Branch (Leaf (((0,0),(0,0)),((0,0),(0,0))))))
In fact, you can visualize any value of this type as being an initial sequence of n Branch tags followed by a "we're finally done, thank goodness" Leaf tag followed by a 2^n-tuple of the contained type. Cool.
Now, we're going to write a function to parse a slightly more convenient representation for these. Here's a couple example invocations:
*Main> :t fromString
fromString :: String -> PerfectlyBalancedTree Integer
*Main> fromString "0"
Leaf 0
*Main> fromString "b(42,69)"
Branch (Leaf (42,69))
*Main> fromString "bbb(((0,0),(0,0)),((0,0),(0,0)))"
Branch (Branch (Branch (Leaf (((0,0),(0,0)),((0,0),(0,0))))))
Along the way, it will be convenient to write a slightly more polymorphic function. Here it is:
fromString' :: Read a => String -> PerfectlyBalancedTree a
fromString' ('b':rest) = Branch (fromString' rest)
fromString' leaf = Leaf (read leaf)
Now our real function is just the same thing with a different type signature:
fromString :: String -> PerfectlyBalancedTree Integer
fromString = fromString'
But wait a second... what just happened here? I slipped something by you big time! Why didn't we just write this directly?
fromStringNoGood :: String -> PerfectlyBalancedTree Integer
fromStringNoGood ('b':rest) = Branch (fromStringNoGood rest)
fromStringNoGood leaf = Leaf (read leaf)
The reason is that in the recursive call, fromStringNoGood has a different type. It's not being called on to return a PerfectlyBalancedTree Integer, it's being called on to return a PerfectlyBalancedTree (Integer, Integer). We could write ourselves such a function...
fromStringStillNoGood :: String -> PerfectlyBalancedTree Integer
fromStringStillNoGood ('b':rest) = Branch (helper rest)
fromStringStillNoGood leaf = Leaf (read leaf)
helper :: String -> PerfectlyBalancedTree (Integer, Integer)
helper ('b':rest) = Branch ({- ... what goes here, now? -})
helper leaf = Leaf (read leaf)
... but this way lies an infinite regress into writing deeperly and deeperly nested types.
The problem is that, even though we're interested in a monomorphic top-level function, we nevertheless cannot determine statically what type read is being called at in the polymorphic function it uses! The data we're passed determines what type of tuple read should return: more bs in the String means a deeper-nested tuple.
You're right: Haskell does have overloading and it provides it through its type-class mechanism.
Consider the following signatures:
f :: [a] -> a
g :: Num a => [a] -> a
The first signature tells you that given a list of elements of any type a, f will produce a value of type a. This means that the implementation of f cannot make any assumptions about the type a and what operations it admits. This is an example of parametric polymorphism. A moment's reflection reveals that there are actually very little options for implementing f: the only thing you can do is select an element from the provided list. Conceptually, there is a single generic implementation of f that works for all types a.
The second signatures tells you that given a list of elements of some type a that belongs to the type class Num, g will produce a value of that type a. This means that the implementation of g can consume, produce, and manipulate values of type a using all operations that come with the type class Num. For example, g can add or multiply the elements of the list, select the minimum of the list, return a lifted constant, ... This is an example of overloading, which is typically taken to be a form of ad-hoc polymorphism (the other main form being coercion). Conceptually, there is a different implementation for g for all types a in Num.
It has return type overloading. For a good example see the Read function. It has the type Read a => String -> a. It can read and return anything that implements the read type class.

Functions don't just have types: They ARE Types. And Kinds. And Sorts. Help put a blown mind back together

I was doing my usual "Read a chapter of LYAH before bed" routine, feeling like my brain was expanding with every code sample. At this point I was convinced that I understood the core awesomeness of Haskell, and now just had to understand the standard libraries and type classes so that I could start writing real software.
So I was reading the chapter about applicative functors when all of a sudden the book claimed that functions don't merely have types, they are types, and can be treated as such (For example, by making them instances of type classes). (->) is a type constructor like any other.
My mind was blown yet again, and I immediately jumped out of bed, booted up the computer, went to GHCi and discovered the following:
Prelude> :k (->)
(->) :: ?? -> ? -> *
What on earth does it mean?
If (->) is a type constructor, what are the value constructors? I can take a guess, but would have no idea how define it in traditional data (->) ... = ... | ... | ... format. It's easy enough to do this with any other type constructor: data Either a b = Left a | Right b. I suspect my inability to express it in this form is related to the extremly weird type signature.
What have I just stumbled upon? Higher kinded types have kind signatures like * -> * -> *. Come to think of it... (->) appears in kind signatures too! Does this mean that not only is it a type constructor, but also a kind constructor? Is this related to the question marks in the type signature?
I have read somewhere (wish I could find it again, Google fails me) about being able to extend type systems arbitrarily by going from Values, to Types of Values, to Kinds of Types, to Sorts of Kinds, to something else of Sorts, to something else of something elses, and so on forever. Is this reflected in the kind signature for (->)? Because I've also run into the notion of the Lambda cube and the calculus of constructions without taking the time to really investigate them, and if I remember correctly it is possible to define functions that take types and return types, take values and return values, take types and return values, and take values which return types.
If I had to take a guess at the type signature for a function which takes a value and returns a type, I would probably express it like this:
a -> ?
or possibly
a -> *
Although I see no fundamental immutable reason why the second example couldn't easily be interpreted as a function from a value of type a to a value of type *, where * is just a type synonym for string or something.
The first example better expresses a function whose type transcends a type signature in my mind: "a function which takes a value of type a and returns something which cannot be expressed as a type."
You touch so many interesting points in your question, so I am
afraid this is going to be a long answer :)
Kind of (->)
The kind of (->) is * -> * -> *, if we disregard the boxity GHC
inserts. But there is no circularity going on, the ->s in the
kind of (->) are kind arrows, not function arrows. Indeed, to
distinguish them kind arrows could be written as (=>), and then
the kind of (->) is * => * => *.
We can regard (->) as a type constructor, or maybe rather a type
operator. Similarly, (=>) could be seen as a kind operator, and
as you suggest in your question we need to go one 'level' up. We
return to this later in the section Beyond Kinds, but first:
How the situation looks in a dependently typed language
You ask how the type signature would look for a function that takes a
value and returns a type. This is impossible to do in Haskell:
functions cannot return types! You can simulate this behaviour using
type classes and type families, but let us for illustration change
language to the dependently typed language
Agda. This is a
language with similar syntax as Haskell where juggling types together
with values is second nature.
To have something to work with, we define a data type of natural
numbers, for convenience in unary representation as in
Peano Arithmetic.
Data types are written in
GADT style:
data Nat : Set where
Zero : Nat
Succ : Nat -> Nat
Set is equivalent to * in Haskell, the "type" of all (small) types,
such as Natural numbers. This tells us that the type of Nat is
Set, whereas in Haskell, Nat would not have a type, it would have
a kind, namely *. In Agda there are no kinds, but everything has
a type.
We can now write a function that takes a value and returns a type.
Below is a the function which takes a natural number n and a type,
and makes iterates the List constructor n applied to this
type. (In Agda, [a] is usually written List a)
listOfLists : Nat -> Set -> Set
listOfLists Zero a = a
listOfLists (Succ n) a = List (listOfLists n a)
Some examples:
listOfLists Zero Bool = Bool
listOfLists (Succ Zero) Bool = List Bool
listOfLists (Succ (Succ Zero)) Bool = List (List Bool)
We can now make a map function that operates on listsOfLists.
We need to take a natural number that is the number of iterations
of the list constructor. The base cases are when the number is
Zero, then listOfList is just the identity and we apply the function.
The other is the empty list, and the empty list is returned.
The step case is a bit move involving: we apply mapN to the head
of the list, but this has one layer less of nesting, and mapN
to the rest of the list.
mapN : {a b : Set} -> (a -> b) -> (n : Nat) ->
listOfLists n a -> listOfLists n b
mapN f Zero x = f x
mapN f (Succ n) [] = []
mapN f (Succ n) (x :: xs) = mapN f n x :: mapN f (Succ n) xs
In the type of mapN, the Nat argument is named n, so the rest of
the type can depend on it. So this is an example of a type that
depends on a value.
As a side note, there are also two other named variables here,
namely the first arguments, a and b, of type Set. Type
variables are implicitly universally quantified in Haskell, but
here we need to spell them out, and specify their type, namely
Set. The brackets are there to make them invisible in the
definition, as they are always inferable from the other arguments.
Set is abstract
You ask what the constructors of (->) are. One thing to point out
is that Set (as well as * in Haskell) is abstract: you cannot
pattern match on it. So this is illegal Agda:
cheating : Set -> Bool
cheating Nat = True
cheating _ = False
Again, you can simulate pattern matching on types constructors in
Haskell using type families, one canoical example is given on
Brent Yorgey's blog.
Can we define -> in the Agda? Since we can return types from
functions, we can define an own version of -> as follows:
_=>_ : Set -> Set -> Set
a => b = a -> b
(infix operators are written _=>_ rather than (=>)) This
definition has very little content, and is very similar to doing a
type synonym in Haskell:
type Fun a b = a -> b
Beyond kinds: Turtles all the way down
As promised above, everything in Agda has a type, but then
the type of _=>_ must have a type! This touches your point
about sorts, which is, so to speak, one layer above Set (the kinds).
In Agda this is called Set1:
FunType : Set1
FunType = Set -> Set -> Set
And in fact, there is a whole hierarchy of them! Set is the type of
"small" types: data types in haskell. But then we have Set1,
Set2, Set3, and so on. Set1 is the type of types which mentions
Set. This hierarchy is to avoid inconsistencies such as Girard's
paradox.
As noticed in your question, -> is used for types and kinds in
Haskell, and the same notation is used for function space at all
levels in Agda. This must be regarded as a built in type operator,
and the constructors are lambda abstraction (or function
definitions). This hierarchy of types is similar to the setting in
System F omega, and more
information can be found in the later chapters of
Pierce's Types and Programming Languages.
Pure type systems
In Agda, types can depend on values, and functions can return types,
as illustrated above, and we also had an hierarchy of
types. Systematic investigation of different systems of the lambda
calculi is investigated in more detail in Pure Type Systems. A good
reference is
Lambda Calculi with Types by Barendregt,
where PTS are introduced on page 96, and many examples on page 99 and onwards.
You can also read more about the lambda cube there.
Firstly, the ?? -> ? -> * kind is a GHC-specific extension. The ? and ?? are just there to deal with unboxed types, which behave differently from just * (which has to be boxed, as far as I know). So ?? can be any normal type or an unboxed type (e.g. Int#); ? can be either of those or an unboxed tuple. There is more information here: Haskell Weird Kinds: Kind of (->) is ?? -> ? -> *
I think a function can't return an unboxed type because functions are lazy. Since a lazy value is either a value or a thunk, it has to be boxed. Boxed just means it is a pointer rather than just a value: it's like Integer() vs int in Java.
Since you are probably not going to be using unboxed types in LYAH-level code, you can imagine that the kind of -> is just * -> * -> *.
Since the ? and ?? are basically just more general version of *, they do not have anything to do with sorts or anything like that.
However, since -> is just a type constructor, you can actually partially apply it; for example, (->) e is an instance of Functor and Monad. Figuring out how to write these instances is a good mind-stretching exercise.
As far as value constructors go, they would have to just be lambdas (\ x ->) or function declarations. Since functions are so fundamental to the language, they get their own syntax.

Resources