Please forgive me if this question is dumb.
While reading about Haskell kinds, I notice a theme:
*
* -> *
* -> * -> *
I get the impression that kinds in Haskell ultimately boil down to how many asterisks there are. You might say that the kind of a type is really just the number of types you need to apply to it before it becomes *. In other words, you could count all *'s but the last one, and define a type's kind by an integer. Say 0, 1, 2, etc.
Here's my question:
Is this a correct observation about Haskell's type system? Or does it allow something other than *'s to go where you typically see *'s? For example:
* -> a -> *
I imagine someone might want to do this to constrain type variables to have an instance of a type class, for example.
Functor a, Applicative b => * -> a -> b -> *
Is that a thing?
The most basic form of the kind language contains only * (or Type in more modern Haskell; I suspect we'll eventually move away from *) and ->.
But there are more things you can build with that language than you can express by just "counting the number of *s". It's not just the number of * or -> that matter, but how they are nested. For example * -> * -> * is the kind of things that take two type arguments to produce a type, but (* -> *) -> * is the kind of things that take a single argumemt to produce a type where that argument itself must be a thing that takes a type argument to produce a type. data ThreeStars a b = Cons a b makes a type constructor with kind * -> * -> *, while data AlsoThreeStars f = AlsoCons (f Integer) makes a type constructor with kind (* -> *) -> *.
There are several language extensions that add more features to the kind language.
PolyKinds adds kind variables that work exactly the same way type variables work. Now we can have kinds like forall k. (* -> k) -> k.
ConstraintKinds makes constraints (the stuff to the left of the => in type signatures, like Eq a) become ordinary type-level entities in a new kind: Constraint. Rather than the stuff left of the => being special purpose syntax fairly disconnected from the rest of the language, now what is acceptable there is anything with kind Constraint. Classes like Eq become type constructors with kind * -> Constraint; you apply it to a type like Eq Bool to produce a Constraint. The advantage is now we can use all of the language features for manipulating type-level entities to manipulate constraints (including PolyKinds!).
DataKinds adds the ability to create new user-defined kinds containing new type-level things, in exactly the same way that in vanilla Haskell we can create new user-defined types containing new term-level things. (Exactly the same way; the way DataKinds actually works is that it lets you use a data declaration as normal and then you can use the resulting type constructor at either the type or the kind level)
There are also kinds used for unboxed/unlifted types, which must not be ever mixed with "normal" Haskell types because they have a different memory layout; they can't contain thunks to implement lazy evaluation, so the runtime has to know never to try to "enter" them as a code pointer, or look for additional header bits, etc. They need to be kept separate at the kind level so that ordinary type variables of kind * can't be instantiated with these unlifted/unboxed types (which would allow you to pass these types that need special handling to generic code that doesn't know to provide the special handling). I'm vaguely aware of this stuff but have never actually had to use it, so I won't add any more so I don't get anything wrong. (Anyone who knows what they're talking about enough to write a brief summary paragraph here, please feel free to edit the answer)
There are probably some others I'm forgetting. But certainly the kind language is richer than the OP is imagining just with the basic Haskell features, and there is much more to it once you turn on a few (quite widely used) extensions.
Related
Browsing the haddocks of various packages I often come along instance documentations that look like this (Control.Category):
Category k (Coercion k)
Category * (->)
or this (Control.Monad.Trans.Identity):
MonadTrans (IdentityT *)
What exactly here does the kind signature mean? It doesn't show up in the source, but I have already noticed that it seems to occur in modules that use the PolyKinds extension. I suspect it is probably like a TypeApplication but with a kind. So that e.g. the last example means that IdentityT is a monad transformer if it's first argument has kind *.
So my questions are:
Is my interpretation correct and what exactly does the kind signature refer to?
In the first Category instance, how am I supposed to know that k is a kind and not a type? Or do I just have to know the arity of Category?
What is the source code analog to this syntax?
I am not asking for an explanation of kinds.
To quote Richard Eisenberg’s recent post on the haskell-cafe mailing list:
Haddock struggles sometimes to render types with -XPolyKinds enabled. The problem is that GHC generally does not require kind arguments to be written and it does not print them out (unless you say -fprint-explicit-kinds). But Haddock, I believe, prints out kinds whenever -XPolyKinds is on. So the two different definitions are really the same: it's just that one module has -XPolyKinds and the other doesn't.
The * is the kind of ordinary types. So Int has kind * (we write Int :: *) while Maybe has kind * -> *. Typeable actually has kind forall k. k -> Constraint, meaning that it's polykinded. In the first snippet below, the * argument to Typeable instantiates k with *, because type variable a has kind *.
So yes, as you guessed, it has to do with PolyKinds. Haddock renders these poly-kinded types with a sort of “explicit kind application”. It just so happens that Category is poly-kinded, having the kind forall k. (k -> k -> *) -> Constraint, so Haddock renders the kind application alongside each instance.
In my opinion, this is a bug or misfeature of Haddock, since there is no equivalent source code analog as far as I know. It is confusing, and I don’t know of a better way to understand it than to recognize the way it usually manifests and visually infer what’s going on from the context.
I have pretty decent intuition about types Haskell prohibits as "impredicative": namely ones where a forall appears in an argument to a type constructor other than ->. But just what is predicativity? What makes it important? How does it relate to the word "predicate"?
The central question of these type systems is: "Can you substitute a polymorphic type in for a type variable?". Predicative type systems are the no-nonsense schoolmarm answering, "ABSOLUTELY NOT", while impredicative type systems are your carefree buddy who thinks that sounds like a fun idea and what could possibly go wrong?
Now, Haskell muddies the discussion a bit because it believes polymorphism should be useful but invisible. So for the remainder of this post, I will be writing in a dialect of Haskell where uses of forall are not just allowed but required. This way we can distinguish between the type a, which is a monomorphic type which draws its value from a typing environment that we can define later, and the type forall a. a, which is one of the harder polymorphic types to inhabit. We'll also allow forall to go pretty much anywhere in a type -- as we'll see, GHC restricts its type syntax as a "fail-fast" mechanism rather than as a technical requirement.
Suppose we have told the compiler id :: forall a. a -> a. Can we later ask to use id as if it had type (forall b. b) -> (forall b. b)? Impredicative type systems are okay with this, because we can instantiate the quantifier in id's type to forall b. b, and substitute forall b. b for a everywhere in the result. Predicative type systems are a bit more wary of that: only monomorphic types are allowed in. (So if we had a particular b, we could write id :: b -> b.)
There's a similar story about [] :: forall a. [a] and (:) :: forall a. a -> [a] -> [a]. While your carefree buddy may be okay with [] :: [forall b. b] and (:) :: (forall b. b) -> [forall b. b] -> [forall b. b], the predicative schoolmarm isn't, so much. In fact, as you can see from the only two constructors of lists, there is no way to produce lists containing polymorphic values without instantiating the type variable in their constructors to a polymorphic value. So although the type [forall b. b] is allowed in our dialect of Haskell, it isn't really sensible -- there's no (terminating) terms of that type. This motivates GHC's decision to complain if you even think about such a type -- it's the compiler's way of telling you "don't bother".*
Well, what makes the schoolmarm so strict? As usual, the answer is about keeping type-checking and type-inference doable. Type inference for impredicative types is right out. Type checking seems like it might be possible, but it's bloody complicated and nobody wants to maintain that.
On the other hand, some might object that GHC is perfectly happy with some types that appear to require impredicativity:
> :set -Rank2Types
> :t id :: (forall b. b) -> (forall b. b)
{- no complaint, but very chatty -}
It turns out that some slightly-restricted versions of impredicativity are not too bad: specifically, type-checking higher-rank types (which allow type variables to be substituted by polymorphic types when they are only arguments to (->)) is relatively simple. You do lose type inference above rank-2, and principal types above rank-1, but sometimes higher rank types are just what the doctor ordered.
I don't know about the etymology of the word, though.
* You might wonder whether you can do something like this:
data FooTy a where
FooTm :: FooTy (forall a. a)
Then you would get a term (FooTm) whose type had something polymorphic as an argument to something other than (->) (namely, FooTy), you don't have to cross the schoolmarm to do it, and so the belief "applying non-(->) stuff to polymorphic types isn't useful because you can't make them" would be invalidated. GHC doesn't let you write FooTy, and I will admit I'm not sure whether there's a principled reason for the restriction or not.
(Quick update some years later: there is a good, principled reason that FooTm is still not okay. Namely, the way that GADTs are implemented in GHC is via type equalities, so the expanded type of FooTm is actually FooTm :: forall a. (a ~ forall b. b) => FooTy a. Hence to actually use FooTm, one would indeed need to instantiate a type variable with a polymorphic type. Thanks to Stephanie Weirich for pointing this out to me.)
Let me just add a point regarding the "etymology" issue, since the other answer by #DanielWagner covers much of the technical ground.
A predicate on something like a is a -> Bool. Now a predicate logic is one that can in some sense reason about predicates -- so if we have some predicate P and we can talk about, for a given a, P(a), now in a "predicate logic" (such as first-order logic) we can also say ∀a. P(a). So we can quantify over variables and discuss the behavior of predicates over such things.
Now, in turn, we say a statement is predicative if all of the things a predicate is applied to are introduced prior to it. So statements are "predicated on" things that already exist. In turn, a statement is impredicative if it can in some sense refer to itself by its "bootstraps".
So in the case of e.g. the id example above, we find that we can give a type to id such that it takes something of the type of id to something else of the type of id. So now we can give a function a type where an quantified variable (introduced by forall a.) can "expand" to be the same type as that of the entire function itself!
Hence impredicativity introduces a possibility of a certain "self reference". But wait, you might say, wouldn't such a thing lead to contradiction? The answer is: "well, sometimes." In particular, "System F" which is the polymorphic lambda calculus and the essential "core" of GHC's "core" language allows a form of impredicativity that nonetheless has two levels -- the value level, and the type level, which is allowed to quantify over itself. In this two-level stratification, we can have impredicativity and not contradiction/paradox.
Although note that this neat trick is very delicate and easy to screw up by the addition of more features, as this collection of articles by Oleg indicates: http://okmij.org/ftp/Haskell/impredicativity-bites.html
I'd like to make a comment on the etymology issue, since #sclv's answer isn't quite right (etymologically, not conceptually).
Go back in time, to the days of Russell when everything is set theory— including logic. One of the logical notions of particular import is the "principle of comprehension"; that is, given some logical predicate φ:A→2 we would like to have some principle to determine the set of all elements satisfying that predicate, written as "{x | φ(x) }" or some variation thereon. The key point to bear in mind is that "sets" and "predicates" are viewed as being fundamentally different things: predicates are mappings from objects to truth values, and sets are objects. Thus, for example, we may allow quantifying over sets but not quantifying over predicates.
Now, Russell was rather concerned by his eponymous paradox, and sought some way to get rid of it. There are numerous fixes, but the one of interest here is to restrict the principle of comprehension. But first, the formal definition of the principle: ∃S.∀x.S x ↔︎ φ(x); that is, for our particular φ there exists some object (i.e., set) S such that for every object (also a set, but thought of as an element) x, we have that S x (you can think of this as meaning "x∈S", though logicians of the time gave "∈" a different meaning than mere juxtaposition) is true just in case φ(x) is true. If we take the principle exactly as written then we end up with an impredicative theory. However, we can place restrictions on which φ we're allowed to take the comprehension of. (For example, if we say that φ must not contain any second-order quantifiers.) Thus, for any restriction R, if a set S is determined (i.e., generated via comprehension) by some R-predicate, then we say that S is "R-predicative". If every set in our language is R-predicative then we say that our language is "R-predicative". And then, as is often the case with hyphenated prefix things, the prefix gets dropped off and left implicit, whence "predicative" languages. And, naturally, languages which are not predicative are "impredicative".
That's the old school etymology. Since those days the terms have gone off and gotten lives of their own. The ways we use "predicative" and "impredicative" today are quite different, because the things we're concerned about have changed. So it can sometimes be a bit hard to see how the heck our modern usage ties back to this stuff. Honestly, I don't think knowing the etymology really helps any in terms of figuring out what the words are really about (these days).
I have read a book called Clean code. One of the strongest messages that I took from the book is that the code must be readable. I do not understand why functional languages such as F# do not include function parameter names into the intellisense or in the type defintion?
Compare
val copy: string -> string -> unit
with
val copy: sourceFile:string -> destinationFile:string -> unit
What is the best practise in the functional world? Shall we prefer single parameter functions? This is what Clean code promotes. Or shall we use records for all functions of 2+ parameters?
I know that one work-around is to use static member instead of let functions but this is not a functional approach, is it?
EDIT:
Just to give more information:
Haskell : addThree :: Int -> Int -> Int -> Int
OCaml: Unix.unlink: (string -> unit)
and surely others. They just do not show parameter names in the type definitions.
If you practice typeful programming, which is the idea that a lot of the semantic content of programs can be reflected statically in the type system, you will find out that in many (but not all) cases, named arguments are not necessary for readability.
Consider the following examples in the List standard library of OCaml. By knowing that they operate on lists, and with the (hopefully clear: we're all for good name choices) name of the function, you will find that you don't need explanations for what the arguments do.
val append : α list -> α list
val flatten : α list list -> α list
val exists: (α -> α bool) -> α list -> bool
val map: (α -> β) -> α list -> β list
val combine : α list -> β list -> (α * β) list
Note that the last example is interesting because it is not exactly clear what the code will do. There would in fact be several interpretations. combine [1;2] [3;4] returns [(1,3); (2,4)] and not, for example, [(1,3); (1,4); (2,3); (2,4)]. It is also not clear what happens when the two lists are not of the same length (the failure mode is unclear).
Functions that are not total, that may raise an exception or not terminate, usually need more documentation about what the failure cases are and how they will behave. This is one strong argument in favor of what we call pure programming, where all the behavior of a function is expressed in terms of returning a value (no exceptions, observable state mutation, or non-termination), and can therefore be statically captured by the type system.
Of course this only works really well for functions that are parametric enough; they have a type that make it very clear what they do. This is not the case of all functions, consider for example the blit function of the String module (I'm sure your favorite language has such examples as well):
val blit : string -> int -> string -> int -> int -> unit
huh?
Programming languages add support named parameters for this reason. In OCaml for example we have "labels" that allow to name parameters. The same function is exported in the StringLabels module as:
val blit : src:string -> src_pos:int -> dst:string -> dst_pos:int -> len:int -> unit
That's better. Yes, named parameters are useful in some cases.
Note however that named arguments can be used to hide bad API design (maybe the example above is targe to this criticism as well). Consider:
val add : float -> float -> float -> float -> float -> float -> float * float * float
obscure, huh? But then:
type coord = {x:float; y:float; z:float}
val add : coord -> coord -> coord
That's much better, and I didn't need any parameter labeling (arguably record labels provide naming, but in fact I could equally use float * float * float here; the fact that value records may subsume named (and optionals?) parameters is also an interesting remark).
David M. Barbour develops the argument that named parameters are a "crutch" of language design, that is used to tamper over the lazyness of API designers, and that not having them encourages better design. I am not sure I agree that named parameters can be profitably avoided in all situations, but he certainly has a point that agrees with the typeful propaganda at the beginning of my post. By raising the level of abstraction (through more polymorphic/parametric types or better problem domain abstractions), you'll find you decrease the need for parameter naming.
Haskell's type synonyms can help in making the type signatures more self-documenting. Consider for example the function writeFile, that just writes a string to a file. It has two parameters: the string to write, and the filename to write to. Or was it the other way around? Both parameters are of type String, so it's not easy to tell which is which!
However, when you look at the documentation, you see the following type signature:
writeFile :: FilePath -> String -> IO ()
That makes it clear (to me, at least!) how the function is intended to be used. Now, since FilePath is just a synonym for String, there's nothing preventing you from using it like this:
writeFile "The quick brown fox jumped over the lazy dog" "test.txt"
but if you get the type FilePath -> String -> IO () as a hint in your IDE, I think that's at least a big push in the right direction!
You could even go a step further and make a newtype for filepaths so you don't accidentally mix up filenames and contents, but I guess that adds more hassle than it's worth, and there's probably also historical reasons why this isn't done.
Or am I completely wrong and it has nothing to do with diferences between functional and imperative programming?
You're not completely wrong, insofar as functional languages with HM type inference often do not require type annotations at all, or, at least not everywhere.
Add to this that type expressions are not necessarily function types, hence the concept of a "parameter name" is not applicable. All in all, a name there is just redundant, it does not add any information to a type, and that could be the reason for not allowing it.
Conversely, in imperative languages, type inference was almost unknown as of late. Thus you must declare everything (in statically typed languages) and so it happens that name and type tend to appear in one place. Moreover, as functions are not first class citiziens, the concept of a function type, let alone an expression that describes a function type is merely unknown.
Observe that with recent developments (like "lambda" syntax, etc.) the concept of an argument whose type is known already or can be easily inferred also appears in those languages. And when I remember correctly, there is even syntactical easement to avoid long names, the lambda argument is just it or even _
What is the best practise in the functional world?
There is alternative standard library for OCaml called Core. It uses labeled parameters almost everywhere. For example
val fold_left : 'a t -> init:'b -> f:('b -> 'a -> 'b) -> 'b
P.S. I have no information about another functional languages.
What it says in the title. If I write a type signature, is it possible to algorithmically generate an expression which has that type signature?
It seems plausible that it might be possible to do this. We already know that if the type is a special-case of a library function's type signature, Hoogle can find that function algorithmically. On the other hand, many simple problems relating to general expressions are actually unsolvable (e.g., it is impossible to know if two functions do the same thing), so it's hardly implausible that this is one of them.
It's probably bad form to ask several questions all at once, but I'd like to know:
Can it be done?
If so, how?
If not, are there any restricted situations where it becomes possible?
It's quite possible for two distinct expressions to have the same type signature. Can you compute all of them? Or even some of them?
Does anybody have working code which does this stuff for real?
Djinn does this for a restricted subset of Haskell types, corresponding to a first-order logic. It can't manage recursive types or types that require recursion to implement, though; so, for instance, it can't write a term of type (a -> a) -> a (the type of fix), which corresponds to the proposition "if a implies a, then a", which is clearly false; you can use it to prove anything. Indeed, this is why fix gives rise to ⊥.
If you do allow fix, then writing a program to give a term of any type is trivial; the program would simply print fix id for every type.
Djinn is mostly a toy, but it can do some fun things, like deriving the correct Monad instances for Reader and Cont given the types of return and (>>=). You can try it out by installing the djinn package, or using lambdabot, which integrates it as the #djinn command.
Oleg at okmij.org has an implementation of this. There is a short introduction here but the literate Haskell source contains the details and the description of the process. (I'm not sure how this corresponds to Djinn in power, but it is another example.)
There are cases where is no unique function:
fst', snd' :: (a, a) -> a
fst' (a,_) = a
snd' (_,b) = b
Not only this; there are cases where there are an infinite number of functions:
list0, list1, list2 :: [a] -> a
list0 l = l !! 0
list1 l = l !! 1
list2 l = l !! 2
-- etc.
-- Or
mkList0, mkList1, mkList2 :: a -> [a]
mkList0 _ = []
mkList1 a = [a]
mkList2 a = [a,a]
-- etc.
(If you only want total functions, then consider [a] as restricted to infinite lists for list0, list1 etc, i.e. data List a = Cons a (List a))
In fact, if you have recursive types, any types involving these correspond to an infinite number of functions. However, at least in the case above, there is a countable number of functions, so it is possible to create an (infinite) list containing all of them. But, I think the type [a] -> [a] corresponds to an uncountably infinite number of functions (again restrict [a] to infinite lists) so you can't even enumerate them all!
(Summary: there are types that correspond to a finite, countably infinite and uncountably infinite number of functions.)
This is impossible in general (and for languages like Haskell that does not even has the strong normalization property), and only possible in some (very) special cases (and for more restricted languages), such as when a codomain type has the only one constructor (for example, a function f :: forall a. a -> () can be determined uniquely). In order to reduce a set of possible definitions for a given signature to a singleton set with just one definition need to give more restrictions (in the form of additional properties, for example, it is still difficult to imagine how this can be helpful without giving an example of use).
From the (n-)categorical point of view types corresponds to objects, terms corresponds to arrows (constructors also corresponds to arrows), and function definitions corresponds to 2-arrows. The question is analogous to the question of whether one can construct a 2-category with the required properties by specifying only a set of objects. It's impossible since you need either an explicit construction for arrows and 2-arrows (i.e., writing terms and definitions), or deductive system which allows to deduce the necessary structure using a certain set of properties (that still need to be defined explicitly).
There is also an interesting question: given an ADT (i.e., subcategory of Hask) is it possible to automatically derive instances for Typeable, Data (yes, using SYB), Traversable, Foldable, Functor, Pointed, Applicative, Monad, etc (?). In this case, we have the necessary signatures as well as additional properties (for example, the monad laws, although these properties can not be expressed in Haskell, but they can be expressed in a language with dependent types). There is some interesting constructions:
http://ulissesaraujo.wordpress.com/2007/12/19/catamorphisms-in-haskell
which shows what can be done for the list ADT.
The question is actually rather deep and I'm not sure of the answer, if you're asking about the full glory of Haskell types including type families, GADT's, etc.
What you're asking is whether a program can automatically prove that an arbitrary type is inhabited (contains a value) by exhibiting such a value. A principle called the Curry-Howard Correspondence says that types can be interpreted as mathematical propositions, and the type is inhabited if the proposition is constructively provable. So you're asking if there is a program that can prove a certain class of propositions to be theorems. In a language like Agda, the type system is powerful enough to express arbitrary mathematical propositions, and proving arbitrary ones is undecidable by Gödel's incompleteness theorem. On the other hand, if you drop down to (say) pure Hindley-Milner, you get a much weaker and (I think) decidable system. With Haskell 98, I'm not sure, because type classes are supposed to be able to be equivalent to GADT's.
With GADT's, I don't know if it's decidable or not, though maybe some more knowledgeable folks here would know right away. For example it might be possible to encode the halting problem for a given Turing machine as a GADT, so there is a value of that type iff the machine halts. In that case, inhabitability is clearly undecidable. But, maybe such an encoding isn't quite possible, even with type families. I'm not currently fluent enough in this subject for it to be obvious to me either way, though as I said, maybe someone else here knows the answer.
(Update:) Oh a much simpler interpretation of your question occurs to me: you may be asking if every Haskell type is inhabited. The answer is obviously not. Consider the polymorphic type
a -> b
There is no function with that signature (not counting something like unsafeCoerce, which makes the type system inconsistent).
Can anyone explain why these both compile happily :
data A a b = A { a :: a, b :: b }
newtype B a = B (A a (B a))
newtype C = C (A Int C)
But I cannot create a similarly recursively defined types via type synonyms?
type B a = A a (B a)
type C = A Int C
Although obviously data B a = A { a :: a, b :: B a } works just fine.
Is there any way to avoid dealing with that extra constructor X everywhere I want the type recursive? I'm mostly passing in accessor functions that pick out the b anyways, so I'm mostly okay, but if an easy circumvention mechanism exists I'd like to know about it.
Any pragmas I should be using to improve performance with the specialized data type C? Just specialize stuff?
Any clever trick for copying between A a b and A c d defining only the a -> b and c -> d mapping without copying over the record twice? I'm afraid that A's fields will change in future. Template Haskell perhaps?
This has to do with Equi-recursive types versus iso-recursive types. Haskell implements recursive types using iso-recursive types, which require the programmer to tell the type-checker when type recursion is happening. The way you mark it is with a specific constructor, which a simple type-synonym doesn't allow you to have.
Equi-recursive types allow the compiler to infer where recursion is happening, but it leads to a much more complicated type-checker and in some seemingly simple cases the problem is undecidable.
If you'd like a good discussion of equi vs. iso recursive types, check out Benjamin Pierce's excellent Types and Programming Languages
Short answer: because type synonyms don't introduce constructors, and haskell needs constructors to explicitly mark recursion at the type-level, you can't use recursive type synonyms.
I will answer your first question and second questions.
The type of B is the infinite type (A a (A a (A a (A a (...)))))
The "type inference engine" could be designed to infer and handle infinite types. Unfortunately many errors (typographical or logical) by the programmer create code that fails to have the desired finite type and accidentally & unexpectedly has an infinite type. Right now the compiler rejects such code, which is nearly always what the programmer wants. Changing it to allow infinite types would create much more difficult to understand errors at compile time (at least as bad as C++ templates) and in rare cases you might make it compile and perform incorrectly at runtime.
Is there any way to avoid dealing with that extra constructor X
everywhere I want the type recursive?
No. Haskell has chosen to allow recursive types only with explicit type constructors from data or newtype. These make the code more verbose but newtype should have little runtime penalty. It is a design decision.