I started reading about GADT in Haskell Wiki but didn't feel quite comfortable understanding it. Do you recommend a specific book chapter or a blog post explaining GADT for a Haskell beginner?
Apfelmus has made video tutorial for GADTs which might be helpful.
I like the example in the GHC manual. It's simple, and it illustrates some key points:
GADTs let you use Haskell's type system to model the type system of a language you're implementing (the "object language")
This allows Haskell's static checking to assert that your "compiler passes" or what-not are type preserving. Functions taking object-language terms can assume those terms are well-typed. Functions returning object-language terms are required to produce well-typed terms.
Pattern matching a GADT constructor causes type refinement. eval has type Term a -> a overall, but the right-hand side for eval (Lit i) has type Int, because the left-hand constructor had type Term Int.
The Haskell system doesn't care what types you give your GADT constructors. We could just as easily make every constructor in data Term a give a result of type Term a, or Term Bool, and the data definition would still go through. But we wouldn't be able to write eval :: Term a -> a. You choose the GADT "tag types" to model your problem, so that the useful functions you want to write are well-typed.
Other links:
http://www.haskell.org/haskellwiki/Generalised_algebraic_datatype
http://www.comlab.ox.ac.uk/people/ralf.hinze/talks/FOP.pdf
The Haskell wiki's GADTs for dummies is the best explanation I have seen.
The problem I (and I suspect others) have with most introductions is that they show examples of GADTs in terms of syntax which is non-obvious until you understand GADTs. This makes the simplest examples on which everything is built especially hard to fully understand—you can guess at what many of the patterns are doing, but understanding the exact role of every statement is challenging.
The "for dummies" post dissects and builds up the meaning of the syntax along the way to explaining its own basic examples, which makes it a far more useful starting point. I highly recommend it.
Related
I am learning Haskell and trying to understand the Monoid typeclass.
At the moment, I am reading the haskellbook and it says the following about the pattern (monoid):
One of the finer points of the Haskell community has been its
propensity for recognizing abstract patterns in code which have
well-defined, lawful representations in mathematics.
What does the author mean by abstract patterns?
Abstract in this sense is the opposite of concrete. This is probably one of the key things to understand about Haskell.
What is a concrete thing? Well, most values in Haskell are concrete. For example 'a' :: Char. The letter 'a' is a Char value, and it's a concrete value. Char is a concrete type. But in 1 :: Num a => a, the number 1 is actually a value of any type, so long as that type has the set of functions that the Num typeclass sets out as mandatory. This is an abstract value! We can have abstract values, abstract types, and therefore abstract functions. When the program is compiled, the Haskell compiler will pick a particular concrete value that supports all of our requirements.
Haskell, at its core, has a very simple, small but incredibly flexible language. It's very similar to an expression of maths, actually. This makes it very powerful. Why? because most things that would be built in language constructs in other languages are not directly built into Haskell, but defined in terms of this simple core.
One of the core pieces is the function, which, it turns out, most of computation is expressible in terms of. Because so much of Haskell is just defined in terms of this small simple core, it means we can extend it to almost anywhere we can imagine.
Typeclasses are probably the best example of this. Monoid, and Num are examples of typeclasses. These are constructs that allow programmers to use an abstraction like a function across a great many types but only having to define it once. Typeclasses let us use the same function names across a whole range of types if we can define those functions for those types. Why is that important or useful? Well, if we can recognise a pattern across, for example, all numbers, and we have a mechanism for talking about all numbers in the language itself, then we can write functions that work with all numbers at once. This is an abstract pattern. You'll notice some Haskellers are quite interested in a branch of mathematics called Category Theory. This branch is pretty much the mathematical definition of abstract patterns. Contrast this ability to encode such things with the inability of other languages, where in other languages the patterns the community notice are often far less rigorous and have to be manually written out, and without any respect for its mathematical nature. The beauty of following the mathematics is the extremely large body of stuff we get for free by aligning our language closer with mathematics.
This is a good explanation of these basics including typeclasses in a book that I helped author: http://www.happylearnhaskelltutorial.com/1/output_other_things.html
Because functions are written in a very general way (because Haskell puts hardly any limits on our ability to express things generally), we can write functions that use types which express such things as "any type, so long as it's a Monoid". These are called type constraints, as above.
Generally abstractions are very useful because we can, for example, write on single function to operate on an entire range of types which means we can often find functions that do exactly what we want on our types if we just make them instances of specific typeclasses. The Ord typeclass is a great example of this. Making a type we define ourselves an instance of Ord gives us a whole bunch of sorting and comparing functions for free.
This is, in my opinion, one of the most exciting parts about Haskell, because while most other languages also allow you to be very general, they mostly take an extreme dip in how expressive you can be with that generality, so therefore also are less powerful. (This is because they are less precise in what they talk about, because their types are less well "defined").
This is how we're able to reason about the "possible values" of a function, and it's not limited to Haskell. The more information we encode at the type level, the more toward the specificity end of the spectrum of expressivity we veer. For example, to take a classic case, the function const :: a -> b -> a. This function requires that a and b can be of absolutely any type at all, including the same type if we like. From that, because the second parameter can be a different type than the first, we can work out that it really only has one possible functionality. It can't return an Int, unless we give it an Int as its first value, because that's not any type, right? So therefore we know the only value it can return is the first value! The functionality is defined right there in the type! If that's not mindblowing, then I don't know what is. :)
As we move to dependent types (that is, a type system where types are first class, which means also that ordinary values can be encoded in the type system), we can get closer and closer to having the type system specify specifically what the constraints of possible functionality are. However, the kicker is, it doesn't necessarily speak about the implementation of the functionality unless we want it to, because we're in control of how abstract it is, but while maintaining expressivity and much precision. That's pretty fascinating, and amazingly powerful.
Much math can be expressed in the language that underpins Haskell, the lambda calculus.
This post poses the question for the case of !! . The accepted answer tell us that what you are actually doing is creating a new function !! and then you should avoid importing the standard one.
But, why to do so if the new function is to be applied to different types than the standard one? Is not the compiler able to choose the right one according to its parameters?
Is there any compiler flag to allow this?
For instance, if * is not defined for [Float] * Float
Why the compiler cries
> Ambiguous occurrence *
> It could refer to either `Main.*', defined at Vec.hs:4:1
> or `Prelude.*',
for this code:
(*) :: [Float] -> Float -> [Float]
(*) as k = map (\a -> a*k) as -- here: clearly Float*Float
r = [1.0, 2.0, 3.0] :: [Float]
s = r * 2.0 -- here: clearly [Float] * Float
main = do
print r
print s
Allowing the compiler to choose the correct implementation of a function based on its type is the purpose of typeclasses. It is not possible without them.
For a justification of this approach, you might read the paper that introduced them: How to make ad-hoc polymorphism less ad hoc [PDF].
Really, the reason is this: in Haskell, there is not necessarily a clear association “variable x has type T”.
Haskell is almost as flexible as dynamic languages, in the sense that any type can be a type variable, i.e. can have polymorphic type. But whereas in dynamic languages (and also e.g. OO polymorphism or C++ templates), the types of such type-variables are basically just extra information attached to the value-variables in your code (so an overloaded operator can see: argument is an Int->do this, is a String->do that), in Haskell the type variables live in a completely seperate scope in the type language. This gives you many advantages, for instance higher-kinded polymorphism is pretty much impossible without such a system. However, it also means it's harder to reason about how overloaded functions should be resolved. If Haskell allowed you to just write overloads and assume the compiler does its best guess at resolving the ambiguity, you'd often end up with strange error messages in unexpected places. (Actually, this can easily happen with overloads even if you have no Hindley-Milner type system. C++ is notorious for it.)
Instead, Haskell chooses to force overloads to be explicit. You must first define a type class before you can overload methods, and though this can't completely preclude confusing compilation errors it makes them much easier to avoid. Also, it lets you express polymorphic methods with type resolution that couldn't be expressed with traditional overloading, in particular polymorphic results (which is great for writing very easily reusable code).
It is a design decision, not a theoretical problem, not to include this in Haskell. As you say, many other languages use types to disambiguate between terms on an ad-hoc way. But type classes have similar functionality and additionally allow abstraction over things that are overloaded. Type-directed name resolution does not.
Nevertheless, forms of type-directed name resolution have been discussed for Haskell (for example in the context of resolving record field selectors) and are supported by some languages similar to Haskell such as Agda (for data constructors) or Idris (more generally).
I'd like to write a Haskell program that uses GADTs interactively on a platform not supported by GHCi (namely, GNU/Linux on mipsel). The problem is, the construct that can be used to define a GADT in GHC, for example:
data Term a where
Lit :: Int -> Term Int
Pair :: Term a -> Term b -> Term (a,b)
...
doesn't seem working on Hugs.
Can't GADTs really be defined in Hugs? My TA at a Haskell class said it was possible in Hugs, but he seemed unsure.
If not, can GADT be encoded by using other syntax or semantics supported by Hugs, just as GADTs can be encoded in ocaml?
GADTs are not implemented in Hugs.
Instead, you should use a port of GHC to mips if you are attempting to run code using GADTs. Note that you won't be able to use ghci on all platforms, due to lack of bytecode loading on more exotic architectures.
Regarding your question 2 (how to encode GADT use cases in Haskell 98), you may want to look at this 2006 paper by Sulzmann and Wang: GADTless programming in Haskell 98.
Like the OCaml work you're referring to, this works by factoring GADTs through an equality type. There are various ways to define equality type; they use a form of Leibniz equality like for OCaml, which allows to substitute through any application of a type operator at kind * -> *.
Depending on how a given type checker reason about GADT equalities, this may not be expressive enough to cover all examples of GADTs: the checker may implement equality reasoning rules that are not necessarily captured by this definition. For example, a*b = c*d implies a = c and b = d: this form of decomposition does not come if you only apply type constructors at kind * -> *. Later in 2010, Oleg discussed how you can use type families to apply "type deconstructors" through Leibniz equality, gaining decomposition properties for this definition -- but of course this is again outside Haskell 98.
That's something to keep in mind for type system designers: is your language complete for leibniz equality, in the sense that it can express what a specialized equality solver can do?
Even if you find an encoding of the equality type that is expressive enough, you will have very practical convenience issues: when you use GADTs, all uses of equality witness are inferred from type annotations. With this explicit encoding you'll have much more work to do.
Finally (no pun intended), a lot of use cases of GADTs can be equally expressed by tagless-final embeddings (again by Oleg), that IIRC can often be done in Haskell 98. The blog post by Martin Van Steenbergen that dons points to in its reply's comment is in this spirit, but Oleg has considerably improved this technique.
Reading Disadvantages of Scala type system versus Haskell?, I have to ask: what is it, specifically, that makes Haskell's type system more powerful than other languages' type systems (C, C++, Java). Apparently, even Scala can't perform some of the same powers as Haskell's type system. What is it, specifically, that makes Haskell's type system (Hindley–Milner type inference) so powerful? Can you give an example?
What is it, specifically, that makes Haskell's type system
It has been engineered for the past decade to be both flexible -- as a logic for property verification -- and powerful.
Haskell's type system has been developed over the years to encourage a relatively flexible, expressive static checking discipline, with several groups of researchers identifying type system techniques that enable powerful new classes of compile-time verification. Scala's is relatively undeveloped in that area.
That is, Haskell/GHC provides a logic that is both powerful and designed to encourage type level programming. Something fairly unique in the world of functional programming.
Some papers that give a flavor of the direction the engineering effort on Haskell's type system has taken:
Fun with type functions
Associated types with class
Fun with functional dependencies
Hindley-Milner is not a type system, but a type inference algorithm. Haskell's type system, back in the day, used to be able to be fully inferred using HM, but that ship has long sailed for modern Haskell with extensions. (ML remains capable of being fully inferred).
Arguably, the ability to mainly or entirely infer all types yields power in terms of expressiveness.
But that's largely not what I think the question is really about.
The papers that dons linked point to the other aspect -- that the extensions to Haskell's type system make it turing complete (and that modern type families make that turing complete language much more closely resemble value-level programming). Another nice paper on this topic is McBride's Faking It: Simulating Dependent Types in Haskell.
The paper in the other thread on Scala: "Type Classes as Objects and Implicits" goes into why you can in fact do most of this in Scala as well, although with a bit more explicitness. I tend to feel, but this is more a gut sense than from real Scala experience, that its more ad-hoc and explicit approach (what the C++ discussion called "nominal") is ultimately a bit messier.
Let's go with a very simple example: Haskell's Maybe.
data Maybe a = Nothing | Just a
In C++:
template <T>
struct Maybe {
bool isJust;
T value; // IMPORTANT: must ignore when !isJust
};
Let's consider these two function signatures, in Haskell:
sumJusts :: Num a => [Maybe a] -> a
and C++:
template <T> T sumJusts(vector<maybe<T> >);
Differences:
In C++ there are more possible mistakes to make. The compiler doesn't check the usage rule of Maybe.
The C++ type of sumJusts does not specify that it requires + and cast from 0. The error messages that show up when things do not work are cryptic and odd. In Haskell the compiler will just complain that the type is not an instance of Num, very straightforward..
In short, Haskell has:
ADTs
Type-classes
A very friendly syntax and good support for generics (which in C++ people try to avoid because of all their cryptickynessishisms)
Haskell language allows you to write safer code without giving up with functionalities. Most languages nowadays trade features for safety: the Haskell language is there to show that's possible to have both.
We can live without null pointers, explicit castings, loose typing and still have a perfectly expressive language, able to produce efficient final code.
More, the Haskell type system, along with its lazy-by-default and purity approach to coding gives you a boost in complicate but important matters like parallelism and concurrency.
Just my two cents.
One thing I really like and miss in other languages is the support of typclasses, which are an elegant solution for many problems (including for instance polyvariadic functions).
Using typeclasses, it's extremely easy to define very abstract functions, which are still completely type-safe - like for instance this Fibonacci-function:
fibs :: Num a => [a]
fibs#(_:xs) = 0:1:zipWith (+) fibs xs
For instance:
map (`div` 2) fibs -- integral context
(fibs !! 10) + 1.234 -- rational context
map (:+ 1.0) fibs -- Complex context
You may even define your own numeric type for this.
What is expressiveness? To my understanding it is what constraint the type system allow us to put on our code, or in other words what properties of code which we can prove. The more expressive a type system is, the more information we can embed at the type level (which can be used at compile time by the type-checker to check our code).
Here are some properties of Haskell's type system that other languages don't have.
Purity.
Purity allows Haskell to distinguish pure code and IO capable code
Paramtricity.
Haskell enforces parametricity for parametrically polymorphic functions so they must obey some laws. (Some languages does let you to express polymorphic function types but they don't enforce parametricity, for example Scala lets you to pattern match on a specific type even if the argument is polymorphic)
ADT
Extensions
Haskell's base type system is a weaker version of λ2 which itself isn't really impressive. But with these extensions it become really powerful (even able to express dependent types with singleton):
existential types
rank-n types (full λ2)
type families
data kinds (allows "typed" programming at type level)
GADT
...
I was reading a research paper about Haskell and how HList is implemented and wondering when the techniques described are and are not decidable for the type checker. Also, because you can do similar things with GADTs, I was wondering if GADT type checking is always decidable.
I would prefer citations if you have them so I can read/understand the explanations.
Thanks!
I believe GADT type checking is always decidable; it's inference which is undecidable, as it requires higher order unification. But a GADT type checker is a restricted form of the proof checkers you see in eg. Coq, where the constructors build up the proof term. For example, the classic example of embedding lambda calculus into GADTs has a constructor for each reduction rule, so if you want to find the normal form of a term, you have to tell it which constructors will get you to it. The halting problem has been moved into the user's hands :-)
You've probably already seen this but there are a collection of papers on this issue at Microsoft research: Type Checking papers. The first one describes the decidable algorithm actually used in the Glasgow Haskell compiler.