Understanding `GHC.TypeLits` - haskell

I'm trying to wrap my head around the GHC extensions KindSignatures and DataKinds. Looking at the Data.Modular package, I understand roughly that
newtype i `Mod` (n :: Nat) = Mod i deriving (Eq, Ord)
is sort of equivalent to declaring a c++ template <typename T, int N> (with the constructor taking only one argument of type T). However, looking at the GHC.TypeLits package, I don't understand most of what is happening. Any general explanation about this package would be helpful. Before this question gets flagged as off topic though, here are some specific sub-questions:
a KnownNat class makes sense, with required function letting you extract the type variable from the type, but what does natVal do, and what is the proxy type variable?
Where would you use someNatVal?
Finally, what is SomeNat - how can a type level number be unknown? Isn't the whole point of a type level number that it is known at compile time?

This question is quite broad -- I'll address a few points, only.
The proxy type variable is just a type variable of kind * -> *, i.e. the kind of type contructors. Pragmatically, if you have a function
foo :: proxy a -> ...
you can pass to it values of type e.g. Maybe Int, choosing proxy = Maybe and a = Int. You can pass also values of type [] Char (also written as [Char]). Or, more commonly, a value of type Proxy Int, where Proxy is a data type defined as
data Proxy a = Proxy
i.e. a data type which does not carry any run-time information (there's a single value for it!), but which carries compile-time information (the phantom type variable a).
Assume N is a type of kind Nat -- a compile-time natural. We can write a function
bar :: N -> ...
but calling this would require us to build a value of type N -- which is immaterial. The purpose of type N is to carry compile-time information, only, and its run-time values are not things that we really want to use. In fact, N could have no values at all, except for bottom. We could call
bar (undefined :: N)
but this looks weird. Reading this, we must realize that bar is lazy in its first argument and that it will not cause divergence trying to use it. The problem is that the bar :: N -> ... type signature is misleading: it claims that the result may depend on the value of the argument of type N, when this is not really the case. Instead, if we use
baz :: Proxy N -> ...
the intent is clear -- there's only a single run-time value for that: Proxy :: Proxy N. It is equally clear that the N value is only present at compile time.
Sometimes, instead of using the specific Proxy N, the code is slightly generalized to
foo :: proxy N -> ...
which achieves the same goal, but permits also different Proxy types. (Personally, I'm not terribly thrilled by this generalization.)
Back to the question: natVal is a function which turns a compile-time only natural into a run-time value. I.e. it converts Proxy N into Int, returning just the constant.
You analogy with C++ templates might get closer if you used type template arguments to model compile time naturals. E.g.
template <typename N> struct S { using pred = N; };
struct Z {};
template <typename N> int natVal();
template <typename N> int natVal() { return 1 + natVal<typename N::pred>(); }
template <> int natVal<Z>() { return 0; }
int main() {
cout << natVal<S<S<Z>>>() << endl; // outputs 2
return 0;
}
Just pretend there are no public constructors for S and Z: their run-time values are unimportant, only their compile-time information matters.

Related

Assigning constrained literal to a polymorphic variable

While learning haskell with Haskell Programming from first principles found an exercise that puzzles me.
Here is the short version:
For the following definition:
a) i :: Num a => a
i = 1
b) Try replacing the type signature with the following:
i :: a
The replacement gives me an error:
error:
• No instance for (Num a) arising from the literal ‘1’
Possible fix:
add (Num a) to the context of
the type signature for:
i' :: forall a. a
• In the expression: 1
In an equation for ‘i'’: i' = 1
|
38 | i' = 1
| ^
It is more or less clear for me how Num constraint arises.
What is not clear why assigning 1 to polymorphic variable i' gives the error.
Why this works:
id 1
while this one doesn't:
i' :: a
i' = 1
id i'
Should it be possible to assign a more specific value to a less specific and lose some type info if there are no issues?
This is a common misunderstanding. You probably have something in mind like, in a class-OO language,
class Object {};
class Num: Object { public: Num add(...){...} };
class Int: Num { int i; ... };
And then you would be able to use an Int value as the argument to a function that expects a Num argument, or a Num value as the argument to a function that expects an Object.
But that's not at all how Haskell's type classes work. Num is not a class of values (like, in the above example it would be the class of all values that belong to one of the subclasses). Instead, it's the class of all types that represent specific flavours of numbers.
How is that different? Well, a polymorphic literal like 1 :: Num a => a does not generate a specific Num value that can then be upcasted to a more general class. Instead, it expects the caller to first pick a concrete type in which you want to render the number, then generates the number immediately in that type, and afterwards the type never changes.
In other words, a polymorphic value has an implicit type-level argument. Whoever wants to use i needs to do so in a context where both
It is unambiguous what type a should be used. (It doesn't necessarily need to be fixed right there: the caller could also itself be a polymorphic function.)
The compiler can prove that this type a has a Num instance.
In C++, the analogue of Haskell typeclasses / polymorphic literal is not [sub]classes and their objects, but instead templates that are constrained to a concept:
#include <concepts>
template<typename A>
concept Num = std::constructible_from<A, int>; // simplified
template<Num A>
A poly_1() {
return 1;
}
Now, poly_1 can be used in any setting that demands a type which fulfills the Num concept, i.e. in particular a type that is constructible_from an int, but not in a context which requires some other type.
(In older C++ such a template would just be duck-typed, i.e. it's not explicit that it requires a Num setting but the compiler would just try to use it as such and then give a type error upon noticing that 1 can't be converted to the specified type.)
tl;dr
A value i' declared as i' :: a must be usable¹ in place of any other value, with no exception. 1 is no such a value, as it can't be used, say, where a String is expected, just to make one example.
Longer version
Let's start form a less uncontroversial scenario where you do need a type constraint:
plus :: a -> a -> a
plus x y = x + y
This does not compile, because the signature is equivalent to plus :: forall a. a -> a -> a, and it is plainly not true that the RHS, x + y, is meaningful for any common type a that x and y are inhabitants of. So you can fix the above by providing a constraint guaranteeing that + is possible between two as, and you can do so by putting Num a => right after :: (or even by giving up on polymorphic types and just change a to Int).
But there are functions that don't require any constraints on their arguments. Here's three of them:
id :: a -> a
id x = x
const :: a -> b -> a
const x _ = x
Data.Tuple.swap :: (a, b) -> (b, a)
Data.Tuple.swap (a, b) = (b, a)
You can pass anything to these functions, and they'll always work, because their definitions make no assumption whatsoever on what can be done with those objects, as they just shuffle/ditch them.
Similarly,
i' :: a
i' = 1
cannot compile because it's not true that 1 can represent a value of any type a. It can't represent a String, for instance, whereas the signature i' :: a is expressing the idea that you can put i' in any place, e.g. where a Int is expected, as much as where a generic Num is expected, or where a String is expected, and so on.
In other words, the above signature says that you can use i' in both of these statements:
j = i' + 1
k = i' ++ "str"
So the question is: just like we found some functions that have signatures not constraining their arguments in any way, do we have a value that inhabits every single type you can think of?
Yes, there are some values like that, and here are two of them:
i' :: a
i' = error ""
j' :: a
j' = undefined
They're all "bottoms", or ⊥.
(¹) By "usable" I mean that when you write it in some place where the code compiles, the code keeps compiling.

Haskell Show type class in a function

I have a function
mySucc :: (Enum a, Bounded a, Eq a, Show a) => a -> Maybe a
mySucc int
| int == maxBound = Nothing
| otherwise = Just $ succ int
When I want to print the output of this function in ghci, Haskell seems to be confused as to which instance of Show to use. Why is that? Shouldn't Haskell automatically resolve a's type during runtime and use it's Show?
My limited understanding of type class is that, if you mention a type (in my case a) and say that it belongs to a type class (Show), Haskell should automatically resolve the type. Isn't that how it resolves Bounded, Enum and Eq? Please correct me if my understanding is wrong.
Shouldn't Haskell automatically resolve a's type during runtime and use it's Show?
Generally speaking, types don't exist at runtime. The compiler typechecks your code, resolves any polymorphism, and then erases the types. But, this is kind of orthogonal to your main question.
My limited understanding of type class is that, if you mention a type (in my case a) and say that it belongs to a type class (Show), Haskell should automatically resolve the type
No. The compiler will automatically resolve the instance. What that means is, you don't need to explicitly pass a showing-method into your function. For example, instead of the function
showTwice :: Show a => a -> String
showTwice x = show x ++ show x
you could have a function that doesn't use any typeclass
showTwice' :: (a -> String) -> a -> String
showTwice' show' x = show' x ++ show' x
which can be used much the same way as showTwice if you give it the standard show as the first argument. But that argument would need to be manually passed around at each call site. That's what you can avoid by using the type class instead, but this still requires the type to be known first.
(Your mySucc doesn't actually use show in any way at all, so you might as well omit the Show a constraint completely.)
When your call to mySucc appears in a larger expression, chances are the type will in fact also be inferred automatically. For example, mySucc (length "bla") will use a ~ Int, because the result of length is fixed to Int; or mySucc 'y' will use a ~ Char. However, if all the subexpressions are polymorphic (and in Haskell, even number literals are polymorphic), then the compiler won't have any indication what type you actually want. In that case you can always specify it explicitly, either in the argument
> mySucc (3 :: Int)
Just 4
or in the result
> mySucc 255 :: Maybe Word8
Nothing
Are you writing mySucc 1? In this case, you get a error because 1 literal is a polymorphic value of type Num a => a.
Try calling mySucc 1 :: Maybe Int and it will work.

How does the :: operator syntax work in the context of bounded typeclass?

I'm learning Haskell and trying to understand the reasoning behind it's syntax design at the same time. Most of the syntax is beautiful.
But since :: normally is like a type annotation, How is it that this works:
Input: minBound::Int
Output: -2147483648
There is no separate operator: :: is a type annotation in that example. Perhaps the best way to understand this is to consider this code:
main = print (f minBound)
f :: Int -> Int
f = id
This also prints -2147483648. The use of minBound is inferred to be an Int because it is the parameter to f. Once the type has been inferred, the value for that type is known.
Now, back to:
main = print (minBound :: Int)
This works in the same way, except that minBound is known to be an Int because of the type annotation, rather than for some more complex reason. The :: isn't some binary operation; it just directs the compiler that the expression minBound has the type Int. Once again, since the type is known, the value can be determined from the type class.
:: still means "has type" in that example.
There are two ways you can use :: to write down type information. Type declarations, and inline type annotations. Presumably you've been used to seeing type declarations, as in:
plusOne :: Integer -> Integer
plusOne = (+1)
Here the plusOne :: Integer -> Integer line is a separate declaration about the identifier plusOne, informing the compiler what its type should be. It is then actually defined on the following line in another declaration.
The other way you can use :: is that you can embed type information in the middle of any expression. Any expression can be followed by :: and then a type, and it means the same thing as the expression on its own except with the additional constraint that it must have the given type. For example:
foo = ('a', 2) :: (Char, Integer)
bar = ('a', 2 :: Integer)
Note that for foo I attached the entire expression, so it is very little different from having used a separate foo :: (Char, Integer) declaration. bar is more interesting, since I gave a type annotation for just the 2 but used that within a larger expression (for the whole pair). 2 :: Integer is still an expression for the value 2; :: is not an operator that takes 2 as input and computes some result. Indeed if the 2 were already used in a context that requires it to be an Integer then the :: Integer annotation changes nothing at all. But because 2 is normally polymorphic in Haskell (it could fit into a context requiring an Integer, or a Double, or a Complex Float) the type annotation pins down that the type of this particular expression is Integer.
The use is that it avoids you having to restructure your code to have a separate declaration for the expression you want to attach a type to. To do that with my simple example would have required something like this:
two :: Integer
two = 2
baz = ('a', two)
Which adds a relatively large amount of extra code just to have something to attach :: Integer to. It also means when you're reading bar, you have to go read a whole separate definition to know what the second element of the pair is, instead of it being clearly stated right there.
So now we can answer your direct question. :: has no special or particular meaning with the Bounded type class or with minBound in particular. However it's useful with minBound (and other type class methods) because the whole point of type classes is to have overloaded names that do different things depending on the type. So selecting the type you want is useful!
minBound :: Int is just an expression using the value of minBound under the constraint that this particular time minBound is used as an Int, and so the value is -2147483648. As opposed to minBound :: Char which is '\NUL', or minBound :: Bool which is False.
None of those options mean anything different from using minBound where there was already some context requiring it to be an Int, or Char, or Bool; it's just a very quick and simple way of adding that context if there isn't one already.
It's worth being clear that both forms of :: are not operators as such. There's nothing terribly wrong with informally using the word operator for it, but be aware that "operator" has a specific meaning in Haskell; it refers to symbolic function names like +, *, &&, etc. Operators are first-class citizens of Haskell: we can bind them to variables1 and pass them around. For example I can do:
(|+|) = (+)
x = 1 |+| 2
But you cannot do this with ::. It is "hard-wired" into the language, just as the = symbol used for introducing definitions is, or the module Main ( main ) where syntax for module headers. As such there are lots of things that are true about Haskell operators that are not true about ::, so you need to be careful not to confuse yourself or others when you use the word "operator" informally to include ::.
1 Actually an operator is just a particular kind of variable name that is applied by writing it between two arguments instead of before them. The same function can be bound to operator and ordinary variables, even at the same time.
Just to add another example, with Monads you can play a little like this:
import Control.Monad
anyMonad :: (Monad m) => Int -> m Int
anyMonad x = (pure x) >>= (\x -> pure (x*x)) >>= (\x -> pure (x+2))
$> anyMonad 4 :: [Int]
=> [18]
$> anyMonad 4 :: Either a Int
=> Right 18
$> anyMonad 4 :: Maybe Int
=> Just 18
it's a generic example telling you that the functionality may change with the type, another example:

Confused about Haskell polymorphic types

I have defined a function :
gen :: a -> b
So just trying to provide a simple implementation :
gen 2 = "test"
But throws error :
gen.hs:51:9:
Couldn't match expected type ‘b’ with actual type ‘[Char]’
‘b’ is a rigid type variable bound by
the type signature for gen :: a -> b at gen.hs:50:8
Relevant bindings include gen :: a -> b (bound at gen.hs:51:1)
In the expression: "test"
In an equation for ‘gen’: gen 2 = "test"
Failed, modules loaded: none.
So my function is not correct. Why is a not typed as Int and b not typed as String ?
This is a very common misunderstanding.
The key thing to understand is that if you have a variable in your type signature, then the caller gets to decide what type that is, not you!
So you cannot say "this function returns type x" and then just return a String; your function actually has to be able to return any possible type that the caller may ask for. If I ask your function to return an Int, it has to return an Int. If I ask it to return a Bool, it has to return a Bool.
Your function claims to be able to return any possible type, but actually it only ever returns String. So it doesn't do what the type signature claims it does. Hence, a compile-time error.
A lot of people apparently misunderstand this. In (say) Java, you can say "this function returns Object", and then your function can return anything it wants. So the function decides what type it returns. In Haskell, the caller gets to decide what type is returned, not the function.
Edit: Note that the type you're written, a -> b, is impossible. No function can ever have this type. There's no way a function can construct a value of type b out of thin air. The only way this can work is if some of the inputs also involve type b, or if b belongs to some kind of typeclass which allows value construction.
For example:
head :: [x] -> x
The return type here is x ("any possible type"), but the input type also mentions x, so this function is possible; you just have to return one of the values that was in the original list.
Similarly, gen :: a -> a is a perfectly valid function. But the only thing it can do is return it's input unchanged (i.e., what the id function does).
This property of type signatures telling you what a function does is a very useful and powerful property of Haskell.
gen :: a -> b does not mean "for some type a and some type b, foo must be of type a -> b", it means "for any type a and any type b, foo must be of type a -> b".
to motivate this: If the type checker sees something like let x :: Int = gen "hello", it sees that gen is used as String -> Int here and then looks at gen's type to see whether it can be used that way. The type is a -> b, which can be specialized to String -> Int, so the type checker decides that this is fine and allows this call. That is since the function is declared to have type a -> b, the type checker allows you to call the function with any type you want and allows you to use the result as any type you want.
However that clearly does not match the definition you gave the function. The function knows how to handle numbers as arguments - nothing else. And likewise it knows how to produce strings as its result - nothing else. So clearly it should not be possible to call the function with a string as its argument or to use the function's result as an Int. So since the type a -> b would allow that, it's clearly the wrong type for that function.
Your type signature gen :: a -> b is stating, that your function can work for any type a (and provide any type b the caller of the function demands).
Besides the fact that such a function is hard to come by, the line gen 2 = "test" tries to return a String which very well may not be what the caller demands.
Excellent answers. Given your profile, however, you seem to know Java, so I think it's valuable to connect this to Java as well.
Java offers two kinds of polymorphism:
Subtype polymorphism: e.g., every type is a subtype of java.lang.Object
Generic polymorphism: e.g., in the List<T> interface.
Haskell's type variables are a version of (2). Haskell doesn't really have a version of (1).
One way to think of generic polymorphism is in terms of templates (which is what C++ people call them): a type that has a type variable parameter is a template that can be specialized into a variety of monomorphic types. So for example, the interface List<T> is a template for constructing monomorphic interfaces like List<String>, List<List<String>> and so on, all of which have the same structure but differ only because the type variable T gets replaced uniformly throughout the signatures with the instantiation type.
The concept that "the caller chooses" that several responders have mentioned here is basically a friendly way of referring to instantiation. In Java, for example, the most common point where the type variable gets "chosen" is when an object is instantiated:
List<String> myList = new ArrayList<String>();
Second common point is that a subtype of a generic type may instantiate the supertype's variables:
class MyFunction implements Function<Integer, String> {
public String apply(Integer i) { ... }
}
Third one is methods that allow the caller to instantiate a variable that's not a parameter of its enclosing type:
/**
* Visitor-pattern style interface for a simple arithmetical language
* abstract syntax tree.
*/
interface Expression {
// The caller of `accept` implicitly chooses which type `R` is,
// by supplying a `Visitor<R>` with `R` instantiated to something
// of its choice.
<R> accept(Expression.Visitor<R> visitor);
static interface Visitor<R> {
R constant(int i);
R add(Expression a, Expression b);
R multiply(Expression a, Expression b);
}
}
In Haskell, instantiation is carried out implicitly by the type inference algorithm. In any expression where you use gen :: a -> b, type inference will infer what types need to be instantiated for a and b, given the context in which gen is used. So basically, "caller chooses" means that any code that uses gen controls the types to which a and b will be instantiated; if I write gen [()], then I'm implicitly instantiating a to [()]. The error here means that your type declaration says that gen [()] is allowed, but your equation gen 2 = "test" implies that it's not.
In Haskell, type variables are implicitly quantified, but we can make this explicit:
{-# LANGUAGE ScopedTypeVariables #-}
gen :: forall a b . a -> b
gen x = ????
The "forall" is really just a type level version of a lambda, often written Λ. So gen is a function taking three arguments: a type, bound to the name a, another type, bound to the name b, and a value of type a, bound to the name x. When your function is called, it is called with those three arguments. Consider a saner case:
fst :: (a,b) -> a
fst (x1,x2) = x1
This gets translated to
fst :: forall (a::*) (b::*) . (a,b) -> a
fst = /\ (a::*) -> /\ (b::*) -> \ (x::(a,b)) ->
case x of
(x1, x2) -> x1
where * is the type (often called a kind) of normal concrete types. If I call fst (3::Int, 'x'), that gets translated into
fst Int Char (3Int, 'x')
where I use 3Int to represent specifically the Int version of 3. We could then calculate it as follows:
fst Int Char (3Int, 'x')
=
(/\ (a::*) -> /\ (b::*) -> \(x::(a,b)) -> case x of (x1,x2) -> x1) Int Char (3Int, 'x')
=
(/\ (b::*) -> \(x::(Int,b)) -> case x of (x1,x2) -> x1) Char (3Int, 'x')
=
(\(x::(Int,Char)) -> case x of (x1,x2) -> x1) (3Int, x)
=
case (3Int,x) of (x1,x2) -> x1
=
3Int
Whatever types I pass in, as long as the value I pass in matches, the fst function will be able to produce something of the required type. If you try to do this for a->b, you will get stuck.

in haskell, why do I need to specify type constraints, why can't the compiler figure them out?

Consider the function,
add a b = a + b
This works:
*Main> add 1 2
3
However, if I add a type signature specifying that I want to add things of the same type:
add :: a -> a -> a
add a b = a + b
I get an error:
test.hs:3:10:
Could not deduce (Num a) from the context ()
arising from a use of `+' at test.hs:3:10-14
Possible fix:
add (Num a) to the context of the type signature for `add'
In the expression: a + b
In the definition of `add': add a b = a + b
So GHC clearly can deduce that I need the Num type constraint, since it just told me:
add :: Num a => a -> a -> a
add a b = a + b
Works.
Why does GHC require me to add the type constraint? If I'm doing generic programming, why can't it just work for anything that knows how to use the + operator?
In C++ template programming, you can do this easily:
#include <string>
#include <cstdio>
using namespace std;
template<typename T>
T add(T a, T b) { return a + b; }
int main()
{
printf("%d, %f, %s\n",
add(1, 2),
add(1.0, 3.4),
add(string("foo"), string("bar")).c_str());
return 0;
}
The compiler figures out the types of the arguments to add and generates a version of the function for that type. There seems to be a fundamental difference in Haskell's approach, can you describe it, and discuss the trade-offs? It seems to me like it would be resolved if GHC simply filled in the type constraint for me, since it obviously decided it was needed. Still, why the type constraint at all? Why not just compile successfully as long as the function is only used in a valid context where the arguments are in Num?
If you don't want to specify the function's type, just leave it out and the compiler will infer the types automatically. But if you choose to specify the types, they have to be correct and accurate.
The entire point of types is to have a formal way to declare the right and wrong way to use a function. A type of (Num a) => a -> a -> a describes exactly what is required of the arguments. If you omitted the class constraint, you would have a more general function that could be used (erroneously) in more places.
And it’s not just preventing you from passing non-Num values to add. Everywhere the function goes, the type is sure to go. Consider this:
add :: a -> a -> a
add a b = a + b
foo :: [a -> a -> a]
foo = [add]
value :: [String]
value = [f "hello" "world" | f <- foo]
You want the compiler to reject this, right? How does it do that? By adding class constraints, and checking that they are not removed, even if you don’t directly name the function.
What’s different in the C++ version? There are no class constraints. The compiler substitutes int or std::string for T, then tries to compile the resulting code and looks for a matching + operator that it can use. The template system is “looser”, since it accepts more invalid programs, and this is a symptom of it being a separate stage before compilation. I would love to modify C++ to add the <? extends T> semantics from Java’s generics. Just learn the type system and recognize that parametric polymorphism is “stronger” than C++ templates, namely it will reject more invalid programs.
I think you might be being tripped up by the "crazy moon poetry" of GHC's error messages. It's not saying that it (being GHC) couldn't deduce the (Num a) constraint. It is saying that the (Num a) constraint can't be deduced from your type signature, which it knows must be there from the use of +. Hence, you're are stating that this function has a type more general than the compiler knows it can have. The compiler doesn't want you lying about your functions to the world!
In the first example you gave, without the type signature, if you run :t add in ghci, you'll see that the compiler knows full well that the (Num a) constraint is there.
As for C++'s templates, remember that they are syntactic templates and are only fully type checked in each instance as they are used. Your add template will work with any types so long as, at each place it is used, there is a suitable + operator and perhaps conversions, to make an instance of the template viable. No guarantees can be made about the template until then... which is why the body of the template must be "visible" to each module that uses it.
Basically, all C++ can do is validate the syntax of the template, and then keep it around as a kind of very hygienic macro. Whereas Haskell generates a real function for add (leaving aside that it may choose to also generate type specific specializations for optimization).
There are cases where the compiler can't figure out the right type for you and where it needs your help. Consider
f s = show $ read s
The compiler says:
Ambiguous type variable `a' in the constraints:
Read a' arising from a use of `read' at src\Main.hs:20:13-18
`Show a' arising from a use of `show' at src\Main.hs:20:6-9
Probable fix: add a type signature that fixes these type variable(s)
(Strange enough, it seems you can define this function in ghci, but it seems there is no way to actually use it)
If you want that something like f "1" works, you need to specify the type:
f s = show $ (read s :: Int)

Resources