in haskell, why do I need to specify type constraints, why can't the compiler figure them out?

in haskell, why do I need to specify type constraints, why can't the compiler figure them out? - haskell

Consider the function,
add a b = a + b
This works:
*Main> add 1 2
3
However, if I add a type signature specifying that I want to add things of the same type:
add :: a -> a -> a
add a b = a + b
I get an error:
test.hs:3:10:
Could not deduce (Num a) from the context ()
arising from a use of `+' at test.hs:3:10-14
Possible fix:
add (Num a) to the context of the type signature for `add'
In the expression: a + b
In the definition of `add': add a b = a + b
So GHC clearly can deduce that I need the Num type constraint, since it just told me:
add :: Num a => a -> a -> a
add a b = a + b
Works.
Why does GHC require me to add the type constraint? If I'm doing generic programming, why can't it just work for anything that knows how to use the + operator?
In C++ template programming, you can do this easily:
#include <string>
#include <cstdio>
using namespace std;
template<typename T>
T add(T a, T b) { return a + b; }
int main()
{
printf("%d, %f, %s\n",
add(1, 2),
add(1.0, 3.4),
add(string("foo"), string("bar")).c_str());
return 0;
}
The compiler figures out the types of the arguments to add and generates a version of the function for that type. There seems to be a fundamental difference in Haskell's approach, can you describe it, and discuss the trade-offs? It seems to me like it would be resolved if GHC simply filled in the type constraint for me, since it obviously decided it was needed. Still, why the type constraint at all? Why not just compile successfully as long as the function is only used in a valid context where the arguments are in Num?

If you don't want to specify the function's type, just leave it out and the compiler will infer the types automatically. But if you choose to specify the types, they have to be correct and accurate.

The entire point of types is to have a formal way to declare the right and wrong way to use a function. A type of (Num a) => a -> a -> a describes exactly what is required of the arguments. If you omitted the class constraint, you would have a more general function that could be used (erroneously) in more places.
And it’s not just preventing you from passing non-Num values to add. Everywhere the function goes, the type is sure to go. Consider this:
add :: a -> a -> a
add a b = a + b
foo :: [a -> a -> a]
foo = [add]
value :: [String]
value = [f "hello" "world" | f <- foo]
You want the compiler to reject this, right? How does it do that? By adding class constraints, and checking that they are not removed, even if you don’t directly name the function.
What’s different in the C++ version? There are no class constraints. The compiler substitutes int or std::string for T, then tries to compile the resulting code and looks for a matching + operator that it can use. The template system is “looser”, since it accepts more invalid programs, and this is a symptom of it being a separate stage before compilation. I would love to modify C++ to add the <? extends T> semantics from Java’s generics. Just learn the type system and recognize that parametric polymorphism is “stronger” than C++ templates, namely it will reject more invalid programs.

I think you might be being tripped up by the "crazy moon poetry" of GHC's error messages. It's not saying that it (being GHC) couldn't deduce the (Num a) constraint. It is saying that the (Num a) constraint can't be deduced from your type signature, which it knows must be there from the use of +. Hence, you're are stating that this function has a type more general than the compiler knows it can have. The compiler doesn't want you lying about your functions to the world!
In the first example you gave, without the type signature, if you run :t add in ghci, you'll see that the compiler knows full well that the (Num a) constraint is there.
As for C++'s templates, remember that they are syntactic templates and are only fully type checked in each instance as they are used. Your add template will work with any types so long as, at each place it is used, there is a suitable + operator and perhaps conversions, to make an instance of the template viable. No guarantees can be made about the template until then... which is why the body of the template must be "visible" to each module that uses it.
Basically, all C++ can do is validate the syntax of the template, and then keep it around as a kind of very hygienic macro. Whereas Haskell generates a real function for add (leaving aside that it may choose to also generate type specific specializations for optimization).

There are cases where the compiler can't figure out the right type for you and where it needs your help. Consider
f s = show $ read s
The compiler says:
Ambiguous type variable `a' in the constraints:
Read a' arising from a use of `read' at src\Main.hs:20:13-18
`Show a' arising from a use of `show' at src\Main.hs:20:6-9
Probable fix: add a type signature that fixes these type variable(s)
(Strange enough, it seems you can define this function in ghci, but it seems there is no way to actually use it)
If you want that something like f "1" works, you need to specify the type:
f s = show $ (read s :: Int)

Related

Assigning constrained literal to a polymorphic variable

While learning haskell with Haskell Programming from first principles found an exercise that puzzles me.
Here is the short version:
For the following definition:
a) i :: Num a => a
i = 1
b) Try replacing the type signature with the following:
i :: a
The replacement gives me an error:
error:
• No instance for (Num a) arising from the literal ‘1’
Possible fix:
add (Num a) to the context of
the type signature for:
i' :: forall a. a
• In the expression: 1
In an equation for ‘i'’: i' = 1
|
38 | i' = 1
| ^
It is more or less clear for me how Num constraint arises.
What is not clear why assigning 1 to polymorphic variable i' gives the error.
Why this works:
id 1
while this one doesn't:
i' :: a
i' = 1
id i'
Should it be possible to assign a more specific value to a less specific and lose some type info if there are no issues?

This is a common misunderstanding. You probably have something in mind like, in a class-OO language,
class Object {};
class Num: Object { public: Num add(...){...} };
class Int: Num { int i; ... };
And then you would be able to use an Int value as the argument to a function that expects a Num argument, or a Num value as the argument to a function that expects an Object.
But that's not at all how Haskell's type classes work. Num is not a class of values (like, in the above example it would be the class of all values that belong to one of the subclasses). Instead, it's the class of all types that represent specific flavours of numbers.
How is that different? Well, a polymorphic literal like 1 :: Num a => a does not generate a specific Num value that can then be upcasted to a more general class. Instead, it expects the caller to first pick a concrete type in which you want to render the number, then generates the number immediately in that type, and afterwards the type never changes.
In other words, a polymorphic value has an implicit type-level argument. Whoever wants to use i needs to do so in a context where both
It is unambiguous what type a should be used. (It doesn't necessarily need to be fixed right there: the caller could also itself be a polymorphic function.)
The compiler can prove that this type a has a Num instance.
In C++, the analogue of Haskell typeclasses / polymorphic literal is not [sub]classes and their objects, but instead templates that are constrained to a concept:
#include <concepts>
template<typename A>
concept Num = std::constructible_from<A, int>; // simplified
template<Num A>
A poly_1() {
return 1;
}
Now, poly_1 can be used in any setting that demands a type which fulfills the Num concept, i.e. in particular a type that is constructible_from an int, but not in a context which requires some other type.
(In older C++ such a template would just be duck-typed, i.e. it's not explicit that it requires a Num setting but the compiler would just try to use it as such and then give a type error upon noticing that 1 can't be converted to the specified type.)

tl;dr
A value i' declared as i' :: a must be usable¹ in place of any other value, with no exception. 1 is no such a value, as it can't be used, say, where a String is expected, just to make one example.
Longer version
Let's start form a less uncontroversial scenario where you do need a type constraint:
plus :: a -> a -> a
plus x y = x + y
This does not compile, because the signature is equivalent to plus :: forall a. a -> a -> a, and it is plainly not true that the RHS, x + y, is meaningful for any common type a that x and y are inhabitants of. So you can fix the above by providing a constraint guaranteeing that + is possible between two as, and you can do so by putting Num a => right after :: (or even by giving up on polymorphic types and just change a to Int).
But there are functions that don't require any constraints on their arguments. Here's three of them:
id :: a -> a
id x = x
const :: a -> b -> a
const x _ = x
Data.Tuple.swap :: (a, b) -> (b, a)
Data.Tuple.swap (a, b) = (b, a)
You can pass anything to these functions, and they'll always work, because their definitions make no assumption whatsoever on what can be done with those objects, as they just shuffle/ditch them.
Similarly,
i' :: a
i' = 1
cannot compile because it's not true that 1 can represent a value of any type a. It can't represent a String, for instance, whereas the signature i' :: a is expressing the idea that you can put i' in any place, e.g. where a Int is expected, as much as where a generic Num is expected, or where a String is expected, and so on.
In other words, the above signature says that you can use i' in both of these statements:
j = i' + 1
k = i' ++ "str"
So the question is: just like we found some functions that have signatures not constraining their arguments in any way, do we have a value that inhabits every single type you can think of?
Yes, there are some values like that, and here are two of them:
i' :: a
i' = error ""
j' :: a
j' = undefined
They're all "bottoms", or ⊥.
(¹) By "usable" I mean that when you write it in some place where the code compiles, the code keeps compiling.

Haskell Show type class in a function

I have a function
mySucc :: (Enum a, Bounded a, Eq a, Show a) => a -> Maybe a
mySucc int
| int == maxBound = Nothing
| otherwise = Just $ succ int
When I want to print the output of this function in ghci, Haskell seems to be confused as to which instance of Show to use. Why is that? Shouldn't Haskell automatically resolve a's type during runtime and use it's Show?
My limited understanding of type class is that, if you mention a type (in my case a) and say that it belongs to a type class (Show), Haskell should automatically resolve the type. Isn't that how it resolves Bounded, Enum and Eq? Please correct me if my understanding is wrong.

Shouldn't Haskell automatically resolve a's type during runtime and use it's Show?
Generally speaking, types don't exist at runtime. The compiler typechecks your code, resolves any polymorphism, and then erases the types. But, this is kind of orthogonal to your main question.
My limited understanding of type class is that, if you mention a type (in my case a) and say that it belongs to a type class (Show), Haskell should automatically resolve the type
No. The compiler will automatically resolve the instance. What that means is, you don't need to explicitly pass a showing-method into your function. For example, instead of the function
showTwice :: Show a => a -> String
showTwice x = show x ++ show x
you could have a function that doesn't use any typeclass
showTwice' :: (a -> String) -> a -> String
showTwice' show' x = show' x ++ show' x
which can be used much the same way as showTwice if you give it the standard show as the first argument. But that argument would need to be manually passed around at each call site. That's what you can avoid by using the type class instead, but this still requires the type to be known first.
(Your mySucc doesn't actually use show in any way at all, so you might as well omit the Show a constraint completely.)
When your call to mySucc appears in a larger expression, chances are the type will in fact also be inferred automatically. For example, mySucc (length "bla") will use a ~ Int, because the result of length is fixed to Int; or mySucc 'y' will use a ~ Char. However, if all the subexpressions are polymorphic (and in Haskell, even number literals are polymorphic), then the compiler won't have any indication what type you actually want. In that case you can always specify it explicitly, either in the argument
> mySucc (3 :: Int)
Just 4
or in the result
> mySucc 255 :: Maybe Word8
Nothing

Are you writing mySucc 1? In this case, you get a error because 1 literal is a polymorphic value of type Num a => a.
Try calling mySucc 1 :: Maybe Int and it will work.

Bind type parameter

I am currently working with the quite nice haskell-eigen library and stumbled upon a fundamental yet probably basic problem (I am quite new to practical haskell development).
I use their basic matrix type
data Matrix a b :: * -> * -> *
where a denotes the haskell and b the internal C type. This is realized via the restriction
Elem a b
with
Elem Double CDouble
Elem Float CFloat
-- more for complex types...
Although not really the question I want to ask here I kind of don't understand why this is done this way. Since it is obviously a kind of functional mapping I already don't understand why this is formulated as an equivalency relation, but anyway...
I now want to define (as a simple example - I got several) an instance of Key from the keys package. It defines the index key for a given container, for example
type instance [] = Int
So instances of Key are defined over types of kind * -> *.
However due to that requirement, this won't work:
type instance Key Matrix = (Int, Int)
I have to in some way make Matrix be of kind * -> *. So (coming from c++ where I would do this using traits classes), I tried this:
type family CType a where
CType Double = CDouble
CType Float = CFloat
type MatX a = Matrix a (CType a)
In other words I tried to use type synonyms as a means of realizing that above mentioned functional type map.
Now I tried the following:
type instance Key MatX = (Int, Int)
which gives me "The type synonym ‘MatX’ should have 1 argument, but has been given none" and I even tried the obviously wrong
type instance Key (MatX a) = (Int, Int)
which gives me "Expected kind * -> *, but MatX a has kind *". This sounds to me like "I the compiler expect a type with more than 0 but - being a type synonym - less than 1 argument".
So my question is: How does one commonly map types in haskell in order to solve such a kind mismatch or get rid of it in another way.
P.S.: I am well aware that the eigen matrix has an indexing function, but
I want it to be a common one with other data types
I have this problem in other variants for other type instances.
Edit: Added reference links to mentioned packages.

You're nearly there. The one missing piece is that type synonyms must be used saturated - that is, you have to supply all of its arguments. MatX on its own is not a valid type, only MatX a. The reason for this is that type synonyms are just synonyms - they're expanded at compile time, which means that the compiler needs to know all of the type synonym's arguments in order to get a valid type after expansion.
The fix is to change your type synonym to a newtype.
newtype MatX a = MatX { getMatX :: Matrix a (CType a) }
newtypes can be partially applied, because MatX a is now a different type to Matrix a (CType a).
type instance Key MatX = (Int, Int)

The other answer shows the general case for converting type synonyms into things that can be used in instance declarations. But in this specific case it can be much simpler: since the index type is the same for all different matrices, you can supply just the arguments needed to get the kind correct. Thus:
type instance Key (Matrix a) = (Int, Int)
No extra type families relating Haskell and C types needed, no new types needed. This will also make working with the keys' library's API much simpler, as you won't need to do any newtype wrapping and unwrapping around each call.

How are variable names chosen in type signatures inferred by GHC?

When I play with checking types of functions in Haskell with :t, for example like those in my previous question, I tend to get results such as:
Eq a => a -> [a] -> Bool
(Ord a, Num a, Ord a1, Num a1) => a -> a1 -> a
(Num t2, Num t1, Num t, Enum t2, Enum t1, Enum t) => [(t, t1, t2)]
It seems that this is not such a trivial question - how does the Haskell interpreter pick literals to symbolize typeclasses? When would it choose a rather than t? When would it choose a1 rather than b? Is it important from the programmer's point of view?

The names of the type variables aren't significant. The type:
Eq element => element -> [element] -> Bool
Is exactly the same as:
Eq a => a -> [a] -> Bool
Some names are simply easier to read/remember.
Now, how can an inferencer choose the best names for types?
Disclaimer: I'm absolutely not a GHC developer. However I'm working on a type-inferencer for Haskell in my bachelor thesis.
During inferencing the names chosen for the variables aren't probably that readable. In fact they are almost surely something along the lines of _N with N a number or aN with N a number.
This is due to the fact that you often have to "refresh" type variables in order to complete inferencing, so you need a fast way to create new names. And using numbered variables is pretty straightforward for this purpose.
The names displayed when inference is completed can be "pretty printed". The inferencer can rename the variables to use a, b, c and so on instead of _1, _2 etc.
The trick is that most operations have explicit type signatures. Some definitions require to quantify some type variables (class, data and instance for example).
All these names that the user explicitly provides can be used to display the type in a better way.
When inferencing you can somehow keep track of where the fresh type variables came from, in order to be able to rename them with something more sensible when displaying them to the user.
An other option is to refresh variables by adding a number to them. For example a fresh type of return could be Monad m0 => a0 -> m0 a0 (Here we know to use m and a simply because the class definition for Monad uses those names). When inferencing is finished you can get rid of the numbers and obtain the pretty names.
In general the inferencer will try to use names that were explicitly provided through signatures. If such a name was already used it might decide to add a number instead of using a different name (e.g. use b1 instead of c if b was already bound).
There are probably some other ad hoc rules. For example the fact that tuple elements have like t, t1, t2, t3 etc. is probably something done with a custom rule. In fact t doesn't appear in the signature for (,,) for example.

How does GHCi pick names for type variables? explains how many of these variable names come about. As Ganesh Sittampalam pointed out in a comment, something strange seems to be happening with arithmetic sequences. Both the Haskell 98 report and the Haskell 2010 report indicate that
[e1..] = enumFrom e1
GHCi, however, gives the following:
Prelude> :t [undefined..]
[undefined..] :: Enum t => [t]
Prelude> :t enumFrom undefined
enumFrom undefined :: Enum a => [a]
This makes it clear that the weird behavior has nothing to do with the Enum class itself, but rather comes in from some stage in translating the syntactic sequence to the enumFrom form. I wondered if maybe GHC wasn't really using that translation, but it really is:
{-# LANGUAGE NoMonomorphismRestriction #-}
module X (aoeu,htns) where
aoeu = [undefined..]
htns = enumFrom undefined
compiled using ghc -ddump-simpl enumlit.hs gives
X.htns :: forall a_aiD. GHC.Enum.Enum a_aiD => [a_aiD]
[GblId, Arity=1]
X.htns =
\ (# a_aiG) ($dEnum_aiH :: GHC.Enum.Enum a_aiG) ->
GHC.Enum.enumFrom # a_aiG $dEnum_aiH (GHC.Err.undefined # a_aiG)
X.aoeu :: forall t_aiS. GHC.Enum.Enum t_aiS => [t_aiS]
[GblId, Arity=1]
X.aoeu =
\ (# t_aiV) ($dEnum_aiW :: GHC.Enum.Enum t_aiV) ->
GHC.Enum.enumFrom # t_aiV $dEnum_aiW (GHC.Err.undefined # t_aiV)
so the only difference between these two representations is the assigned type variable name. I don't know enough about how GHC works to know where that t comes from, but at least I've narrowed it down!
Ørjan Johansen has noted in a comment that something similar seems to happen with function definitions and lambda abstractions.
Prelude> :t \x -> x
\x -> x :: t -> t
but
Prelude> :t map (\x->x) $ undefined
map (\x->x) $ undefined :: [b]
In the latter case, the type b comes from an explicit type signature given to map.

Are you familiar with the concepts of alpha equivalence and alpha substitution? This captures the notion that, for example, both of the following are completely equivalent and interconvertible (in certain circumstances) even though they differ:
\x -> (x, x)
\y -> (y, y)
The same concept can be extended to the level of types and type variables (see "System F" for further reading). Haskell in fact has a notion of "lambdas at the type level" for binding type variables, but it's hard to see because they're implicit by default. However, you can make them explicit by using the ExplicitForAll extension, and play around with explicitly binding your type variables:
ghci> :set -XExplicitForAll
ghci> let f x = x; f :: forall a. a -> a
In the second line, I use the forall keyword to introduce a new type variable, which is then used in a type.
In other words, it doesn't matter whether you choose a or t in your example, as long as the type expressions satisfy alpha-equivalence. Choosing type variable names so as to maximize human convenience is an entirely different topic, and probably far more complicated!

Functional Dependency in Haskell

I cannot really get it. Why do we need it at all? I mean if I use the same type parameter, I think that means they should be the same type.
I heard it can help the compiler to avoid the infinite loop. Can someone tell me some more details about that?
In the end, are there any 'patterns and practices' we should follow on the usage of functional dependency in Real World Haskell?
[Follow-up Question]
class Extract container element where
extract :: container -> element
instance Extract (a,b) a where
extract (x,_) = x
In the code above, I used the same type variable 'a' for both container and element, I think the compiler can thus infer that these two types are the same type.
But when I tried this code in GHCi, I got the following feedback:
*Main> extract('x',3)
<interactive>:1:0:
No instance for (Extract (Char, t) element)
arising from a use of `extract' at <interactive>:1:0-13
Possible fix:
add an instance declaration for (Extract (Char, t) element)
In the expression: extract ('x', 3)
In the definition of `it': it = extract ('x', 3)
When one of them has been specified to be type 'Char', why the other one is still unresolved type 'element'?

I thought this explains it fairly well. So basically if you have an FD relation of a -> b all it means is for type-class instance there can only be one 'b' with any 'a' so Int Int but you can't have Int Float as well. That's what they mean when it's said that 'b' is uniquely determined from 'a'. This extends to any number of type paramters. The reason why it is needed is 1. Type inference 2. Sometimes you want a constraint like that.
An alternative to FDs is type families extension but not for all cases of FDs.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string