How does Haskell actually define the + function? - haskell

While reading Real world Haskell I came up with this note:
ghci> :info (+)
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
...
-- Defined in GHC.Num
infixl 6 +
But how can Haskell define + as a non-native function? At some level you have to say that 2 + 3 will become assembler i.e. machine code.

The + function is overloaded and for some types, like Int and Double the definition of + is something like
instance Num Int where
x + y = primAddInt x y
where primAddInt is a function the compiler knows about and will generate machine code for.
The details of how this looks and works depends on the Haskell implementation you're looking at.

It is in fact possible to define numbers without ANY native primitives. There are many ways, but the simplest is:
data Peano = Z | S Peano
Then you can define instance Num for this type using pattern-matching. The second common representation of numbers is so called Church encoding using only functions (all numbers will be represented by some obscure functions, and + will 'add' two functions together to form third one).
Very interesting encodings are possible indeed. For example, you can represent arbitrary precision reals in [0,1) using sequences of bits:
data RealReal = RealReal Bool RealReal | RealEnd
In GHC of course it is defined in a machine-specific way by using either primitives or FFI.

Related

Haskell: What are the types of these expressions (if they have any)?

I'm new to haskell and I am doing some exercises to learn as much as I can about types, but some of the questions are really confusing to me. The exercise I'm struggling with reads:
What are the types of the following expressions? If the expression has no type, do state as much. Also
be sure to declare necessary class restrictions where needed.
5 + 8 :: ?
(+) 2 :: ?
(+2) :: ?
(2+) :: ?
I get that 5 + 8 will return an Int, but the others are not valid expressions by them selves. Does that mean that they have no type, or should i think of them as functions (f :: Int -> Int)(f x = x + 2)?
First, the answers:
5 + 8 has the type forall a. Num a => a which can be specialized to Int.
(+) 2 and (2 +) are the same and both have the type forall a. Num a => a -> a which can be specialized to Int -> Int.
(+ 2) is different but also has the type forall a. Num a => a -> a which can be specialized to Int -> Int.
(+) has the type forall a. Num a => a -> a -> a which can be specialized to Int -> Int -> Int.
For further explanation, read on.
One Literal, Many Types
In Haskell, a numeric literal like 114514 does not have a concrete type like Int. This is good because we have many different types for numbers, incl. Int, Integer, Float, Double and so on, and we do not want to have different notations for each type.
The literals 5, 114514, and 1919810 all have the type
forall a. Num a => a
You can read it like this: "for any type a, if a is an instance of the Num typeclass, then the value can have the type a." Int, Integer, Float and Double are all instances of Num, and because Haskell has (relatively) strong type inference, in different contexts it will be specialized to concrete types like Int.
So what is typeclass?
Typeclasses
The way we express some type "supports" some operation(s) in Haskell is by typeclasses. A typeclass is a set of function signatures, without real implementations. They represent operations we want to make on some types (e.g. Num represents operations we want to make on numeric types), but the actual implementations on different types may differ (the actual calculation of integers and floating point numbers are really different).
We can make a type be an instance of a typeclass (note that this has nothing to do with the instances and classes in object-oriented programming), by actually defining these functions for this type. In this way, we defined this type to support these operations.
Num is one of the typeclasses, which represents the support of numeric operations. It is partially defined as below (I did not put the full one here, in order to reduce verbosity):
class Num a where
(+) :: a -> a -> a
(-) :: a -> a -> a
(*) :: a -> a -> a
You can see that in the signatures we have as instead of real concrete types, these functions are said to be polymorphic, that is, they are generic and can be specialized to different types.
For example, if a and b are both Ints, then a + b has type Int too, because haskell inferred that the + we used here should be the one defined for Int, given that both of its arguments are Ints.
So, if some type is an instance of Num, it means that the +, - and * operator is defined for this type. Being an instance of Num means supporting these operators.
Sections
One good thing of Haskell is its (relatively) flexible infix operators. In Haskell, infix operators are nothing but normal functions with an infix notation (which is merely a syntactic sugar). We can also use infix operators in a prefix manner. For example, 5 + 8 is equivalent to (+) 5 8.
It seems that you are confused with (+) 5, (+5) and (5+). Remember that if we put parentheses on both sides of an infix function, we make it prefix. And as you may already know, prefix functions can be partially applied, that is, give the function only some of its parameters, so it becomes a functions with less parameters to be given later. (+) 5 means that we partially applied (+) by only giving its first argument, so it becomes a function waiting for another one argument, or originally its second argument. We can apply (+) 5 to 8 so it becomes ((+) 5) 8, which is equivalent to 5 + 8.
On the other hand, (5+) and (+5) are called sections, which are another syntactic sugar for infix operators. (5+) means you filled the left hand side of the operator, and it becomes a function waiting for its right hand side. (5+) 8 means 5 + 8 or (+) 5 8. (+5) is flipped, i.e. you filled the right hand side, so it is a function waiting for the left hand side. (+5) 8 means 8 + 5 or (+) 8 5.
Haskell is a language very different from other languages you may have learned; every expression that compiles has a type. Functions are first-class and every function has a type too. Thinking in types will really help in your learning progress.
In Haskell:
operators are just functions
functions can be provided less arguments then declaration
parenthesis can be used to disambiguate order and components of arguments
Taken together it means that:
(+) Is just + function, and will take 2 more arguments
(1 +) Is function with it's first argument provided and will take one more
(+ 2) Is another function that have is argument provided, but this time second one. It also awaits one more argument

Haskell Type Coercion

I trying to wrap my head around Haskell type coercion. Meaning, when does can one pass a value into a function without casting and how that works. Here is a specific example, but I am looking for a more general explanation I can use going forward to try and understand what is going on:
Prelude> 3 * 20 / 4
15.0
Prelude> let c = 20
Prelude> :t c
c :: Integer
Prelude> 3 * c / 4
<interactive>:94:7:
No instance for (Fractional Integer)
arising from a use of `/'
Possible fix: add an instance declaration for (Fractional Integer)
In the expression: 3 * c / 4
In an equation for `it': it = 3 * c / 4
The type of (/) is Fractional a => a -> a -> a. So, I'm guessing that when I do "3 * 20" using literals, Haskell somehow assumes that the result of that expression is a Fractional. However, when a variable is used, it's type is predefined to be Integer based on the assignment.
My first question is how to fix this. Do I need to cast the expression or convert it somehow?
My second question is that this seems really weird to me that you can't do basic math without having to worry so much about int/float types. I mean there's an obvious way to convert automatically between these, why am I forced to think about this and deal with it? Am I doing something wrong to begin with?
I am basically looking for a way to easily write simple arithmetic expressions without having to worry about the neaty greaty details and keeping the code nice and clean. In most top-level languages the compiler works for me -- not the other way around.
If you just want the solution, look at the end.
You nearly answered your own question already. Literals in Haskell are overloaded:
Prelude> :t 3
3 :: Num a => a
Since (*) also has a Num constraint
Prelude> :t (*)
(*) :: Num a => a -> a -> a
this extends to the product:
Prelude> :t 3 * 20
3 * 20 :: Num a => a
So, depending on context, this can be specialized to be of type Int, Integer, Float, Double, Rational and more, as needed. In particular, as Fractional is a subclass of Num, it can be used without problems in a division, but then the constraint will become
stronger and be for class Fractional:
Prelude> :t 3 * 20 / 4
3 * 20 / 4 :: Fractional a => a
The big difference is the identifier c is an Integer. The reason why a simple let-binding in GHCi prompt isn't assigned an overloaded type is the dreaded monomorphism restriction. In short: if you define a value that doesn't have any explicit arguments,
then it cannot have overloaded type unless you provide an explicit type signature.
Numeric types are then defaulted to Integer.
Once c is an Integer, the result of the multiplication is Integer, too:
Prelude> :t 3 * c
3 * c :: Integer
And Integer is not in the Fractional class.
There are two solutions to this problem.
Make sure your identifiers have overloaded type, too. In this case, it would
be as simple as saying
Prelude> let c :: Num a => a; c = 20
Prelude> :t c
c :: Num a => a
Use fromIntegral to cast an integral value to an arbitrary numeric value:
Prelude> :t fromIntegral
fromIntegral :: (Integral a, Num b) => a -> b
Prelude> let c = 20
Prelude> :t c
c :: Integer
Prelude> :t fromIntegral c
fromIntegral c :: Num b => b
Prelude> 3 * fromIntegral c / 4
15.0
Haskell will never automatically convert one type into another when you pass it to a function. Either it's compatible with the expected type already, in which case no coercion is necessary, or the program fails to compile.
If you write a whole program and compile it, things generally "just work" without you having to think too much about int/float types; so long as you're consistent (i.e. you don't try to treat something as an Int in one place and a Float in another) the constraints just flow through the program and figure out the types for you.
For example, if I put this in a source file and compile it:
main = do
let c = 20
let it = 3 * c / 4
print it
Then everything's fine, and running the program prints 15.0. You can see from the .0 that GHC successfully figured out that c must be some kind of fractional number, and made everything work, without me having to give any explicit type signatures.
c can't be an integer because the / operator is for mathematical division, which isn't defined on integers. The operation of integer division is represented by the div function (usable in operator fashion as x `div` y). I think this might be what is tripping you up in your whole program? This is unfortunately just one of those things you have to learn by getting tripped up by it, if you're used to the situation in many other languages where / is sometimes mathematical division and sometimes integer division.
It's when you're playing around in the interpreter that things get messy, because there you tend to bind values with no context whatsoever. In interpreter GHCi has to execute let c = 20 on its own, because you haven't entered 3 * c / 4 yet. It has no way of knowing whether you intend that 20 to be an Int, Integer, Float, Double, Rational, etc
Haskell will pick a default type for numeric values; otherwise if you never use any functions that only work on one particular type of number you'd always get an error about ambiguous type variables. This normally works fine, because these default rules are applied while reading the whole module and so take into account all the other constraints on the type (like whether you've ever used it with /). But here there are no other constraints it can see, so the type defaulting picks the first cab off the rank and makes c an Integer.
Then, when you ask GHCi to evaluate 3 * c / 4, it's too late. c is an Integer, so must 3 * c be, and Integers don't support /.
So in the interpreter, yes, sometimes if you don't give an explicit type to a let binding GHC will pick an incorrect type, especially with numeric types. After that, you're stuck with whatever operations are supported by the concrete type GHCi picked, but when you get this kind of error you can always rebind the variable; e.g. let c = 20.0.
However I suspect in your real program the problem is simply that the operation you wanted was actually div rather than /.
Haskell is a bit unusual in this way. Yes you can't divide to integers together but it's rarely a problem.
The reason is that if you look at the Num typeclass, there's a function fromIntegral this allows you to convert literals into the appropriate type. This with type inference alleviates 99% of the cases where it'd be a problem. Quick example:
newtype Foo = Foo Integer
deriving (Show, Eq)
instance Num Foo where
fromInteger _ = Foo 0
negate = undefined
abs = undefined
(+) = undefined
(-) = undefined
(*) = undefined
signum = undefined
Now if we load this into GHCi
*> 0 :: Foo
Foo 0
*> 1 :: Foo
Foo 0
So you see we are able to do some pretty cool things with how GHCi parses a raw integer. This has a lot of practical uses in DSL's that we won't talk about here.
Next question was how to get from a Double to an Integer or vice versa. There's a function for that.
In the case of going from an Integer to a Double, we'd use fromInteger as well. Why?
Well the type signature for it is
(Num a) => Integer -> a
and since we can use (+) with Doubles we know they're a Num instance. And from there it's easy.
*> 0 :: Double
0.0
Last piece of the puzzle is Double -> Integer. Well a brief search on Hoogle shows
truncate
floor
round
-- etc ...
I'll leave that to you to search.
Type coercion in Haskell isn't automatic (or rather, it doesn't actually exist). When you write the literal 20 it's inferred to be of type Num a => a (conceptually anyway. I don't think it works quite like that) and will, depending on the context in which it is used (i.e. what functions you pass it to) be instantiated with an appropitiate type (I believe if no further constraints are applied, this will default to Integer when you need a concrete type at some point). If you need a different kind of Num, you need to convert the numbers e.g. (3* fromIntegral c / 4) in your example.
The type of (/) is Fractional a => a -> a -> a.
To divide Integers, use div instead of (/). Note that the type of div is
div :: Integral a => a -> a -> a
In most top-level languages the compiler works for me -- not the other way around.
I argue that the Haskell compiler works for you just as much, if not more so, than those of other languages you have used. Haskell is a very different language than the traditional imperative languages (such as C, C++, Java, etc.) you are probably used to. This means that the compiler works differently as well.
As others have stated, Haskell will never automatically coerce from one type to another. If you have an Integer which needs to be used as a Float, you need to do the conversion explicitly with fromInteger.

Haskell - add typeclass?

Consider the following example:
data Dot = Dot Double Double
data Vector = Vector Double Double
First, i would like to overload + operator for Vector addition. If i wanted to overload equality(==) operator, i would write it like:
instance Eq Vector where ...blahblahblah
But I can't find if there is Add typeclass to make Vector behave like a type with addition operation. I can't even find a complete list of Haskell typeclasses, i know only few from different tutorials. Does such a list exist?
Also, can I overload + operator for adding Vector to Dot(it seems rather logical, doesn't it?).
An easy way to discover information about which typeclass (if any) a function belongs to is to use GHCi:
Prelude> :i (+)
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
...
-- Defined in GHC.Num
infixl 6 +
The operator + in Prelude is defined by the typeclass Num. However as the name suggests, this not only defines addition, but also a lots of other numeric operations (in particular the other arithmetic operators as well as the ability to use numeric literals), so this doesn't fit your use case.
There is no way to overload just + for your type, unless you want to hide Prelude's + operator (which would mean you have to create your own Addable instance for Integer, Double etc. if you still want to be able to use + on numbers).
You can write an instance Num Vector to overload + for vector addition (and the other operators that make sense).
instance Num Vector where
(Vector x1 y1) + (Vector x2 y2) = Vector (x1 + x2) (y1 + y2)
-- and so on
However, note that + has the type Num a => a -> a -> a, i.e. both operands and the result all have to be the same type. This means that you cannot have a Dot plus a Vector be a Dot.
While you can hide Num from the Prelude and specify your own +, this is likely to cause confusion and make it harder to use your code together with regular arithmetic.
I suggest you define your own operator for vector-point addition, for example
(Dot x y) `offsetBy` (Vector dx dy) = Dot (x + dx) (y + dy)
or some variant using symbols if you prefer something shorter.
I sometimes see people defining their own operators that kind of look like ones from the Prelude. Even ++ probably uses that symbol because they wanted something that conveyed the idea of "adding" two lists together, but it didn't make sense for lists to be an instance of Num. So you could use <+> or |+| or something.

What is this abstract data type called?

I'm writing Haskell, but this could be applied to any OO or functional language with a concept of ADT. I'll give the template in Haskell, ignoring the fact that the arithmetic operators are already taken:
class Thing a where
(+) :: a -> a -> a
(-) :: a -> a -> a
x - y = x + negate y
(*) :: (RealFrac b) => a -> b -> a
negate :: a -> a
negate x = x * (-1)
Basically these are things that can be added and subtracted and also multiplied by real fractional values. One example might be a simple list of numbers: addition and subtraction are pairwise (in Haskell, "(+) = zipWith (+)"), and multiplication by a real multiplies every item in the list by the same amount. I've come across enough other examples to want to define it as a class, but I don't know exactly what to call it.
In Haskell its usually a monoid provided there is some kind of zero value.
Is this some known kind of object in the zoo of algebraic types? I've looked through rings, semirings, nearsemirings, groups etc without finding it.
This is a vector space: http://en.wikipedia.org/wiki/Vector_space. You have addition and scalar multiplication.

Typed FP: Tuple Arguments and Curriable Arguments

In statically typed functional programming languages, like Standard ML, F#, OCaml and Haskell, a function will usually be written with the parameters separated from each other and from the function name simply by whitespace:
let add a b =
a + b
The type here being "int -> (int -> int)", i.e. a function that takes an int and returns a function which its turn takes and int and which finally returns an int. Thus currying becomes possible.
It's also possible to define a similar function that takes a tuple as an argument:
let add(a, b) =
a + b
The type becomes "(int * int) -> int" in this case.
From the point of view of language design, is there any reason why one could not simply identify these two type patterns in the type algebra? In other words, so that "(a * b) -> c" reduces to "a -> (b -> c)", allowing both variants to be curried with equal ease.
I assume this question must have cropped up when languages like the four I mentioned were designed. So does anyone know any reason or research indicating why all four of these languages chose not to "unify" these two type patterns?
I think the consensus today is to handle multiple arguments by currying (the a -> b -> c form) and to reserve tuples for when you genuinely want values of tuple type (in lists and so on). This consensus is respected by every statically typed functional language since Standard ML, which (purely as a matter of convention) uses tuples for functions that take multiple arguments.
Why is this so? Standard ML is the oldest of these languages, and when people were first writing ML compilers, it was not known how to handle curried arguments efficiently. (At the root of the problem is the fact that any curried function could be partially applied by some other code you haven't seen yet, and you have to compile it with that possibility in mind.) Since Standard ML was designed, compilers have improved, and you can read about the state of the art in an excellent paper by Simon Marlow and Simon Peyton Jones.
LISP, which is the oldest functional language of them all, has a concrete syntax which is extremely unfriendly to currying and partial application. Hrmph.
At least one reason not to conflate curried and uncurried functional types is that it would be very confusing when tuples are used as returned values (a convenient way in these typed languages to return multiple values). In the conflated type system, how can function remain nicely composable? Would a -> (a * a) also transform to a -> a -> a? If so, are (a * a) -> a and a -> (a * a) the same type? If not, how would you compose a -> (a * a) with (a * a) -> a?
You propose an interesting solution to the composition issue. However, I don't find it satisfying because it doesn't mesh well with partial application, which is really a key convenience of curried functions. Let me attempt to illustrate the problem in Haskell:
apply f a b = f a b
vecSum (a1,a2) (b1,b2) = (a1+b1,a2+b2)
Now, perhaps your solution could understand map (vecSum (1,1)) [(0,1)], but what about the equivalent map (apply vecSum (1,1)) [(0,1)]? It becomes complicated! Does your fullest unpacking mean that the (1,1) is unpacked with apply's a & b arguments? That's not the intent... and in any case, reasoning becomes complicated.
In short, I think it would be very hard to conflate curried and uncurried functions while (1) preserving the semantics of code valid under the old system and (2) providing a reasonable intuition and algorithm for the conflated system. It's an interesting problem, though.
Possibly because it is useful to have a tuple as a separate type, and it is good to keep different types separate?
In Haskell at least, it is quite easy to go from one form to the other:
Prelude> let add1 a b = a+b
Prelude> let add2 (a,b) = a+b
Prelude> :t (uncurry add1)
(uncurry add1) :: (Num a) => (a, a) -> a
Prelude> :t (curry add2)
(curry add2) :: (Num a) => a -> a -> a
so uncurry add1 is same as add2 and curry add2 is the same as add1. I guess similar things are possible in the other languages as well. What greater gains would there be to automatically currying every function defined on a tuple? (Since that's what you seem to be asking.)
Expanding on the comments under namin's good answer:
So assume a function which has type 'a -> ('a * 'a):
let gimme_tuple(a : int) =
(a*2, a*3)
Then assume a function which has type ('a * 'a) -> 'b:
let add(a : int, b) =
a + b
Then composition (assuming the conflation that I propose) wouldn't pose any problem as far as I can see:
let foo = add(gimme_tuple(5))
// foo gets value 5*2 + 5*3 = 25
But then you could conceive of a polymorphic function that takes the place of add in the last code snippet, for instance a little function that just takes a 2-tuple and makes a list of the two elements:
let gimme_list(a, b) =
[a, b]
This would have the type ('a * 'a) -> ('a list). Composition now would be problematic. Consider:
let bar = gimme_list(gimme_tuple(5))
Would bar now have the value [10, 15] : int list or would bar be a function of type (int * int) -> ((int * int) list), which would eventually return a list whose head would be the tuple (10, 15)? For this to work, I posited in a comment to namin's answer that one would need an additional rule in the type system that the binding of actual to formal parameters be the "fullest possible", i.e. that the system should attempt a non-partial binding first and only try a partial binding if a full binding is unattainable. In our example, it would mean that we get the value [10, 15] because a full binding is possible in this case.
Is such a concept of "fullest possible" inherently meaningless? I don't know, but I can't immediately see a reason it would be.
One problem is of course if you want the second interpretation of the last code snippet, then you'd need to go jump through an extra hoop (typically an anonymous function) to get that.
Tuples are not present in these languages simply to be used as function parameters. They are a really convenient way to represent structured data, e.g., a 2D point (int * int), a list element ('a * 'a list), or a tree node ('a * 'a tree * 'a tree). Remember also that structures are just syntactic sugar for tuples. I.e.,
type info =
{ name : string;
age : int;
address : string; }
is a convenient way of handling a (string * int * string) tuple.
There's no fundamental reason you need tuples in a programming language (you can build up tuples in the lambda calculus, just as you can booleans and integers—indeed using currying*), but they are convenient and useful.
* E.g.,
tuple a b = λf.f a b
fst x = x (λa.λb.a)
snd x = x (λa.λb.b)
curry f = λa.λb.f (λg.g a b)
uncurry f = λx.x f

Resources