I am playing a bit with zipWith and encounter following:
Prelude Control.Applicative> :t zipWith id
zipWith id :: [b -> c] -> [b] -> [c]
Why does the compiler expect for the next argument a list of functions?
I tried to analyze, but could not conclude, why the next argument must be a list of functions.
How did the signature is getting apply, when I pass id to zipWith?
The type of zipWith is:
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
And the type of id is:
id :: d -> d
So if we now want to derive the type of zipWith id, we push the type of id :: d -> d into the type of the first argument of zipWith:
d -> d
~ a -> (b -> c)
So that means that: a ~ d and a ~ b -> c. So that means that the type of zipWith id is now:
zipWith id :: [a] -> [b] -> [c]
-> zipWith id :: [b -> c] -> [b] -> [c]
How does this work: the first list has to contain a list of functions f :: b -> c, and the second list, a list of elements x :: b, and it thus calculates a list of elements f x :: c.
For example:
Prelude> zipWith id [(+1),(5/),(3*),(3-)] [1,4,2,5]
[2.0,1.25,6.0,-2.0]
since 1+1 is 2.0, 5/4 is 1.25, 3*2 is 6.0 and 3-5 is -2.0.
So zipWith id will take two elements f and x, and apply id f x on these, or more verbose (id f) x. Since id f is f, it will thus calculate f x.
We can thus conclude that zipWith is an elementwise mapping.
Thank you, Willem Van Onsem for the great answer.
Let's understand zipWith id from the eyes of the type inference system of ghc.
first, consider the type of zipWith
Prelude> :info zipWith
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
-- Defined in ‘GHC.List’
First argument of zipWith is a function which accepts a function which takes two arguments.
(a -> b -> c) can also be re-written as a -> (b -> c)
now consider zipWith id. type of id is from a -> a
we have put id in a place where a two argument function must go.
So, type inference would make (a -> b -> c) look like a -> (b -> c) (notice a -> (b -> c) takes one arument a and gives b -> c i.e a single argument function.)
But, making a -> (b -> c) an identity function would be possible only if a is (b -> c).
When a is (b -> c) the function a -> b -> c becomes ((b -> c) -> (b -> c))
So, type inferencing system would infer a as (b -> c) and the resultant output would be [a] -> [b] -> [c] replacing a with b -> c.
Replace a with (b -> c).
Make (a -> b -> c) look like id. (a -> b -> c) can be made to look like id by the above replacement.
((b -> c) -> b -> c) which can also be written as ((b -> c) -> (b -> c)) which is id :: x -> x where x is (b -> c)
zipWith :: ((b -> c) -> b -> c) -> [b -> c] -> [b] -> [c]
So finally we get output as [b -> c] -> [b] -> [c]
Given ist the Haskell function:
head . filter fst
The question is now how to find the type "manually" by hand. If I let Haskell tell me the type I get:
head . filter fst :: [(Bool, b)] -> (Bool, b)
But I want to understand how this works using only the signatures of the used functions which are defined as follows:
head :: [a] -> a
(.) :: (b -> c) -> (a -> b) -> a -> c
filter :: (a -> Bool) -> [a] -> [a]
fst :: (a, b) -> a
Edit: so many very good explanations ... it's not easy to select the best one!
Types are infered using a process generally called unification.
Haskell belongs to the Hindley-Milner family, which is the unification
algorithm it uses to determine the type of an expression.
If unification fails, then the expression is a type error.
The expression
head . filter fst
passes. Let's do the unification manually to see what why we get
what we get.
Let's start with filter fst:
filter :: (a -> Bool) -> [a] -> [a]
fst :: (a' , b') -> a' -- using a', b' to prevent confusion
filter takes a (a -> Bool), then a [a] to give another [a]. In the expression
filter fst, we pass to filter the argument fst, whose type is (a', b') -> a'.
For this to work, the type fst must unify with the type of filter's first argument:
(a -> Bool) UNIFY? ((a', b') -> a')
The algorithm unifies the two type expressions and tries to bind as many type variables (such as a or a') to actual types (such as Bool).
Only then does filter fst lead to a valid typed expression:
filter fst :: [a] -> [a]
a' is clearly Bool. So the type variable a' resolves to a Bool.
And (a', b') can unify to a. So if a is (a', b') and a' is Bool,
Then a is just (Bool, b').
If we had passed an incompatible argument to filter, such as 42 (a Num),
unification of Num a => a with a -> Bool would have failed as the two expressions
can never unify to a correct type expression.
Coming back to
filter fst :: [a] -> [a]
This is the same a we are talking about, so we substitute in it's place
the result of the previous unification:
filter fst :: [(Bool, b')] -> [(Bool, b')]
The next bit,
head . (filter fst)
Can be written as
(.) head (filter fst)
So take (.)
(.) :: (b -> c) -> (a -> b) -> a -> c
So for unification to succeed,
head :: [a] -> a must unify (b -> c)
filter fst :: [(Bool, b')] -> [(Bool, b')] must unify (a -> b)
From (2) we get that a IS b in the expression
(.) :: (b -> c) -> (a -> b) -> a -> c)`
So the values of the type variables a and c in the
expression (.) head (filter fst) :: a -> c are easy to tell since
(1) gives us the relation between b and c, that: b is a list of c.
And as we know a to be [(Bool, b')], c can only unify to (Bool, b')
So head . filter fst successfully type-checks as that:
head . filter fst :: [(Bool, b')] -> (Bool, b')
UPDATE
It's interesting to see how you can unify starting the process from various points.
I chose filter fst first, then went on to (.) and head but as the other examples
show, unification can be carried out in several ways, not unlike the way a mathematic
proof or a theorem derivation can be done in more than one way!
filter :: (a -> Bool) -> [a] -> [a] takes a function (a -> Bool), a list of the same type a, and also returns a list of that type a.
In your defintion you use filter fst with fst :: (a,b) -> a so the type
filter (fst :: (Bool,b) -> Bool) :: [(Bool,b)] -> [(Bool,b)]
is inferred.
Next, you compose your result [(Bool,b)] with head :: [a] -> a.
(.) :: (b -> c) -> (a -> b) -> a -> c is the composition of two functions, func2 :: (b -> c) and func1 :: (a -> b). In your case, you have
func2 = head :: [ a ] -> a
and
func1 = filter fst :: [(Bool,b)] -> [(Bool,b)]
so head here takes [(Bool,b)] as argument and returns (Bool,b) per definition. In the end you have:
head . filter fst :: [(Bool,b)] -> (Bool,b)
Let's start with (.). It's type signature is
(.) :: (b -> c) -> (a -> b) -> a -> c
which says
"given a function from b to c, and a function from a to b,
and an a, I can give you a b". We want to use that with head and
filter fst, so`:
(.) :: (b -> c) -> (a -> b) -> a -> c
^^^^^^^^ ^^^^^^^^
head filter fst
Now head, which is a function from an array of something to a
single something. So now we know that b is going to be an array,
and c is going to be an element of that array. So for the purpose of
our expression, we can think of (.) as having the signature:
(.) :: ([d] -> d) -> (a -> [d]) -> a -> d -- Equation (1)
^^^^^^^^^^
filter fst
The signature for filter is:
filter :: (e -> Bool) -> [e] -> [e] -- Equation (2)
^^^^^^^^^^^
fst
(Note that I've changed the name of the type variable to avoid confusion
with the as
that we already have!) This says "Given a function from e to a Bool,
and a list of es, I can give you a list of es". The function fst
has the signature:
fst :: (f, g) -> f
says, "given a pair containing an f and a g, I can give you an f".
Comparing this with Equation 2, we know that
e is going to be a pair of values, the first element of
which must be a Bool. So in our expression, we can think of filter
as having the signature:
filter :: ((Bool, g) -> Bool) -> [(Bool, g)] -> [(Bool, g)]
(All I've done here is to replace e with (Bool, g) in Equation 2.)
And the expression filter fst has the type:
filter fst :: [(Bool, g)] -> [(Bool, g)]
Going back to Equation 1, we can see that (a -> [d]) must now be
[(Bool, g)] -> [(Bool, g)], so a must be [(Bool, g)] and d
must be (Bool, g). So in our expression, we can think of (.) as
having the signature:
(.) :: ([(Bool, g)] -> (Bool, g)) -> ([(Bool, g)] -> [(Bool, g)]) -> [(Bool, g)] -> (Bool, g)
To summarise:
(.) :: ([(Bool, g)] -> (Bool, g)) -> ([(Bool, g)] -> [(Bool, g)]) -> [(Bool, g)] -> (Bool, g)
^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
head filter fst
head :: [(Bool, g)] -> (Bool, g)
filter fst :: [(Bool, g)] -> [(Bool, g)]
Putting it all together:
head . filter fst :: [(Bool, g)] -> (Bool, g)
Which is equivalent to what you had, except that I've used g as the type variable rather than b.
This probably all sounds very complicated, because I described it in gory detail. However, this sort of reasoning quickly becomes second nature and you can do it in your head.
(skip down for a manual derivation)
Find the type of head . filter fst == ((.) head) (filter fst), given
head :: [a] -> a
(.) :: (b -> c) -> ((a -> b) -> (a -> c))
filter :: (a -> Bool) -> ([a] -> [a])
fst :: (a, b) -> a
This is achieved in a purely mechanical manner by a small Prolog program:
type(head, arrow(list(A) , A)). %% -- known facts
type(compose, arrow(arrow(B, C) , arrow(arrow(A, B), arrow(A, C)))).
type(filter, arrow(arrow(A, bool), arrow(list(A) , list(A)))).
type(fst, arrow(pair(A, B) , A)).
type([F, X], T):- type(F, arrow(A, T)), type(X, A). %% -- application rule
which automagically produces, when run in a Prolog interpreter,
3 ?- type([[compose, head], [filter, fst]], T).
T = arrow(list(pair(bool, A)), pair(bool, A)) %% -- [(Bool,a)] -> (Bool,a)
where types are represented as compound data terms, in a purely syntactical manner. E.g. the type [a] -> a is represented by arrow(list(A), A), with possible Haskell equivalent Arrow (List (Logvar "a")) (Logvar "a"), given the appropriate data definitions.
Only one inference rule, that of an application, was used, as well as Prolog's structural unification whereby compound terms match if they have the same shape and their constituents match: f(a1, a2, ... an) and g(b1, b2, ... bm) match iff f is the same as g, n == m and ai matches bi, with logical variables being able to take on any value as needed, but only once (can't be changed).
4 ?- type([compose, head], T1). %% -- (.) head :: (a -> [b]) -> (a -> b)
T1 = arrow(arrow(A, list(B)), arrow(A, B))
5 ?- type([filter, fst], T2). %% -- filter fst :: [(Bool,a)] -> [(Bool,a)]
T2 = arrow(list(pair(bool, A)), list(pair(bool, A)))
To perform type inference manually in a mechanical fashion, involves writing things one under another, noting equivalences on the side and performing the substitutions thus mimicking the operations of Prolog. We can treat any ->, (_,_), [] etc. purely as syntactical markers, without understanding their meaning at all, and perform the process mechanically using structural unification and, here, only one rule of type inference, viz. rule of application: (a -> b) c ⊢ b {a ~ c} (replace a juxtaposition of (a -> b) and c, with b, under the equivalence of a and c). It is important to rename logical variables, consistently, to avoid name clashes:
(.) :: (b -> c ) -> ((a -> b ) -> (a -> c )) b ~ [a1],
head :: [a1] -> a1 c ~ a1
(.) head :: (a ->[a1]) -> (a -> c )
(a ->[c] ) -> (a -> c )
---------------------------------------------------------
filter :: ( a -> Bool) -> ([a] -> [a]) a ~ (a1,b),
fst :: (a1, b) -> a1 Bool ~ a1
filter fst :: [(a1,b)] -> [(a1,b)]
[(Bool,b)] -> [(Bool,b)]
---------------------------------------------------------
(.) head :: ( a -> [ c ]) -> (a -> c) a ~ [(Bool,b)]
filter fst :: [(Bool,b)] -> [(Bool,b)] c ~ (Bool,b)
((.) head) (filter fst) :: a -> c
[(Bool,b)] -> (Bool,b)
You can do this the "technical" way, with lots of complicated unification steps. Or you can do it the "intuitive" way, just looking at the thing and thinking "OK, what have I got here? What is this expecting?" and so on.
Well, filter expects a function and a list, and returns a list. filter fst specifies a function, but there's no list given - so we're still waiting for the list input. So filter fst is taking a list and returning another list. (This is quite a common Haskell phrase, by the way.)
Next, the . operator "pipes" the output to head, which expects a list and returns one of the elements from that list. (The first one, as it happens.) So whatever filter comes up with, head gives you the first element of it. At this point, we can conclude
head . filter foobar :: [x] -> x
But what is x? Well, filter fst applies fst to every element of the list (to decide whether to keep it or throw it). So fst must be applicable to the list elements. And fst expects a 2-element tuple, and returns the first element of that tuple. Now filter is expecting fst to return a Bool, so that means the first element of the tuple must be a Bool.
Putting all that together, we conclude
head . filter fst :: [(Bool, y)] -> (Bool, y)
What is y? We don't know. We don't actually care! The above functions will work whatever it is. So that's our type signature.
In more complicated examples it can be harder to figure out what's going on. (Especially when weird class instances get involved!) But for smallish ones like this, involving common functions, you can usually just think "OK, what goes in here? What comes out there? What does this function expect?" and walk right up to the answer without too much manual algorithm-chasing.
Reading "Real World Haskell", on page 95 the author provides an example:
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
My question is: Why does this code compile? foldr takes only three arguments - but here, it is passed four: step, id, xs, z.
For example, this doesn't work (because sum expects one):
sum filter odd [1,2,3]
instead I must write:
sum $ filter odd [1,2,3]
Here's the type of foldr:
Prelude> :t foldr
foldr :: (a -> b -> b) -> b -> [a] -> b
Can we figure out how it becomes a four-argument function? Let's give it a try!
we're giving it id :: d -> d as the second parameter (b), so let's substitute that into the type:
(a -> (d -> d) -> (d -> d)) -> (d -> d) -> [a] -> (d -> d)
in Haskell, a -> a -> a is the same as a -> (a -> a), which gives us (removing the last set of parentheses):
(a -> (d -> d) -> (d -> d)) -> (d -> d) -> [a] -> d -> d
let's simplify, by substituting e for (a -> (d -> d) -> (d -> d)) and f for (d -> d), to make it easier to read:
e -> f -> [a] -> d -> d
So we can plainly see that we've constructed a four-argument function! My head hurts.
Here's a simpler example of creating an n + 1-argument function from an n-arg func:
Prelude> :t id
id :: a -> a
id is a function of one argument.
Prelude> id id id id id 5
5
But I just gave it 5 args!
It's because of how polymorphic foldr is:
foldr :: (a -> b -> b) -> b -> [a] -> b
Here, we've instantiated b to a function type, let's call it c -> c, so the type of foldr specializes to (for example)
foldr :: (a -> (c -> c) -> (c -> c)) -> (c -> c) -> [a] -> c -> c
foldr only takes 3 arguments
Wrong. All functions in Haskell take exactly 1 argument, and produce exactly 1 result.
foldr :: (a -> b -> b) -> b -> [a] -> b
See, foldr takes one argument (a -> b -> b), and produces 1 result: b -> [a] -> b. When you see this:
foldr step id xs z
Remember, it is just shorthand for this:
((((foldr step) id) xs) z)
This explains why this is nonsense:
sum filter odd [1,2,3]
(((sum filter) odd) [1,2,3])
sum :: Num a => [a] -> a takes a list as its input, but you gave it a function.