Why is the type of this function (a -> a) -> a? - haskell

Why is the type of this function (a -> a) -> a?
Prelude> let y f = f (y f)
Prelude> :t y
y :: (t -> t) -> t
Shouldn't it be an infinite/recursive type?
I was going to try and put into words what I think it's type should be, but I just can't do it for some reason.
y :: (t -> t) -> ?WTFIsGoingOnOnTheRHS?
I don't get how f (y f) resolves to a value. The following makes a little more sense to me:
Prelude> let y f x = f (y f) x
Prelude> :t y
y :: ((a -> b) -> a -> b) -> a -> b
But it's still ridiculously confusing. What's going on?

Well, y has to be of type (a -> b) -> c, for some a, b and c we don't know yet; after all, it takes a function, f, and applies it to an argument, so it must be a function taking a function.
Since y f = f x (again, for some x), we know that the return type of y must be the return type of f itself. So, we can refine the type of y a bit: it must be (a -> b) -> b for some a and b we don't know yet.
To figure out what a is, we just have to look at the type of the value passed to f. It's y f, which is the expression we're trying to figure out the type of right now. We're saying that the type of y is (a -> b) -> b (for some a, b, etc.), so we can say that this application of y f must be of type b itself.
So, the type of the argument to f is b. Put it all back together, and we get (b -> b) -> b — which is, of course, the same thing as (a -> a) -> a.
Here's a more intuitive, but less precise view of things: we're saying that y f = f (y f), which we can expand to the equivalent y f = f (f (y f)), y f = f (f (f (y f))), and so on. So, we know that we can always apply another f around the whole thing, and since the "whole thing" in question is the result of applying f to an argument, f has to have the type a -> a; and since we just concluded that the whole thing is the result of applying f to an argument, the return type of y must be that of f itself — coming together, again, as (a -> a) -> a.

Just two points to add to other people's answers.
The function you're defining is usually called fix, and it is a fixed-point combinator: a function that computes the fixed point of another function. In mathematics, the fixed point of a function f is an argument x such that f x = x. This already allows you to infer that the type of fix has to be (a -> a) -> a; "function that takes a function from a to a, and returns an a."
You've called your function y, which seems to be after the Y combinator, but this is an inaccurate name: the Y combinator is one specific fixed point combinator, but not the same as the one you've defined here.
I don't get how f (y f) resolves to a value.
Well, the trick is that Haskell is a non-strict (a.k.a. "lazy") language. The calculation of f (y f) can terminate if f doesn't need to evaluate its y f argument in all cases. So, if you're defining factorial (as John L illustrates), fac (y fac) 1 evaluates to 1 without evaluating y fac.
Strict languages can't do this, so in those languages you cannot define a fixed-point combinator in this way. In those languages, the textbook fixed-point combinator is the Y combinator proper.

#ehird's done a good job of explaining the type, so I'd like to show how it can resolve to a value with some examples.
f1 :: Int -> Int
f1 _ = 5
-- expansion of y applied to f1
y f1
f1 (y f1) -- definition of y
5 -- definition of f1 (the argument is ignored)
-- here's an example that uses the argument, a factorial function
fac :: (Int -> Int) -> (Int -> Int)
fac next 1 = 1
fac next n = n * next (n-1)
y fac :: Int -> Int
fac (y fac) -- def. of y
-- at this point, further evaluation requires the next argument
-- so let's try 3
fac (y fac) 3 :: Int
3 * (y fac) 2 -- def. of fac
3 * (fac (y fac) 2) -- def. of y
3 * (2 * (y fac) 1) -- def. of fac
3 * (2 * (fac (y fac) 1) -- def. of y
3 * (2 * 1) -- def. of fac
You can follow the same steps with any function you like to see what will happen. Both of these examples converge to values, but that doesn't always happen.

Let me tell about a combinator. It's called the "fixpoint combinator" and it has the following property:
The Property: the "fixpoint combinator" takes a function f :: (a -> a) and discovers a "fixed point" x :: a of that function such that f x == x. Some implementations of the fixpoint combinator might be better or worse at "discovering", but assuming it terminates, it will produce a fixed point of the input function. Any function that satisfies The Property can be called a "fixpoint combinator".
Call this "fixpoint combinator" y. Based on what we just said, the following are true:
-- as we said, y's input is f :: a -> a, and its output is x :: a, therefore
y :: (a -> a) -> a
-- let x be the fixed point discovered by applying f to y
y f == x -- because y discovers x, a fixed point of f, per The Property
f x == x -- the behavior of a fixed point, per The Property
-- now, per substitution of "x" with "f x" in "y f == x"
y f == f x
-- again, per substitution of "x" with "y f" in the previous line
y f == f (y f)
So there you go. You have defined y in terms of the essential property of the fixpoint combinator:
y f == f (y f). Instead of assuming that y f discovers x, you can assume that x represents a divergent computation, and still come to the same conclusion (iinm).
Since your function satisfies The Property, we can conclude that it is a fixpoint combinator, and that the other properties we have stated, including the type, are applicable to your function.
This isn't exactly a solid proof, but I hope it provides additional insight.

Related

How do you determine the way to write functions so that they are compatible for composing?

Let's say we have these 2 functions
f1 :: Int -> [Int] -> Int -> [Int]
f1 _ [] _ = []
f1 x (h:t) y = (x * h + y):(f1 x t y)
f2 :: [Int] -> Int
f2 (h:t) = h
Why does (f2 . f1 1 [1..10]) 1 work, but (f2 . f1 1) [1..10] 1 doesn't work?
All functions in Haskell take exactly one argument and return one value. The argument and/or the return type could be another function.
f1 has an argument type of Int and a return type of [Int] -> Int -> [Int]. The right associativity of (->) means we don't have to explicitly write this as
f1 :: Int -> ([Int] -> (Int -> [Int]))
but can instead drop the parentheses.
(Yes, (->) is an operator. You can use :k in GHCi to see the kind, but unfortunately what you get back is more complicated than we want to explain here:
> :k (->)
(->) :: TYPE q -> TYPE r -> *
Don't worry about what TYPE q and TYPE r stand for. Suffice it to say that (->) takes two types and returns a new type, and we can assume a simpler kind like
(->) :: * -> * -> *
The kind * is the kind of an ordinary type, more frequently written as Type these days.
)
In order to compose two functions, the return type of one must match the argument type of the other. We can see this from the type of (.) itself:
(.) :: (b -> c) -> (a -> b) -> a -> c
^ ^
That's not the case with f1 and f2:
-- vvvvvvvvvvvvvvvvvvvvv
f1 :: Int -> ([Int] -> Int -> [Int])
f2 :: [Int] -> Int
-- ^^^^^
-- [Int] -> Int -> [Int] and [Int] are different types
nor with f1 1 and f2
-- vvvvvvvvvvvv
f1 1 :: [Int] -> (Int -> [Int])
f2 :: [Int] -> Int
-- ^^^^^
-- Int -> [Int] and [Int] are different types
but it is true of f1 1 [1..10] and f2:
-- vvvvv
f1 1 [1..10] :: Int -> [Int]
f2 :: [Int] -> Int
-- ^^^^^
-- [Int] and [Int] are the same type
While (->) is right-associative, function application is left-associative, which is why we can write
((f1 1) [1..10]) 1
as f1 1 [1..10] 1 instead, leading to the appearance of f1 as taking 3 arguments, rather than an expression involving 3 separate function calls. You can see the three calls more clearly if you use a let expression to name each intermediate function explicitly:
let t1 = f1 1
t2 = t1 [1..10]
t3 = t2 1
in t3
As a complement to chepner's excellent answer, here's a less formal way of looking at it (which might be more intuitive if you haven't assimilated the formalities yet).
You seem to be thinking of (f . g) arg as a special syntactic form that will pass the argument to g and then call f on the result1. In fact this isn't what happens (or at best it's a shorthand for what happens).
Remember that functions are first class in Haskell. Frequently when that is brought up it means that we are passing functions as values to other functions, or storing them in data structures. But here what we need to remember is that first-class values can be computed as the result of expressions. f . g is an expression that computes a new function, in precisely the same manner that 7 + 9 is an expression that computes a new integer.
So (f . g) arg is not passing arg to g. It's passing arg to the function that is computed by the expression f . g. And if we had multiple arguments where arg is written, as in (f . g) x y z they would all be passed to the whole function f . g. There is no way for any of them to be passed to g directly.
The . operator builds a new function out of f and g by producing a function that behaves as follows: it takes a single argument x, passes it to g and then passes the result of that to f. In Haskell syntax that is basically (f . g) x = f (g x)2. So if we try to pass multiple arguments, as in (f . g) x y z, we end up with this:
(f . g) x y z = (f (g x)) y z
You can see this doesn't end up passing more arguments to g before g's result is fed to f; rather y and z end up being passed to the final result after x has been fed all the way through g and f (this is only going to work if f itself returns a function after being applied to one argument3). A simple way to think of it is that the first argument to the composed function "flows through the composition pipeline"; any other arguments will be passed to the result, after x has "flowed through". If we wanted the extra arguments to be passed to g, we should write it this way:
(f . g x y) z = f (g x y z)
Note that this reframes the "flow"; now it's z that is feeding through the composed functions. x and y have moved to being part of the definition of one of the two composed functions (g x y is every bit as much of an "expression that computes a function" as f . g is). If you really want the emphasis to be on x "flowing through" the composition pipeline, you'll need to change the way you defined g so that it takes arguments in a different order. This is a big part of why Haskellers frequently write functions to take the "main thing being operated on" as their last argument.
With all that in mind, your question should be pretty clear now.
In (f2 . f1 1 [1..10]) 1 the outermost 1 will "flow through the pipeline"; the first stage of the pipeline is f1 1 [1..10] (so the 1 will end up as a third argument to f1).
Whereas in (f2 . f1 1) [1..10] 1, the [1..10] will "flow through the pipeline", and the outer 1 will only be passed to the final result of the whole pipeline, not to the first stage of the pipeline. And the final result of the composition pipeline is the result of f2, which is a simple Int and cannot be applied to more arguments.
1 Haskell does have syntax for "pass multiple arguments to one function, then call another function on the result of that". However it's much more pedestrian. Just write f (g x y z). If this looks simpler to you than trying to twist it into a chain of function compositions, then just write it directly.
2 (f . g) x = f (g x) is almost the literal definition of the . operator. The real definition is f . g = \x -> f (g x). This definition is equivalent in the sense that it always produces exactly the same result, but it is usually a bit more efficient.
3 And your f2 example does not return a function after being applied to one argument (it just returns an Int), so we can see that (f2 . _anything_) arg1 arg2 is definitely going to be a type error with no further analysis, no matter what other function you're composing or what the two arguments are.

Too many arguments for fmap

The first argument, that fmap is expected is a function with one argument.
fmap :: Functor f => (a -> b) -> f a -> f b
Then I tried as follow in prelude:
Prelude> x = fmap (\x y -> x * y)
As you can see, the first argument to fmap is a function, that has two arguments. Why does the compiler let it pass?
The function that I pass to fmap above has two arguments not one!
Haskell does not actually have functions with more (or less) than one argument. A "two-argument function" is really just a function that takes one argument and produces another function that takes another argument. That is, \x y -> x * y is just a syntactic short cut for \x -> \y -> x * y. This concept is known as currying.
So this should explain what's happening in your example. Your fmap will simply turn an f of numbers into an f of functions. So, for example, x [1,2,3] would produce the list [\y -> 1 * y, \y -> 2 * y, \y -> 3 * y] (a.k.a. [(1*), (2*), (3*)]).
You have defined a function. One of the fundamental aspects of functional programming that functions can be parameters, stored into variables, etc.
If we then query the type of x, we get:
Prelude> :t x
x :: (Functor f, Num a) => f a -> f (a -> a)
So x is now a function that takes as input a Functor with a applied on that function, and returns the an element of a type with the same functor, but applied with a -> a.
So you can for instance apply a list on x, like:
Prelude> :t x [1,4,2,5]
x [1,4,2,5] :: Num a => [a -> a]
So now we have a list of functions, that is equivalent to:
[\x -> 1*x, \x -> 4*x, \x -> 2*x, \x -> 5*x]

Starting with Haskell

Can someone explain to me whats going on in this function?
applyTwice :: (a -> a) -> a -> a
applyTwice f x = f (f x)
I do understand what curried functions are, this could be re-written like this:
applyTwice :: ((a -> a) -> (a -> (a)))
applyTwice f x = f (f x)
However I dont fully understand the (+3) operator and how it works. Maybe it's something really stupid but I can't figure it out.
Can someone explain step by step how the function works? Thanks =)
applyTwice :: ((a -> a) -> (a -> (a)))
applyTwice f x = f (f x)
Haskell has "operator slicing": if you omit one or both of the arguments to an operator, Haskell automatically turns it into a function for you.
Specifically, (+3) is missing the first argument (Haskell has no unary +). So, Haskell makes that expression into a function that takes the missing argument, and returns the input value plus 3:
-- all the following functions are the same
f1 x = x + 3
f2 = (+3)
f3 = \ x -> x + 3
Similarly, if you omit both arguments, Haskell turns it into a function with two (curried) arguments:
-- all the following functions are the same
g1 x y = x + y
g2 = (+)
g3 = \ x y -> x + y
From comments: note that Haskell does have unary -. So, (-n) is not an operator slice, it just evaluates the negative (same as negate n).
If you want to slice binary - the way you do +, you can use (subtract n) instead:
-- all the following functions are the same
h1 x = x - 3
h2 = subtract 3
h3 = \ x -> x - 3

Reusing patterns in pattern guards or case expressions

My Haskell project includes an expression evaluator, which for the purposes of this question can be simplified to:
data Expression a where
I :: Int -> Expression Int
B :: Bool -> Expression Bool
Add :: Expression Int -> Expression Int -> Expression Int
Mul :: Expression Int -> Expression Int -> Expression Int
Eq :: Expression Int -> Expression Int -> Expression Bool
And :: Expression Bool -> Expression Bool -> Expression Bool
Or :: Expression Bool -> Expression Bool -> Expression Bool
If :: Expression Bool -> Expression a -> Expression a -> Expression a
-- Reduces an Expression down to the simplest representation.
reduce :: Expression a -> Expression a
-- ... implementation ...
The straightforward approach to implementing this is to write a case expression to recursively evaluate and pattern match, like so:
reduce (Add x y) = case (reduce x, reduce y) of
(I x', I y') -> I $ x' + y'
(x', y') -> Add x' y'
reduce (Mul x y) = case (reduce x, reduce y) of
(I x', I y') -> I $ x' * y'
(x', y') -> Mul x' y'
reduce (And x y) = case (reduce x, reduce y) of
(B x', B y') -> B $ x' && y'
(x', y') -> And x' y'
-- ... and similarly for other cases.
To me, that definition looks somewhat awkward, so I then rewrote the definition using pattern guards, like so:
reduce (Add x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' + y'
I think this definition looks cleaner compared to the case expression, but when defining multiple patterns for different constructors, the pattern is repeated multiple times.
reduce (Add x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' + y'
reduce (Mul x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' * y'
Noting these repeated patterns, I was hoping there would be some syntax or structure that could cut down on the repetition in the pattern matching. Is there a generally accepted method to simplify these definitions?
Edit: after reviewing the pattern guards, I've realised they don't work as a drop-in replacement here. Although they provide the same result when x and y can be reduced to I _, they do not reduce any values when the pattern guards do not match. I would still like reduce to simplify subexpressions of Add et al.
One partial solution, which I've used in a similar situation, is to extract the logic into a "lifting" function that takes a normal Haskell operation and applies it to your language's values. This abstracts over the wrappping/unwrapping and resulting error handling.
The idea is to create two typeclasses for going to and from your custom type, with appropriate error handling. Then you can use these to create a liftOp function that could look like this:
liftOp :: (Extract a, Extract b, Pack c) => (a -> b -> c) ->
(Expression a -> Expression b -> Expression c)
liftOp err op a b = case res of
Nothing -> err a' b'
Just res -> pack res
where res = do a' <- extract $ reduce' a
b' <- extract $ reduce' b
return $ a' `op` b'
Then each specific case looks like this:
Mul x y -> liftOp Mul (*) x y
Which isn't too bad: it isn't overly redundant. It encompasses the information that matters: Mul gets mapped to *, and in the error case we just apply Mul again.
You would also need instances for packing and unpacking, but these are useful anyhow. One neat trick is that these can also let you embed functions in your DSL automatically, with an instance of the form (Extract a, Pack b) => Pack (a -> b).
I'm not sure this will work exactly for your example, but I hope it gives you a good starting point. You might want to wire additional error handling through the whole thing, but the good news is that most of that gets folded into the definition of pack, unpack and liftOp, so it's still pretty centralized.
I wrote up a similar solution for a related (but somewhat different) problem. It's also a way to handle going back and forth between native Haskell values and an interpreter, but the interpreter is structured differently. Some of the same ideas should still apply though!
This answer is inspired by rampion's follow-up question, which suggests the following function:
step :: Expression a -> Expression a
step x = case x of
Add (I x) (I y) -> I $ x + y
Mul (I x) (I y) -> I $ x * y
Eq (I x) (I y) -> B $ x == y
And (B x) (B y) -> B $ x && y
Or (B x) (B y) -> B $ x || y
If (B b) x y -> if b then x else y
z -> z
step looks at a single term, and reduces it if everything needed to reduce it is present. Equiped with step, we only need a way to replace a term everywhere in the expression tree. We can start by defining a way to apply a function inside every term.
{-# LANGUAGE RankNTypes #-}
emap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
emap f x = case x of
I a -> I a
B a -> B a
Add x y -> Add (f x) (f y)
Mul x y -> Mul (f x) (f y)
Eq x y -> Eq (f x) (f y)
And x y -> And (f x) (f y)
Or x y -> Or (f x) (f y)
If x y z -> If (f x) (f y) (f z)
Now, we need to apply a function everywhere, both to the term and everywhere inside the term. There are two basic possibilities, we could apply the function to the term before applying it inside or we could apply the function afterwards.
premap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
premap f = emap (premap f) . f
postmap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
postmap f = f . emap (postmap f)
This gives us two possibilities for how to use step, which I will call shorten and reduce.
shorten = premap step
reduce = postmap step
These behave a little differently. shorten removes the innermost level of terms, replacing them with literals, shortening the height of the expression tree by one. reduce completely evaluates the expression tree to a literal. Here's the result of iterating each of these on the same input
"shorten"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul (I 2) (I 3))) (I 0)
If (And (B True) (B True)) (Add (I 1) (I 6)) (I 0)
If (B True) (I 7) (I 0)
I 7
"reduce"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul (I 2) (I 3))) (I 0)
I 7
Partial reduction
Your question implies that you sometimes expect that expressions can't be reduced completely. I'll extend your example to include something to demonstrate this case, by adding a variable, Var.
data Expression a where
Var :: Expression Int
...
We will need to add support for Var to emap:
emap f x = case x of
Var -> Var
...
bind will replace the variable, and evaluateFor performs a complete evaluation, traversing the expression only once.
bind :: Int -> Expression a -> Expression a
bind a x = case x of
Var -> I a
z -> z
evaluateFor :: Int -> Expression a -> Expression a
evaluateFor a = postmap (step . bind a)
Now reduce iterated on an example containing a variable produces the following output
"reduce"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul Var (I 3))) (I 0)
Add (I 1) (Mul Var (I 3))
If the output expression from the reduction is evaluated for a specific value of Var, we can reduce the expression all the way to a literal.
"evaluateFor 5"
Add (I 1) (Mul Var (I 3))
I 16
Applicative
emap can instead be written in terms of an Applicative Functor, and postmap can be made into a generic piece of code suitable for other data types than expressions. How to do so is described in this answer to rampion's follow-up question.

Y Combinator in Haskell

Is it possible to write the Y Combinator in Haskell?
It seems like it would have an infinitely recursive type.
Y :: f -> b -> c
where f :: (f -> b -> c)
or something. Even a simple slightly factored factorial
factMaker _ 0 = 1
factMaker fn n = n * ((fn fn) (n -1)
{- to be called as
(factMaker factMaker) 5
-}
fails with "Occurs check: cannot construct the infinite type: t = t -> t2 -> t1"
(The Y combinator looks like this
(define Y
(lambda (X)
((lambda (procedure)
(X (lambda (arg) ((procedure procedure) arg))))
(lambda (procedure)
(X (lambda (arg) ((procedure procedure) arg)))))))
in scheme)
Or, more succinctly as
(λ (f) ((λ (x) (f (λ (a) ((x x) a))))
(λ (x) (f (λ (a) ((x x) a))))))
For the applicative order
And
(λ (f) ((λ (x) (f (x x)))
(λ (x) (f (x x)))))
Which is just a eta contraction away for the lazy version.
If you prefer short variable names.
Here's a non-recursive definition of the y-combinator in haskell:
newtype Mu a = Mu (Mu a -> a)
y f = (\h -> h $ Mu h) (\x -> f . (\(Mu g) -> g) x $ x)
hat tip
The Y combinator can't be typed using Hindley-Milner types, the polymorphic lambda calculus on which Haskell's type system is based. You can prove this by appeal to the rules of the type system.
I don't know if it's possible to type the Y combinator by giving it a higher-rank type. It would surprise me, but I don't have a proof that it's not possible. (The key would be to identify a suitably polymorphic type for the lambda-bound x.)
If you want a fixed-point operator in Haskell, you can define one very easily because in Haskell, let-binding has fixed-point semantics:
fix :: (a -> a) -> a
fix f = f (fix f)
You can use this in the usual way to define functions and even some finite or infinite data structures.
It is also possible to use functions on recursive types to implement fixed points.
If you're interested in programming with fixed points, you want to read Bruce McAdam's technical report That About Wraps it Up.
The canonical definition of the Y combinator is as follows:
y = \f -> (\x -> f (x x)) (\x -> f (x x))
But it doesn't type check in Haskell because of the x x, since it would require an infinite type:
x :: a -> b -- x is a function
x :: a -- x is applied to x
--------------------------------
a = a -> b -- infinite type
If the type system were to allow such recursive types, it would make type checking undecidable (prone to infinite loops).
But the Y combinator will work if you force it to typecheck, e.g. by using unsafeCoerce :: a -> b:
import Unsafe.Coerce
y :: (a -> a) -> a
y = \f -> (\x -> f (unsafeCoerce x x)) (\x -> f (unsafeCoerce x x))
main = putStrLn $ y ("circular reasoning works because " ++)
This is unsafe (obviously). rampion's answer demonstrates a safer way to write a fixpoint combinator in Haskell without using recursion.
Oh
this wiki page and
This Stack Overflow answer seem to answer my question.
I will write up more of an explanation later.
Now, I've found something interesting about that Mu type. Consider S = Mu Bool.
data S = S (S -> Bool)
If one treats S as a set and that equals sign as isomorphism, then the equation becomes
S ⇋ S -> Bool ⇋ Powerset(S)
So S is the set of sets that are isomorphic to their powerset!
But we know from Cantor's diagonal argument that the cardinality of Powerset(S) is always strictly greater than the cardinality of S, so they are never isomorphic.
I think this is why you can now define a fixed point operator, even though you can't without one.
Just to make rampion's code more readable:
-- Mu :: (Mu a -> a) -> Mu a
newtype Mu a = Mu (Mu a -> a)
w :: (Mu a -> a) -> a
w h = h (Mu h)
y :: (a -> a) -> a
y f = w (\(Mu x) -> f (w x))
-- y f = f . y f
in which w stands for the omega combinator w = \x -> x x, and y stands for the y combinator y = \f -> w . (f w).

Resources