Y Combinator in Haskell - haskell

Is it possible to write the Y Combinator in Haskell?
It seems like it would have an infinitely recursive type.
Y :: f -> b -> c
where f :: (f -> b -> c)
or something. Even a simple slightly factored factorial
factMaker _ 0 = 1
factMaker fn n = n * ((fn fn) (n -1)
{- to be called as
(factMaker factMaker) 5
-}
fails with "Occurs check: cannot construct the infinite type: t = t -> t2 -> t1"
(The Y combinator looks like this
(define Y
(lambda (X)
((lambda (procedure)
(X (lambda (arg) ((procedure procedure) arg))))
(lambda (procedure)
(X (lambda (arg) ((procedure procedure) arg)))))))
in scheme)
Or, more succinctly as
(λ (f) ((λ (x) (f (λ (a) ((x x) a))))
(λ (x) (f (λ (a) ((x x) a))))))
For the applicative order
And
(λ (f) ((λ (x) (f (x x)))
(λ (x) (f (x x)))))
Which is just a eta contraction away for the lazy version.
If you prefer short variable names.

Here's a non-recursive definition of the y-combinator in haskell:
newtype Mu a = Mu (Mu a -> a)
y f = (\h -> h $ Mu h) (\x -> f . (\(Mu g) -> g) x $ x)
hat tip

The Y combinator can't be typed using Hindley-Milner types, the polymorphic lambda calculus on which Haskell's type system is based. You can prove this by appeal to the rules of the type system.
I don't know if it's possible to type the Y combinator by giving it a higher-rank type. It would surprise me, but I don't have a proof that it's not possible. (The key would be to identify a suitably polymorphic type for the lambda-bound x.)
If you want a fixed-point operator in Haskell, you can define one very easily because in Haskell, let-binding has fixed-point semantics:
fix :: (a -> a) -> a
fix f = f (fix f)
You can use this in the usual way to define functions and even some finite or infinite data structures.
It is also possible to use functions on recursive types to implement fixed points.
If you're interested in programming with fixed points, you want to read Bruce McAdam's technical report That About Wraps it Up.

The canonical definition of the Y combinator is as follows:
y = \f -> (\x -> f (x x)) (\x -> f (x x))
But it doesn't type check in Haskell because of the x x, since it would require an infinite type:
x :: a -> b -- x is a function
x :: a -- x is applied to x
--------------------------------
a = a -> b -- infinite type
If the type system were to allow such recursive types, it would make type checking undecidable (prone to infinite loops).
But the Y combinator will work if you force it to typecheck, e.g. by using unsafeCoerce :: a -> b:
import Unsafe.Coerce
y :: (a -> a) -> a
y = \f -> (\x -> f (unsafeCoerce x x)) (\x -> f (unsafeCoerce x x))
main = putStrLn $ y ("circular reasoning works because " ++)
This is unsafe (obviously). rampion's answer demonstrates a safer way to write a fixpoint combinator in Haskell without using recursion.

Oh
this wiki page and
This Stack Overflow answer seem to answer my question.
I will write up more of an explanation later.
Now, I've found something interesting about that Mu type. Consider S = Mu Bool.
data S = S (S -> Bool)
If one treats S as a set and that equals sign as isomorphism, then the equation becomes
S ⇋ S -> Bool ⇋ Powerset(S)
So S is the set of sets that are isomorphic to their powerset!
But we know from Cantor's diagonal argument that the cardinality of Powerset(S) is always strictly greater than the cardinality of S, so they are never isomorphic.
I think this is why you can now define a fixed point operator, even though you can't without one.

Just to make rampion's code more readable:
-- Mu :: (Mu a -> a) -> Mu a
newtype Mu a = Mu (Mu a -> a)
w :: (Mu a -> a) -> a
w h = h (Mu h)
y :: (a -> a) -> a
y f = w (\(Mu x) -> f (w x))
-- y f = f . y f
in which w stands for the omega combinator w = \x -> x x, and y stands for the y combinator y = \f -> w . (f w).

Related

Applicative functor evaluation is not clear to me

I am currently reading Learn You a Haskell for Great Good! and am stumbling on the explanation for the evaluation of a certain code block. I've read the explanations several times and am starting to doubt if even the author understands what this piece of code is doing.
ghci> (+) <$> (+3) <*> (*100) $ 5
508
An applicative functor applies a function in some context to a value in some context to get some result in some context. I have spent a few hours studying this code block and have come up with a few explanations for how this expression is evaluated, and none of them are satisfactory. I understand that (5+3)+(5*100) is 508, but the problem is getting to this expression. Does anyone have a clear explanation for this piece of code?
The other two answers have given the detail of how this is calculated - but I thought I might chime in with a more "intuitive" answer to explain how, without going through a detailed calculation, one can "see" that the result must be 508.
As you implied, every Applicative (in fact, even every Functor) can be viewed as a particular kind of "context" which holds values of a given type. As simple examples:
Maybe a is a context in which a value of type a might exist, but might not (usually the result of a computation which may fail for some reason)
[a] is a context which can hold zero or more values of type a, with no upper limit on the number - representing all possible outcomes of a particular computation
IO a is a context in which a value of type a is available as a result of interacting with "the outside world" in some way. (OK that one isn't so simple...)
And, relevant to this example:
r -> a is a context in which a value of type a is available, but its particular value is not yet known, because it depends on some (as yet unknown) value of type r.
The Applicative methods can be very well understood on the basis of values in such contexts. pure embeds an "ordinary value" in a "default context" in which it behaves as closely as possible in that context to a "context-free" one. I won't go through this for each of the 4 examples above (most of them are very obvious), but I will note that for functions, pure = const - that is, a "pure value" a is represented by the function which always produces a no matter what the source value.
Rather than dwell on how <*> can best be described using the "context" metaphor though, I want to dwell on the particular expression:
f <$> a <*> b
where f is a function between 2 "pure values" and a and b are "values in a context". This expression in fact has a synonym as a function: liftA2. Although using the liftA2 function is generally considered less idiomatic than the "applicative style" using <$> and <*>, the name emphasies that the idea is to "lift" a function on "ordinary values" to one on "values in a context". And when thought of like this, I think it is usually very intuitive what this does, given a particular "context" (ie. a particular Applicative instance).
So the expression:
(+) <$> a <*> b
for values a and b of type say f Int for an Applicative f, behaves as follows for different instances f:
if f = Maybe, then the result, if a and b are both Just values, is to add up the underlying values and wrap them in a Just. If either a or b is Nothing, then the whole expression is Nothing.
if f = [] (the list instance) then the above expression is a list containing all sums of the form a' + b' where a' is in a and b' is in b.
if f = IO, then the above expression is an IO action that performs all the I/O effects of a followed by those of b, and results in the sum of the Ints produced by those two actions.
So what, finally, does it do if f is the function instance? Since a and b are both functions describing how to get a given Int given an arbitrary (Int) input, it is natural that lifting the (+) function over them should be the function that, given an input, gets the result of both the a and b functions, and then adds the results.
And that is, of course, what it does - and the explicit route by which it does that has been very ably mapped out by the other answers. But the reason why it works out like that - indeed, the very reason we have the instance that f <*> g = \x -> f x (g x), which might otherwise seem rather arbitrary (although in actual fact it's one of the very few things, if not the only thing, that will type-check), is so that the instance matches the semantics of "values which depend on some as-yet-unknown other value, according to the given function". And in general, I would say it's often better to think "at a high level" like this than to be forced to go down to the low-level details of exactly how computations are performed. (Although I certainly don't want to downplay the importance of also being able to do the latter.)
[Actually, from a philosophical point of view, it might be more accurate to say that the definition is as it is just because it's the "natural" definition that type-checks, and that it's just happy coincidence that the instance then takes on such a nice "meaning". Mathematics is of course full of just such happy "coincidences" which turn out to have very deep reasons behind them.]
It is using the applicative instance for functions. Your code
(+) <$> (+3) <*> (*100) $ 5
is evaluated as
( (\a->\b->a+b) <$> (\c->c+3) <*> (\d->d*100) ) 5 -- f <$> g
( (\x -> (\a->\b->a+b) ((\c->c+3) x)) <*> (\d->d*100) ) 5 -- \x -> f (g x)
( (\x -> (\a->\b->a+b) (x+3)) <*> (\d->d*100) ) 5
( (\x -> \b -> (x+3)+b) <*> (\d->d*100) ) 5
( (\x->\b->(x+3)+b) <*> (\d->d*100) ) 5 -- f <*> g
(\y -> ((\x->\b->(x+3)+b) y) ((\d->d*100) y)) 5 -- \y -> (f y) (g y)
(\y -> (\b->(y+3)+b) (y*100)) 5
(\y -> (y+3)+(y*100)) 5
(5+3)+(5*100)
where <$> is fmap or just function composition ., and <*> is ap if you know how it behaves on monads.
Let us first take a look how fmap and (<*>) are defined for a function:
instance Functor ((->) r) where
fmap = (.)
instance Applicative ((->) a) where
pure = const
(<*>) f g x = f x (g x)
liftA2 q f g x = q (f x) (g x)
The expression we aim to evaluate is:
(+) <$> (+3) <*> (*100) $ 5
or more verbose:
((+) <$> (+3)) <*> (*100) $ 5
If we thus evaluate (<$>), which is an infix synonym for fmap, we thus see that this is equal to:
(+) . (+3)
so that means our expression is equivalent to:
((+) . (+3)) <*> (*100) $ 5
Next we can apply the sequential application. Here f is thus equal to (+) . (+3) and g is (*100). This thus means that we construct a function that looks like:
\x -> ((+) . (+3)) x ((*100) x)
We can now simplify this and rewrite this into:
\x -> ((+) (x+3)) ((*100) x)
and then rewrite it to:
\x -> (+) (x+3) ((*100) x)
We thus have constructed a function that looks like:
\x -> (x+3) + 100 * x
or simpler:
\x -> 101 * x + 3
If we then calculate:
(\x -> 101*x + 3) 5
then we of course obtain:
101 * 5 + 3
and thus:
505 + 3
which is the expected:
508
For any applicative,
a <$> b <*> c = liftA2 a b c
For functions,
liftA2 a b c x
= a (b x) (c x) -- by definition;
= (a . b) x (c x)
= ((a <$> b) <*> c) x
Thus
(+) <$> (+3) <*> (*100) $ 5
=
liftA2 (+) (+3) (*100) 5
=
(+) ((+3) 5) ((*100) 5)
=
(5+3) + (5*100)
(the long version of this answer follows.)
Pure math has no time. Pure Haskell has no time. Speaking in verbs ("applicative functor applies" etc.) can be confusing ("applies... when?...").
Instead, (<*>) is a combinator which combines a "computation" (denoted by an applicative functor) carrying a function (in the context of that type of computations) and a "computation" of the same type, carrying a value (in like context), into one combined "computation" that carries out the application of that function to that value (in such context).
"Computation" is used to contrast it with a pure Haskell "calculations" (after Philip Wadler's "Calculating is better than Scheming" paper, itself referring to David Turner's Kent Recursive Calculator language, one of predecessors of Miranda, the (main) predecessor of Haskell).
"Computations" might or might not be pure themselves, that's an orthogonal issue. But mainly what it means, is that "computations" embody a generalized function call protocol. They might "do" something in addition to / as part of / carrying out the application of a function to its argument. Or in types,
( $ ) :: (a -> b) -> a -> b
(<$>) :: (a -> b) -> f a -> f b
(<*>) :: f (a -> b) -> f a -> f b
(=<<) :: (a -> f b) -> f a -> f b
With functions, the context is application (another one), and to recover the value -- be it a function or an argument -- the application to a common argument is to be performed.
(bear with me, we're almost there).
The pattern a <$> b <*> c is also expressible as liftA2 a b c. And so, the "functions" applicative functor "computation" type is defined by
liftA2 h x y s = let x' = x s -- embellished application of h to x and y
y' = y s in -- in context of functions, or Reader
h x' y'
-- liftA2 h x y = let x' = x -- non-embellished application, or Identity
-- y' = y in
-- h x' y'
-- liftA2 h x y s = let (x',s') = x s -- embellished application of h to x and y
-- (y',s'') = y s' in -- in context of
-- (h x' y', s'') -- state-passing computations, or State
-- liftA2 h x y = let (x',w) = x -- embellished application of h to x and y
-- (y',w') = y in -- in context of
-- (h x' y', w++w') -- logging computations, or Writer
-- liftA2 h x y = [h x' y' | -- embellished application of h to x and y
-- x' <- x, -- in context of
-- y' <- y ] -- nondeterministic computations, or List
-- ( and for Monads we define `liftBind h x k =` and replace `y` with `k x'`
-- in the bodies of the above combinators; then liftA2 becomes liftBind: )
-- liftA2 :: (a -> b -> c) -> f a -> f b -> f c
-- liftBind :: (a -> b -> c) -> f a -> (a -> f b) -> f c
-- (>>=) = liftBind (\a b -> b) :: f a -> (a -> f b) -> f b
And in fact all the above snippets can be just written with ApplicativeDo as liftA2 h x y = do { x' <- x ; y' <- y ; pure (h x' y') } or even more intuitively as
liftA2 h x y = [h x' y' | x' <- x, y' <- y], with Monad Comprehensions, since all the above computation types are monads as well as applicative functors. This shows by the way that (<*>) = liftA2 ($), which one might find illuminating as well.
Indeed,
> :t let liftA2 h x y r = h (x r) (y r) in liftA2
:: (a -> b -> c) -> (t -> a) -> (t -> b) -> (t -> c)
> :t liftA2 -- the built-in one
liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c
i.e. the types match when we take f a ~ (t -> a) ~ (->) t a, i.e. f ~ (->) t.
And so, we're already there:
(+) <$> (+3) <*> (*100) $ 5
=
liftA2 (+) (+3) (*100) 5
=
(+) ((+3) 5) ((*100) 5)
=
(+) (5+3) (5*100)
=
(5+3) + (5*100)
It's just how liftA2 is defined for this type, Applicative ((->) t) => ...:
instance Applicative ((->) t) where
pure x t = x
liftA2 h x y t = h (x t) (y t)
There's no need to define (<*>). The source code says:
Minimal complete definition
pure, ((<*>) | liftA2)
So now you've been wanting to ask for a long time, why is it that a <$> b <*> c is equivalent to liftA2 a b c?
The short answer is, it just is. One can be defined in terms of the other -- i.e. (<*>) can be defined via liftA2,
g <*> x = liftA2 id g x -- i.e. (<*>) = liftA2 id = liftA2 ($)
-- (g <*> x) t = liftA2 id g x t
-- = id (g t) (x t)
-- = (id . g) t (x t) -- = (id <$> g <*> x) t
-- = g t (x t)
(which is exactly as it is defined in the source),
and it is a law that every Applicative Functor must follow, that h <$> g = pure h <*> g.
Lastly,
liftA2 h g x == pure h <*> g <*> x
-- h g x == (h g) x
because <*> associates to the left: it is infixl 4 <*>.

Notational confusion in a basic type declaration exercise

I'm going through the new book Haskell Programming from First Principles. It seems decent, but I feel that there are some confusing holes in the explanations. I apologize if I'm missing something basic.
The last problem in chapter 5 is to fill in the ??? below so that things make sense:
munge :: (x -> y) -> (y -> (w, z)) -> x -> w
munge = ???
The solution which was explained to me (after much head-scratching) goes:
g :: y -> (w, z)
g = undefined
f :: x -> y
f = undefined
munge :: (x -> y) -> (y -> (w, z)) -> x -> w
munge g f v = fst (g (f v))
I'm getting hung up on this example in two ways.
First, it seems like the munge function ought to take a function as input which takes x -> y. But the way munge is defined, it seems like we supply an additional argument v to the function f first. But if f :: x -> y, then won't the expression f v be of type just y instead of x -> y?
Second, I'm struggling to understand why the x appears in the second-to-last position in the type declaration. At that point I feel like the logical next piece after the (y -> (w,x)) step should just be w, since at that stage the function g is being applied to fst and w ought to be the type of what fst returns. I can feel that I'm close, but can't quite close the gap.
Clearly I'm not understanding the notation correctly. Can anyone set me straight?
EDIT: Ok, here is a clarifying question to the second part. Is it possible to revise the munge function so that it has the following type (i.e. original type with second-to-last x application omitted)? If so what would it look like?
munge :: (x -> y) -> (y -> (w, z)) -> w
The answer is incorrect and ill-typed. f and g should be swapped:
munge :: (x -> y) -> (y -> (w, z)) -> x -> w
munge g f v = fst (f (g v))
I'm not sure if that clears up your confusion.
EDIT In case it's interesting, here are more equivalent ways of writing this function and its type:
-- notice parens in type signature; `->` associates right
munge :: (x -> y) -> ((y -> (w, z)) -> (x -> w))
munge g f v = -- omitted
-- type signature omitted
munge boop _plort zOWY = fst (_plort (boop zOWY))
munge g f = fst . f . g
munge g = \f v -> fst . f . g $ v
-- don't do this please
munge = ((fst .) .) . (.)
EDIT2 It might be helpful to play around with this in GHCi, asking the inferred type of different expressions:
Prelude> let munge g f v = fst (f (g v))
Prelude> :t munge
munge :: (t1 -> t) -> (t -> (a, b)) -> t1 -> a
Prelude> :t munge head
munge head :: (t -> (a, b)) -> [t] -> a
Prelude> :t munge head (\x-> (x, not x))
munge head (\x-> (x, not x)) :: [Bool] -> Bool
Prelude> :t munge ((+1) . fst . snd . head)
munge ((+1) . fst . snd . head)
:: Num t => (t -> (a, b)) -> [(a1, (t, b1))] -> a
The solution, confusingly, is using the same variables f and g for two different things: as global names for two functions, and as parameter names in defining munge. Making a change of variable should make it clearer:
g :: y -> (w, z)
g = undefined
f :: x -> y
f = undefined
munge :: (x -> y) -> (y -> (w, z)) -> x -> w
munge f1 f2 v = fst (f2 (f1 v)) -- fst . f2 . f1 $ v
Then you would call munge on f and g will thing like
munge f g someArgumentForF
Inside munge, f (called f1) is first applied to someArgumentForF (called v) to get a value that can be passed to g (called f2). This produces a tuple, and applying fst to the tuple returns the value of type w needed as the final result.

Reusing patterns in pattern guards or case expressions

My Haskell project includes an expression evaluator, which for the purposes of this question can be simplified to:
data Expression a where
I :: Int -> Expression Int
B :: Bool -> Expression Bool
Add :: Expression Int -> Expression Int -> Expression Int
Mul :: Expression Int -> Expression Int -> Expression Int
Eq :: Expression Int -> Expression Int -> Expression Bool
And :: Expression Bool -> Expression Bool -> Expression Bool
Or :: Expression Bool -> Expression Bool -> Expression Bool
If :: Expression Bool -> Expression a -> Expression a -> Expression a
-- Reduces an Expression down to the simplest representation.
reduce :: Expression a -> Expression a
-- ... implementation ...
The straightforward approach to implementing this is to write a case expression to recursively evaluate and pattern match, like so:
reduce (Add x y) = case (reduce x, reduce y) of
(I x', I y') -> I $ x' + y'
(x', y') -> Add x' y'
reduce (Mul x y) = case (reduce x, reduce y) of
(I x', I y') -> I $ x' * y'
(x', y') -> Mul x' y'
reduce (And x y) = case (reduce x, reduce y) of
(B x', B y') -> B $ x' && y'
(x', y') -> And x' y'
-- ... and similarly for other cases.
To me, that definition looks somewhat awkward, so I then rewrote the definition using pattern guards, like so:
reduce (Add x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' + y'
I think this definition looks cleaner compared to the case expression, but when defining multiple patterns for different constructors, the pattern is repeated multiple times.
reduce (Add x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' + y'
reduce (Mul x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' * y'
Noting these repeated patterns, I was hoping there would be some syntax or structure that could cut down on the repetition in the pattern matching. Is there a generally accepted method to simplify these definitions?
Edit: after reviewing the pattern guards, I've realised they don't work as a drop-in replacement here. Although they provide the same result when x and y can be reduced to I _, they do not reduce any values when the pattern guards do not match. I would still like reduce to simplify subexpressions of Add et al.
One partial solution, which I've used in a similar situation, is to extract the logic into a "lifting" function that takes a normal Haskell operation and applies it to your language's values. This abstracts over the wrappping/unwrapping and resulting error handling.
The idea is to create two typeclasses for going to and from your custom type, with appropriate error handling. Then you can use these to create a liftOp function that could look like this:
liftOp :: (Extract a, Extract b, Pack c) => (a -> b -> c) ->
(Expression a -> Expression b -> Expression c)
liftOp err op a b = case res of
Nothing -> err a' b'
Just res -> pack res
where res = do a' <- extract $ reduce' a
b' <- extract $ reduce' b
return $ a' `op` b'
Then each specific case looks like this:
Mul x y -> liftOp Mul (*) x y
Which isn't too bad: it isn't overly redundant. It encompasses the information that matters: Mul gets mapped to *, and in the error case we just apply Mul again.
You would also need instances for packing and unpacking, but these are useful anyhow. One neat trick is that these can also let you embed functions in your DSL automatically, with an instance of the form (Extract a, Pack b) => Pack (a -> b).
I'm not sure this will work exactly for your example, but I hope it gives you a good starting point. You might want to wire additional error handling through the whole thing, but the good news is that most of that gets folded into the definition of pack, unpack and liftOp, so it's still pretty centralized.
I wrote up a similar solution for a related (but somewhat different) problem. It's also a way to handle going back and forth between native Haskell values and an interpreter, but the interpreter is structured differently. Some of the same ideas should still apply though!
This answer is inspired by rampion's follow-up question, which suggests the following function:
step :: Expression a -> Expression a
step x = case x of
Add (I x) (I y) -> I $ x + y
Mul (I x) (I y) -> I $ x * y
Eq (I x) (I y) -> B $ x == y
And (B x) (B y) -> B $ x && y
Or (B x) (B y) -> B $ x || y
If (B b) x y -> if b then x else y
z -> z
step looks at a single term, and reduces it if everything needed to reduce it is present. Equiped with step, we only need a way to replace a term everywhere in the expression tree. We can start by defining a way to apply a function inside every term.
{-# LANGUAGE RankNTypes #-}
emap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
emap f x = case x of
I a -> I a
B a -> B a
Add x y -> Add (f x) (f y)
Mul x y -> Mul (f x) (f y)
Eq x y -> Eq (f x) (f y)
And x y -> And (f x) (f y)
Or x y -> Or (f x) (f y)
If x y z -> If (f x) (f y) (f z)
Now, we need to apply a function everywhere, both to the term and everywhere inside the term. There are two basic possibilities, we could apply the function to the term before applying it inside or we could apply the function afterwards.
premap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
premap f = emap (premap f) . f
postmap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
postmap f = f . emap (postmap f)
This gives us two possibilities for how to use step, which I will call shorten and reduce.
shorten = premap step
reduce = postmap step
These behave a little differently. shorten removes the innermost level of terms, replacing them with literals, shortening the height of the expression tree by one. reduce completely evaluates the expression tree to a literal. Here's the result of iterating each of these on the same input
"shorten"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul (I 2) (I 3))) (I 0)
If (And (B True) (B True)) (Add (I 1) (I 6)) (I 0)
If (B True) (I 7) (I 0)
I 7
"reduce"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul (I 2) (I 3))) (I 0)
I 7
Partial reduction
Your question implies that you sometimes expect that expressions can't be reduced completely. I'll extend your example to include something to demonstrate this case, by adding a variable, Var.
data Expression a where
Var :: Expression Int
...
We will need to add support for Var to emap:
emap f x = case x of
Var -> Var
...
bind will replace the variable, and evaluateFor performs a complete evaluation, traversing the expression only once.
bind :: Int -> Expression a -> Expression a
bind a x = case x of
Var -> I a
z -> z
evaluateFor :: Int -> Expression a -> Expression a
evaluateFor a = postmap (step . bind a)
Now reduce iterated on an example containing a variable produces the following output
"reduce"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul Var (I 3))) (I 0)
Add (I 1) (Mul Var (I 3))
If the output expression from the reduction is evaluated for a specific value of Var, we can reduce the expression all the way to a literal.
"evaluateFor 5"
Add (I 1) (Mul Var (I 3))
I 16
Applicative
emap can instead be written in terms of an Applicative Functor, and postmap can be made into a generic piece of code suitable for other data types than expressions. How to do so is described in this answer to rampion's follow-up question.

Is there a way to elegantly represent this pattern in Haskell?

Mind the pure function below, in an imperative language:
def foo(x,y):
x = f(x) if a(x)
if c(x):
x = g(x)
else:
x = h(x)
x = f(x)
y = f(y) if a(y)
x = g(x) if b(y)
return [x,y]
That function represents a style where you have to incrementally update variables. It can be avoided in most cases, but there are situations where that pattern is unavoidable - for example, writing a cooking procedure for a robot, which inherently requires a series of steps and decisions. Now, imagine we were trying to represent foo in Haskell.
foo x0 y0 =
let x1 = if a x0 then f x0 else x0 in
let x2 = if c x1 then g x1 else h x1 in
let x3 = f x2 in
let y1 = if a y0 then f y0 else y0 in
let x4 = if b y1 then g x3 else x3 in
[x4,y1]
That code works, but it is too complicated and error prone due to the need for manually managing the numeric tags. Notice that, after x1 is set, x0's value should never be used again, but it still can. If you accidentally use it, that will be an undetected error.
I've managed to solve this problem using the State monad:
fooSt x y = execState (do
(x,y) <- get
when (a x) (put (f x, y))
(x,y) <- get
if c x
then put (g x, y)
else put (h x, y)
(x,y) <- get
put (f x, y)
(x,y) <- get
when (a y) (put (x, f y))
(x,y) <- get
when (b y) (put (g x, x))) (x,y)
This way, need for tag-tracking goes away, as well as the risk of accidentally using an outdated variable. But now the code is verbose and much harder to understand, mainly due to the repetition of (x,y) <- get.
So: what is a more readable, elegant and safe way to express this pattern?
Full code for testing.
Your goals
While the direct transformation of imperative code would usually lead to the ST monad and STRef, lets think about what you actually want to do:
You want to manipulate values conditionally.
You want to return that value.
You want to sequence the steps of your manipulation.
Requirements
Now this indeed looks first like the ST monad. However, if we follow the simple monad laws, together with do notation, we see that
do
x <- return $ if somePredicate x then g x
else h x
x <- return $ if someOtherPredicate x then a x
else b x
is exactly what you want. Since you need only the most basic functions of a monad (return and >>=), you can use the simplest:
The Identity monad
foo x y = runIdentity $ do
x <- return $ if a x then f x
else x
x <- return $ if c x then g x
else h x
x <- return $ f x
y <- return $ if a x then f y
else y
x <- return $ if b y then g x
else y
return (x,y)
Note that you cannot use let x = if a x then f x else x, because in this case the x would be the same on both sides, whereas
x <- return $ if a x then f x
else x
is the same as
(return $ if a x then (f x) else x) >>= \x -> ...
and the x in the if expression is clearly not the same as the resulting one, which is going to be used in the lambda on the right hand side.
Helpers
In order to make this more clear, you can add helpers like
condM :: Monad m => Bool -> a -> a -> m a
condM p a b = return $ if p then a else b
to get an even more concise version:
foo x y = runIdentity $ do
x <- condM (a x) (f x) x
x <- fmap f $ condM (c x) (g x) (h x)
y <- condM (a y) (f y) y
x <- condM (b y) (g x) x
return (x , y)
Ternary craziness
And while we're up to it, lets crank up the craziness and introduce a ternary operator:
(?) :: Bool -> (a, a) -> a
b ? ie = if b then fst ie else snd ie
(??) :: Monad m => Bool -> (a, a) -> m a
(??) p = return . (?) p
(#) :: a -> a -> (a, a)
(#) = (,)
infixr 2 ??
infixr 2 #
infixr 2 ?
foo x y = runIdentity $ do
x <- a x ?? f x # x
x <- fmap f $ c x ?? g x # h x
y <- a y ?? f y # y
x <- b y ?? g x # x
return (x , y)
But the bottomline is, that the Identity monad has everything you need for this task.
Imperative or non-imperative
One might argue whether this style is imperative. It's definitely a sequence of actions. But there's no state, unless you count the bound variables. However, then a pack of let … in … declarations also gives an implicit sequence: you expect the first let to bind first.
Using Identity is purely functional
Either way, the code above doesn't introduce mutability. x doesn't get modified, instead you have a new x or y shadowing the last one. This gets clear if you desugar the do expression as noted above:
foo x y = runIdentity $
a x ?? f x # x >>= \x ->
c x ?? g x # h x >>= \x ->
return (f x) >>= \x ->
a y ?? f y # y >>= \y ->
b y ?? g x # x >>= \x ->
return (x , y)
Getting rid of the simplest monad
However, if we would use (?) on the left hand side and remove the returns, we could replace (>>=) :: m a -> (a -> m b) -> m b) by something with type a -> (a -> b) -> b. This just happens to be flip ($). We end up with:
($>) :: a -> (a -> b) -> b
($>) = flip ($)
infixr 0 $> -- same infix as ($)
foo x y = a x ? f x # x $> \x ->
c x ? g x # h x $> \x ->
f x $> \x ->
a y ? f y # y $> \y ->
b y ? g x # x $> \x ->
(x, y)
This is very similar to the desugared do expression above. Note that any usage of Identity can be transformed into this style, and vice-versa.
The problem you state looks like a nice application for arrows:
import Control.Arrow
if' :: (a -> Bool) -> (a -> a) -> (a -> a) -> a -> a
if' p f g x = if p x then f x else g x
foo2 :: (Int,Int) -> (Int,Int)
foo2 = first (if' c g h . if' a f id) >>>
first f >>>
second (if' a f id) >>>
(\(x,y) -> (if b y then g x else x , y))
in particular, first lifts a function a -> b to (a,c) -> (b,c), which is more idiomatic.
Edit: if' allows a lift
import Control.Applicative (liftA3)
-- a functional if for lifting
if'' b x y = if b then x else y
if' :: (a -> Bool) -> (a -> a) -> (a -> a) -> a -> a
if' = liftA3 if''
I'd probably do something like this:
foo x y = ( x', y' )
where x' = bgf y' . cgh . af $ x
y' = af y
af z = (if a z then f else id) z
cgh z = (if c z then g else h) z
bg y x = (if b y then g else id) x
For something more complicated, you may want to consider using lens:
whenM :: Monad m => m Bool -> m () -> m ()
whenM c a = c >>= \res -> when res a
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM mb ml mr = mb >>= \b -> if b then ml else mr
foo :: Int -> Int -> (Int, Int)
foo = curry . execState $ do
whenM (uses _1 a) $
_1 %= f
ifM (uses _1 c)
(_1 %= g)
(_1 %= h)
_1 %= f
whenM (uses _2 a) $
_2 %= f
whenM (uses _2 b) $ do
_1 %= g
And there's nothing stopping you from using more descriptive variable names:
foo :: Int -> Int -> (Int, Int)
foo = curry . execState $ do
let x :: Lens (a, c) (b, c) a b
x = _1
y :: Lens (c, a) (c, b) a b
y = _2
whenM (uses x a) $
x %= f
ifM (uses x c)
(x %= g)
(x %= h)
x %= f
whenM (uses y a) $
y %= f
whenM (uses y b) $ do
x %= g
This is a job for the ST (state transformer) library.
ST provides:
Stateful computations in the form of the ST type. These look like ST s a for a computation that results in a value of type a, and may be run with runST to obtain a pure a value.
First-class mutable references in the form of the STRef type. The newSTRef a action creates a new STRef s a reference with an initial value of a, and which can be read with readSTRef ref and written with writeSTRef ref a. A single ST computation can use any number of STRef references internally.
Together, these let you express the same mutable variable functionality as in your imperative example.
To use ST and STRef, we need to import:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Monad.ST.Safe
import Data.STRef
Instead of using the low-level readSTRef and writeSTRef all over the place, we can define the following helpers to match the imperative operations that the Python-style foo example uses:
-- STRef assignment.
(=:) :: STRef s a -> ST s a -> ST s ()
ref =: x = writeSTRef ref =<< x
-- STRef function application.
($:) :: (a -> b) -> STRef s a -> ST s b
f $: ref = f `fmap` readSTRef ref
-- Postfix guard syntax.
if_ :: Monad m => m () -> m Bool -> m ()
action `if_` guard = act' =<< guard
where act' b = if b then action
else return ()
This lets us write:
ref =: x to assign the value of ST computation x to the STRef ref.
(f $: ref) to apply a pure function f to the STRef ref.
action `if_` guard to execute action only if guard results in True.
With these helpers in place, we can faithfully translate the original imperative definition of foo into Haskell:
a = (< 10)
b = even
c = odd
f x = x + 3
g x = x * 2
h x = x - 1
f3 x = x + 2
-- A stateful computation that takes two integer STRefs and result in a final [x,y].
fooST :: Integral n => STRef s n -> STRef s n -> ST s [n]
fooST x y = do
x =: (f $: x) `if_` (a $: x)
x' <- readSTRef x
if c x' then
x =: (g $: x)
else
x =: (h $: x)
x =: (f $: x)
y =: (f $: y) `if_` (a $: y)
x =: (g $: x) `if_` (b $: y)
sequence [readSTRef x, readSTRef y]
-- Pure wrapper: simply call fooST with two fresh references, and run it.
foo :: Integral n => n -> n -> [n]
foo x y = runST $ do
x' <- newSTRef x
y' <- newSTRef y
fooST x' y'
-- This will print "[9,3]".
main = print (foo 0 0)
Points to note:
Although we first had to define some syntactical helpers (=:, $:, if_) before translating foo, this demonstrates how you can use ST and STRef as a foundation to grow your own little imperative language that's directly suited to the problem at hand.
Syntax aside, this matches the structure of the original imperative definition exactly, without any error-prone restructuring. Any minor changes to the original example can be mirrored directly to Haskell. (The addition of the temporary x' <- readSTRef x binding in the Haskell code is only in order to use it with the native if/else syntax: if desired, this can be replaced with an appropriate ST-based if/else construct.)
The above code demonstrates giving both pure and stateful interfaces to the same computation: pure callers can use foo without knowing that it uses mutable state internally, while ST callers can directly use fooST (and for example provide it with existing STRefs to modify).
#Sibi said it best in his comment:
I would suggest you to stop thinking imperatively and rather think in a functional way. I agree that it will take some time to getting used to the new pattern, but try to translate imperative ideas to functional languages isn't a great approach.
Practically speaking, your chain of let can be a good starting point:
foo x0 y0 =
let x1 = if a x0 then f x0 else x0 in
let x2 = if c x1 then g x1 else h x1 in
let x3 = f x2 in
let y1 = if a y0 then f y0 else y0 in
let x4 = if b y1 then g x3 else x3 in
[x4,y1]
But I would suggest using a single let and giving descriptive names to the intermediate stages.
In this example unfortunately I don't have a clue what the various x's and y's do, so I cannot suggest meaningful names. In real code you would use names such as x_normalized, x_translated, or such, instead of x1 and x2, to describe what those values really are.
In fact, in a let or where you don't really have variables: they're just shorthand names you give to intermediate results, to make it easy to compose the final expression (the one after in or before the where.)
This is the spirit behind the x_bar and x_baz below. Try to come up with names that are reasonably descriptive, given the context of your code.
foo x y =
let x_bar = if a x then f x else x
x_baz = f if c x_bar then g x_bar else h x_bar
y_bar = if a y then f y else y
x_there = if b y_bar then g x_baz else x_baz
in [x_there, y_bar]
Then you can start recognizing patterns that were hidden in the imperative code. For example, x_bar and y_bar are basically the same transformation, applied respectively to x and y: that's why they have the same suffix "_bar" in this nonsensical example; then your x2 probably doesn't need an intermediate name , since you can just apply f to the result of the entire "if c then g else h".
Going on with the pattern recognition, you should factor out the transformations that you are applying to variables into sub-lambdas (or whatever you call the auxiliary functions defined in a where clause.)
Again, I don't have a clue what the original code did, so I cannot suggest meaningful names for the auxiliary functions. In a real application, f_if_a would be called normalize_if_needed or thaw_if_frozen or mow_if_overgrown... you get the idea:
foo x y =
let x_bar = f_if_a x
y_bar = f_if_a y
x_baz = f (g_if_c_else_h x_bar)
x_there = g_if_b x_baz y_bar
in [x_there, y_bar]
where
f_if_a x
| a x = f x
| otherwise = x
g_if_c_else_h x
| c x = g x
| otherwise = h x
g_if_b x y
| b y = g x
| otherwise = x
Don't disregard this naming business.
The whole point of Haskell and other pure functional languages is to express algorithms without the assignment operator, meaning the tool that can modify the value of an existing variable.
The names you give to things inside a function definition, whether introduced as arguments, let, or where, can only refer to one value (or auxiliary function) throughout the entire definition, so that your code can be more easily reasoned about and proven correct.
If you don't give them meaningful names (and conversely giving your code a meaningful structure) then you're missing out on the entire purpose of Haskell.
(IMHO the other answers so far, citing monads and other shenanigans, are barking up the wrong tree.)
I always prefer layering state transformers to using a single state over a tuple: it definitely declutters things by letting you "focus" on a specific layer (representations of the x and y variables in our case):
import Control.Monad.Trans.Class
import Control.Monad.Trans.State
foo :: x -> y -> (x, y)
foo x y =
(flip runState) y $ (flip execStateT) x $ do
get >>= \v -> when (a v) (put (f v))
get >>= \v -> put ((if c v then g else h) v)
modify f
lift $ get >>= \v -> when (a v) (put (f v))
lift get >>= \v -> when (b v) (modify g)
The lift function allows us to focus on the inner state layer, which is y.

Why is the type of this function (a -> a) -> a?

Why is the type of this function (a -> a) -> a?
Prelude> let y f = f (y f)
Prelude> :t y
y :: (t -> t) -> t
Shouldn't it be an infinite/recursive type?
I was going to try and put into words what I think it's type should be, but I just can't do it for some reason.
y :: (t -> t) -> ?WTFIsGoingOnOnTheRHS?
I don't get how f (y f) resolves to a value. The following makes a little more sense to me:
Prelude> let y f x = f (y f) x
Prelude> :t y
y :: ((a -> b) -> a -> b) -> a -> b
But it's still ridiculously confusing. What's going on?
Well, y has to be of type (a -> b) -> c, for some a, b and c we don't know yet; after all, it takes a function, f, and applies it to an argument, so it must be a function taking a function.
Since y f = f x (again, for some x), we know that the return type of y must be the return type of f itself. So, we can refine the type of y a bit: it must be (a -> b) -> b for some a and b we don't know yet.
To figure out what a is, we just have to look at the type of the value passed to f. It's y f, which is the expression we're trying to figure out the type of right now. We're saying that the type of y is (a -> b) -> b (for some a, b, etc.), so we can say that this application of y f must be of type b itself.
So, the type of the argument to f is b. Put it all back together, and we get (b -> b) -> b — which is, of course, the same thing as (a -> a) -> a.
Here's a more intuitive, but less precise view of things: we're saying that y f = f (y f), which we can expand to the equivalent y f = f (f (y f)), y f = f (f (f (y f))), and so on. So, we know that we can always apply another f around the whole thing, and since the "whole thing" in question is the result of applying f to an argument, f has to have the type a -> a; and since we just concluded that the whole thing is the result of applying f to an argument, the return type of y must be that of f itself — coming together, again, as (a -> a) -> a.
Just two points to add to other people's answers.
The function you're defining is usually called fix, and it is a fixed-point combinator: a function that computes the fixed point of another function. In mathematics, the fixed point of a function f is an argument x such that f x = x. This already allows you to infer that the type of fix has to be (a -> a) -> a; "function that takes a function from a to a, and returns an a."
You've called your function y, which seems to be after the Y combinator, but this is an inaccurate name: the Y combinator is one specific fixed point combinator, but not the same as the one you've defined here.
I don't get how f (y f) resolves to a value.
Well, the trick is that Haskell is a non-strict (a.k.a. "lazy") language. The calculation of f (y f) can terminate if f doesn't need to evaluate its y f argument in all cases. So, if you're defining factorial (as John L illustrates), fac (y fac) 1 evaluates to 1 without evaluating y fac.
Strict languages can't do this, so in those languages you cannot define a fixed-point combinator in this way. In those languages, the textbook fixed-point combinator is the Y combinator proper.
#ehird's done a good job of explaining the type, so I'd like to show how it can resolve to a value with some examples.
f1 :: Int -> Int
f1 _ = 5
-- expansion of y applied to f1
y f1
f1 (y f1) -- definition of y
5 -- definition of f1 (the argument is ignored)
-- here's an example that uses the argument, a factorial function
fac :: (Int -> Int) -> (Int -> Int)
fac next 1 = 1
fac next n = n * next (n-1)
y fac :: Int -> Int
fac (y fac) -- def. of y
-- at this point, further evaluation requires the next argument
-- so let's try 3
fac (y fac) 3 :: Int
3 * (y fac) 2 -- def. of fac
3 * (fac (y fac) 2) -- def. of y
3 * (2 * (y fac) 1) -- def. of fac
3 * (2 * (fac (y fac) 1) -- def. of y
3 * (2 * 1) -- def. of fac
You can follow the same steps with any function you like to see what will happen. Both of these examples converge to values, but that doesn't always happen.
Let me tell about a combinator. It's called the "fixpoint combinator" and it has the following property:
The Property: the "fixpoint combinator" takes a function f :: (a -> a) and discovers a "fixed point" x :: a of that function such that f x == x. Some implementations of the fixpoint combinator might be better or worse at "discovering", but assuming it terminates, it will produce a fixed point of the input function. Any function that satisfies The Property can be called a "fixpoint combinator".
Call this "fixpoint combinator" y. Based on what we just said, the following are true:
-- as we said, y's input is f :: a -> a, and its output is x :: a, therefore
y :: (a -> a) -> a
-- let x be the fixed point discovered by applying f to y
y f == x -- because y discovers x, a fixed point of f, per The Property
f x == x -- the behavior of a fixed point, per The Property
-- now, per substitution of "x" with "f x" in "y f == x"
y f == f x
-- again, per substitution of "x" with "y f" in the previous line
y f == f (y f)
So there you go. You have defined y in terms of the essential property of the fixpoint combinator:
y f == f (y f). Instead of assuming that y f discovers x, you can assume that x represents a divergent computation, and still come to the same conclusion (iinm).
Since your function satisfies The Property, we can conclude that it is a fixpoint combinator, and that the other properties we have stated, including the type, are applicable to your function.
This isn't exactly a solid proof, but I hope it provides additional insight.

Resources