`seq` on partially applied functions - haskell

Lets say I have the following:
f :: a -> b -> c
g :: b -> c
g = f 10
Now lets say f is actually:
f x y = f1 x + y
Would:
g `seq` ...
actually evaluate f1 10, so later when running
g 9
it's actually just a simple addition?
If not, is there a way to "evaluate" parts of a partially applied function?
I'm looking for a generic solution, one that doesn't depend on knowing how f and g work.

No, it will not, because in general, the choice of right hand side for f might depend on y. If you want to share the result of f1 x between calls to g, you would have to write f like this:
f x = let z = f1 x in \y -> z + y
Of course, due to laziness this will not evaluate f1 x until the first time g is called. To have g `seq` ... force evaluation of f1 x, you would have to write:
f x = let z = f1 x in z `seq` (\y -> z + y)

seq is shallow:
Prelude> let f1 = undefined
Prelude> let f = \x -> \y -> f1 x + y
Prelude> let g = f 10
Prelude> g `seq` 1
1
Prelude> g 9
*** Exception: Prelude.undefined
Prelude>
I'd take a look at Control.DeepSeq: http://hackage.haskell.org/packages/archive/deepseq/1.2.0.1/doc/html/Control-DeepSeq.html

Related

Constant space short circuiting `foldM` over `Maybe`

Lets say I have the following:
f :: b -> a -> b
x :: b
l :: [a]
and
foldl' f x l
runs in constant space. That is f is suitably strict.
Now consider if I have:
f2 :: b -> a -> Maybe b
f2 x y = if (pred x y) then Just $! (f x y) else Nothing
will
foldM f2 x l
reliably run in constant space? Or is there something else I need to do to ensure I have both constant space but still the short circuiting behaviour of Maybe?
(Note whilst I've asked this question about Maybe, I actually want to do this with Either, but I suspect the approach is similar)
In the library source code foldM is defined as foldlM, which in turn is defined as
foldlM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
foldlM f z0 xs = foldr c return xs z0
where c x k z = f z x >>= k
Assuming, c x k z = f2 z x >>= k, let's see what happens when we call it. To see if it's constant space or not, we will only reduce the expressions by applying the topmost function without reducing the subexpressions.
foldlM f2 z0 (x:xs)
=
foldr c return (x:xs) z0
=
c x (foldr c return xs) z0
=
f2 z0 x >>= foldr c return xs
Since >>= is strict on the first arg, we evaluate f2 z0 x first. If that returns Nothing, we ignore the rest (short-circuiting, as you mentioned). If that returns Just y, we have
Just y >>= foldr c return xs
=
foldr c return xs y
and we are ready for the next loop.
This did not cause our term to grow, so it looks like it runs in constant space (provided f2 keeps the size of y constant, of course).

How to understand nested lambda functions in Haskell

I am trying to understand the meaning of the following 2 lambda expressions in Haskell:
f = \x -> x (\y -> x y)
g = \x -> (\y -> y) x
I tried to convert them, and I got this:
f x y = x x y
g x y = y x
Is this correct? I assumed the arguments of both functions have to be x and y, as they are both found in a lambda expression in the function description. I basically understood it this way: f(x) = x f(y) and f(y) = y x. And for g, g(x) = g(y) x and g(y) = y. But as I am new to Haskell, I'm not very confident with these types of conversion. If not correct, what would be a correct conversion?
Neither is correct. Your solution uses the functions
f x y = x x y
g x y = y x
which actually mean
f = \x -> (\y -> x x y)
g = \x -> (\y -> y x)
and those differ from the original expressions
f = \x -> x (\y -> x y)
g = \x -> (\y -> y) x
The above two equations can be rewritten as
f x = x (\y -> x y)
g x = (\y -> y) x
But from here, there is no way to turn the remaining lambdas into more arguments for f or g. At best, we can simplify them using beta/eta conversion and get
f x = x x -- eta (\y -> x y) = x
g x = x -- beta (\y -> y) x = x
(Also see the comment below by Will Ness, who points out that through an additional eta expansion in f we could reach the OP's definition. Still, that is incidental.)
Finally, note that Haskell will not accept f x = x x since that can not be typed, unless we use rank-2 types and explicitly provide a type annotation like f :: (forall a. a) -> b. The original code f = \x -> x (\y -> x y) suffers from the same issue. That would also be fine in untyped languages, e.g. the untyped lambda calculus in programming languages theory.
The :type command at the GHCi prompt is your friend. Let's take your second example first
λ> :type let g = \x -> (\y -> y) x in g
let g = \x -> (\y -> y) x in g :: p -> p
So g is well-typed and is a convoluted way to write an identity function :: p -> p. Specifically, g takes some x and applies an identity function (\y -> y) to x, resulting in x. GHCi in giving the type uses a fresh type name p, to avoid confusion. No your g x y = ... is not equivalent. (Check it with :type.)
You can abbreviate :type to just :t. Then let's take your first example.
λ> :t let f = \x -> x (\y -> x y) in f
* Occurs check: cannot construct the infinite type: t2 ~ t2 -> t3
* In the first argument of `x', namely `(\ y -> x y)'
In the expression: x (\ y -> x y)
In the expression: \ x -> x (\ y -> x y)
* Relevant bindings include
x :: t2 -> t3 (bound at <interactive>:1:10)
f :: (t2 -> t3) -> t3 (bound at <interactive>:1:5)
Errk. Is your suggested f the same as that?
λ> :t let f x y = x x y in f
* Occurs check: cannot construct the infinite type:
t3 ~ t3 -> t4 -> t5
* In the first argument of `x', namely `x'
It at least looks like a similar error message. What are these t2, t3, t4, t5? Again it's GHCi using fresh names for the types, to avoid confusion.
Looking at the let f = ..., GHCi sees x is applied to something, so it gives x :: t2 -> t3 where t2 is the type of its argument, t3 is the return type. It also sees f = \x -> x (blah). So the return type of f must be whatever x returns, i.e. t3, and the argument to f is x. So f :: (t2 -> t3) -> t3.
Inside the (blah), there's x applied to something. So the something (i.e. y) must be the type of x's argument, and the return type must be x's return type. I.e. (\y -> x y) :: t2 -> t3. Errk: then we must have x's argument type same as that, because x is applied to it. And the way we write 'same as' is with ~.
Then the error message tells you GHCi is trying to make sense of t2 ~ (t2 -> t3). (-> binds tighter than ~.) And if you try to subsitute that equivalence for t2 into the RHS you'll get t2 ~ (((... -> t3) -> t3)-> t3) ad infinitum.
Your suggested equivalent for f x y = is not equivalent (the message/typing is a little different). But they're both infinite types, so not allowed.

Explain (.)(.) to me

Diving into Haskell, and while I am enjoying the language I'm finding the pointfree style completely illegible. I've come a across this function which only consists of these ASCII boobies as seen below.
f = (.)(.)
And while I understand its type signature and what it does, I can't for the life of me understand why it does it. So could someone please write out the de-pointfreed version of it for me, and maybe step by step work back to the pointfree version sorta like this:
f g x y = (g x) + y
f g x = (+) (g x)
f g = (+) . g
f = (.) (+)
Generally (?) (where ? stands for an arbitrary infix operator) is the same as \x y -> x ? y. So we can rewrite f as:
f = (\a b -> a . b) (\c d -> c . d)
Now if we apply the argument to the function, we get:
f = (\b -> (\c d -> c . d) . b)
Now b is just an argument to f, so we can rewrite this as:
f b = (\c d -> c . d) . b
The definition of . is f . g = \x -> f (g x). If replace the outer . with its definition, we get:
f b = \x -> (\c d -> c . d) (b x)
Again we can turn x into a regular parameter:
f b x = (\c d -> c . d) (b x)
Now let's replace the other .:
f b x = (\c d y -> c (d y)) (b x)
Now let's apply the argument:
f b x = \d y -> (b x) (d y)
Now let's move the parameters again:
f b x d y = (b x) (d y)
Done.
You can also gradually append arguments to f:
f = ((.) . )
f x = (.) . x
f x y = ((.) . x) y
= (.) (x y)
= ((x y) . )
f x y z = (x y) . z
f x y z t = ((x y) . z) t
= (x y) (z t)
= x y (z t)
= x y $ z t
The result reveals that x and z are actually (binary and unary, respectively) functions, so I'll use different identifiers:
f g x h y = g x (h y)
We can work backwards by "pattern matching" over the combinators' definitions. Given
f a b c d = a b (c d)
= (a b) (c d)
we proceed
= B (a b) c d
= B B a b c d -- writing B for (.)
so by eta-contraction
f = B B
because
a (b c) = B a b c -- bidirectional equation
by definition. Haskell's (.) is actually the B combinator (see BCKW combinators).
edit: Potentially, many combinators can match the same code. That's why there are many possible combinatory encodings for the same piece of code. For example, (ab)(cd) = (ab)(I(cd)) is a valid transformation, which might lead to some other combinator definition matching that. Choosing the "most appropriate" one is an art (or a search in a search space with somewhat high branching factor).
That's about going backwards, as you asked. But if you want to go "forward", personally, I like the combinatory approach much better over the lambda notation fidgeting. I would even just write many arguments right away, and get rid of the extra ones in the end:
BBabcdefg = B(ab)cdefg = (ab)(cd)efg
hence,
BBabcd = B(ab)cd = (ab)(cd)
is all there is to it.

Is there a way to elegantly represent this pattern in Haskell?

Mind the pure function below, in an imperative language:
def foo(x,y):
x = f(x) if a(x)
if c(x):
x = g(x)
else:
x = h(x)
x = f(x)
y = f(y) if a(y)
x = g(x) if b(y)
return [x,y]
That function represents a style where you have to incrementally update variables. It can be avoided in most cases, but there are situations where that pattern is unavoidable - for example, writing a cooking procedure for a robot, which inherently requires a series of steps and decisions. Now, imagine we were trying to represent foo in Haskell.
foo x0 y0 =
let x1 = if a x0 then f x0 else x0 in
let x2 = if c x1 then g x1 else h x1 in
let x3 = f x2 in
let y1 = if a y0 then f y0 else y0 in
let x4 = if b y1 then g x3 else x3 in
[x4,y1]
That code works, but it is too complicated and error prone due to the need for manually managing the numeric tags. Notice that, after x1 is set, x0's value should never be used again, but it still can. If you accidentally use it, that will be an undetected error.
I've managed to solve this problem using the State monad:
fooSt x y = execState (do
(x,y) <- get
when (a x) (put (f x, y))
(x,y) <- get
if c x
then put (g x, y)
else put (h x, y)
(x,y) <- get
put (f x, y)
(x,y) <- get
when (a y) (put (x, f y))
(x,y) <- get
when (b y) (put (g x, x))) (x,y)
This way, need for tag-tracking goes away, as well as the risk of accidentally using an outdated variable. But now the code is verbose and much harder to understand, mainly due to the repetition of (x,y) <- get.
So: what is a more readable, elegant and safe way to express this pattern?
Full code for testing.
Your goals
While the direct transformation of imperative code would usually lead to the ST monad and STRef, lets think about what you actually want to do:
You want to manipulate values conditionally.
You want to return that value.
You want to sequence the steps of your manipulation.
Requirements
Now this indeed looks first like the ST monad. However, if we follow the simple monad laws, together with do notation, we see that
do
x <- return $ if somePredicate x then g x
else h x
x <- return $ if someOtherPredicate x then a x
else b x
is exactly what you want. Since you need only the most basic functions of a monad (return and >>=), you can use the simplest:
The Identity monad
foo x y = runIdentity $ do
x <- return $ if a x then f x
else x
x <- return $ if c x then g x
else h x
x <- return $ f x
y <- return $ if a x then f y
else y
x <- return $ if b y then g x
else y
return (x,y)
Note that you cannot use let x = if a x then f x else x, because in this case the x would be the same on both sides, whereas
x <- return $ if a x then f x
else x
is the same as
(return $ if a x then (f x) else x) >>= \x -> ...
and the x in the if expression is clearly not the same as the resulting one, which is going to be used in the lambda on the right hand side.
Helpers
In order to make this more clear, you can add helpers like
condM :: Monad m => Bool -> a -> a -> m a
condM p a b = return $ if p then a else b
to get an even more concise version:
foo x y = runIdentity $ do
x <- condM (a x) (f x) x
x <- fmap f $ condM (c x) (g x) (h x)
y <- condM (a y) (f y) y
x <- condM (b y) (g x) x
return (x , y)
Ternary craziness
And while we're up to it, lets crank up the craziness and introduce a ternary operator:
(?) :: Bool -> (a, a) -> a
b ? ie = if b then fst ie else snd ie
(??) :: Monad m => Bool -> (a, a) -> m a
(??) p = return . (?) p
(#) :: a -> a -> (a, a)
(#) = (,)
infixr 2 ??
infixr 2 #
infixr 2 ?
foo x y = runIdentity $ do
x <- a x ?? f x # x
x <- fmap f $ c x ?? g x # h x
y <- a y ?? f y # y
x <- b y ?? g x # x
return (x , y)
But the bottomline is, that the Identity monad has everything you need for this task.
Imperative or non-imperative
One might argue whether this style is imperative. It's definitely a sequence of actions. But there's no state, unless you count the bound variables. However, then a pack of let … in … declarations also gives an implicit sequence: you expect the first let to bind first.
Using Identity is purely functional
Either way, the code above doesn't introduce mutability. x doesn't get modified, instead you have a new x or y shadowing the last one. This gets clear if you desugar the do expression as noted above:
foo x y = runIdentity $
a x ?? f x # x >>= \x ->
c x ?? g x # h x >>= \x ->
return (f x) >>= \x ->
a y ?? f y # y >>= \y ->
b y ?? g x # x >>= \x ->
return (x , y)
Getting rid of the simplest monad
However, if we would use (?) on the left hand side and remove the returns, we could replace (>>=) :: m a -> (a -> m b) -> m b) by something with type a -> (a -> b) -> b. This just happens to be flip ($). We end up with:
($>) :: a -> (a -> b) -> b
($>) = flip ($)
infixr 0 $> -- same infix as ($)
foo x y = a x ? f x # x $> \x ->
c x ? g x # h x $> \x ->
f x $> \x ->
a y ? f y # y $> \y ->
b y ? g x # x $> \x ->
(x, y)
This is very similar to the desugared do expression above. Note that any usage of Identity can be transformed into this style, and vice-versa.
The problem you state looks like a nice application for arrows:
import Control.Arrow
if' :: (a -> Bool) -> (a -> a) -> (a -> a) -> a -> a
if' p f g x = if p x then f x else g x
foo2 :: (Int,Int) -> (Int,Int)
foo2 = first (if' c g h . if' a f id) >>>
first f >>>
second (if' a f id) >>>
(\(x,y) -> (if b y then g x else x , y))
in particular, first lifts a function a -> b to (a,c) -> (b,c), which is more idiomatic.
Edit: if' allows a lift
import Control.Applicative (liftA3)
-- a functional if for lifting
if'' b x y = if b then x else y
if' :: (a -> Bool) -> (a -> a) -> (a -> a) -> a -> a
if' = liftA3 if''
I'd probably do something like this:
foo x y = ( x', y' )
where x' = bgf y' . cgh . af $ x
y' = af y
af z = (if a z then f else id) z
cgh z = (if c z then g else h) z
bg y x = (if b y then g else id) x
For something more complicated, you may want to consider using lens:
whenM :: Monad m => m Bool -> m () -> m ()
whenM c a = c >>= \res -> when res a
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM mb ml mr = mb >>= \b -> if b then ml else mr
foo :: Int -> Int -> (Int, Int)
foo = curry . execState $ do
whenM (uses _1 a) $
_1 %= f
ifM (uses _1 c)
(_1 %= g)
(_1 %= h)
_1 %= f
whenM (uses _2 a) $
_2 %= f
whenM (uses _2 b) $ do
_1 %= g
And there's nothing stopping you from using more descriptive variable names:
foo :: Int -> Int -> (Int, Int)
foo = curry . execState $ do
let x :: Lens (a, c) (b, c) a b
x = _1
y :: Lens (c, a) (c, b) a b
y = _2
whenM (uses x a) $
x %= f
ifM (uses x c)
(x %= g)
(x %= h)
x %= f
whenM (uses y a) $
y %= f
whenM (uses y b) $ do
x %= g
This is a job for the ST (state transformer) library.
ST provides:
Stateful computations in the form of the ST type. These look like ST s a for a computation that results in a value of type a, and may be run with runST to obtain a pure a value.
First-class mutable references in the form of the STRef type. The newSTRef a action creates a new STRef s a reference with an initial value of a, and which can be read with readSTRef ref and written with writeSTRef ref a. A single ST computation can use any number of STRef references internally.
Together, these let you express the same mutable variable functionality as in your imperative example.
To use ST and STRef, we need to import:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Monad.ST.Safe
import Data.STRef
Instead of using the low-level readSTRef and writeSTRef all over the place, we can define the following helpers to match the imperative operations that the Python-style foo example uses:
-- STRef assignment.
(=:) :: STRef s a -> ST s a -> ST s ()
ref =: x = writeSTRef ref =<< x
-- STRef function application.
($:) :: (a -> b) -> STRef s a -> ST s b
f $: ref = f `fmap` readSTRef ref
-- Postfix guard syntax.
if_ :: Monad m => m () -> m Bool -> m ()
action `if_` guard = act' =<< guard
where act' b = if b then action
else return ()
This lets us write:
ref =: x to assign the value of ST computation x to the STRef ref.
(f $: ref) to apply a pure function f to the STRef ref.
action `if_` guard to execute action only if guard results in True.
With these helpers in place, we can faithfully translate the original imperative definition of foo into Haskell:
a = (< 10)
b = even
c = odd
f x = x + 3
g x = x * 2
h x = x - 1
f3 x = x + 2
-- A stateful computation that takes two integer STRefs and result in a final [x,y].
fooST :: Integral n => STRef s n -> STRef s n -> ST s [n]
fooST x y = do
x =: (f $: x) `if_` (a $: x)
x' <- readSTRef x
if c x' then
x =: (g $: x)
else
x =: (h $: x)
x =: (f $: x)
y =: (f $: y) `if_` (a $: y)
x =: (g $: x) `if_` (b $: y)
sequence [readSTRef x, readSTRef y]
-- Pure wrapper: simply call fooST with two fresh references, and run it.
foo :: Integral n => n -> n -> [n]
foo x y = runST $ do
x' <- newSTRef x
y' <- newSTRef y
fooST x' y'
-- This will print "[9,3]".
main = print (foo 0 0)
Points to note:
Although we first had to define some syntactical helpers (=:, $:, if_) before translating foo, this demonstrates how you can use ST and STRef as a foundation to grow your own little imperative language that's directly suited to the problem at hand.
Syntax aside, this matches the structure of the original imperative definition exactly, without any error-prone restructuring. Any minor changes to the original example can be mirrored directly to Haskell. (The addition of the temporary x' <- readSTRef x binding in the Haskell code is only in order to use it with the native if/else syntax: if desired, this can be replaced with an appropriate ST-based if/else construct.)
The above code demonstrates giving both pure and stateful interfaces to the same computation: pure callers can use foo without knowing that it uses mutable state internally, while ST callers can directly use fooST (and for example provide it with existing STRefs to modify).
#Sibi said it best in his comment:
I would suggest you to stop thinking imperatively and rather think in a functional way. I agree that it will take some time to getting used to the new pattern, but try to translate imperative ideas to functional languages isn't a great approach.
Practically speaking, your chain of let can be a good starting point:
foo x0 y0 =
let x1 = if a x0 then f x0 else x0 in
let x2 = if c x1 then g x1 else h x1 in
let x3 = f x2 in
let y1 = if a y0 then f y0 else y0 in
let x4 = if b y1 then g x3 else x3 in
[x4,y1]
But I would suggest using a single let and giving descriptive names to the intermediate stages.
In this example unfortunately I don't have a clue what the various x's and y's do, so I cannot suggest meaningful names. In real code you would use names such as x_normalized, x_translated, or such, instead of x1 and x2, to describe what those values really are.
In fact, in a let or where you don't really have variables: they're just shorthand names you give to intermediate results, to make it easy to compose the final expression (the one after in or before the where.)
This is the spirit behind the x_bar and x_baz below. Try to come up with names that are reasonably descriptive, given the context of your code.
foo x y =
let x_bar = if a x then f x else x
x_baz = f if c x_bar then g x_bar else h x_bar
y_bar = if a y then f y else y
x_there = if b y_bar then g x_baz else x_baz
in [x_there, y_bar]
Then you can start recognizing patterns that were hidden in the imperative code. For example, x_bar and y_bar are basically the same transformation, applied respectively to x and y: that's why they have the same suffix "_bar" in this nonsensical example; then your x2 probably doesn't need an intermediate name , since you can just apply f to the result of the entire "if c then g else h".
Going on with the pattern recognition, you should factor out the transformations that you are applying to variables into sub-lambdas (or whatever you call the auxiliary functions defined in a where clause.)
Again, I don't have a clue what the original code did, so I cannot suggest meaningful names for the auxiliary functions. In a real application, f_if_a would be called normalize_if_needed or thaw_if_frozen or mow_if_overgrown... you get the idea:
foo x y =
let x_bar = f_if_a x
y_bar = f_if_a y
x_baz = f (g_if_c_else_h x_bar)
x_there = g_if_b x_baz y_bar
in [x_there, y_bar]
where
f_if_a x
| a x = f x
| otherwise = x
g_if_c_else_h x
| c x = g x
| otherwise = h x
g_if_b x y
| b y = g x
| otherwise = x
Don't disregard this naming business.
The whole point of Haskell and other pure functional languages is to express algorithms without the assignment operator, meaning the tool that can modify the value of an existing variable.
The names you give to things inside a function definition, whether introduced as arguments, let, or where, can only refer to one value (or auxiliary function) throughout the entire definition, so that your code can be more easily reasoned about and proven correct.
If you don't give them meaningful names (and conversely giving your code a meaningful structure) then you're missing out on the entire purpose of Haskell.
(IMHO the other answers so far, citing monads and other shenanigans, are barking up the wrong tree.)
I always prefer layering state transformers to using a single state over a tuple: it definitely declutters things by letting you "focus" on a specific layer (representations of the x and y variables in our case):
import Control.Monad.Trans.Class
import Control.Monad.Trans.State
foo :: x -> y -> (x, y)
foo x y =
(flip runState) y $ (flip execStateT) x $ do
get >>= \v -> when (a v) (put (f v))
get >>= \v -> put ((if c v then g else h) v)
modify f
lift $ get >>= \v -> when (a v) (put (f v))
lift get >>= \v -> when (b v) (modify g)
The lift function allows us to focus on the inner state layer, which is y.

Why does GHC make fix so confounding?

Looking at the GHC source code I can see that the definition for fix is:
fix :: (a -> a) -> a
fix f = let x = f x in x
In an example fix is used like this:
fix (\f x -> let x' = x+1 in x:f x')
This basically yields a sequence of numbers that increase by one to infinity. For this to happen fix must be currying the function that it receives right back to that very function as it's first parameter. It isn't clear to me how the definition of fix listed above could be doing that.
This definition is how I came to understand how fix works:
fix :: (a -> a) -> a
fix f = f (fix f)
So now I have two questions:
How does x ever come to mean fix x in the first definition?
Is there any advantage to using the first definition over the second?
It's easy to see how this definition works by applying equational reasoning.
fix :: (a -> a) -> a
fix f = let x = f x in x
What will x evaluate to when we try to evaluate fix f? It's defined as f x, so fix f = f x. But what is x here? It's f x, just as before. So you get fix f = f x = f (f x). Reasoning in this way you get an infinite chain of applications of f: fix f = f (f (f (f ...))).
Now, substituting (\f x -> let x' = x+1 in x:f x') for f you get
fix (\f x -> let x' = x+1 in x:f x')
= (\f x -> let x' = x+1 in x:f x') (f ...)
= (\x -> let x' = x+1 in x:((f ...) x'))
= (\x -> x:((f ...) x + 1))
= (\x -> x:((\x -> let x' = x+1 in x:(f ...) x') x + 1))
= (\x -> x:((\x -> x:(f ...) x + 1) x + 1))
= (\x -> x:(x + 1):((f ...) x + 1))
= ...
Edit: Regarding your second question, #is7s pointed out in the comments that the first definition is preferable because it is more efficient.
To find out why, let's look at the Core for fix1 (:1) !! 10^8:
a_r1Ko :: Type.Integer
a_r1Ko = __integer 1
main_x :: [Type.Integer]
main_x =
: # Type.Integer a_r1Ko main_x
main3 :: Type.Integer
main3 =
!!_sub # Type.Integer main_x 100000000
As you can see, after the transformations fix1 (1:) essentially became main_x = 1 : main_x. Note how this definition refers to itself - this is what "tying the knot" means. This self-reference is represented as a simple pointer indirection at runtime:
Now let's look at fix2 (1:) !! 100000000:
main6 :: Type.Integer
main6 = __integer 1
main5
:: [Type.Integer] -> [Type.Integer]
main5 = : # Type.Integer main6
main4 :: [Type.Integer]
main4 = fix2 # [Type.Integer] main5
main3 :: Type.Integer
main3 =
!!_sub # Type.Integer main4 100000000
Here the fix2 application is actually preserved:
The result is that the second program needs to do allocation for each element of the list (but since the list is immediately consumed, the program still effectively runs in constant space):
$ ./Test2 +RTS -s
2,400,047,200 bytes allocated in the heap
133,012 bytes copied during GC
27,040 bytes maximum residency (1 sample(s))
17,688 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
[...]
Compare that to the behaviour of the first program:
$ ./Test1 +RTS -s
47,168 bytes allocated in the heap
1,756 bytes copied during GC
42,632 bytes maximum residency (1 sample(s))
18,808 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
[...]
How does x ever come to mean fix x in the first definition?
fix f = let x = f x in x
Let bindings in Haskell are recursive
First of all, realize that Haskell allows recursive let bindings. What Haskell calls "let", some other languages call "letrec". This feels pretty normal for function definitions. For example:
ghci> let fac n = if n == 0 then 1 else n * fac (n - 1) in fac 5
120
But it can seem pretty weird for value definitions. Nevertheless, values can be recursively defined, due to Haskell's non-strictness.
ghci> take 5 (let ones = 1 : ones in ones)
[1,1,1,1,1]
See A gentle introduction to Haskell sections 3.3 and 3.4 for more elaboration on Haskell's laziness.
Thunks in GHC
In GHC, an as-yet-unevaluated expression is wrapped up in a "thunk": a promise to perform the computation. Thunks are only evaluated when they absolutely must be. Suppose we want to fix someFunction. According to the definition of fix, that's
let x = someFunction x in x
Now, what GHC sees is something like this.
let x = MAKE A THUNK in x
So it happily makes a thunk for you and moves right along until you demand to know what x actually is.
Sample evaluation
That thunk's expression just happens to refer to itself. Let's take the ones example and rewrite it to use fix.
ghci> take 5 (let ones recur = 1 : recur in fix ones)
[1,1,1,1,1]
So what will that thunk look like?
We can inline ones as the anonymous function \recur -> 1 : recur for a clearer demonstration.
take 5 (fix (\recur -> 1 : recur))
-- expand definition of fix
take 5 (let x = (\recur -> 1 : recur) x in x)
Now then, what is x? Well, even though we're not quite sure what x is, we can still go through with the function application:
take 5 (let x = 1 : x in x)
Hey look, we're back at the definition we had before.
take 5 (let ones = 1 : ones in ones)
So if you believe you understand how that one works, then you have a good feel of how fix works.
Is there any advantage to using the first definition over the second?
Yes. The problem is that the second version can cause a space leak, even with optimizations. See GHC trac ticket #5205, for a similar problem with the definition of forever. This is why I mentioned thunks: because let x = f x in x allocates only one thunk: the x thunk.
The difference is in sharing vs copying.1
fix1 f = x where x = f x -- more visually apparent way to write the same thing
fix2 f = f (fix2 f)
If we substitute the definition into itself, both are reduced as the same infinite application chain f (f (f (f (f ...)))). But the first definition uses explicit naming; in Haskell (as in most other languages) sharing is enabled by the ability to name things: one name is more or less guaranteed to refer to one "entity" (here, x). The 2nd definition does not guarantee any sharing - the result of a call fix2 f is substituted into the expression, so it might as well be substituted as a value.
But a given compiler could in theory be smart about it and use sharing in the second case as well.
The related issue is "Y combinator". In untyped lambda calculus where there is no naming constructs (and thus no self-reference), Y combinator emulates self-reference by arranging for the definition to be copied, so referring to the copy of self becomes possible. But in implementations which use environment model to allow for named entities in a language, direct reference by name becomes possible.
To see a more drastic difference between the two definitions, compare
fibs1 = fix1 ( (0:) . (1:) . g ) where g (a:t#(b:_)) = (a+b):g t
fibs2 = fix2 ( (0:) . (1:) . g ) where g (a:t#(b:_)) = (a+b):g t
See also:
In Scheme, how do you use lambda to create a recursive function?
Y combinator discussion in "The Little Schemer"
Can fold be used to create infinite lists?
(especially try to work out the last two definitions in the last link above).
1 Working from the definitions, for your example fix (\g x -> let x2 = x+1 in x : g x2) we get
fix1 (\g x -> let x2 = x+1 in x : g x2)
= fix1 (\g x -> x : g (x+1))
= fix1 f where {f = \g x -> x : g (x+1)}
= fix1 f where {f g x = x : g (x+1)}
= x where {x = f x ; f g x = x : g (x+1)}
= g where {g = f g ; f g x = x : g (x+1)} -- both g in {g = f g} are the same g
= g where {g = \x -> x : g (x+1)} -- and so, here as well
= g where {g x = x : g (x+1)}
and thus a proper recursive definition for g is actually created. (in the above, we write ....x.... where {x = ...} for let {x = ...} in ....x...., for legibility).
But the second derivation proceeds with a crucial distinction of substituting a value back, not a name, as
fix2 (\g x -> x : g (x+1))
= fix2 f where {f g x = x : g (x+1)}
= f (fix2 f) where {f g x = x : g (x+1)}
= (\x-> x : g (x+1)) where {g = fix2 f ; f g x = x : g (x+1)}
= h where {h x = x : g (x+1) ; g = fix2 f ; f g x = x : g (x+1)}
so the actual call will proceed as e.g.
take 3 $ fix2 (\g x -> x : g (x+1)) 10
= take 3 (h 10) where {h x = x : g (x+1) ; g = fix2 f ; f g x = x : g (x+1)}
= take 3 (x:g (x+1)) where {x = 10 ; g = fix2 f ; f g x = x : g (x+1)}
= x:take 2 (g x2) where {x2 = x+1 ; x = 10 ; g = fix2 f ; f g x = x : g (x+1)}
= x:take 2 (g x2) where {x2 = x+1 ; x = 10 ; g = f (fix2 f) ; f g x = x : g (x+1)}
= x:take 2 (x2 : g2 (x2+1)) where { g2 = fix2 f ;
x2 = x+1 ; x = 10 ; f g x = x : g (x+1)}
= ......
and we see that a new binding (for g2) is established here, instead of the previous one (for g) being reused as with the fix1 definition.
I have perhaps a bit simplified explanation that comes from inlining optimization. If we have
fix :: (a -> a) -> a
fix f = f (fix f)
then fix is a recursive function and this means it cannot be inlined in places where it is used (an INLINE pragma will be ignored, if given).
However
fix' f = let x = f x in x
is not a recursive function - it never calls itself. Only x inside is recursive. So when calling
fix' (\r x -> let x' = x+1 in x:r x')
the compiler can inline it into
(\f -> (let y = f y in y)) (\r x -> let x' = x+1 in x:r x')
and then continue simplifying it, for example
let y = (\r x -> let x' = x+1 in x:r x') y in y
let y = (\ x -> let x' = x+1 in x:y x') in y
which is just as if the function were defined using the standard recursive notation without fix:
y x = let x' = x+1 in x:y x'

Resources