Abstract Problem:
I'd like to implement a self-reference / pointer in Elm.
Specific Problem:
I'm writing a toy LISP interpreter in Elm inspired by mal.
I'm attempting to implement something like letrec to support recursive and mutually-recursive bindings (the "self reference" and "pointers" I'm mentioning above).
Here's some example code:
(letrec
([count (lambda (items)
(if (empty? items)
0
(+ 1 (count (cdr items)))
)
)
])
(count (quote 1 2 3))
)
;=>3
Note how the body of the lambda refers to the binding count. In other words, the function needs a reference to itself.
Deeper Background:
When a lambda is defined, we need to create a function closure which consists of three components:
The function body (the expression to be evaluated when the function is called).
A list of function arguments (local variables that will be bound upon calling).
A closure (the values of all non-local variables that may be referenced within the body of the function).
From the wikipedia article:
Closures are typically implemented with [...] a representation of the function's lexical environment (i.e., the set of available variables) at the time when the closure was created. The referencing environment binds the non-local names to the corresponding variables in the lexical environment at the time the closure is created, additionally extending their lifetime to at least as long as the lifetime of the closure itself. When the closure is entered at a later time, possibly with a different lexical environment, the function is executed with its non-local variables referring to the ones captured by the closure, not the current environment.
Based on the above lisp code, in creating the lambda, we create a closure whose count variable must be bound to the lambda, thereby creating an infinite/circular/self-reference. This problem gets further complicated by mutually-recursive definitions which must be supported by letrec as well.
Elm, being a pure functional language, does not support imperative modification of state. Therefore, I believe that it is impossible to represent self-referencing values in Elm. Can you provide some guidance on alternatives to implementing letrec in Elm?
Research and Attempts
Mal in Elm
Jos von Bakel has already implemented mal in Elm. See his notes here and the environment implementation here. He's gone to great lengths to manually build a pointer system with its own internal GC mechanism. While this works, this seems like massive amounts of struggle. I'm craving a pure functional implementation.
Mal in Haskell
The mal implementation in Haskell (see code here) uses Data.IORef to emulate pointers. This also seems like hack to me.
Y-Combinator/Fixed Points
It seems possible that the Y-Combinator can be used to implement these self references. There seems to be a Y* Combinator that works for mutually recursive functions as well. It seems logical to me that there must also exist a Z* combinator (equivalent to Y* but supports the eager evaluation model of Elm). Should I transform all of my letrec instances so that each binding is wrapped around a Z*?
The Y-Combinator is new to me and my intuitive mind simply does not understand it so I'm not sure if the above solution will work.
Conclusion
Thank you very much for reading! I have been unable to sleep well for days as I struggle with this problem.
Thank You!
-Advait
In Haskell, this is fairly straightforward thanks to lazy evaluation. Because Elm is strict, to use the technique below, you would need to introduce laziness explicitly, which would be more or less equivalent to adding a pointer indirection layer of the sort you mentioned in your question.
Anyway, the Haskell answer might be useful to someone, so here goes...
Fundamentally, a self-referencing Haskell value is easily constructed by introducing a recursive binding, such as:
let mylist = [1,2] ++ mylist in mylist
The same principle can be used in writing an interpreter to construct self-referencing values.
Given the following simple S-expression language for constructing potentially recursive / self-referencing data structures with integer atoms:
data Expr = Atom Int | Var String | Cons Expr Expr | LetRec [String] [Expr] Expr
we can write an interpreter to evaluate it to the following type, which doesn't use IORefs or ad hoc pointers or anything weird like that:
data Value = AtomV Int | ConsV Value Value deriving (Show)
One such interpreter is:
type Context = [(String,Value)]
interp :: Context -> Expr -> Value
interp _ (Atom x) = AtomV x
interp ctx (Var v) = fromJust (lookup v ctx)
interp ctx (Cons ca cd) = ConsV (interp ctx ca) (interp ctx cd)
interp ctx (LetRec vs es e)
= let ctx' = zip vs (map (interp ctx') es) ++ ctx
in interp ctx' e
This is effectively a computation in a reader monad, but I've written it explicitly because a Reader version would require using the MonadFix instance either explicitly or via the RecursiveDo syntax and so would obscure the details.
The key bit of code is the case for LetRec. Note that a new context is constructed by introducing a set of potentially mutually recursive bindings. Because evaluation is lazy, the values themselves can be computed with the expression interp ctx' es using the newly created ctx' of which they are part, tying the recursive knot.
We can use our interpreter to create a self-referencing value like so:
car :: Value -> Value
car (ConsV ca _cd) = ca
cdr :: Value -> Value
cdr (ConsV _ca cd) = cd
main = do
let v = interp [] $ LetRec ["ones"] [Cons (Atom 1) (Var "ones")] (Var "ones")
print $ car $ v
print $ car . cdr $ v
print $ car . cdr . cdr $ v
print $ car . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr $ v
Here's the full code, also showing an alternative interp' using the Reader monad with recursive-do notation:
{-# LANGUAGE RecursiveDo #-}
{-# OPTIONS_GHC -Wall #-}
module SelfRef where
import Control.Monad.Reader
import Data.Maybe
data Expr = Atom Int | Var String | Cons Expr Expr | LetRec [String] [Expr] Expr
data Value = AtomV Int | ConsV Value Value deriving (Show)
type Context = [(String,Value)]
interp :: Context -> Expr -> Value
interp _ (Atom x) = AtomV x
interp ctx (Var v) = fromJust (lookup v ctx)
interp ctx (Cons ca cd) = ConsV (interp ctx ca) (interp ctx cd)
interp ctx (LetRec vs es e)
= let ctx' = zip vs (map (interp ctx') es) ++ ctx
in interp ctx' e
interp' :: Expr -> Reader Context Value
interp' (Atom x) = pure $ AtomV x
interp' (Var v) = asks (fromJust . lookup v)
interp' (Cons ca cd) = ConsV <$> interp' ca <*> interp' cd
interp' (LetRec vs es e)
= mdo let go = local (zip vs vals ++)
vals <- go $ traverse interp' es
go $ interp' e
car :: Value -> Value
car (ConsV ca _cd) = ca
cdr :: Value -> Value
cdr (ConsV _ca cd) = cd
main = do
let u = interp [] $ LetRec ["ones"] [Cons (Atom 1) (Var "ones")] (Var "ones")
let v = runReader (interp' $ LetRec ["ones"] [Cons (Atom 1) (Var "ones")] (Var "ones")) []
print $ car . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr $ u
print $ car . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr $ v
A binding construct in which the expressions can see the bindings doesn't require any exotic self-reference mechanisms.
How it works is that an environment is created for the variables, and then the values are assigned to them. The initializing expressions are evaluated in the environment in which those variables are already visible. Thus if those expressions happen to be lambda expressions, then they capture that environment, and that's how the functions can refer to each other.
An interpreter does this by extending the environment with the new variables, and then using the extended environment for evaluating the assignments. Similarly, a compiler extends the compile-time lexical environment, and then compiles the assignments under that environment, so the running code will store values into the correct frame locations. If you have working lexical closures, the correct behavior of functions being able to mutually recurse just pops out.
Note that if the assignments are performed in left to right order, and one of the lambdas happens to be dispatched during initialization, and then happens to make a forward call to one of lambdas through a not-yet-assigned variable, that will be a problem; e.g.
(letrec
([alpha (lambda () (omega)]
[beta (alpha)] ;; problem: alpha calls omega, not yet stored in variable.
[omega (lambda ())])
...)
Note that in the R7RS Scheme Report, P16-17, letrec is in fact documented as working like this. All the variables are bound, and then they are assigned the values. If the evaluation of an init expression refers to the same variable that is being initialized, or to later variables not yet initialized, R7RS says that it is an error. The document also specifies a restriction regarding the use of continuations captured in the initializing expressions.
The U combinator
I am late to the party here, but I got interested and spent some time working out how to do this in a Lisp-family language, specifically Racket, and thought perhaps other people might be interested.
I suspect that there is lots of information about this out there, but it's seriously hard to search for anything which looks like '*-combinator' now (even now I am starting a set of companies called 'Integration by parts' and so on).
You can, as you say, do this with the Y combinator, but I didn't want to do that because Y is something I find I can understand for a few hours at a time and then I have to work it all out again. But it turns out that you can use something much simpler: the U combinator. It seems to be even harder to search for this than Y, but here is a quote about it:
In the theory of programming languages, the U combinator, U, is the mathematical function that applies its argument to its argument; that is U(f) = f(f), or equivalently, U = λ f . f(f).
Self-application permits the simulation of recursion in the λ-calculus, which means that the U combinator enables universal computation. (The U combinator is actually more primitive than the more well-known fixed-point Y combinator.)
The expression U(U), read U of U, is the smallest non-terminating program, [...].
(Text from here, which unfortunately is not a site all about the U combinator other than this quote.)
Prerequisites
All of the following code samples are in Racket. The macros are certainly Racket-specific. To make the macros work you will need syntax-parse via:
(require (for-syntax syntax/parse))
However note that my use of syntax-parse is naïve in the extreme: I'm really just an unfrozen CL caveman pretending to understand Racket's macro system.
Also note I have not ruthlessly turned everything into λ: there are lets in this code, use of multiple values including let-values, (define (f ...) ...) and so on.
Two versions of U
The first version of U is the obvious one:
(define (U f)
(f f))
But this will run into some problems with an applicative-order language, which Racket is by default. To avoid that we can make the assumption that (f f) is going to be a function, and wrap that form in another function to delay its evaluation until it's needed: this is the standard trick that you have to do for Y in an applicative-order language as well. I'm only going to use the applicative-order U when I have to, so I'll give it a different name:
(define (U/ao f)
(λ args (apply (f f) args)))
Note also that I'm allowing more than one argument rather than doing the pure-λ-calculus thing.
Using U to construct a recursive functions
To do this we do a similar trick that you do with Y: write a function which, if given a function as argument which deals with the recursive cases, will return a recursive function. And obviously I'll use the Fibonacci function as the canonical recursive function.
So, consider this thing:
(define fibber
(λ (f)
(λ (n)
(if (<= n 2)
1
(+ ((U f) (- n 1))
((U f) (- n 2)))))))
This is a function which, given another function, U of which computes smaller Fibonacci numbers, will return a function which will compute the Fibonacci number for n.
In other words, U of this function is the Fibonacci function!
And we can test this:
> (define fibonacci (U fibber))
> (fibonacci 10)
55
So that's very nice.
Wrapping U in a macro
So, to hide all this the first thing to do is to remove the explicit calls to U in the recursion. We can lift them out of the inner function completely:
(define fibber/broken
(λ (f)
(let ([fib (U f)])
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))))
Don't try to compute U of this: it will recurse endlessly because (U fibber/broken) -> (fibber/broken fibber/broken) and this involves computing (U fibber/broken), and we're doomed.
Instead we can use U/ao:
(define fibber
(λ (f)
(let ([fib (U/ao f)])
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))))
And this is all fine ((U fibber) 10) is 55 (and terminates!).
And this is really all you need to be able to write the macro:
(define-syntax (with-recursive-binding stx)
(syntax-parse stx
[(_ (name:id value:expr) form ...+)
#'(let ([name (U (λ (f)
(let ([name (U/ao f)])
value)))])
form ...)]))
And this works fine:
(with-recursive-binding (fib (λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))
(fib 10))
A caveat on bindings
One fairly obvious thing here is that there are two bindings constructed by this macro: the outer one, and an inner one of the same name. And these are not bound to the same function in the sense of eq?:
(with-recursive-binding (ts (λ (it)
(eq? ts it)))
(ts ts))
is #f. This matters only in a language where bindings can be mutated: a language with assignment in other words. Both the outer and inner bindings, unless they have been mutated, are to functions which are identical as functions: they compute the same values for all values of their arguments. In fact, it's hard to see what purpose eq? would serve in a language without assignment.
This caveat will apply below as well.
Two versions of U for many functions
The obvious generalization of U, U*, to many functions is that U*(f1, ..., fn) is the tuple (f1(f1, ..., fn), f2(f1, ..., fn), ...). And a nice way of expressing that in Racket is to use multiple values:
(define (U* . fs)
(apply values (map (λ (f)
(apply f fs))
fs)))
And we need the applicative-order one as well:
(define (U*/ao . fs)
(apply values (map (λ (f)
(λ args (apply (apply f fs) args)))
fs)))
Note that U* is a true generalization of U: (U f) and (U* f) are the same.
Using U* to construct mutually-recursive functions
I'll work with a trivial pair of functions:
an object is a numeric tree if it is a cons and its car and cdr are numeric objects;
an objct is a numeric object if it is a number, or if it is a numeric tree.
So we can define 'maker' functions (with an '-er' convention: a function which makes an x is an xer, or, if x has hyphens in it, an x-er) which will make suitable functions:
(define numeric-tree-er
(λ (nter noer)
(λ (o)
(let-values ([(nt? no?) (U* nter noer)])
(and (cons? o)
(no? (car o))
(no? (cdr o)))))))
(define numeric-object-er
(λ (nter noer)
(λ (o)
(let-values ([(nt? no?) (U* nter noer)])
(cond
[(number? o) #t]
[(cons? o) (nt? o)]
[else #f])))))
Note that for both of these I've raised the call to U* a little, simply to make the call to the appropriate value of U* less opaque.
And this works:
(define-values (numeric-tree? numeric-object?)
(U* numeric-tree-er numeric-object-er))
And now:
> (numeric-tree? 1)
#f
> (numeric-object? 1)
#t
> (numeric-tree? '(1 . 2))
#t
> (numeric-tree? '(1 2 . (3 4)))
#f
Wrapping U* in a macro
The same problem as previously happens when we raise the inner call to U* with the same result: we need to use U*/ao. In addition the macro becomes significantly more hairy and I'm moderately surprised that I got it right so easily. It's not conceptually hard: it's just not obvious to me that the pattern-matching works.
(define-syntax (with-recursive-bindings stx)
(syntax-parse stx
[(_ ((name:id value:expr) ...) form ...+)
#:fail-when (check-duplicate-identifier (syntax->list #'(name ...)))
"duplicate variable name"
(with-syntax ([(argname ...) (generate-temporaries #'(name ...))])
#'(let-values
([(name ...) (U* (λ (argname ...)
(let-values ([(name ...)
(U*/ao argname ...)])
value)) ...)])
form ...))]))
And now, in a shower of sparks, we can write:
(with-recursive-bindings ((numeric-tree?
(λ (o)
(and (cons? o)
(numeric-object? (car o))
(numeric-object? (cdr o)))))
(numeric-object?
(λ (o)
(cond [(number? o) #t]
[(cons? o) (numeric-tree? o)]
[else #f]))))
(numeric-tree? '(1 2 3 (4 (5 . 6) . 7) . 8)))
and get #t.
As I said, I am sure there are well-known better ways to do this, but I thought this was interesting enough not to lose.
Related
I'm working on Haskell lambda calculus interpreter. I have a method which reduces expression to it's normal form.
type Var = String
data Term =
Variable Var
| Lambda Var Term
| Apply Term Term
deriving Show
normal :: Term -> Term
normal (Variable index) = Variable index
normal (Lambda var body) = Lambda var (normal body)
normal (Apply left right) = case normal left of
Lambda var body -> normal (substitute var (normal right) body)
otherwise -> Apply (normal left) (normal right)
How can I save the steps taken into a collection?
My normal function outputs this:
\a. \b. a (a (a (a b)))
and my goal would be to get all the steps as:
(\f. \x. f (f x)) (\f. \x. f (f x)),
\a. (\f. \x. f (f x)) ((\f. \x. f (f x)) a),
\a. \b. (\f. \x. f (f x)) a ((\f. \x. f (f x)) a b),
\a. \b. (\b. a (a b)) ((\f. \x. f (f x)) a b),
\a. \b. a (a ((\f. \x. f (f x)) a b)),
\a. \b. a (a ((\b. a (a b)) b)),
\a. \b. a (a (a (a b)))]
I've tried encapsulating the normal method into lists as follows:
normal :: Term -> [Term]
normal (Variable index) = Variable index
normal (Lambda var body) = term: [Lambda var (normal body)]
normal (Apply left right) = case normal left of
Lambda var body -> normal (substitute var (normal right) body)
otherwise -> Apply (normal left) (normal right)
But that doesn't seem to be the right approach.
I think you're putting the cart before the horse. normal repeatedly reduces a term until it cannot be reduced any more. Where, then, is the function that actually reduces a term once?
reduce :: Term -> Maybe Term -- returns Nothing if no reduction
reduce (Variable _) = Nothing -- variables don't reduce
reduce (Lambda v b) = Lambda v <$> reduce b -- an abstraction reduces as its body does
reduce (Apply (Lambda v b) x) = Just $ substitute v x b -- actual meaning of reduction
reduce (Apply f x) = flip Apply x <$> reduce f <|> Apply f <$> reduce x -- try to reduce f, else try x
Then
normal :: Term -> [Term]
normal x = x : maybe [] normal (reduce x)
Or, to be slightly more accurate
import Data.List.NonEmpty as NE
normal :: Term -> NonEmpty Term
normal = NE.unfoldr $ (,) <*> reduce
Note that this definition of reduce also corrects a bug in your original normal. There are terms which have normal forms that your normal fails to evaluate. Consider the term
(\x y. y) ((\x. x x) (\x. x x)) -- (const id) omega
This normalizes to \y. y. Depending on how substitute is implemented, your normal might succeed or fail to normalize this term. If it succeeds, it will have been saved by laziness. A hypothetical "stepping" version of your normal, which normalizes arguments before substitution, would definitely fail to normalize this.
Refraining from reducing the argument before substitution guarantees you will find the normal form of any term, if a normal form exists. You can restore the eager behavior with
eagerReduce t#(Apply f#(Lambda v b) x) = Apply f <$> eagerReduce x <|> Just (substitute v x b)
-- other equations...
eagerNormal = NE.unfoldr $ (,) <*> eagerReduce
As promised, eagerNormal produces an infinite list on my example term and never finds the normal form.
You're on the right track, but there's just a lot more you need to do. Remember, if you change the type of your function from Term -> Term to Term -> [Term], then you need to make sure that for any input Term, the function produces an output [Term]. However, in your new implementation of normal, you only made the change for one case (and in doing so, you made up some new value called term — not sure why you did that).
So, let's think through the whole problem. What is the list of Term that should be produced by normal when the input is Variable index? Well, there's no work to do, so it should probably be a singleton list:
normal' (Variable index) = [Variable index]
How about for Lambda var body? You wrote term: [Lambda var (normal' body)], but this doesn't make any sense. Remember that normal' body is now producing a list, but Lambda can't take a list of terms as its body argument. And what is this term value you're trying to cons onto your singleton list?
Calling normal' body is a great idea. This produces a list of Terms, which represent partial normalizations of the body. But, we want to produce partial normalizations of the lambda, not just the body. So, we need to modify each element in the list, converting it from a body to a lamdba:
normal' (Lambda var body) = Lambda var <$> normal' body
Hooray! (Note that since we're not doing any actual normalization in this step, we don't need to increase the length of the list.)
(For the sake of coding convenience, for the final case, we will construct the list of partial terms in reverse order. We can always reverse it later.)
The last case is the hardest, but it follows the same principles. We begin by recognizing that the recursive calls normal' left and normal' right will return lists of results rather than simply the final term:
normal' (Apply left right) =
let lefts = normal' left
rights = normal' right
One question this raises is: Which evaluation steps do we take first? Let's choose to evaluate left first. This means that all of the evaluation steps for left must be paired with the original right, and all of the evaluation steps for right must be paired with the most evaluated left.
normal' (Apply left right) =
let lefts = normal' left
rights = normal' right
lefts' = flip Apply right <$> lefts
rights' = Apply (head lefts) <$> init rights
evalSoFar = rights' + lefts'
Note the use of init rights at the end — since the last element of rights should be equal to right and we already have a value with the head of lefts and the last element of rights in lefts', we omit the last element of of rights when building rights'.
From here, all we need to do is actually perform our substitution (assuming that head lefts is indeed a Lambda expression) and concatenate our evalSoFar list to what it produces:
normal' (Apply left right) =
let lefts = normal' left
rights = normal' right
lefts' = flip Apply right <$> lefts
rights' = Apply (head lefts) <$> init rights
evalSoFar = rights' ++ lefts'
in case (lefts, rights) of
(Lambda var body : _, right' : _) -> normal' (substitute var right' body) ++ evalSoFar
_ -> evalSoFar
Remember that this produces the list backwards, so we can define normal as follows:
normal :: Term -> [Term]
normal = reverse . normal'
It's hard for me to test this exactly considering you didn't provide a definition of substitute, but I'm pretty sure it should do what you want.
That said, I will note that the resulting evaluation steps I get from evaluating your sample term are not the same as those given in the question. Specifically, your second evaluation step, going from
\a. (\f. \x. f (f x)) ((\f. \x. f (f x)) a),
\a. \b. (\f. \x. f (f x)) a ((\f. \x. f (f x)) a b),
seems wrong based on your implementation. Note that here you're performing a substitution before fully evaluating the argument. If you run the code in this answer, you'll see that the result you get does not perform this step but rather fully evaluates the function argument (that is, ((\f. \x. f (f x)) a)) before substituting.
When I think "I want a computation to produce extra information as it progresses", I say: "Why should I implement this by hand, when Writer is right there?" Writer is the way to produce a pair when you're mostly only interested in one half of the pair, and the other half is some structure you just append to like a log.
You didn't provide substitute, and I didn't want to implement it, so I wrote my own simple language to reduce. This also leaves you the exercise of applying the same technique to your language.
import Control.Monad.Trans.Writer.Lazy (Writer, runWriter, tell, censor)
data Sum = Number Int
| Sum :+ Sum
instance Show Sum where
show (Number n) = show n
show (x :+ y) = "(" ++ show x ++ " + " ++ show y ++ ")"
The key Writer functions we'll need to implement this, in addition to the usual Applicative/Monad combinators, are tell and censor. tell is pretty boring: it just says "append this to the log". censor is interesting: it runs another Writer and returns that Writer's output, but it also modifies the log produced by that Writer.
The idea is to use tell to log the term every time we make a reduction, and to use censor to ensure that logged sub-terms are properly placed into the context of the overall computation.
normalize :: Sum -> Writer [Sum] Sum
normalize root = tell [root] *> go root
where go n#(Number _) = pure n
go ((Number left) :+ (Number right)) =
let sum = (Number (left + right))
in tell [sum] *> pure sum
go (left :+ right) = do
left' <- censor (map (:+ right)) (go left)
right' <- censor (map (left' :+)) (go right)
go $ left' :+ right'
To that end, note that our only tell (other than at the root, to show the original expression) is in the case where both operands are already reduced, so that we can do some simplification. In the non-reduced case, we simply normalize both operands, placing any logs produced into the broader context. Your language has different reduction steps, of course; as long as you make sure to log something exactly when you directly simplify something, you won't wind up with duplicate or missing steps. I think the only reduction you ever do is a substitution, so that should be simple enough.
Observe that this produces a desirable result:
main = print . runWriter . normalize $
(Number 1 :+ Number 2) :+ (Number 3 :+ (Number 4 :+ Number 5))
$ runghc tmp.hs
(15,[((1 + 2) + (3 + (4 + 5))),(3 + (3 + (4 + 5))),(3 + (3 + 9)),(3 + 12),15])
Here the returned pair contains the expected result (Number 15) on the left, and the set of reductions applied on the right, in the order they were done.
I am trying to write a function, when passed:
variables VX = ["v1",..."vn"]
And a Term, will replace all Terms within the passed Term with a variable from VX respectively.
My function works to a certain extent, for the example:
S ((\a. \x. (\y. a c) x b) (\f. \x. x) 0)
It returns:
S (V1 V1 0)
Rather than what it should return:
S (V1 V2 0)
Here is my function along with the tests. Can anyone spot a mistake I have made perhaps?
termToExpression :: [Var] -> Term -> Expression
termToExpression [] a = termToExpr a
termToExpression _ (TermVar y) = ExpressionVar y
termToExpression (x : _) (TermLambda a b) = ExpressionVar x
termToExpression (x : xs) (TermApp n m) = ExpressionApp (termToExpression (x : xs) n) (termToExpression (x : xs) m)
The issue is that
ExpressionApp (termToExpression (x : xs) n) (termToExpression (x : xs) m)
makes two recursive calls, and intuitively the first one should "consume" any number of variables to generate its output. After that, the second recursive call should not use the variables already "consumed" by the first one.
In a sense, there is some sort of state which is being modified by each call: the list of variables gets partially consumed.
To model that, you will need to first write an auxiliary recursive function which returns, together with the new lambda term, the list of not-yet-consumed variables.
aux :: [Var] -> Term -> (Expression, [Var])
Now, when you need to make two recursive calls to aux, you can make the first one, get the list of not-consumed variables from its result, and make the second recursive call using that list.
(A more advanced solution would be to use a State [Var] monad, but I guess you want to write a basic solution.)
In order to refresh my 20 year old experience with Haskell I am walking through https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Adding_Variables_and_Assignment and at one point the following line is introduced to apply op to all the params. This is to implement e.g. (+ 1 2 3 4)
numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op
I do not understand the syntax, and the explanation in the text is a bit vague.
I understand what foldl1 does and how to dot functions (unpackNum is a helper function), but using Monads and the >>= operator leaves me a bit confused.
How is this to be read?
Essentially,
mapM unpackNum params >>= return . Number . foldl1 op
is made of two parts.
mapM unpackNum params means: take the list of parameters params. On each item, apply unpackNum: this will produce an Integer wrapped inside the ThrowsError monad. So, it's not exactly a plain Integer, since it has a chance to error out. Anyway, using unpackNum on each item either successfully produces all Integers, or throws some error. In the first case, we produce a new list of type [Integer], in the second one we (unsurprisingly) throw the error. So, the resulting type for this part is ThrowsError [Integer].
The second part is ... >>= return . Number . foldl1 op. Here >>= means: if the first part threw some error, the whole expression throws that error as well. If the part succeeded in producing [Integer] then proceed with foldl1 op, wrap the result as a Number, and finally use return to inject this value as a successful computation.
Overall there are monadic computations, but you should not think about those too much. The monadic stuff here is only propagating the errors around, or store plain values if the computation is successful. With a bit of experience, one can concentrate on the successful values only, and let mapM,>>=,return handle the error cases.
By the way, note that while the book uses code like action >>= return . f, this is arguably a bad style. One can use, to the same effect, fmap f action or f <$> action, which is a more direct way to express the same computation. E.g.
Number . foldl1 op <$> mapM unpackNum params
which is very close to a non-monadic code which ignores the error cases
-- this would work if there was no monad around, and errors did not have to be handled
Number . foldl1 op $ map unpackNum params
Your question is about syntax, so I'll just talk about how to parse that expression. Haskell's syntax is pretty simple. Informally:
identifiers separated by spaces are function application (the first identifier applied to the rest)
except identifiers that use non-alphanumeric chatracters (e.g. >>=, or .) are infix (i.e. their first argument is to the left of the identifier)
the first type of function application above (non-infix) binds more tightly than the second
operators can associate either to the left or right, and have different precedence (defined with an infix... declaration)
So only knowing this, if I see:
mapM unpackNum params >>= return . Number . foldl1 op
To begin with I know that it must be parse like
(mapM unpackNum params) >>= return . Number . (foldl1 op)
To go further we need to inspect the fixity/precedence of the two operators we see in this expression:
Prelude> :info (.)
(.) :: (b -> c) -> (a -> b) -> a -> c -- Defined in ‘GHC.Base’
infixr 9 .
Prelude> :info (>>=)
class Applicative m => Monad (m :: * -> *) where
(>>=) :: m a -> (a -> m b) -> m b
...
-- Defined in ‘GHC.Base’
infixl 1 >>=
(.) has a higher precedence (9 vs 1 for >>=), so its arguments will bind more tightly (i.e. we parenthesize them first). But how do we know which of these is correct?
(mapM unpackNum params) >>= ((return . Number) . (foldl1 op))
(mapM unpackNum params) >>= (return . (Number . (foldl1 op)))
...? Because (.) was declared infixr it associates to the right, meaning the second parse above is correct.
As Will Ness points out in the comments, (.) is associative (like e.g. addition) so both of these happen to be semantically equivalent.
With a little experience with a library (or the Prelude in this case) you learn to parse expressions with operators correctly without thinking too much.
If after doing this exercise you want to understand what a function does or how it works, then you can click through to the source of the functions you're interested in and replace occurrences of left-hand sides with right-hand sides (i.e. inline the bodies of the functions and terms). Obviously you can do this in your head or in an editor.
You could "sugar this up" with a more beginner-friendly syntax, with the do notation. Your function, numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op would become:
numericBinop op params = do
x <- mapM unpackNum params -- "<-" translates to ">>=", the bind operator
return . Number $ foldl1 op x
Well now the most mysterious is the mapM function, that is sequence . fmap, and it simply takes a function, fmaps it over the container, and flips the type, in this case (I presume) from [Number Integer] to ThrowsError [Integer], while preserving any errors (side effects) that may arise during the flipping, or in other words, if the 'flipping' caused any error, it would be represented in the result.
Not the simplest example, and you probably would be better off seeing how mapM (fmap (+1)) [Just 2, Just 3] differs from mapM (fmap (+1)) [Just 2, Nothing]. For more insights look into Traversable typeclass.
After that, you bind the [Integer] out of the ThrowsError monad and feed it to the function that takes care of doing the computation on the list, resulting in a single Integer, which in turn you need to re-embed into the ThrowsError monad with the return function after you wrap it into a Number.
If you still have trouble understanding monads, I suggest you take a look at the still relevant LYAH chapter that will gently introduce you to monads
>>= builds a computation that may fail at either end: its left argument can be an empty monad, in which case it does not even happen, otherwise its result can be empty as well. It has type
>>= :: m a -> (a -> m b) -> m b
See, its arguments are: value(s) immersed into monad and a function that accepts a pure value and returns an immersed result. This operator is a monadic version of what is known as flatMap in Scala, for instance; in Haskell, its particular implementation for lists is known as concatMap. If you have a list l, then l >>= f works as follows: for each element of l, f is applied to this element and returns a list; and all those resultant lists are concatenated to produce result.
Consider a code in Java:
try {
function1();
function2();
}
catch(Exception e) {
}
What happens when function2 is called? See, after the call to function1 the program is probably in a valid state, so function2() is an operator that transforms this current state into some different next one. But the call to function1() could result in an exception thrown, so the control would immediately transfer to the catch-block—this can be regarded as null state, so there's nothing to apply function2 to. In other words, we have the following possible control paths:
[S0] --- function1() --> [S1] --- function2() --> [S2]
[S0] --- function1() --> [] --> catch
(For simplicity, exceptions thrown from function2 are not considered in this diagram.)
So, either [S1] is a (non-empty) valid machine state, and function2 transforms it further to a (non-empty) valid [S2], or it is empty, and thus function2() is a no-op and never run. This can be summarized in pseudo-code as
S2 <- [S0] >>= function1 >>= function2
First, syntax. Whitespace is application, semantically:
f x = f $ x -- "call" f with an argument x
so your expression is actually
numericBinop op params = ((mapM unpackNum) params) >>= return . Number . (foldl1 op)
Next, the operators are built from non-alphanumerical characters, without any whitespace. Here, there's . and >>=. Running :i (.) and :i (>>=) at GHCi reveals their fixity specs are infixl 9 . and infixr 1 >>=. 9 is above 1 so . is stronger than >>=; thus
= ((mapM unpackNum) params) >>= (return . Number . (foldl1 op))
infixl 9 . means . associates to the right, thus, finally, it is
= ((mapM unpackNum) params) >>= (return . (Number . (foldl1 op)))
The (.) is defined as (f . g) x = f (g x), thus (f . (g . h)) x = f ((g . h) x) = f (g (h x)) = (f . g) (h x) = ((f . g) . h) x; by eta-reduction we have
(f . (g . h)) = ((f . g) . h)
thus (.) is associative, and so parenthesization is optional. We'll drop the explicit parens with the "whitespace" application from now on as well. Thus we have
numericBinop op params = (mapM unpackNum params) >>=
(\ x -> return (Number (foldl1 op x))) -- '\' is for '/\'ambda
Monadic sequences are easier written with do, and the above is equivalent to
= do
{ x <- mapM unpackNum params -- { ; } are optional, IF all 'do'
; return (Number (foldl1 op x))) -- lines are indented at the same level
}
Next, mapM can be defined as
mapM f [] = return []
mapM f (x:xs) = do { x <- f x ;
xs <- mapM f xs ;
return (x : xs) }
and the Monad Laws demand that
do { r <- do { x ; === do { x ;
y } ; r <- y ;
foo r foo r
} }
(you can find an overview of do notation e.g. in this recent answer of mine); thus,
numericBinop op [a, b, ..., z] =
do {
a <- unpackNum a ;
b <- unpackNum b ;
...........
z <- unpackNum z ;
return (Number (foldl1 op [a, b, ..., z]))
}
(you may have noticed my use of x <- x bindings -- we can use the same variable name on both sides of <-, because monadic bindings are not recursive -- thus introducing shadowing.)
This is now clearer, hopefully.
But, I said "first, syntax". So now, the meaning of it. By same Monad Laws,
numericBinop op [a, b, ..., y, z] =
do {
xs <- do { a <- unpackNum a ;
b <- unpackNum b ;
...........
y <- unpackNum y ;
return [a, b, ..., y] } ;
z <- unpackNum z ;
return (Number (op (foldl1 op xs) z))
}
thus, we need only understand the sequencing of two "computations", c and d,
do { a <- c ; b <- d ; return (foo a b) }
=
c >>= (\ a ->
d >>= (\ b ->
return (foo a b) ))
for a particular monad involved, which is determined by the bind (>>=) operator's implementation for a given monad.
Monads are EDSLs for generalized function composition. The sequencing of computations involves not only the explicit expressions appearing in the do sequence, but also the implicit effects peculiar to the particular monad in question, performed in principled and consistent manner behind the scenes. Which is the whole point to having monads in the first place (well, one of the main points, at least).
Here the monad involved appears to concern itself with the possibility of failure, and early bail-outs in the event that failure indeed happens.
So, with the do code we write the essence of what we intend to happen, and the possibility of intermittent failure is automatically taken care of, for us, behind the scenes.
In other words, if one of unpackNum computations fails, so will the whole of the combined computation fail, without attempting any of the subsequent unpackNum sub-computations. But if all of them succeed, so will the combined computation.
I am able to understand the basics of point-free functions in Haskell:
addOne x = 1 + x
As we see x on both sides of the equation, we simplify it:
addOne = (+ 1)
Incredibly it turns out that functions where the same argument is used twice in different parts can be written point-free!
Let me take as a basic example the average function written as:
average xs = realToFrac (sum xs) / genericLength xs
It may seem impossible to simplify xs, but http://pointfree.io/ comes out with:
average = ap ((/) . realToFrac . sum) genericLength
That works.
As far as I understand this states that average is the same as calling ap on two functions, the composition of (/) . realToFrac . sum and genericLength
Unfortunately the ap function makes no sense whatsoever to me, the docs http://hackage.haskell.org/package/base-4.8.1.0/docs/Control-Monad.html#v:ap state:
ap :: Monad m => m (a -> b) -> m a -> m b
In many situations, the liftM operations can be replaced by uses of ap,
which promotes function application.
return f `ap` x1 `ap` ... `ap` xn
is equivalent to
liftMn f x1 x2 ... xn
But writing:
let average = liftM2 ((/) . realToFrac . sum) genericLength
does not work, (gives a very long type error message, ask and I'll include it), so I do not understand what the docs are trying to say.
How does the expression ap ((/) . realToFrac . sum) genericLength work? Could you explain ap in simpler terms than the docs?
Any lambda term can be rewritten to an equivalent term that uses just a set of suitable combinators and no lambda abstractions. This process is called abstraciton elimination. During the process you want to remove lambda abstractions from inside out. So at one step you have λx.M where M is already free of lambda abstractions, and you want to get rid of x.
If M is x, you replace λx.x with id (id is usually denoted by I in combinatory logic).
If M doesn't contain x, you replace the term with const M (const is usually denoted by K in combinatory logic).
If M is PQ, that is the term is λx.PQ, you want to "push" x inside both parts of the function application so that you can recursively process both parts. This is accomplished by using the S combinator defined as λfgx.(fx)(gx), that is, it takes two functions and passes x to both of them, and applies the results together. You can easily verify that that λx.PQ is equivalent to S(λx.P)(λx.Q), and we can recursively process both subterms.
As described in the other answers, the S combinator is available in Haskell as ap (or <*>) specialized to the reader monad.
The appearance of the reader monad isn't accidental: When solving the task of replacing λx.M with an equivalent function is basically lifting M :: a to the reader monad r -> a (actually the reader Applicative part is enough), where r is the type of x. If we revise the process above:
The only case that is actually connected with the reader monad is when M is x. Then we "lift" x to id, to get rid of the variable. The other cases below are just mechanical applications of lifting an expression to an applicative functor:
The other case λx.M where M doesn't contain x, it's just lifting M to the reader applicative, which is pure M. Indeed, for (->) r, pure is equivalent to const.
In the last case, <*> :: f (a -> b) -> f a -> f b is function application lifted to a monad/applicative. And this is exactly what we do: We lift both parts P and Q to the reader applicative and then use <*> to bind them together.
The process can be further improved by adding more combinators, which allows the resulting term to be shorter. Most often, combinators B and C are used, which in Haskell correspond to functions (.) and flip. And again, (.) is just fmap/<$> for the reader applicative. (I'm not aware of such a built-in function for expressing flip, but it'd be viewed as a specialization of f (a -> b) -> a -> f b for the reader applicative.)
Some time ago I wrote a short article about this: The Monad Reader Issue 17, The Reader Monad and Abstraction Elimination.
When the monad m is (->) a, as in your case, you can define ap as follows:
ap f g = \x -> f x (g x)
We can see that this indeed "works" in your pointfree example.
average = ap ((/) . realToFrac . sum) genericLength
average = \x -> ((/) . realToFrac . sum) x (genericLength x)
average = \x -> (/) (realToFrac (sum x)) (genericLength x)
average = \x -> realToFrac (sum x) / genericLength x
We can also derive ap from the general law
ap f g = do ff <- f ; gg <- g ; return (ff gg)
that is, desugaring the do-notation
ap f g = f >>= \ff -> g >>= \gg -> return (ff gg)
If we substitute the definitions of the monad methods
m >>= f = \x -> f (m x) x
return x = \_ -> x
we get the previous definition of ap back (for our specific monad (->) a). Indeed:
app f g
= f >>= \ff -> g >>= \gg -> return (ff gg)
= f >>= \ff -> g >>= \gg -> \_ -> ff gg
= f >>= \ff -> g >>= \gg _ -> ff gg
= f >>= \ff -> \x -> (\gg _ -> ff gg) (g x) x
= f >>= \ff -> \x -> (\_ -> ff (g x)) x
= f >>= \ff -> \x -> ff (g x)
= f >>= \ff x -> ff (g x)
= \y -> (\ff x -> ff (g x)) (f y) y
= \y -> (\x -> f y (g x)) y
= \y -> f y (g y)
The Simple Bit: fixing liftM2
The problem in the original example is that ap works a bit differently from the liftM functions. ap takes a function wrapped up in a monad, and applies it to an argument wrapped up in a monad. But the liftMn functions take a "normal" function (one which is not wrapped up in a monad) and apply it to argument(s) that are wrapped up in monads.
I'll explain more about what that means below, but the upshot is that if you want to use liftM2, then you have to pull (/) out and make it a separate argument at the beginning. (So in this case (/) is the "normal" function.)
let average = liftM2 ((/) . realToFrac . sum) genericLength -- does not work
let average = liftM2 (/) (realToFrac . sum) genericLength -- works
As posted in the original question, calling liftM2 should involve three agruments: liftM2 f x1 x2. Here the f is (/), x1 is (realToFrac . sum) and x2 is genericLength.
The version posted in the question (the one which doesn't work) was trying to call liftM2 with only two arguments.
The explanation
I'll build this up in a few stages. I'll start with some specific values, and build up to a function that can take any set of values. Jump to the last section for the TL:DR
In this example, lets assume the list of numbers is [1,2,3,4]. The sum of these numbers is 10, and the length of the list is 4. The average is 10/4 or 2.5.
To shoe-horn this into the right form for ap, we're going to break this into a function, an input, and a result.
ourFunction = (10/) -- "divide 10 by"
ourInput = 4
ourResult = 2.5
Three kinds of Function Application
ap and listM both involve monads. At this point in the explanation, you can think of a monad as something that a value can be "wrapped up in". I'll give a better definition below.
Normal function application applies a normal function to a normal input. liftM applies a normal function to an input wrapped in a monad, and ap applies a function wrapped in a monad to an input wrapped in a monad.
(10/) 4 -- returns 2.5
liftM (10/) monad(4) -- returns monad(2.5)
ap monad(10/) monad(4) -- returns monad(2.5)
(Note that this is pseudocode. monad(4) is not actually valid Haskell).
(Note that liftM is a different function from liftM2, which was used earlier. liftM takes a function and only one argument, which is a better fit for the pattern i'm describing.)
In the average function defined above, the monads were functions, but "functions-as-monads" can be hard to talk about, so I'll start with simpler examples.
So what's a monad?
A better description of a monad is "something which contains a value, or produces a value, or which you can somehow extract a value from, but which also has something more complicated going on."
That's a really vague description, but it kind of has to be, because the "something more complicated" can be a lot of different things.
Monads can be confusing, but the point of them is that when you use monad operations (like ap and liftM) they will take care of the "something more complicated" for you, so you can just concentrate on the values.
That's probably still not very clear, so let's do some examples:
The Maybe monad
ap (Just (10/)) (Just 4) -- result is (Just 2.5)
One of the simplest monads is 'Maybe'. The value is whatever is contained inside a Just. So if we call ap and give it (Just ourFunction) and (Just ourInput) then we get back (Just ourResult).
The "something more complicated" is the fact that there might not be a value there at all, and you have to allow for the Nothing case.
As mentioned, the point of using a function like ap is that it takes care of these extra complications for us. With the Maybe monad, ap handles this by returning Nothing if either the Maybe-function or the Maybe-input were Nothing.
ap (Just (10/)) Nothing -- result is Nothing
ap Nothing (Just 4) -- result is Nothing
The List Monad
ap [(10/)] [4] -- result is [2.5]
With the list Monad, the value is whatever is inside the list. So ap [ourfunction] [ourInput] returns [ourResult].
The "something more complicated" is that there may be more than one thing inside the list (or exactly one thing, or nothing at all).
With lists, that means ap takes a list of zero or more functions, and a list of zero or more inputs. It handles that by returning a list of zero or more results: one result for every possible combination of function and input.
ap [(10/), (100/)] [5,4,2] -- result is [2.0, 2.5, 5.0, 20.0, 25.0, 50.0]
Functions as Monads
A function like genericLength is considered a Monad because it has a value (the function's output), and it has a "something more complicated" (the fact that you have to supply an input before you can get the value).
This is where it gets a little confusing, because we're dealing with multiple functions, multiple inputs, and multiple results. It is all well defined, it's just hard to describe, so we have to be careful with our terminology.
Lets start with the list [1,2,3,4], and call that our "original input". That's the list we're trying to find the average of. It's the xs argument in the original average function.
If we give our original input ([1,2,3,4]) to genericLength then we get a value of '4'.
Our other function is ((/) . realToFrac . sum). It takes our list [1,2,3,4] and finds the sum (10), turns that into a fractional value, and then feeds it as the first argument to (/). The result is an incomplete division function that is waiting for another argument. ie it takes [1,2,3,4] as an input, and produces (10/) as its output.
This all fits with the way ap is defined for functions. With functions, ap takes two things. The first is a function that reads the original input and produces a new function. The second is a function that reads the original input and produces a new input. The final result is a function that takes the original input, and returns the same thing you would get if you applied the new function to the new input.
You might have to read that a few times to make sense of it. Alternatively, here it is in pseudocode:
average =
ap
(functionThatTakes [1,2,3,4] and returns "(10/)" )
(functionThatTakes [1,2,3,4] and returns " 4 " )
-- which means:
average =
(functionThatTakes [1,2,3,4] and returns "2.5" )
If you compare this to the simpler examples above, you'll see that it still has our function (10/), our input 4 and our result 2.5. And each of them is once again wrapped up in the "something more complicated". In this case, the "something more complicated" is the "function that takes [1,2,3,4] and returns...".
Of course, since they're functions, they don't have to take [1,2,3,4] as their input. If they took a different list of integers (eg [1,2,3,4,5]) then we would get different results (e.g. new function: (15/), new input 5 and new value 3).
Other examples
minPlusMax = ap ((+) . minimum) maximum
-- a function that adds the minimum element of a list, to the maximum element
upperAndLower = ap ((,) . toUpper) toLower
-- a function that takes a Char and returns a tuple, with the upper case and lower case versions of a character
These could all also be defined using liftM2.
average = liftM2 (/) sum genericLength
minPlusMax = liftM2 (+) minimum maximum
upperAndLower = liftM2 (,) toUpper toLower
I am the kind that prefers learning by looking at code instead of reading long explanations. This might be one of the reasons I dislike long academic papers. Code is unambiguous, compact, noise-free and if you don't get something you can just play with it - no need to ask the author.
This is a complete definition of the Lambda Calculus:
-- A Lambda Calculus term is a function, an application or a variable.
data Term = Lam Term | App Term Term | Var Int deriving (Show,Eq,Ord)
-- Reduces lambda term to its normal form.
reduce :: Term -> Term
reduce (Var index) = Var index
reduce (Lam body) = Lam (reduce body)
reduce (App left right) = case reduce left of
Lam body -> reduce (substitute (reduce right) body)
otherwise -> App (reduce left) (reduce right)
-- Replaces bound variables of `target` by `term` and adjusts bruijn indices.
-- Don't mind those variables, they just keep track of the bruijn indices.
substitute :: Term -> Term -> Term
substitute term target = go term True 0 (-1) target where
go t s d w (App a b) = App (go t s d w a) (go t s d w b)
go t s d w (Lam a) = Lam (go t s (d+1) w a)
go t s d w (Var a) | s && a == d = go (Var 0) False (-1) d t
go t s d w (Var a) | otherwise = Var (a + (if a > d then w else 0))
-- If the evaluator is correct, this test should print the church number #4.
main = do
let two = (Lam (Lam (App (Var 1) (App (Var 1) (Var 0)))))
print $ reduce (App two two)
In my opinion, the "reduce" function above says much more about the Lambda Calculus than pages of explanations and I wish I could just look at it when I started learning. You can also see it implements a very strict evaluation strategy that goes even inside abstractions. On that spirit, how could that code be modified in order to illustrate the many different evaluation strategies that the LC can have (call-by-name, lazy evaluation, call-by-value, call-by-sharing, partial evaluation and so on)?
Call-by-name requires only a few changes:
Not evaluating the body of a lambda abstraction: reduce (Lam body) = (Lam body).
Not evaluating the argument of the application. Instead, we should substitute it as is:
reduce (App left right) = case reduce left of
Lam body -> reduce (substitute right body)
Call-by-need(aka lazy evaluation) seems harder(or maybe impossible) to implement in a fully declarative manner because we need to memoize values of expressions. I do not see a way to achieve it with minor changes.
Call by-sharing is not applicable to simple lambda calculus because we do not have objects and assignments here.
We could also use full beta reduction, but we need to chose some deterministic order of evaluation(we cannot pick an "arbitrary" redex and reduce it using the code we have now). This choice will yield some evaluation strategy(possible one of the described above).
The topic is quite broad. I'll just write about a few ideas.
The proposed reduce performs parallel rewriting. That is, it maps App t1 t2 to App t1' t2' (provided t1' is not an abstraction). Some strategies such as CBV and CBN are more sequential, in that they only have a single redex.
To describe them, I would modify reduce so that it returns whether a reduction was actually done, or if instead the term was a normal form. This can be done by returning a Maybe Term, where Nothing means normal form.
In that way, CBN would be
reduce :: Term -> Maybe Term
reduce (Var index) = Nothing -- Vars are NF
reduce (Lam body) = Nothing -- no reduction under Lam
reduce (App (Lam body) right) = Just $ substitute right body
reduce (App left right) =
(flip App right <$> reduce left) <|> -- try reducing left
(App left <$> reduce right) -- o.w., try reducing right
while CBV would be
reduce :: Term -> Maybe Term
reduce (Var index) = Nothing
reduce (Lam body) = Nothing -- no reduction under Lam
reduce (App (Lam body) right)
| reduce right == Nothing -- right must be a NF
= Just $ substitute right body
reduce (App left right) =
(flip App right <$> reduce left) <|>
(App left <$> reduce right)
Lazy evaluation (with sharing) can not be expressed using terms, if I remember correctly. It requires graphs to denote that a subterm is being shared.