Implementing lazy functional languages - haskell

When implementing a lazy functional language, it is necessary to store values as unevaluated thunks, to be evaluated only when needed.
One of the challenges of an efficient implementation, as discussed in e.g. The Spineless Tagless G-machine, is that this evaluation must be carried out only once for each thunk, and subsequent accesses must reuse the calculated value - failure to do this would lead to at least quadratic slowdown (perhaps exponential? I'm not sure off the top of my head.)
I'm looking for a simple example implementation whose operation is easily understood (as opposed to an industrial-strength implementation like GHC which is designed for performance over simplicity). I came across minihaskell at http://www.andrej.com/plzoo/ which contains the following code.
As it is dubbed "an efficient interpreter", I would assume it does indeed carry out each evaluation only once and save the calculated value for reuse, but I'm having difficulty seeing where and how; I can only see one assignment statement in the interpreter itself, and that doesn't look like it's overwriting part of a thunk record.
So my question is, is this interpreter indeed doing such caching, and if so where and how? (And if not, what's the simplest extant implementation that does do so?)
Code from http://www.andrej.com/plzoo/html/minihaskell.html
let rec interp env = function
| Var x ->
(try
let r = List.assoc x env in
match !r with
VClosure (env', e) -> let v = interp env' e in r := v ; v
| v -> v
with
Not_found -> runtime_error ("Unknown variable " ^ x))
... snipping the easy stuff ...
| Fun _ as e -> VClosure (env, e)
| Apply (e1, e2) ->
(match interp env e1 with
VClosure (env', Fun (x, _, e)) ->
interp ((x, ref (VClosure (env, e2)))::env') e
| _ -> runtime_error "Function expected in application")
| Pair _ as e -> VClosure (env, e)
| Fst e ->
(match interp env e with
VClosure (env', Pair (e1, e2)) -> interp env' e1
| _ -> runtime_error "Pair expected in fst")
| Snd e ->
(match interp env e with
VClosure (env', Pair (e1, e2)) -> interp env' e2
| _ -> runtime_error "Pair expected in snd")
| Rec (x, _, e) ->
let rec env' = (x,ref (VClosure (env',e))) :: env in
interp env' e
| Nil ty -> VNil ty
| Cons _ as e -> VClosure (env, e)
| Match (e1, _, e2, x, y, e3) ->
(match interp env e1 with
VNil _ -> interp env e2
| VClosure (env', Cons (d1, d2)) ->
interp ((x,ref (VClosure(env',d1)))::(y,ref (VClosure(env',d2)))::env) e3
| _ -> runtime_error "List expected in match")

The key are the records: notice !r, r := v. Whenever we lookup a variable from the environment, we actually get back a record, which we dereference to see if it's a thunk. If it is a thunk, we evaluate it and then save the result. We create thunks during application (notice the call to the ref constructor), recursive definitions and pattern matching, because those are constructs that bind variables.

Here are two call-by-need interpreters; one in Haskell, and one in Scheme. The key to both is that you can suspend evaluation inside procedures of no arguments (thunks). Whether your host language is call-by-need (Haskell) or call-by-value (Scheme, ML), lambda forms are considered values, so nothing under the lambda will be evaluated until the thunk is applied.
So, when an interpreted function is applied to an argument, you just wrap the unevaluated syntactic representation of the argument in a new thunk. Then, when you come across a variable, you look it up in the environment and promptly evaluate the thunk, giving you the value of the argument
Simply getting to this point makes your interpreter lazy, since arguments are not actually evaluated until they're used; this is a call-by-name interpreter. As you point out, though, an efficient lazy language will evaluate these arguments only once; such a language is call-by-need. To get this efficiency, you update the environment to instead contain a thunk containing just the value of the argument, rather than the entire argument expression.
The first interpreter here is in Haskell, and is fairly similar to the ML code you pasted. Of course, the challenges in Haskell are to 1) not trivially implement laziness, thanks to Haskell's built-in laziness, and 2) wrangle the side-effects into the code. Haskell's IORefs are used to allow the environment to be updated.
module Interp where
import Data.IORef
data Expr = ExprBool Bool
| ExprInt Integer
| ExprVar String
| ExprZeroP Expr
| ExprSub1 Expr
| ExprMult Expr Expr
| ExprIf Expr Expr Expr
| ExprLam String Expr
| ExprApp Expr Expr
deriving (Show)
data Val = ValBool Bool
| ValInt Integer
| ValClos ((() -> IO Val) -> IO Val)
instance Show Val where
show (ValBool b) = show b
show (ValInt n) = show n
show (ValClos c) = "Closure"
data Envr = EnvrEmpty
| EnvrExt String (IORef (() -> IO Val)) Envr
applyEnv :: Envr -> String -> IO (IORef (() -> IO Val))
applyEnv EnvrEmpty y = error $ "unbound variable " ++ y
applyEnv (EnvrExt x v env) y =
if x == y
then return v
else applyEnv env y
eval :: Expr -> Envr -> IO Val
eval exp env = case exp of
(ExprBool b) -> return $ ValBool b
(ExprInt n) -> return $ ValInt n
(ExprVar y) -> do
thRef <- applyEnv env y
th <- readIORef thRef
v <- th ()
writeIORef thRef (\() -> return v)
return v
(ExprZeroP e) -> do
(ValInt n) <- eval e env
return $ ValBool (n == 0)
(ExprSub1 e) -> do
(ValInt n) <- eval e env
return $ ValInt (n - 1)
(ExprMult e1 e2) -> do
(ValInt n1) <- eval e1 env
(ValInt n2) <- eval e2 env
return $ ValInt (n1 * n2)
(ExprIf te ce ae) -> do
(ValBool t) <- eval te env
if t then eval ce env else eval ae env
(ExprLam x body) ->
return $ ValClos (\a -> do
a' <- newIORef a
eval body (EnvrExt x a' env))
(ExprApp rator rand) -> do
(ValClos c) <- eval rator env
c (\() -> eval rand env)
-- "poor man's Y" factorial definition
fact = ExprApp f f
where f = (ExprLam "f" (ExprLam "n" (ExprIf (ExprZeroP (ExprVar "n"))
(ExprInt 1)
(ExprMult (ExprVar "n")
(ExprApp (ExprApp (ExprVar "f")
(ExprVar "f"))
(ExprSub1 (ExprVar "n")))))))
-- test factorial 5 = 120
testFact5 = eval (ExprApp fact (ExprInt 5)) EnvrEmpty
-- Omega, the delightful infinite loop
omega = ExprApp (ExprLam "x" (ExprApp (ExprVar "x") (ExprVar "x")))
(ExprLam "x" (ExprApp (ExprVar "x") (ExprVar "x")))
-- show that ((\y -> 5) omega) does not diverge, because the
-- interpreter is lazy
testOmega = eval (ExprApp (ExprLam "y" (ExprInt 5)) omega) EnvrEmpty
The second interpreter is in Scheme, where the only real boilerplate is Oleg's pattern-matching macro. I find that it's much easier to see where the laziness is coming from in the Scheme version. The box functions allow the environment to be updated; Chez Scheme includes them, but I've included definitions that should work for others.
(define box
(lambda (x)
(cons x '())))
(define unbox
(lambda (b)
(car b)))
(define set-box!
(lambda (b v)
(set-car! b v)))
;; Oleg Kiselyov's linear pattern matcher
(define-syntax pmatch
(syntax-rules (else guard)
((_ (rator rand ...) cs ...)
(let ((v (rator rand ...)))
(pmatch v cs ...)))
((_ v) (errorf 'pmatch "failed: ~s" v))
((_ v (else e0 e ...)) (begin e0 e ...))
((_ v (pat (guard g ...) e0 e ...) cs ...)
(let ((fk (lambda () (pmatch v cs ...))))
(ppat v pat (if (and g ...) (begin e0 e ...) (fk)) (fk))))
((_ v (pat e0 e ...) cs ...)
(let ((fk (lambda () (pmatch v cs ...))))
(ppat v pat (begin e0 e ...) (fk))))))
(define-syntax ppat
(syntax-rules (uscore quote unquote)
((_ v uscore kt kf)
; _ can't be listed in literals list in R6RS Scheme
(and (identifier? #'uscore) (free-identifier=? #'uscore #'_))
kt)
((_ v () kt kf) (if (null? v) kt kf))
((_ v (quote lit) kt kf) (if (equal? v (quote lit)) kt kf))
((_ v (unquote var) kt kf) (let ((var v)) kt))
((_ v (x . y) kt kf)
(if (pair? v)
(let ((vx (car v)) (vy (cdr v)))
(ppat vx x (ppat vy y kt kf) kf))
kf))
((_ v lit kt kf) (if (equal? v (quote lit)) kt kf))))
(define empty-env
(lambda ()
`(empty-env)))
(define extend-env
(lambda (x v env)
`(extend-env ,x ,v ,env)))
(define apply-env
(lambda (env y)
(pmatch env
[(extend-env ,x ,v ,env)
(if (eq? x y)
v
(apply-env env y))])))
(define value-of
(lambda (exp env)
(pmatch exp
[,b (guard (boolean? b)) b]
[,n (guard (integer? n)) n]
[,y (guard (symbol? y))
(let* ([box (apply-env env y)]
[th (unbox box)]
[v (th)])
(begin (set-box! box (lambda () v)) v))]
[(zero? ,e) (zero? (value-of e env))]
[(sub1 ,e) (sub1 (value-of e env))]
[(* ,e1 ,e2) (* (value-of e1 env) (value-of e2 env))]
[(if ,t ,c ,a) (if (value-of t env)
(value-of c env)
(value-of a env))]
[(lambda (,x) ,body)
(lambda (a) (value-of body (extend-env x a env)))]
[(,rator ,rand) ((value-of rator env)
(box (lambda () (value-of rand env))))])))
;; "poor man's Y" factorial definition
(define fact
(let ([f '(lambda (f)
(lambda (n)
(if (zero? n)
1
(* n ((f f) (sub1 n))))))])
`(,f ,f)))
;; test factorial 5 = 120
(define testFact5
(lambda ()
(value-of `(,fact 5) (empty-env))))
;; Omega, the delightful infinite loop
(define omega
'((lambda (x) (x x)) (lambda (x) (x x))))
;; show that ((lambda (y) 5) omega) does not diverge, because the interpreter
;; is lazy
(define testOmega
(lambda ()
(value-of `((lambda (y) 5) ,omega) (empty-env))))

You should have a look at graph reduction using combinators (SKI). It's beautiful and simple and illustrates how lazy evaluation works.

You might be interested in Alef (Alef Lazily Evaluates Functions), which is a very simple, pure, lazy functional programming language that I originally created specifically for explaining lazy evaluation via graph reduction. It is implemented in less than 500 lines of Common Lisp, including some neat visualization functions.
http://gergo.erdi.hu/blog/2013-02-17-write_yourself_a_haskell..._in_lisp/
Unfortunately, I haven't gotten around to finishing 'Typecheck Yourself a Haskell... in Lisp' yet, even though most of the code was already written around the time I posted part 1.

Related

Recursion scheme for symbolic differentiation

Following terminology from this excellent series, let's represent an expression such as (1 + x^2 - 3x)^3 by a Term Expr, where the data types are the following:
data Expr a =
Var
| Const Int
| Plus a a
| Mul a a
| Pow a Int
deriving (Functor, Show, Eq)
data Term f = In { out :: f (Term f) }
Is there a recursion scheme suitable for performing symbolic differentiation? I feel like it's almost a Futumorphism specialized to Term Expr, i.e. futu deriveFutu for an appropriate function deriveFutu:
data CoAttr f a
= Automatic a
| Manual (f (CoAttr f a))
futu :: Functor f => (a -> f (CoAttr f a)) -> a -> Term f
futu f = In <<< fmap worker <<< f where
worker (Automatic a) = futu f a
worker (Manual g) = In (fmap worker g)
This looks pretty good, except that the underscored variables are Terms instead of CoAttrs:
deriveFutu :: Term Expr -> Expr (CoAttr Expr (Term Expr))
deriveFutu (In (Var)) = (Const 1)
deriveFutu (In (Const _)) = (Const 0)
deriveFutu (In (Plus x y)) = (Plus (Automatic x) (Automatic y))
deriveFutu (In (Mul x y)) = (Plus (Manual (Mul (Automatic x) (Manual _y)))
(Manual (Mul (Manual _x) (Automatic y)))
)
deriveFutu (In (Pow x c)) = (Mul (Manual (Const c)) (Manual (Mul (Manual (Pow _x (c-1))) (Automatic x))))
The version without recursion schemes looks like this:
derive :: Term Expr -> Term Expr
derive (In (Var)) = In (Const 1)
derive (In (Const _)) = In (Const 0)
derive (In (Plus x y)) = In (Plus (derive x) (derive y))
derive (In (Mul x y)) = In (Plus (In (Mul (derive x) y)) (In (Mul x (derive y))))
derive (In (Pow x c)) = In (Mul (In (Const c)) (In (Mul (In (Pow x (c-1))) (derive x))))
As an extension to this question, is there a recursion scheme for differentiating and eliminating "empty" Exprs such as Plus (Const 0) x that arise as a result of differentiation -- in one pass over the data?
Look at the differentiation rule for product:
(u v)' = u' v + v' u
What do you need to know to differentiate a product? You need to know the derivatives of the subterms (u', v'), as well as their values (u, v).
This is exactly what a paramorphism gives you.
para
:: Functor f
=> (f (b, Term f) -> b)
-> Term f -> b
para g (In a) = g $ (para g &&& id) <$> a
derivePara :: Term Expr -> Term Expr
derivePara = para $ In . \case
Var -> Const 1
Const _ -> Const 0
Plus x y -> Plus (fst x) (fst y)
Mul x y -> Plus
(In $ Mul (fst x) (snd y))
(In $ Mul (snd x) (fst y))
Pow x c -> Mul
(In (Const c))
(In (Mul
(In (Pow (snd x) (c-1)))
(fst x)))
Inside the paramorphism, fst gives you access to the derivative of a subterm, while snd gives you the term itself.
As an extension to this question, is there a recursion scheme for differentiating and eliminating "empty" Exprs such as Plus (Const 0) x that arise as a result of differentiation -- in one pass over the data?
Yes, it's still a paramorphism. The easiest way to see this is to have smart constructors such as
plus :: Term Expr -> Term Expr -> Expr (Term Expr)
plus (In (Const 0)) (In x) = x
plus (In x) (In (Const 0)) = x
plus x y = Plus x y
and use them when defining the algebra. You could probably express this as some kind of para-cata fusion, too.

SAT solving with haskell SBV library: how to generate a predicate from a parsed string?

I want to parse a String that depicts a propositional formula and then find all models of the propositional formula with a SAT solver.
Now I can parse a propositional formula with the hatt package; see the testParse function below.
I can also run a SAT solver call with the SBV library; see the testParse function below.
Question:
How do I, at runtime, generate a value of type Predicate like myPredicate within the SBV library that represents the propositional formula I just parsed from a String? I only know how to manually type the forSome_ $ \x y z -> ... expression, but not how to write a converter function from an Expr value to a value of type Predicate.
-- cabal install sbv hatt
import Data.Logic.Propositional
import Data.SBV
-- Random test formula:
-- (x or ~z) and (y or ~z)
-- graphical depiction, see: https://www.wolframalpha.com/input/?i=%28x+or+~z%29+and+%28y+or+~z%29
testParse = parseExpr "test source" "((X | ~Z) & (Y | ~Z))"
myPredicate :: Predicate
myPredicate = forSome_ $ \x y z -> ((x :: SBool) ||| (bnot z)) &&& (y ||| (bnot z))
testSat = do
x <- allSat $ myPredicate
putStrLn $ show x
main = do
putStrLn $ show $ testParse
testSat
{-
Need a function that dynamically creates a Predicate
(as I did with the function (like "\x y z -> ..") for an arbitrary expression of type "Expr" that is parsed from String.
-}
Information that might be helpful:
Here is the link to the BitVectors.Data:
http://hackage.haskell.org/package/sbv-3.0/docs/src/Data-SBV-BitVectors-Data.html
Here is example code form Examples.Puzzles.PowerSet:
import Data.SBV
genPowerSet :: [SBool] -> SBool
genPowerSet = bAll isBool
where isBool x = x .== true ||| x .== false
powerSet :: [Word8] -> IO ()
powerSet xs = do putStrLn $ "Finding all subsets of " ++ show xs
res <- allSat $ genPowerSet `fmap` mkExistVars n
Here is the Expr data type (from hatt library):
data Expr = Variable Var
| Negation Expr
| Conjunction Expr Expr
| Disjunction Expr Expr
| Conditional Expr Expr
| Biconditional Expr Expr
deriving Eq
Working With SBV
Working with SBV requires that you follow the types and realize the Predicate is just a Symbolic SBool. After that step it is important that you investigate and discover Symbolic is a monad - yay, a monad!
Now that you you know you have a monad then anything in the haddock that is Symbolic should be trivial to combine to build any SAT you desire. For your problem you just need a simple interpreter over your AST that builds a Predicate.
Code Walk-Through
The code is all included in one continuous form below but I will step through the fun parts. The entry point is solveExpr which takes expressions and produces a SAT result:
solveExpr :: Expr -> IO AllSatResult
solveExpr e0 = allSat prd
The application of SBV's allSat to the predicate is sort of obvious. To build that predicate we need to declare an existential SBool for every variable in our expression. For now lets assume we have vs :: [String] where each string corresponds to one of the Var from the expression.
prd :: Predicate
prd = do
syms <- mapM exists vs
let env = M.fromList (zip vs syms)
interpret env e0
Notice how programming language fundamentals is sneaking in here. We now need an environment that maps the expressions variable names to the symbolic booleans used by SBV.
Next we interpret the expression to produce our Predicate. The interpret function uses the environment and just applies the SBV function that matches the intent of each constructor from hatt's Expr type.
interpret :: Env -> Expr -> Predicate
interpret env expr = do
let interp = interpret env
case expr of
Variable v -> return (envLookup v env)
Negation e -> bnot `fmap` interp e
Conjunction e1 e2 ->
do r1 <- interp e1
r2 <- interp e2
return (r1 &&& r2)
Disjunction e1 e2 ->
do r1 <- interp e1
r2 <- interp e2
return (r1 ||| r2)
Conditional e1 e2 -> error "And so on"
Biconditional e1 e2 -> error "And so on"
And that is it! The rest is just boiler-plate.
Complete Code
import Data.Logic.Propositional hiding (interpret)
import Data.SBV
import Text.Parsec.Error (ParseError)
import qualified Data.Map as M
import qualified Data.Set as Set
import Data.Foldable (foldMap)
import Control.Monad ((<=<))
testParse :: Either ParseError Expr
testParse = parseExpr "test source" "((X | ~Z) & (Y | ~Z))"
type Env = M.Map String SBool
envLookup :: Var -> Env -> SBool
envLookup (Var v) e = maybe (error $ "Var not found: " ++ show v) id
(M.lookup [v] e)
solveExpr :: Expr -> IO AllSatResult
solveExpr e0 = allSat go
where
vs :: [String]
vs = map (\(Var c) -> [c]) (variables e0)
go :: Predicate
go = do
syms <- mapM exists vs
let env = M.fromList (zip vs syms)
interpret env e0
interpret :: Env -> Expr -> Predicate
interpret env expr = do
let interp = interpret env
case expr of
Variable v -> return (envLookup v env)
Negation e -> bnot `fmap` interp e
Conjunction e1 e2 ->
do r1 <- interp e1
r2 <- interp e2
return (r1 &&& r2)
Disjunction e1 e2 ->
do r1 <- interp e1
r2 <- interp e2
return (r1 ||| r2)
Conditional e1 e2 -> error "And so on"
Biconditional e1 e2 -> error "And so on"
main :: IO ()
main = do
let expr = testParse
putStrLn $ "Solving expr: " ++ show expr
either (error . show) (print <=< solveExpr) expr
forSome_ is a member of the Provable class, so it seems it would suffice to define the instance Provable Expr. Almost all functions in SVB use Provable so this would allow you to use all of those natively Expr. First, we convert an Expr to a function which looks up variable values in a Vector. You could also use Data.Map.Map or something like that, but the environment is not changed once created and Vector gives constant time lookup:
import Data.Logic.Propositional
import Data.SBV.Bridge.CVC4
import qualified Data.Vector as V
import Control.Monad
toFunc :: Boolean a => Expr -> V.Vector a -> a
toFunc (Variable (Var x)) = \env -> env V.! (fromEnum x)
toFunc (Negation x) = \env -> bnot (toFunc x env)
toFunc (Conjunction a b) = \env -> toFunc a env &&& toFunc b env
toFunc (Disjunction a b) = \env -> toFunc a env ||| toFunc b env
toFunc (Conditional a b) = \env -> toFunc a env ==> toFunc b env
toFunc (Biconditional a b) = \env -> toFunc a env <=> toFunc b env
Provable essentially defines two functions: forAll_, forAll, forSome_, forSome. We have to generate all possible maps of variables to values and apply the function to the maps. Choosing how exactly to handle the results will be done by the Symbolic monad:
forAllExp_ :: Expr -> Symbolic SBool
forAllExp_ e = (m0 >>= f . V.accum (const id) (V.replicate (fromEnum maxV + 1) false)
where f = return . toFunc e
maxV = maximum $ map (\(Var x) -> x) (variables e)
m0 = mapM fresh (variables e)
Where fresh is a function which "quantifies" the given variable by associating it with all possible values.
fresh :: Var -> Symbolic (Int, SBool)
fresh (Var var) = forall >>= \a -> return (fromEnum var, a)
If you define one of these functions for each of the four functions you will have quite a lot of very repetitive code. So you can generalize the above as follows:
quantExp :: (String -> Symbolic SBool) -> Symbolic SBool -> [String] -> Expr -> Symbolic SBool
quantExp q q_ s e = m0 >>= f . V.accum (const id) (V.replicate (fromEnum maxV + 1) false)
where f = return . toFunc e
maxV = maximum $ map (\(Var x) -> x) (variables e)
(v0, v1) = splitAt (length s) (variables e)
m0 = zipWithM fresh (map q s) v0 >>= \r0 -> mapM (fresh q_) v1 >>= \r1 -> return (r0++r1)
fresh :: Symbolic SBool -> Var -> Symbolic (Int, SBool)
fresh q (Var var) = q >>= \a -> return (fromEnum var, a)
If it is confusing exactly what is happening, the Provable instance may suffice to explain:
instance Provable Expr where
forAll_ = quantExp forall forall_ []
forAll = quantExp forall forall_
forSome_ = quantExp exists exists_ []
forSome = quantExp exists exists_
Then your test case:
myPredicate :: Predicate
myPredicate = forSome_ $ \x y z -> ((x :: SBool) ||| (bnot z)) &&& (y ||| (bnot z))
myPredicate' :: Predicate
myPredicate' = forSome_ $ let Right a = parseExpr "test source" "((X | ~Z) & (Y | ~Z))" in a
testSat = allSat myPredicate >>= print
testSat' = allSat myPredicate >>= print

Why doesn't this lambda calculus reducer reduce succ 0 to 1?

data Term = Var Integer
| Apply Term Term
| Lambda Term
deriving (Eq, Show)
sub :: Term -> Integer -> Term -> Term
sub e v r = case e of
Var x -> if x == v then r else e
Apply m1 m2 -> Apply (sub m1 v r) (sub m2 v r)
Lambda t -> Lambda (sub t (v + 1) r)
beta :: Term -> Term
beta t = case t of
Apply (Lambda e) e' -> sub e 0 e'
otherwise -> t
eta :: Term -> Term
eta t = case t of
Lambda (Apply f (Var 0)) -> f
otherwise -> t
reduce :: Term -> Term
reduce t = if t == t'
then t
else reduce t'
where t' = beta . eta $ t
I tried:
let zero = Lambda $ Lambda $ Var 0
let succ = Lambda $ Lambda $ Lambda $ Apply (Var 1) $ (Apply (Apply (Var 2) (Var 1)) (Var 0))
reduce (Apply succ zero)
In GHCi, but it didn't seem to give me the expression for one (Lambda (Lambda (Apply (Var 1) (Var 0)) that I'm looking for. Instead it gives me:
Lambda (Lambda (Apply (Var 1) (Apply (Apply (Lambda (Lambda (Var 0))) (Var 1)) (Var 0))))
The variables are encoded not by name, but by how many lambdas you need to walk outwards to get the parameter.
Your reducer, in common with the way lambda calculus is normally evaluated, doesn't reduce inside Lambda terms - it only removes top-level redexes. The result it produces should be equivalent to one, in that if you apply both to the same arguments you will get the same result, but not syntactically identical.

Conversion from lambda term to combinatorial term

Suppose there are some data types to express lambda and combinatorial terms:
data Lam α = Var α -- v
| Abs α (Lam α) -- λv . e1
| App (Lam α) (Lam α) -- e1 e2
deriving (Eq, Show)
infixl 0 :#
data SKI α = V α -- x
| SKI α :# SKI α -- e1 e2
| I -- I
| K -- K
| S -- S
deriving (Eq, Show)
There is also a function to get a list of lambda term's free variables:
fv ∷ Eq α ⇒ Lam α → [α]
fv (Var v) = [v]
fv (Abs x e) = filter (/= x) $ fv e
fv (App e1 e2) = fv e1 ++ fv e2
To convert lambda term to combinatorial term abstract elimination rules could be usefull:
convert ∷ Eq α ⇒ Lam α → SKI α
1) T[x] => x
convert (Var x) = V x
2) T[(E₁ E₂)] => (T[E₁] T[E₂])
convert (App e1 e2) = (convert e1) :# (convert e2)
3) T[λx.E] => (K T[E]) (if x does not occur free in E)
convert (Abs x e) | x `notElem` fv e = K :# (convert e)
4) T[λx.x] => I
convert (Abs x (Var y)) = if x == y then I else K :# V y
5) T[λx.λy.E] => T[λx.T[λy.E]] (if x occurs free in E)
convert (Abs x (Abs y e)) | x `elem` fv e = convert (Abs x (convert (Abs y e)))
6) T[λx.(E₁ E₂)] => (S T[λx.E₁] T[λx.E₂])
convert (Abs x (App y z)) = S :# (convert (Abs x y)) :# (convert (Abs x z))
convert _ = error ":["
This definition is not valid because of 5):
Couldn't match expected type `Lam α' with actual type `SKI α'
In the return type of a call of `convert'
In the second argument of `Abs', namely `(convert (Abs y e))'
In the first argument of `convert', namely
`(Abs x (convert (Abs y e)))'
So, what I have now is:
> convert $ Abs "x" $ Abs "y" $ App (Var "y") (Var "x")
*** Exception: :[
What I want is (hope I calculate it right):
> convert $ Abs "x" $ Abs "y" $ App (Var "y") (Var "x")
S :# (S (KS) (S (KK) I)) (S (KK) I)
Question:
If lambda term and combinatorial term have a different types of expression, how 5) could be formulated right?
Let's consider the equation T[λx.λy.E] => T[λx.T[λy.E]].
We know the result of T[λy.E] is an SKI expression. Since it has been produced by one of the cases 3, 4 or 6, it is either I or an application (:#).
Thus the outer T in T[λx.T[λy.E]] must be one of the cases 3 or 6. You can perform this case analysis in the code. I'm sorry but I don't have the time to write it out.
Here it's better to have a common data type for combinators and lambda expressions. Notice that your types already have significant overlap (Var, App), and it doesn't hurt to have combinators in lambda expressions.
The only possibility we want to eliminate is having lambda abstractions in combinator terms. We can forbid them using indexed types.
In the following code the type of a term is parameterised by the number of nested lambda abstractions in that term. The convert function returns Term Z a, where Z means zero, so there are no lambda abstractions in the returned term.
For more information about singleton types (which are used a bit here), see the paper Dependently Typed Programming with Singletons.
{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, GADTs, TypeOperators,
ScopedTypeVariables, MultiParamTypeClasses, FlexibleInstances #-}
data Nat = Z | Inc Nat
data SNat :: Nat -> * where
SZ :: SNat Z
SInc :: NatSingleton n => SNat n -> SNat (Inc n)
class NatSingleton (a :: Nat) where
sing :: SNat a
instance NatSingleton Z where sing = SZ
instance NatSingleton a => NatSingleton (Inc a) where sing = SInc sing
type family Max (a :: Nat) (b :: Nat) :: Nat
type instance Max Z a = a
type instance Max a Z = a
type instance Max (Inc a) (Inc b) = Inc (Max a b)
data Term (l :: Nat) a where
Var :: a -> Term Z a
Abs :: NatSingleton l => a -> Term l a -> Term (Inc l) a
App :: (NatSingleton l1, NatSingleton l2)
=> Term l1 a -> Term l2 a -> Term (Max l1 l2) a
I :: Term Z a
K :: Term Z a
S :: Term Z a
fv :: Eq a => Term l a -> [a]
fv (Var v) = [v]
fv (Abs x e) = filter (/= x) $ fv e
fv (App e1 e2) = fv e1 ++ fv e2
fv _ = []
eliminateLambda :: (Eq a, NatSingleton l) => Term (Inc l) a -> Term l a
eliminateLambda t =
case t of
Abs x t ->
case t of
Var y
| y == x -> I
| otherwise -> App K (Var y)
Abs {} -> Abs x $ eliminateLambda t
App a b -> S `App` (eliminateLambda $ Abs x a)
`App` (eliminateLambda $ Abs x b)
App a b -> eliminateLambdaApp a b
eliminateLambdaApp
:: forall a l1 l2 l .
(Eq a, Max l1 l2 ~ Inc l,
NatSingleton l1,
NatSingleton l2)
=> Term l1 a -> Term l2 a -> Term l a
eliminateLambdaApp a b =
case (sing :: SNat l1, sing :: SNat l2) of
(SInc _, SZ ) -> App (eliminateLambda a) b
(SZ , SInc _) -> App a (eliminateLambda b)
(SInc _, SInc _) -> App (eliminateLambda a) (eliminateLambda b)
convert :: forall a l . Eq a => NatSingleton l => Term l a -> Term Z a
convert t =
case sing :: SNat l of
SZ -> t
SInc _ -> convert $ eliminateLambda t
The key insight is that S, K and I are just constant Lam terms, in the same way that 1, 2 and 3 are constant Ints. It would be pretty easy to make rule 5 type-check by making an inverse to the 'convert' function:
nvert :: SKI a -> Lam a
nvert S = Abs "x" (Abs "y" (Abs "z" (App (App (Var "x") (Var "z")) (App (Var "y") (Var "z")))))
nvert K = Abs "x" (Abs "y" (Var "x"))
nvert I = Abs "x" (Var "x")
nvert (V x) = Var x
nvert (x :# y) = App (nvert x) (nvert y)
Now we can use 'nvert' to make rule 5 type-check:
convert (Abs x (Abs y e)) | x `elem` fv e = convert (Abs x (nvert (convert (Abs y e))))
We can see that the left and the right are identical (we'll ignore the guard), except that 'Abs y e' on the left is replaced by 'nvert (convert (Abs y e))' on the right. Since 'convert' and 'nvert' are each others' inverse, we can always replace any Lam 'x' with 'nvert (convert x)' and likewise we can always replace any SKI 'x' with 'convert (nvert x)', so this is a valid equation.
Unfortunately, while it's a valid equation it's not a useful function definition because it won't cause the computation to progress: we'll just convert 'Abs y e' back and forth forever!
To break this loop we can replace the call to 'nvert' with a 'reminder' that we should do it later. We do this by adding a new constructor to Lam:
data Lam a = Var a -- v
| Abs a (Lam a) -- \v . e1
| App (Lam a) (Lam a) -- e1 e2
| Com (SKI a) -- Reminder to COMe back later and nvert
deriving (Eq, Show)
Now rule 5 uses this reminder instead of 'nvert':
convert (Abs x (Abs y e)) | x `elem` fv e = convert (Abs x (Com (convert (Abs y e))))
Now we need to make good our promise to come back, by making a separate rule to replace reminders with actual calls to 'nvert', like this:
convert (Com c) = convert (nvert c)
Now we can finally break the loop: we know that 'convert (nvert c)' is always identical to 'c', so we can replace the above line with this:
convert (Com c) = c
Notice that our final definition of 'convert' doesn't actually use 'nvert' at all! It's still a handy function though, since other functions involving Lam can use it to handle the new 'Com' case.
You've probably noticed that I've actually named this constructor 'Com' because it's just a wrapped-up COMbinator, but I thought it would be more informative to take a slightly longer route than just saying "wrap up your SKIs in Lams" :)
If you're wondering why I called that function "nvert", see http://unapologetic.wordpress.com/2007/05/31/duality-terminology/ :)
Warbo is right, combinators are constant lambda terms, consequently the conversion function is
T[ ]:L -> C with L the set of lambda terms and C that of combinatory terms and with C ⊂ L .
So there is no typing problem for the rule T[λx.λy.E] => T[λx.T[λy.E]]
Here an implementation in Scala.

Mutually recursive evaluator in Haskell

Update: I've added an answer that describes my final solution (hint: the single Expr data type wasn't sufficient).
I'm writing an evaluator for a little expression language, but I'm stuck on the LetRec construct.
This is the language:
type Var = String
type Binds = [(Var, Expr)]
data Expr
= Var Var
| Lam Var Expr
| App Expr Expr
| Con Int
| Sub Expr Expr
| If Expr Expr Expr
| Let Var Expr Expr
| LetRec Binds Expr
deriving (Show, Eq)
And this this the evaluator so far:
data Value
= ValInt Int
| ValFun Env Var Expr
deriving (Show, Eq)
type Env = [(Var, Value)]
eval :: Env -> Expr -> Either String Value
eval env (Var x) = maybe (throwError $ x ++ " not found")
return
(lookup x env)
eval env (Lam x e) = return $ ValFun env x e
eval env (App e1 e2) = do
v1 <- eval env e1
v2 <- eval env e2
case v1 of
ValFun env1 x e -> eval ((x, v2):env1) e
_ -> throwError "First arg to App not a function"
eval _ (Con x) = return $ ValInt x
eval env (Sub e1 e2) = do
v1 <- eval env e1
v2 <- eval env e2
case (v1, v2) of
(ValInt x, ValInt y) -> return $ ValInt (x - y)
_ -> throwError "Both args to Sub must be ints"
eval env (If p t f) = do
v1 <- eval env p
case v1 of
ValInt x -> if x /= 0
then eval env t
else eval env f
_ -> throwError "First arg of If must be an int"
eval env (Let x e1 e2) = do
v1 <- eval env e1
eval ((x, v1):env) e2
eval env (LetRec bs e) = do
env' <- evalBinds
eval env' e
where
evalBinds = mfix $ \env' -> do
env'' <- mapM (\(x, e') -> eval env' e' >>= \v -> return (x, v)) bs
return $ nub (env'' ++ env)
This is my test function I want to evaluate:
test3 :: Expr
test3 = LetRec [ ("even", Lam "x" (If (Var "x")
(Var "odd" `App` (Var "x" `Sub` Con 1))
(Con 1)
))
, ("odd", Lam "x" (If (Var "x")
(Var "even" `App` (Var "x" `Sub` Con 1))
(Con 0)
))
]
(Var "even" `App` Con 5)
EDIT:
Based on Travis' answer and Luke's comment, I've updated my code to use the MonadFix instance for the Error monad. The previous example works fine now! However, the example bellow doesn't work correctly:
test4 :: Expr
test4 = LetRec [ ("x", Con 3)
, ("y", Var "x")
]
(Con 0)
When evaluating this, the evaluator loops, and nothing happens. I'm guessing I've made something a bit too strict here, but I'm not sure what it is. Am I violating one of the MonadFix laws?
When Haskell throws a fit, that's usually an indication that you have not thought clearly about a core issue of your problem. In this case, the question is: which evaluation model do you want to use for your language? Call-by-value or call-by-need?
Your representation of environments as [(Var,Value)] suggests that you want to use call-by-value, since every Expr is evaluated to a Value right away before storing it in the environment. But letrec does not go well with that, and your second example shows!
Furthermore, note that the evaluation model of the host language (Haskell) will interfere with the evaluation model of the language you want to implement; in fact, that's what you are currently making use of for your examples: despite their purpose, your Values are not evaluated to weak head normal form.
Unless you have a clear picture of the evaluation model of your little expression language, you won't make much progress on letrec or on the error checking facilities.
Edit:
For an example specification of letrec in a call-by-value language, have a look at the Ocaml Manual. On the simplest level, they only allow right-hand sides that are lambda expressions, i.e. things that are syntactically known to be values.
Maybe I'm missing something, but doesn't the following work?
eval env (LetRec bs ex) = eval env' ex
where
env' = env ++ map (\(v, e) -> (v, eval env' e)) bs
For your updated version: What about the following approach? It works as desired on your test case, and doesn't throw away errors in LetRec expressions:
data Value
= ValInt Int
| ValFun EnvWithError Var Expr
deriving (Show, Eq)
type Env = [(Var, Value)]
type EnvWithError = [(Var, Either String Value)]
eval :: Env -> Expr -> Either String Value
eval = eval' . map (second Right)
where
eval' :: EnvWithError -> Expr -> Either String Value
eval' env (Var x) = maybe (throwError $ x ++ " not found")
(join . return)
(lookup x env)
eval' env (Lam x e) = return $ ValFun env x e
eval' env (App e1 e2) = do
v1 <- eval' env e1
v2 <- eval' env e2
case v1 of
ValFun env1 x e -> eval' ((x, Right v2):env1) e
_ -> throwError "First arg to App not a function"
eval' _ (Con x) = return $ ValInt x
eval' env (Sub e1 e2) = do
v1 <- eval' env e1
v2 <- eval' env e2
case (v1, v2) of
(ValInt x, ValInt y) -> return $ ValInt (x - y)
_ -> throwError "Both args to Sub must be ints"
eval' env (If p t f) = do
v1 <- eval' env p
case v1 of
ValInt x -> if x /= 0
then eval' env t
else eval' env f
_ -> throwError "First arg of If must be an int"
eval' env (Let x e1 e2) = do
v1 <- eval' env e1
eval' ((x, Right v1):env) e2
eval' env (LetRec bs ex) = eval' env' ex
where
env' = env ++ map (\(v, e) -> (v, eval' env' e)) bs
Answering my own question; I wanted to share the final solution I came up with.
As Heinrich correctly pointed out, I didn't really think through the impact the evaluation order has.
In a strict (call-by-value) language, an expression that is already a value (weak head normal form) is different from an expression that still needs some evaluation. Once I encoded this distinction in my data type, everything fell into place:
type Var = String
type Binds = [(Var, Val)]
data Val
= Con Int
| Lam Var Expr
deriving (Show, Eq)
data Expr
= Val Val
| Var Var
| App Expr Expr
| Sub Expr Expr
| If Expr Expr Expr
| Let Var Expr Expr
| LetRec Binds Expr
deriving (Show, Eq)
The only difference with my my original Expr data type, is that I pulled out two constructors (Con and Lam) into their own data type Val. The Expr data type has a new constructor Val, this represents the fact that a value is also a valid expression.
With values in their own data type, they can be handled separately from other expression, for example letrec bindings can only contain values, no other expressions.
This distinction is also made in other strict languages like C, where only functions and constants can be defined in global scope.
See the complete code for the updated evaluator function.

Resources