Reading through Why It’s Nice to be Quoted, in section 3 there's an example of splicing a variable identifier in a quasiquote.
subst [:lam | $exp:e1 $exp:e2 |] x y =
let e1' = subst e1 x y
e2' = subst e2 x y
in
[:lam | $exp:e1' $exp:e2' |]
I see why the recursive calls to subst are done outside the [:lam| ... |], it's because the function antiVarE in section 3.2 builds a TH.varE out of the variable name.
My question is how much work would be required to support arbitrary expression splices beyond just a variable name?
For example:
subst [:lam | $exp:e1 $exp:e2 |] x y =
[:lam | $exp:(subst e1 x y) $exp:(subst e2 x y) |]
Answering my own question for posterity.
Turns out it was quite simple. Using the parseExp function in haskell-src-meta package I was able to easily convert a string to AST fragment.
In the original paper, aside from the parser changes required to capture an expression string between parentheses, the antiExpE function could be rewritten as such.
antiExpE :: Exp -> Maybe TH.ExpQ
antiExpE (AE v) =
case parseExp v of
Right exp -> Just . return $ exp
Left _ -> Nothing
antiExpE = Nothing
Related
The do notation allows us to express monadic code without overwhelming nestings, so that
main = getLine >>= \ a ->
getLine >>= \ b ->
putStrLn (a ++ b)
can be expressed as
main = do
a <- getLine
b <- getLine
putStrLn (a ++ b)
Suppose, though, the syntax allows ... #expression ... to stand for do { x <- expression; return (... x ...) }. For example, foo = f a #(b 1) c would be desugared as: foo = do { x <- b 1; return (f a x c) }. The code above could, then, be expressed as:
main = let a = #getLine in
let b = #getLine in
putStrLn (a ++ b)
Which would be desugared as:
main = do
x <- getLine
let a = x in
return (do
x' <- getLine
let b = x' in
return (putStrLn (a ++ b)))
That is equivalent. This syntax is appealing to me because it seems to offer the same functionality as the do-notation, while also allowing some shorter expressions such as:
main = putStrLn (#(getLine) ++ #(getLine))
So, I wonder if there is anything defective with this proposed syntax, or if it is indeed complete and equivalent to the do-notation.
putStrLn is already String -> IO (), so your desugaring ... return (... return (putStrLn (a ++ b))) ends up having type IO (IO (IO ())), which is likely not what you wanted: running this program won't print anything!
Speaking more generally, your notation can't express any do-block which doesn't end in return. [See Derek Elkins' comment.]
I don't believe your notation can express join, which can be expressed with do without any additional functions:
join :: Monad m => m (m a) -> m a
join mx = do { x <- mx; x }
However, you can express fmap constrained to Monad:
fmap' :: Monad m => (a -> b) -> m a -> m b
fmap' f mx = f #mx
and >>= (and thus everything else) can be expressed using fmap' and join. So adding join would make your notation complete, but still not convenient in many cases, because you end up needing a lot of joins.
However, if you drop return from the translation, you get something quite similar to Idris' bang notation:
In many cases, using do-notation can make programs unnecessarily verbose, particularly in cases such as m_add above where the value bound is used once, immediately. In these cases, we can use a shorthand version, as follows:
m_add : Maybe Int -> Maybe Int -> Maybe Int
m_add x y = pure (!x + !y)
The notation !expr means that the expression expr should be evaluated and then implicitly bound. Conceptually, we can think of ! as being a prefix function with the following type:
(!) : m a -> a
Note, however, that it is not really a function, merely syntax! In practice, a subexpression !expr will lift expr as high as possible within its current scope, bind it to a fresh name x, and replace !expr with x. Expressions are lifted depth first, left to right. In practice, !-notation allows us to program in a more direct style, while still giving a notational clue as to which expressions are monadic.
For example, the expression:
let y = 42 in f !(g !(print y) !x)
is lifted to:
let y = 42 in do y' <- print y
x' <- x
g' <- g y' x'
f g'
Adding it to GHC was discussed, but rejected (so far). Unfortunately, I can't find the threads discussing it.
How about this:
do a <- something
b <- somethingElse a
somethingFinal a b
I'm working with the lambda calculus implemented in haskell. Expressions:
%x.e -- lambda abstraction, like \x->e in Haskell
e e' -- application
x,y,z -- variables
succ, pred, ifzero, 0, 1, 2....
Syntax:
type Id = String
-- The "builtin" functions.
data Op = Succ | Pred | IfZero
deriving (Show, Ord, Eq)
-- Expressions of L
data Exp = Var Id
| Nat Integer
| Op Op
| Lam Id Exp
| App Exp Exp
deriving (Ord, Eq, Show)
There's also a function that converts the usual representation into Exp:
parse "%f. f 1 2" = Lam "f" App (App (Var "f") (Nat 1)) (Nat 2)`
And I have to write the function subst x e e' that will substitute e for all "free" occurrences of (Var x) in e'. "Free" means that the variable it is not declared by a surrounding %. For example, in the expression
x (%x. x (%y.xz)) (%y.x)
there are 5 occurrences of x. The first is "free". The second is the parameter for the % expression and is never substituted for. The third and fourth occurrences refer to the parameter of the enclosing % expression. The fifth is free. Therefore if we substitute 0 for x we get 0 (%x. x (%y.xz)) (%y.0).
All I need to use is pattern-matching and recursion. All I was able to write is the function prototype as
subst :: Id -> Exp -> Exp -> Exp
subst x (App z z') (App e e') =
If somebody could give me a hint how to implement the function it would be great. Any help is highly appreciated
I'd like to first point out that the pattern matching (subst x (App z z') (App e e')) is non-exhaustive. (Most of) the other patterns are trivial, so it's easy to forget them. I'd suggest against pattern matching the third argument, because if all you're doing is subsituting it into the second argument, you don't care if it's an Application or Natural.
The first step in most recursive functions is to consider cases. What's the base case? In this situation, it's where the second argument is equal to Var x:
-- Replace x with y in a Var. If the Var is equal to x, replace
-- otherwise, leave alone.
subst x (Var a) y
| a == x = y
| otherwise = (Var a)
Then we need to consider the step cases, what if it's an application, and what if it's a lambda abstraction?
-- Replace x with y in an application. We just need to replace
-- x with y in z and in z', because an application doesn't
-- bind x (x stays free in z and z').
subst x (App z z') y = (App new_z new_z')
where new_z = -- Recursively call subst here.
new_z' = -- And recursively call subst here, too.
-- Replace x with y in a lambda abstraction. We *do* need to
-- check to make sure that the lambda doesn't bind x. If the
-- lambda does bind x, then there's no possibility of x being
-- free in e, and we can leave the whole lambda as-is.
subst x (Lam z e) y
| z == x = (Lam z e)
| otherwise = (Lam z new_e)
where new_e = -- Recursively call subst here.
Finally, all the trivial cases are the same (we leave alone both Nat and Op, since there's no chance we would be substituting them):
subst :: Id -> Exp -> Exp -> Exp
subst _ nat_or_op _ = nat_or_op
I have 2 lists Expressions and bindings (id = Expr), and trying to replace each expression with its equivalent from the bindings list in a new list called newE, where Expression = id ..
Initially, I have only one expression:
eq (div (add 2 7) (sub 5 2)) 3
I want to replace each identifier in this expression with its equivalent from the bindings list, so I tried to split this expression into a list of strings and removed brackets, to separate each identifier ..
This is how I tried to implement it:
newE = [\x -> getExp(b) | x <- eStr, b <- bs, x == getId(b)]
where es = getExpressions (prog)
bs = getBindings (prog)
-- Extracting expression into a list of strings
-- and removing brackets
eStr = map (delete ')')(map (delete ')')
(map (delete '(') (split " " (show es))))
newE' = join " " newE
Is this correct?
Now I'm getting errors that newE returns [t0 -> Expr] while it's supposed to return Expr only, why is that?
and the join function is expecting a [Char] .. while its type in the Data.List.Utils documentation is:
join :: [a] -> [[a]] -> [a]
so, isn't it supposed to accept any type not just list of characters? or did it get confused with a 'join' from another library?
I searched the libraries I've imported, but they don't have a join.
Any help to resolve these errors and modify the code to do what it's supposed to do?
Thank you
Since you asked, here is a sketch of the conventional approach:
Convert the string expression into a Expr value, e.g.
add 2 7 -> App (App (Var "add") (Var "2")) (Var "7")
Write a function lookupBinding to lookup a binding for a Symbol:
lookupBinding :: [Binding] -> Symbol -> Maybe Expr
Write a substitute function to substitute binding definitions into an expression:
substitute :: [Binding] -> Expr -> Expr
It will go something like this:
substitute bindings (App e1 e2) = App e1' e2'
where e1' = substitute bindings e1
e2' = substitute bindings e2
substitute bindings (Var sym) = ... lookupBinding sym ...
substitute bindings (Lam sym exp) = ... substitute bindings' exp ...
Your list comprehension body returns the function \x -> getExp b. That's why the compiler says it's returning a function type.
I am very new to Haskell and I need to make a working calculator what will give answers to expressions like: 2+3*(5+12)
I have something that manages to calculate more or less but I have a problem with order of operations. I have no idea how to do it. Here is my code:
import Text.Regex.Posix
import Data.Maybe
oblicz :: String -> Double
oblicz str = eval (Nothing, None) $ map convertToExpression $ ( tokenize str )
eval :: (Maybe Double,Expression)->[Expression]->Double
eval (Nothing, _) ((Variable v):reszta) = eval (Just v, None) reszta
eval (Just aktualnyWynik, None) ((Operator o):reszta) = eval ((Just aktualnyWynik), (Operator o)) reszta
eval (Just aktualnyWynik, (Operator o)) ((Variable v):reszta) = eval (Just $ o aktualnyWynik v , None) reszta
eval (aktualnyWynik, operator) (LeftParenthesis:reszta)
= eval (aktualnyWynik, operator) ((Variable (eval (Nothing, None) reszta)):(getPartAfterParentheses reszta))
eval (Just aktualnyWynik, _) [] = aktualnyWynik
eval (Just aktualnyWynik, _) (RightParenthesis:_) = aktualnyWynik
data Expression = Operator (Double->Double->Double)
| Variable Double
| LeftParenthesis
| RightParenthesis
| None
tokenize :: String -> [String]
tokenize expression = getAllTextMatches(expression =~ "([0-9]+|\\(|\\)|\\+|-|%|/|\\*)" :: AllTextMatches [] String)
convertToExpression :: String -> Expression
convertToExpression "-" = Operator (-)
convertToExpression "+" = Operator (+)
convertToExpression "*" = Operator (*)
convertToExpression "/" = Operator (/)
convertToExpression "(" = LeftParenthesis
convertToExpression ")" = RightParenthesis
convertToExpression variable = Variable (read variable)
getPartAfterParentheses :: [Expression] -> [Expression]
getPartAfterParentheses [] = []
getPartAfterParentheses (RightParenthesis:expressionsList) = expressionsList
getPartAfterParentheses (LeftParenthesis:expressionsList) = getPartAfterParentheses (getPartAfterParentheses expressionsList)
getPartAfterParentheses (expression:expressionsList) = getPartAfterParentheses expressionsList
I thought maybe I could create two stacks - one with numbers and one with operators. While reading the expression, I could push numbers on one stack and operators on another. When it is an operator I would check if there is something already on the stack and if there is check if I should pop it from the stack and do the math or not - just like in onp notation.
Unfortunately, as I said, I am VERY new to haskell and have no clue how to go about writing this.
Any hints or help would be nice :)
Pushing things on different stacks sure feels very much a prcedural thing to do, and that's generally not nice in Haskell. (Stacks can be realised as lists, which works quite nice in a purely functional fashion. Even real mutable state can be fine if only as an optimisation, but if more than one object needs to be modified at a time then this isn't exactly enjoyable.)
The preferrable way would be to build up a tree representing the expression.
type DInfix = Double -> Double -> Double -- for readability's sake
data ExprTree = Op DInfix ExprTree ExprTree
| Value Double
Evaluating this tree is basically evalTree (Op c t1 t2) = c (evalTree t1) (evalTree t2), i.e. ExprTree->Double right away.
To build the tree up, the crucial point: get the operator fixities right. Different operators have different fixity. I'd put that information in the Operator field:
type Fixity = Int
data Expression = Operator (Double->Double->Double) Fixity
| ...
which then requires e.g.
...
convertToExpression "+" = Operator (+) 6
convertToExpression "*" = Operator (*) 7
...
(Those are the fixities that Haskell itself has for the operators. You can :i + in GHCi to see them.)
Then you'd build the tree.
toExprTree :: [Expression] -> ExprTree
Obvious base case:
toExprTree [Variable v] = Value v
You might continue with
toExprTree (Variable v : Operator c _ : exprs) = Op c (Value v) (toExprTree exprs)
But that's actually not right: for e.g. 4 * 3 + 2 it would give 4 * (3 + 2). We actually need to bring the 4 * down the remaining expressions tree, as deep as the fixities are lower. So the tree needs to know about that as well
data ExprTree = Op DInfix Fixity ExprTree ExprTree
| Value Double
mergeOpL :: Double -> DInfix -> Fixity -> ExprTree -> ExprTree
mergeOpL v c f t#(Op c' f' t' t'')
| c > c' = Op c' f' (mergeOpL v c f t') t''
mergeOpL v c f t = Op c f (Value v) t
What remains to be done is handling parentheses. You'd need to take a whole matching-brackets expression and assign it a tree-fixity of, say tight = 100 :: Fixity.
As a note: such a tokenisation - manual parsing workflow is pretty cumbersome, regardless how nicely functional you do it. Haskell has powerful parser-combinator libraries like parsec, which take most of the work and bookkeeping off you.
What you need to solve this problem is the Shunting-yard Algorithm of Edsger Dijstra as described at http://www.wcipeg.com/wiki/Shunting_yard_algorithm. You can see my implementation at the bottom of this file.
If you are limiting your self to just +,-,*,/ you can also solve the problem using the usual trick in most intro to compiler examples simply parsing into two different non-terminals, ofter called term and product to build the correct tree. This get unwieldy if you have to deal with a lot of operators or they are user defined.
I want to implement an imperative language interpreter in Haskell (for educational purposes). But it's difficult for me to create right architecture for my interpreter: How should I store variables? How can I implement nested function calls? How should I implement variable scoping? How can I add debugging possibilities in my language? Should I use monads/monad transformers/other techniques? etc.
Does anybody know good articles/papers/tutorials/sources on this subject?
If you are new to writing this kind of processors, I would recommend to put off using monads for a while and first focus on getting a barebones implementation without any bells or whistles.
The following may serve as a minitutorial.
I assume that you have already tackled the issue of parsing the source text of the programs you want to write an interpreter for and that you have some types for capturing the abstract syntax of your language. The language that I use here is very simple and only consists of integer expressions and some basic statements.
Preliminaries
Let us first import some modules that we will use in just a bit.
import Data.Function
import Data.List
The essence of an imperative language is that it has some form of mutable variables. Here, variables simply represented by strings:
type Var = String
Expressions
Next, we define expressions. Expressions are constructed from integer constants, variable references, and arithmetic operations.
infixl 6 :+:, :-:
infixl 7 :*:, :/:
data Exp
= C Int -- constant
| V Var -- variable
| Exp :+: Exp -- addition
| Exp :-: Exp -- subtraction
| Exp :*: Exp -- multiplication
| Exp :/: Exp -- division
For example, the expression that adds the constant 2 to the variable x is represented by V "x" :+: C 2.
Statements
The statement language is rather minimal. We have three forms of statements: variable assignments, while loops, and sequences.
infix 1 :=
data Stmt
= Var := Exp -- assignment
| While Exp Stmt -- loop
| Seq [Stmt] -- sequence
For example, a sequence of statements for "swapping" the values of the variables x and y can be represented by Seq ["tmp" := V "x", "x" := V "y", "y" := V "tmp"].
Programs
A program is just a statement.
type Prog = Stmt
Stores
Now, let us move to the actual interpreter. While running a program, we need to keep track of the values assigned to the different variables in the programs. These values are just integers and as a representation of our "memory" we just use lists of pairs consisting of a variable and a value.
type Val = Int
type Store = [(Var, Val)]
Evaluating expressions
Expressions are evaluated by mapping constants to their value, looking up the values of variables in the store, and mapping arithmetic operations to their Haskell counterparts.
eval :: Exp -> Store -> Val
eval (C n) r = n
eval (V x) r = case lookup x r of
Nothing -> error ("unbound variable `" ++ x ++ "'")
Just v -> v
eval (e1 :+: e2) r = eval e1 r + eval e2 r
eval (e1 :-: e2) r = eval e1 r - eval e2 r
eval (e1 :*: e2) r = eval e1 r * eval e2 r
eval (e1 :/: e2) r = eval e1 r `div` eval e2 r
Note that if the store contains multiple bindings for a variable, lookup selects the bindings that comes first in the store.
Executing statements
While the evaluation of an expression cannot alter the contents of the store, executing a statement may in fact result in an update of the store. Hence, the function for executing a statement takes a store as an argument and produces a possibly updated store.
exec :: Stmt -> Store -> Store
exec (x := e) r = (x, eval e r) : r
exec (While e s) r | eval e r /= 0 = exec (Seq [s, While e s]) r
| otherwise = r
exec (Seq []) r = r
exec (Seq (s : ss)) r = exec (Seq ss) (exec s r)
Note that, in the case of assignments, we simply push a new binding for the updated variable to the store, effectively shadowing any previous bindings for that variable.
Top-level Interpreter
Running a program reduces to executing its top-level statement in the context of an initial store.
run :: Prog -> Store -> Store
run p r = nubBy ((==) `on` fst) (exec p r)
After executing the statement we clean up any shadowed bindings, so that we can easily read off the contents of the final store.
Example
As an example, consider the following program that computes the Fibonacci number of the number stored in the variable n and stores its result in the variable x.
fib :: Prog
fib = Seq
[ "x" := C 0
, "y" := C 1
, While (V "n") $ Seq
[ "z" := V "x" :+: V "y"
, "x" := V "y"
, "y" := V "z"
, "n" := V "n" :-: C 1
]
]
For instance, in an interactive environment, we can now use our interpreter to compute the 25th Fibonacci number:
> lookup "x" $ run fib [("n", 25)]
Just 75025
Monadic Interpretation
Of course, here, we are dealing with a very simple and tiny imperative language. As your language gets more complex, so will the implementation of your interpreter. Think for example about what additions you need when you add procedures and need to distinguish between local (stack-based) storage and global (heap-based) storage. Returning to that part of your question, you may then indeed consider the introduction of monads to streamline the implementation of your interpreter a bit.
In the example interpreter above, there are two "effects" that are candidates for being captured by a monadic structure:
The passing around and updating of the store.
Aborting running the program when a run-time error is encountered. (In the implementation above, the interpreter simply crashes when such an error occurs.)
The first effect is typically captured by a state monad, the second by an error monad. Let us briefly investigate how to do this for our interpreter.
We prepare by importing just one more module from the standard libraries.
import Control.Monad
We can use monad transformers to construct a composite monad for our two effects by combining a basic state monad and a basic error monad. Here, however, we simply construct the composite monad in one go.
newtype Interp a = Interp { runInterp :: Store -> Either String (a, Store) }
instance Monad Interp where
return x = Interp $ \r -> Right (x, r)
i >>= k = Interp $ \r -> case runInterp i r of
Left msg -> Left msg
Right (x, r') -> runInterp (k x) r'
fail msg = Interp $ \_ -> Left msg
Edit 2018: The Applicative Monad Proposal
Since the Applicative Monad Proposal (AMP) every Monad must also be an instance of Functor and Applicative. To do this we can add
import Control.Applicative -- Otherwise you can't do the Applicative instance.
to the imports and make Interp an instance of Functor and Applicative like this
instance Functor Interp where
fmap = liftM -- imported from Control.Monad
instance Applicative Interp where
pure = return
(<*>) = ap -- imported from Control.Monad
Edit 2018 end
For reading from and writing to the store, we introduce effectful functions rd and wr:
rd :: Var -> Interp Val
rd x = Interp $ \r -> case lookup x r of
Nothing -> Left ("unbound variable `" ++ x ++ "'")
Just v -> Right (v, r)
wr :: Var -> Val -> Interp ()
wr x v = Interp $ \r -> Right ((), (x, v) : r)
Note that rd produces a Left-wrapped error message if a variable lookup fails.
The monadic version of the expression evaluator now reads
eval :: Exp -> Interp Val
eval (C n) = do return n
eval (V x) = do rd x
eval (e1 :+: e2) = do v1 <- eval e1
v2 <- eval e2
return (v1 + v2)
eval (e1 :-: e2) = do v1 <- eval e1
v2 <- eval e2
return (v1 - v2)
eval (e1 :*: e2) = do v1 <- eval e1
v2 <- eval e2
return (v1 * v2)
eval (e1 :/: e2) = do v1 <- eval e1
v2 <- eval e2
if v2 == 0
then fail "division by zero"
else return (v1 `div` v2)
In the case for :/:, division by zero results in an error message being produced through the Monad-method fail, which, for Interp, reduces to wrapping the message in a Left-value.
For the execution of statements we have
exec :: Stmt -> Interp ()
exec (x := e) = do v <- eval e
wr x v
exec (While e s) = do v <- eval e
when (v /= 0) (exec (Seq [s, While e s]))
exec (Seq []) = do return ()
exec (Seq (s : ss)) = do exec s
exec (Seq ss)
The type of exec conveys that statements do not result in values but are executed only for their effects on the store or the run-time errors they may trigger.
Finally, in the function run we perform a monadic computation and process its effects.
run :: Prog -> Store -> Either String Store
run p r = case runInterp (exec p) r of
Left msg -> Left msg
Right (_, r') -> Right (nubBy ((==) `on` fst) r')
In the interactive environment, we can now revisit the interpretation of our example program:
> lookup "x" `fmap` run fib [("n", 25)]
Right (Just 75025)
> lookup "x" `fmap` run fib []
Left "unbound variable `n'"
A couple of good papers I've finally found:
Building Interpreters by Composing Monads
Monad Transformers Step by Step - how incrementally build tiny interpreter using
monad transformers
How to build a monadic interpreter in one day
Monad Transformers and Modular Interpreters