Haskell: Iterating through material in a (complex, recursive) ADT _with context_ - haskell

I have an ADT for, basically, first order logic:
data Function = Function String
data Predicate = Predicate String
data Type_ = TPoint | TSet | TFunction | TPositiveRealNumber | TSequence | TNaturalNumber | TGroup
data VariableType = VTNormal | VTDiamond | VTBullet
data Dependencies = Dependencies [Term] {-dep-} [Term] {-indep-}
data Variable = Variable String Int Type_ VariableType Dependencies
data Term = VariableTerm Variable
| ApplyFn Function [Term]
data Formula = AtomicFormula Predicate [Term]
| Not Formula
| And [Formula]
| Or [Formula]
| Forall [Variable] Formula
| UniversalImplies [Variable] [Formula] Formula
| Exists [Variable] Formula
And I need to iterate through each Term (potentially deeply) nested inside a given Formula and do something to it -- said thing will depend on both the term and the 'context' in which it appears. So as a simple example we could print out the formula repeatedly, with each copy having a different term bolded. I don't want this particular behavior wired in; I want a higher-order function with the following type signature
f :: (FormulaWithTermShapedHole -> Term -> a) -> Formula -> [a]
I already have a function with this type signature which can do something to every term...
mapTermInFormulaM :: Monad m => (Term -> m Term) -> Formula -> m Formula
... but it can't utilise context in the way I now need. (So it could print every term that occurs somewhere inside the formula, but it couldn't print the whole formula with the term bolded.)
It feels like there should be a slick way of doing this... any suggestions would be welcome.

This can be done with generic programming in Haskell.
A good starting point for this is:
http://www.haskell.org/haskellwiki/Scrap_your_boilerplate
And there are some excellent papers by Simon Peyton Jones who started this, if you want to read further:
http://research.microsoft.com/en-us/um/people/simonpj/papers/hmap/

Related

Writing an interpreter for an imperative language in Haskell

I am trying to build an interpreter for a C-like language in Haskell. I have so far written and combined small monadic parsers following this paper, hence so far I can generate an AST representation of a program. I defined the abstract syntax as follows:
data LangType = TypeReal | TypeInt | TypeBool | TypeString deriving (Show)
type Id = String
data AddOp = Plus | Minus | Or deriving (Show)
data RelOp = LT | GT | LTE | GTE | NEq | Eq deriving (Show)
data MultOp = Mult | Div | And deriving (Show)
data UnOp = UnMinus | UnNot deriving (Show)
data BinOp = Rel RelOp | Mul MultOp | Add AddOp deriving (Show)
data AST = Program [Statement] deriving (Show)
data Block = StatsBlock [Statement] deriving (Show)
data Statement = VariableDecl Id LangType Expression
| Assignment Id Expression
| PrintStatement Expression
| IfStatement Expression Block Block
| WhileStatement Expression Block
| ReturnStatement Expression
| FunctionDecl Id LangType FormalParams Block
| BlockStatement Block
deriving (Show)
data Expression = RealLiteral Double
| IntLiteral Int
| BoolLiteral Bool
| StringLiteral String
| Unary UnOp Expression
| Binary BinOp Expression Expression
| FuncCall Id [Expression]
| Var Id
deriving (Show)
data FormalParams = IdentifierType [(Id, LangType)] deriving (Show)
I have yet to type-check my AST and build the interpreter to evaluate expressions and execute statements. My questions are the following:
Does the abstract syntax make sense/can it be improved? In particular, I've been running into a recurring problem. In the EBNF of this language I'm trying to interpret, a WhileStatement consists of an Expression (which I have no problem with) and a Block, which in the EBNF happens to be a Statement just like WhileStatement, and so I cannot refer to Block from my WhileStatement. I've worked around this by defining a separate data type Block (as is shown in the above code), but am not sure if this is the best way. I'm finding defining data types quite confusing.
Since I have to type-check my AST and evaluate/execute, do I implement these separately or can I define some function which does them both at the same time?
Any general tips on how I should go about type-checking and interpreting the language would also be greatly appreciated. Since the language has variable and function declarations, I am thinking of implementing some sort of symbol table, although again I am struggling with defining the type for this. So far I've tried
import qualified Data.Map as M
data Value = RealLit Double | IntLit Int | BoolLit Bool | StringLit String | Func [FormalParams] String
deriving (Show)
type TermEnv = M.Map String Value
but I'm unsure whether I should be using my LangType from before.
Addressing your question in the comments about how to proceed with type checking and evaluation.
If you don't have to do inference or polymorphism, type checking is pretty simple. Also type checking and evaluation mirror each other pretty closely in these conditions.
Begin by defining a monad with the features you need. For a type checker, you will need
A type environment, i.e. a Reader(Map Id LangType) component, to keep track of the types of local variables.
An error ability, e.g. ExceptString.
So you could define a monad like
type TypeEnv = Map.Map Id LangType
type TC = ReaderT TypeEnv (Except String)
And then your typechecker function would look like:
typeCheck :: AST -> TC ()
(We return () because there is nothing interesting to be gained from the typechecking process besides knowing whether the program passed.)
This will be largely structurally inductive, e.g.
typeCheck (Program stmt) = -- typecheckStmt each statement*
typeCheckStmt :: Statement -> TC ()
typeCheckStmt (VariableDecl v type defn) = ...
typeCheckStmt (Assignment v exp) = do
Just t <- asks (Map.lookup v)
t' <- typeCheckExp exp
when (t /= t') $ throwError "Types do not match"
...
-- Return the type of a composite expression to use elsewhere
typeCheckExp :: Expression -> TC LangType
...
There will be a bit of finesse required to make sure that variable declarations in a list of statements can be seen by later statements in the same list. I will leave that as a puzzle. (Hint: see the local function to provide an updated environment within a scope.)
Evaluation is a similar story. You're correct that you need a type of run-time values. Without some cleverness that you are probably not ready for (and is of questionable utility even if you were) there is not really a way to use LangType in Value, so you're on the right track.
You will need a monad that supports keeping track of the values of variables and the ability to do whatever else your language needs. To start I recommend
type Eval = StateT (Map Id Value) IO
and proceed structurally as before. There will again be some finesse required when handling variable scopes and shadowing, and you may need to change the environment type or mess with your Value type a bit to accommodate these subtleties, but thinking through these problems is important. Start simple, don't try to implement typechecking and evaluation for your whole language at once.

'Valuation a' data structure and 'eval' function

I'm studying haskell and I don't know how to complete one exercise:
We can define a data structure for generalised expressions as
follows:
data Expr a = Lit a | EVar Var | Op (Ops a) [Expr a]
type Ops a = [a] -> a
type Var = Char
To evaluate an expression, we need to know all the values of its variables.
Define a new type or datatype Valuation a, associating variables with values
of the type a. Then write a function:
eval :: Valuation a -> Expr a -> a
that, for the given variable valuation and expression, evaluates (folds) the
expression to a single value.
Extra information (given by my professor with the exercise):
Valuation could be any type, associating Var (= Char) and 'a', example: >[(Var,a)] or Var -> a .
eval function, after receiving a structure (of type: Valuation a) and an >expression (of type: Expr a), should simplify that expression to one "a" type >value.
Expression (Expr Int) example: standard expression such as (x + 10) * y would >look like this: Op* [Op+ [EVar ‘x’, Lit 10], EVar ‘y’]. Is Valuation Int type >structure would associate EVar ‘x’ with 15 and EVar ‘y’ with 2, then the end >result should be 50.
My questions would be:
1) How should such data structure of Valuation a look like? I'm thinking about some kind of a map with keys and values, but I might be totally wrong.
2) For eval I was thinking about writing a function which would include all operators and their priorities, but overall this exercise looks a bit too difficult (in my mind) compared to other exercises we usually do - most of the time the hardest part is figuring out what the exercise wants and the solution itself takes up only about 5-15 lines of code; so, maybe my thinking for this exercise is off?
Any help would be appreciated.
Let's consolidate the comments into an answer:
1) How should such data structure of Valuation a look like? I'm thinking about some kind of a map with keys and values, but I might be totally wrong.
You are indeed looking for something that associates Var keys with a values. A [(Var, a)] list of pairs is one way of achieving that. You can then work with such a list using functions like lookup:
lookup :: Eq a => a -> [(a, b)] -> Maybe b
lookup key assocs looks up a key in an association list.
Alternatively, you might want to consider using a proper map type, such as Map from containers.
2) For eval I was thinking about writing a function which would include all operators and their priorities [...]
Given your problem statement, there is no list of "all operators" from you to include. A second look at the Expr definition...
data Expr a = Lit a | EVar Var | Op (Ops a) [Expr a]
type Ops a = [a] -> a
... shows that an Op expression already encapsulates the operation to be done, in the form of the [a] -> a function in is its first field. Your task, then, boils down to converting the [Expr a] in the second field into an [a], handling each of the three types of Expr appropriately, and then apply the [a] -> a function to the resulting [a] list.

to overload a data type or to use a similar one?

This is more of a question about a programming style and common practices. But I feel that it does not fit into the code review forum...
My program parses regular expressions and processes them. A regular expression can have the usual elements (Kleene closure, concatenation, etc) and it also can have references to other regular expressions by their names, like macros:
data Regex a = Epsilon
| Literal a
| Ranges [(a, a)]
| Ref String
| Then (Regex a) (Regex a)
| Or (Regex a) (Regex a)
| Star (Regex a)
After I process a regular expression and resolve all macro references, and convert Literal elements to Range elements (this is needed for my purposes), I end up with a type that cannot and must not have Ref and Literal, so in my functions that work with it I do something like:
foo (Literal _) = error "unexpected literal"
foo (Ref _) = error "unexpected reference"
foo (Epsilon) = ...
foo (Star x) = ...
...
This looks ugly to me because it does runtime checks instead of checks during compilation. Not a very haskell kind of approach.
So maybe I can introduce another data type which is very similar to the original one and use that?
data RegexSimple a = Epsilon2
| Ranges2 [(a, a)]
| Then2 (Regex a) (Regex a)
| Or2 (Regex a) (Regex a)
| Star2 (Regex a)
That would work, but here I have a lot of duplication and also the nice and descriptive names of constructors are taken now and I need to invent new ones...
What would the experts do here? I want to learn : )
I don't know what the rest of your code looks like, so this solution may require you to rethink certain aspects, but the most "haskell-ish" solution to this problem would probably be to use GADTs and phantom types. Together, they basically allow you to create arbitrary subtypes for more flexible type safety. You would redefine your types like so.
{-# LANGUAGE GADTs #-}
data Literal
data Ref
data Rangeable
data Regex t a where
Epsilon :: Regex Rangeable a
Literal :: a -> Regex Literal a
Ranges :: [(a, a)] -> Regex Rangeable a
Ref :: String -> Regex Ref a
Then :: Regex t' a -> Regex t' a -> Regex Rangeable a
Or :: Regex t' a -> Regex t' a -> Regex Rangeable a
Star :: Regex t' a -> Regex Rangeable
Then you can define
foo :: Regex Rangeable a
foo (Epsilon) = ...
foo s#(Star a) = ...
Now, statements like foo $ Literal 'c' will fail compile-time type-checks.
I'm not an expert but it's a problem I have also myself (even though it more with product type than sum type).
The obvious solution is to reuse RegexSimple in Regex so that
data Regex a = Ref a | Literal a | SimpleR (SimpleRegex a)
another way is to parametrize Regex with a functor
data Regex f a = Literal (f a) | Ref (f a) | Epsilon a ...
and use either Regex Id or Regex Void.
Another way is just to use Maybe
data Regex a = Literal (Maybe a) | Epsilon a ...
But this it less clean because you can't enforce a function to only accept simple regex.

Extending propositional logic to modal logic in Haskell

I have written some code in Haskell for modeling propositional logic
data Formula = Prop {propName :: String}
| Neg Formula
| Conj Formula Formula
| Disj Formula Formula
| Impl Formula Formula
| BiImpl Formula Formula
deriving (Eq,Ord)
However, there is no natural way to extend this to Modal Logic, since the data type is closed.
Therefore, I thought I should use classes instead. That way, I can easily add new language features in different modules later on. The problem is that I don't exactly know how to write it. I would like something like the following
type PropValue = (String,Bool) -- for example ("p",True) states that proposition p is true
type Valuation = [PropValue]
class Formula a where
evaluate :: a -> Valuation -> Bool
data Proposition = Prop String
instance Formula Proposition where
evaluate (Prop s) val = (s,True) `elem` val
data Conjunction = Conj Formula Formula -- illegal syntax
instance Formula Conjunction where
evaluate (Conj φ ψ) v = evaluate φ v && evaluate ψ v
The mistake is of course in the definition of Conjunction. However, it is unclear to me how I could rewrite it so that it works.
This should work:
data Conjunction f = Conj f f
instance Formula f => Formula (Conjunction f) where
evaluate (Conj φ ψ) v = evaluate φ v && evaluate ψ v
However, I am not sure type classes are the right tool for what you are trying to achieve.
Maybe you could give a whirl to using explicit type level functors and recurring over them:
-- functor for plain formulae
data FormulaF f = Prop {propName :: String}
| Neg f
| Conj f f
| Disj f f
| Impl f f
| BiImpl f f
-- plain formula
newtype Formula = F {unF :: FormulaF Formula}
-- functor adding a modality
data ModalF f = Plain f
| MyModality f
-- modal formula
newtype Modal = M {unM :: ModalF Modal}
Yes, this is not terribly convenient since constructors such as F,M,Plain get sometimes in the way. But, unlike type classes, you can use pattern matching here.
As another option, use a GADT:
data Plain
data Mod
data Formula t where
Prop {propName :: String} :: Formula t
Neg :: Formula t -> Formula t
Conj :: Formula t -> Formula t -> Formula t
Disj :: Formula t -> Formula t -> Formula t
Impl :: Formula t -> Formula t -> Formula t
BiImpl :: Formula t -> Formula t -> Formula t
MyModality :: Formula Mod -> Formula Mod
type PlainFormula = Formula Plain
type ModalFormula = Formula Mod

Recursive data structures in haskell: prolog-like terms

I have a question about recursive data structures in Haskell (language that I'm currently trying to learn).
I would like to encode in Haskell Prolog-like terms, but every solution I came up with has different drawbacks that I would really like to avoid. I would like to find a cheap and elegant way of encoding a BNF grammar in Haskell types, if you wish to see my problem from this perspective.
Just as a reminder, some prolog terms could be male, sum(2, 3.1, 5.1), btree(btree(0, 1), Variable).
Solution 1
data Term = SConst String
| IConst Integer
| FConst Double
| Var String
| Predicate {predName :: String, predArgs :: [Term]}
With this solution I can have nested predicates (since predArgs are Term), but I can't distinguish predicates from other terms in type signatures.
Solution 2
data Term = SConst String
| IConst Integer
| FConst Double
| Var String
data Predicate = Predicate {predName :: String, predArgs ::[Either Term Predicate}
In this variant I can clearly distinguish predicates from basic terms, but the Either type in the predArgs list can be quite a nuisance to manage later in the code (I think... I'm new to Haskell).
Solution 3
data Term = SConst String
| IConst Integer
| FConst Double
| Var String
| Struct String [Term]
data Predicate = Predicate String [Term]
With this last solution, I split terms in two different types as before, but this time I avoid Either Term Predicate adding a Struct constructor in Term with basically the same semantics as Predicate.
It's just like solution 1 with two predicate constructors for terms. One is recursion-enabled, Struct, and the other one, Predicate is to be able to distinguish between predicates and regular terms.
The problem with this try is that Struct and Predicate are structurally equivalent and have almost the same meaning, but I will not be able to write functions that works - in example - both on (Predicate "p" []) and (Struct "p" []).
So again my question is: please, is there a better way to encode my predicates and terms such that:
I'm able to distinguish between predicate and terms in type signatures;
nested predicates like p(q(1), r(q(3), q(4))) are supported;
I can write functions that will work uniformly on predicates, without any
distinction like the one in solution #3?
Please feel free to ask me for further clarifications should you need any.
Thank you very much.
You could add a term constructor to wrap a predicate. Here, I also factored all of the literals into their own data type:
data Term = TLit Literal
| TVar String
| TPred Predicate
data Literal = LitS String
| LitI Int
| LitF Double
data Predicate = Predicate String [Term]
Here's one way (that's probably not worth the trouble):
{-# LANGUAGE EmptyDataDecls #-}
-- 'T' and 'F' are short for 'True' and 'False'
data T = T
data F
-- 'p' is short for 'mayNotBeAPredicate'
data Term p
= SConst !p String
| IConst !p Integer
| FConst !p Double
| Var !p String
| Predicate {predName :: String, predArgs :: [Term T]}
sconst :: String -> Term T
iconst :: Integer -> Term T
fconst :: Double -> Term T
var :: String -> Term T
predicate :: String -> [Term T] -> Term p
sconst = SConst T
iconst = IConst T
fconst = FConst T
var = Var T
predicate = Predicate
checkPredicate :: Term p -> Maybe (Term F)
checkPredicate (Predicate name args) = Just (Predicate name args)
checkPredicate _ = Nothing
forgetPredicate :: Term p -> Term T
forgetPredicate (SConst _ s) = sconst s
forgetPredicate (IConst _ i) = iconst i
forgetPredicate (FConst _ f) = fconst f
forgetPredicate (Var _ s) = var s
forgetPredicate (Predicate name args) = predicate name args
You can now write functions which only accept predicates by giving them an input type of Term F, and functions which accept any input type by giving them an input type of Term p.

Resources