I am trying to write a recursive data structure in Haskell in order to represent an expression tree. I have this data type:
data Expr =
And(Expr, Expr ) |
Or (Expr, Expr ) |
(/) Expr Expr
And I would like to pattern-match every data constructor in a function:
toStringE :: Expr -> String
toStringE e = case e of
And(a,b) -> "and(" ++ toStringE a ++ ", " ++ toStringE b ++ ")"
Or(a,b) -> "or(" ++ toStringE a ++ ", " ++ toStringE b ++ ")"
(/) expr1 expr2 -> (toStringE expr1) ++ " / " ++ (toStringE expr2)
But when I try to compile, I get this error on the last line of the toStringE function:
Parse error in pattern: (/)
What do you think I am doing wrong?
(/) isn't a valid infix data constructor. All valid infix data constructors start with :. You could name the constructor (:/), for instance.
The reason for this is that pattern matching depends on the ability to determine whether something is a constructor or not lexically. It does that by examining the first character of the identifier. If it's a capital letter or :, it is a constructor. If it's not one of those, it's a variable to be bound by the pattern match.
Related
I'm trying to add a parser for infix operators to a simple expressions parser. I have already looked at the documentation and at this question, but it seems like I am missing something.
import qualified Text.Parsec.Expr as Expr
import qualified Text.Parsec.Token as Tokens
import Text.ParserCombinators.Parsec
import Text.Parsec
data Expr = Number Integer
| Op Expr Expr
| Boolean Bool
instance Show Expr where
show (Op l r) = "(+ " ++ (show l) ++ " " ++ (show r) ++ ")"
show (Number r) = show r
show (Boolean b) = show b
parens = Tokens.parens haskell
reserved = Tokens.reservedOp haskell
infix_ operator func =
Expr.Infix (spaces >> reserved operator >> spaces >> return func) Expr.AssocLeft
infixOp =
Expr.buildExpressionParser table parser
where
table = [[infix_ "+" Op]]
number :: Parser Expr
number =
do num <- many1 digit
return $ Number $ read num
bool :: Parser Expr
bool = (string "true" >> return (Boolean True)) <|> (string "false" >> return (Boolean False))
parser = parens infixOp <|> number <|> bool
run = Text.Parsec.runParser parser () ""
This parser is able to parse expressions like 1, false, (1 + 2), (1 + false), but not 1 + 2 (it's parsed as 1). If I try to change the parser to parens infixOp <|> infixOp <|> number <|> bool, it get stuck.
What should i change in order to parse expressions like 1 + 2 without parenthesis?
You have to run the infixOp parser at the top level like this:
run = Text.Parsec.runParser infixOp () ""
Otherwise the your infix expressions can only be parsed when occuring in parentheses.
The attempt to use parens infixOp <|> infixOp <|> number <|> bool most likely gets stuck because it loops: parser tries to parse using infixOp, which tries to parse using parse and so on...
These tutorial might help you getting started with parsec (they did for me):
https://wiki.haskell.org/Parsing_a_simple_imperative_language
http://dev.stephendiehl.com/fun/002_parsers.html
I am using GHC version 8.0.2 on Windows 7, & module Debug.Trace.
In the trace of the parse function below, my insertion of ++ show first results in the following error:
No instance for (Show a) arising from a use of `show'
Possible fix:
add (Show a) to the context of
the type signature for:
parse :: Parser a -> String -> [(a, String)]
In the first argument of (++)', namelyshow first'
In the second argument of (++)', namely
show first ++ "," ++ show second ++ ")]"'
In the second argument of (++)', namely
" -> [(" ++ show first ++ "," ++ show second ++ ")]"'
My question: is there a way to show the first element of the ordered pair (a,String) even though its type is not known at compile-time?
My source code is shown below:
{-# LANGUAGE MonomorphismRestriction #-}
import Data.Typeable
import Data.Char
import Debug.Trace
newtype Parser a = P ( String -> [(a,String)] )
parse :: Parser a -> String -> [(a,String)]
parse (P p) input | trace
( let result = (p input)
element = head result
first = fst element
second = snd element
in ("parse maps " ++ input ++ " -> [(" ++ show first ++ "," ++ show second ++ ")]")
) False = undefined
parse (P p) input = p input
nextChar :: Parser Char
nextChar = P ( \input -> case input of { [] -> [] ; (c:cs) -> [(c,cs)] } )
I am hoping to trace evaluation of parse nextChar "ABCD".
Yes, sure, just follow the instructions in the error:
parse :: Show a => Parser a -> String -> [(a,String)]
Once you're done debugging, you can delete the call to trace and the Show constraint; then you'll be able to parse un-Showable things again.
I want to implement a method for showing a propositional formula in SML. The solutions that I found so far was of this type:
fun show (Atom a) = a
| show (Neg p) = "(~ " ^ show p ^ ")"
| show (Conj(p,q)) = "(" ^ show p ^ " & " ^ show q ^ ")"
| show (Disj(p,q)) = "(" ^ show p ^ " | " ^ show q ^ ")";
This produces unnecessary braces:
((~p) & (q | r))
when, what I'd like to have is:
~ p & (q | r)
I saw, that Haskell has a function (display?) which does this nicely. Can someone help me out a little bit. How should I go about this?
If you want to eliminate redundant parentheses, you will need to pass around some precedence information. For example, in Haskell, the showsPrec function embodies this pattern; it has type
showsPrec :: Show a => Int -> a -> String -> String
where the first Int argument is the precedence of the current printing context. The extra String argument is a trick to get efficient list appending. I'll demonstrate how to write a similar function for your type, though admittedly in Haskell (since I know that language best) and without using the extra efficiency trick.
The idea is to first build a string that has no top-level parentheses -- but does have all the parentheses needed to disambiguate subterms -- then add parentheses only if necessary. The unbracketed computation below does the first step. Then the only question is: when should we put parentheses around our term? Well, the answer to that is that things should be parenthesized when a low-precedence term is an argument to a high-precedence operator. So we need to compare the precedence of our immediate "parent" -- called dCntxt in the code below -- to the precedence of the term we're currently rendering -- called dHere in the code below. The bracket function below either adds parentheses or leaves the string alone based on the result of this comparison.
data Formula
= Atom String
| Neg Formula
| Conj Formula Formula
| Disj Formula Formula
precedence :: Formula -> Int
precedence Atom{} = 4
precedence Neg {} = 3
precedence Conj{} = 2
precedence Disj{} = 1
displayPrec :: Int -> Formula -> String
displayPrec dCntxt f = bracket unbracketed where
dHere = precedence f
recurse = displayPrec dHere
unbracketed = case f of
Atom s -> s
Neg p -> "~ " ++ recurse p
Conj p q -> recurse p ++ " & " ++ recurse q
Disj p q -> recurse p ++ " | " ++ recurse q
bracket
| dCntxt > dHere = \s -> "(" ++ s ++ ")"
| otherwise = id
display :: Formula -> String
display = displayPrec 0
Here's how it looks in action.
*Main> display (Neg (Conj (Disj (Conj (Atom "a") (Atom "b")) (Atom "c")) (Conj (Atom "d") (Atom "e"))))
"~ ((a & b | c) & d & e)"
I'm programming a standard math notation -> DC POSIX-compliant format converter. It takes the input string, parses it into an intermediary data type and then turns it into the output string by showing it.
This is the Data type used. I have no problems with the Data type -> Output String conversion, it works flawlessly:
data Expression = Expression :+ Expression
| Expression :- Expression
| Expression :* Expression
| Expression :/ Expression
| Expression :^ Expression
| Cons String
infixr 0 :+
infixr 0 :-
infixr 1 :*
infixr 1 :/
infixr 2 :^
instance Show Expression where
show (x :+ y) = unwords [show x, show y, "+"]
show (x :- y) = unwords [show x, show y, "-"]
show (x :* y) = unwords [show x, show y, "*"]
show (x :/ y) = unwords [show x, show y, "/"]
show (x :^ y) = unwords [show x, show y, "^"]
show (Cons y) = y
The Parsec parser part, however, refuses to comply with the defined operator precedency rules. Clearly because of the way chainl1 is used in the subexpression parser definition:
expression :: Parser Expression
expression = do
spaces
x <- subexpression
spaces >> eof >> return x
subexpression :: Parser Expression
subexpression = (
(bracketed subexpression) <|>
constant
) `chainl1` (
try addition <|>
try substraction <|>
try multiplication <|>
try division <|>
try exponentiation
)
addition = operator '+' (:+)
substraction = operator '-' (:-)
multiplication = operator '*' (:*)
division = operator '/' (:/)
exponentiation = operator '^' (:^)
operator :: Char -> (a -> a -> a) -> Parser (a -> a -> a)
operator c op = do
spaces >> char c >> spaces
return op
bracketed :: Parser a -> Parser a
bracketed parser = do
char '('
x <- parser
char ')'
return x
constant :: Parser Expression
constant = do
parity <- optionMaybe $ oneOf "-+"
constant <- many1 (digit <|> char '.')
return (if parity == Just '-'
then (Cons $ '_':constant)
else Cons constant)
Is there a way of making the parser take into account the operator precedence rules without having to rewrite the entirety of my code?
Well, you don't need to rewrite your entire code, but since your subexpression parser doesn't take precedence into account at all, you have to rewrite that - substantially.
One possibility is to build it from parsers for subexpressions with top-level operators with the same precedence,
atom :: Parser Expression
atom = bracketed subexpression <|> constant
-- highest precedence operator is exponentiation, usually that's
-- right-associative, hence I use chainr1 here
powers :: Parser Expression
powers = atom `chainr1` try exponentiation
-- a multiplicative expression is a product or quotient of powers,
-- left-associative
multis :: Parser Expression
multis = powers `chainl1` (try multiplication <|> try division)
-- a subexpression is a sum (or difference) of multiplicative expressions
subexpression :: Parser Expression
subexpression = multis `chainl1` (try addition <|> try substraction)
Another option is to let the precedence and associativities be taken care of by the library and use Text.Parsec.Expr, namely buildExpressionParser:
table = [ [binary "^" (:^) AssocRight]
, [binary "*" (:*) AssocLeft, binary "/" (:/) AssocLeft]
, [binary "+" (:+) AssocLeft, binary "-" (:-) AssocLeft]
]
binary name fun assoc = Infix (do{ string name; spaces; return fun }) assoc
subexpression = buildExpressionParser table atom
(which requires that bracketed parser and constant consume the spaces after the used tokens).
This function generates simple .dot files for visualizing automata transition functions using Graphviz. It's primary purpose is debugging large sets of automatically generated transitions (e.g., the inflections of Latin verbs).
prepGraph :: ( ... ) => NFA c b a -> [String]
prepGraph nfa = "digraph finite_state_machine {"
: wrapSp "rankdir = LR"
: wrapSp ("node [shape = circle]" ++ (mapSp (states nfa \\ terminal nfa)))
: wrapSp ("node [shape = doublecircle]" ++ (mapSp $ terminal nfa))
: formatGraph nfa ++ ["}"]
formatGraph :: ( ... ) => NFA c b a -> [String]
formatGraph = map formatDelta . deltaTuples
where formatDelta (a, a', bc) = wrapSp (mkArrow a a' ++ " " ++ mkLabel bc)
mkArrow x y = show x ++ " -> " ++ show y
mkLabel (y, z) = case z of
(Just t) -> "[ label = \"(" ++ show y ++ ", " ++ show t ++ ")\" ]"
Nothing -> "[ label = \"(" ++ show y ++ ", " ++ "Null" ++ ")\" ]"
where wrap, wrapSp and mapSp are formatting functions, as is deltaTuples.
The problem is that formatGraph retains double quotes around Strings, which causes errors in Graphviz. E.g., when I print unlines $ prepGraph to a file, I get things like:
0 -> 1 [ label = "('a', "N. SF")" ];
instead of
0 -> 1 [ label = "('a', N. SF)" ];
(However, "Null" seems to work fine, and outputs perfectly well). Now of course the string "N. SF" isn't the actual form I use to store inflections, but that form does include a String or two. So how can I tell Haskell: when you show a String values, don't double-quote it?
Check out how Martin Erwig handled the same problem in Data.Graph.Inductive.Graphviz:
http://hackage.haskell.org/packages/archive/fgl/5.4.2.3/doc/html/src/Data-Graph-Inductive-Graphviz.html
The function you're looking for is "sq" at the bottom:
sq :: String -> String
sq s#[c] = s
sq ('"':s) | last s == '"' = init s
| otherwise = s
sq ('\'':s) | last s == '\'' = init s
| otherwise = s
sq s = s
(check out the context and adapt for your own code, of course)
Use dotgen package - it has special safeguards in place to prevent forbidden chars from sneaking into attribute values.
You could define your own typeClass like this:
class GShow a where
gShow :: a -> String
gShow = show
instance GShow String where
show = id
instance GShow Integer
instance GShow Char
-- And so on for all the types you need.
The default implementation for "gShow" is "show", so you don't need a "where" clause for every instance. But you do need all the instances, which is a bit of a drag.
Alternatively you could use overlapping instances. I think (although I haven't tried it) that this will let you replace the list of instances using the default "gShow" by a single line:
instance (Show a) => GShow a
The idea is that with overlapping instances the compiler will chose the most specific instance available. So for strings it will pick the string instance over the more general one, and for everything else the general one is the only one that matches.
It seems a little ugly, but you could apply a filter to show t
filter (/='"') (show t)