Haskell Stuck at parsing boolean logic - haskell

I'm currently writing a parser for a simple programming language. It's getting there however I'm unable to parse a boolean logic statement such as "i == 0 AND j == 0". All I get back is "non exhaustive patterns in case"
When I parse a boolean expression on its own it works fine e.g. "i == 0". Note "i == 0 a" will also return a boolean statement but "i == 0 AND" does not return anything.
Can anyone help please?
Whilst the above works correctly for input such as run parseBoolean "i == 0"

As #hammar points out, you should use Text.Parsec.Expr for this kind of thing. However, since this is homework, maybe you have to do it the hard way!
The problem is in parseArithmetic, you allow anyChar to be an operator, but then in the case statement, you only allow for +, -, *, /, %, and ^. When parseArithmetic tries to parse i == 0, it uses the first = as the operator, but can't parse an intExp2 from the second =, and fails in the monad, and backtracks, before getting to the case statement. However, when you try to parse i == 0 AND j == 0, it gets the i == part, but then it thinks that there's an arithmetic expression of 0 A ND, where A is an operator, and ND is the name of some variable, so it gets to the case, and boom.
Incidentally, instead of using the parser to match a string, and then using a case statement to match it a second time, you can have your parser return a function instead of a string, and then apply the function directly:
parseOp :: String -> a -> Parser a
parseOp op a = string op >> spaces >> return a
parseLogic :: Parser BoolExp
parseLogic = do
boolExp1 <- parseBoolExp
spaces
operator <- choice [ try $ parseOp "AND" And
, parseOp "OR" Or
, parseOp "XOR" XOr
]
boolExp2 <- parseBoolExp
return $ operator boolExp1 boolExp2
parseBoolean :: Parser BoolExp
parseBoolean = do
intExp1 <- parseIntExp
spaces
operator <- choice [ try $ parseOp "==" Main.EQ
, parseOp "=>" GTorEQ
, parseOp "<=" LTorEQ
]
intExp2 <- parseIntExp
return $ operator intExp1 intExp2

Related

Haskell case statement grammar

How to understand the case args of in the following code?
main :: IO ()
main = do
args <- getArgs
case args of
[dir, mbytes] | [(bytes ,"")] <- reads mbytes
, bytes >= 1 -> findDuplicates dir bytes
(_) -> do
name <- getProgName
printf "Something went wrong - please use ./%s <dir> <bytes>\n" name
The guards in this case expression are making use of the PatternGuards extension, part of Haskell 2010 but not Haskell 98. The idea is that with this extension, your guards can do pattern-matching of their own, not just evaluate Bool expressions.
So your case expression has two patterns:
[dir, mbytes]
and the wildcard pattern. If [dir, mbytes] does match with args, the pattern match still might not succeed: its guards need to apply. It has two guards:
[(bytes ,"")] <- reads mbytes
which means that calling reads mbytes must match with [(bytes, "")], and
bytes >= 1
which is an ordinary boolean expression.
If all of those pattern matches and guards succeed, then the first clause of the case is the one that is used; otherwise, we fall through to the default clause and print an error message.

Haskell substring testing

When I use sub_string("abberr","habberyry") , it returns True, when obviously it should be False. The point of the function is to search for the first argument within the second one. Any ideas what's wrong?
sub_string :: (String, String) -> Bool
sub_string(_,[]) = False
sub_string([],_) = True
sub_string(a:x,b:y) | a /= b = sub_string(a:x,y)
| otherwise = sub_string(x,y)
Let me give you hints on why it's not working:
your function consumes "abber" and "habber" of the input stings on the initial phase.
Now "r" and "yry" is left.
And "r" is a subset of "yry". So it returns True. To illustrate a more simple example of your problem:
*Main> sub_string("rz","rwzf")
True
First off, you need to switch your first two lines. _ will match [] and this will matter when you're matching, say, substring "abc" "abc". Secondly, it is idiomatic Haskell to write a function with two arguments instead of one with a pair argument. So your code should start out:
substring :: String -> String -> Bool
substring [] _ = True
substring _ [] = False
substring needle (h : aystack)
| ...
Now we get to the tricky case where both of these lists are not empty. Here's the problem with recursing on substring as bs: you'll get results like "abc" being a substring of "axbxcx" (because "abc" will match 'a' first, then will look for "bc" in the rest of the string; the substring algorithm will then skip past the 'x' to look for "bc" in "bxcx", which will match 'b' and look for "c" in "xcx", which will return True.
Instead your condition needs to be more thorough. If you're willing to use functions from Data.List this is:
| isPrefixOf needle (h : aystack) = True
| otherwise = substring needle aystack
Otherwise you need to write your own isPrefixOf, for example:
isPrefixOf needle haystack = needle == take (length needle) haystack
As Sibi already pointed out, your function tests for subsequence. Review the previous exercise, it is probably isPrefixof (hackage documentation), which is just a fancy way of saying startsWith, which looks very similar to the function you wrote.
If that is not the previous exercise, do that now!
Then write sub_string in terms of isPrefixOf:
sub_string (x, b:y) = isPrefixOf ... ?? ???
Fill in the dots and "?" yourself.

How to search a pattern from a file/String in Haskell

** old**
Suppose we have a pattern ex. "1101000111001110".
Now I have a pattern to be searched ex. "1101". I am new to Haskell world, I am trying it at my end. I am able to do it in c but need to do it in Haskell.
Given Pattern := "1101000111001110"
Pattern To Be Searched :- "110
Desired Output:-"Pattern Found"`
** New**
import Data.List (isInfixOf)
main = do x <- readFile "read.txt"
putStr x
isSubb :: [Char] -> [Char] -> Bool
isSubb sub str = isInfixOf sub str
This code reads a file named "read", which contains the following string 110100001101. Using isInfixOf you can check the pattern "1101" in the string and result will be True.
But the problem is i am not able to search "1101" in the string present in "read.txt".
I need to compare the "read.txt" string with the user provided string. i.e
one string is their in the file "read.txt"
and second string user will provid (user defined) and we will perform search and find whether user defined string is present in the string present in "read.txt"
Answer to new:
To achieve this, you have to use readLn:
sub <- readLn
readLn accepts input until a \n is encountered and <- binds the result to sub. Watch out that if the input should be a string you have to explicitly type the "s around your string.
Alternatively if you do not feel like typing the quotation marks every time, you can use getLine in place of readLn which has the type IO String which becomes String after being bound to sub
For further information on all functions included in the standard libraries of Haskell see Hoogle. Using Hoogle you can search functions by various criteria and will often find functions which suit your needs.
Answer to old:
Use the isInfixOf function from Data.List to search for the pattern:
import Data.List (isInfixOf)
isInfixOf "1101" "1101000111001110" -- outputs true
It returns true if the first sequence exists in the second and false otherwise.
To read a file and get its contents use readFile:
contents <- readFile "filename.txt"
You will get the whole file as one string, which you can now perform standard functions on.
Outputting "Pattern found" should be trivial then.

Parsec.Expr repeated Prefix/Postfix operator not supported

The documentation for Parsec.Expr.buildExpressionParser says:
Prefix and postfix operators of the same precedence can only occur
once (i.e. --2 is not allowed if - is prefix negate).
and indeed, this is biting me, since the language I am trying to parse allows arbitrary repetition of its prefix and postfix operators (think of a C expression like **a[1][2]).
So, why does Parsec make this restriction, and how can I work around it?
I think I can move my prefix/postfix parsers down into the term parser since they have the highest precedence.
i.e.
**a + 1
is parsed as
(*(*(a)))+(1)
but what could I have done if I wanted it to parse as
*(*((a)+(1)))
if buildExpressionParser did what I want, I could simply have rearranged the order of the operators in the table.
Note See here for a better solution
I solved it myself by using chainl1:
prefix p = Prefix . chainl1 p $ return (.)
postfix p = Postfix . chainl1 p $ return (flip (.))
These combinators use chainl1 with an op parser that always succeeds, and simply composes the functions returned by the term parser in left-to-right or right-to-left order. These can be used in the buildExprParser table; where you would have done this:
exprTable = [ [ Postfix subscr
, Postfix dot
]
, [ Prefix pos
, Prefix neg
]
]
you now do this:
exprTable = [ [ postfix $ choice [ subscr
, dot
]
]
, [ prefix $ choice [ pos
, neg
]
]
]
in this way, buildExprParser can still be used to set operator precedence, but now only sees a single Prefix or Postfix operator at each precedence. However, that operator has the ability to slurp up as many copies of itself as it can, and return a function which makes it look as if there were only a single operator.

Haskell: Delimit a string by chosen sub-strings and whitespace

Am still new to Haskell, so apologize if there is an obvious answer to this...
I would like to make a function that splits up the all following lists of strings i.e. [String]:
["int x = 1", "y := x + 123"]
["int x= 1", "y:= x+123"]
["int x=1", "y:=x+123"]
All into the same string of strings i.e. [[String]]:
[["int", "x", "=", "1"], ["y", ":=", "x", "+", "123"]]
You can use map words.lines for the first [String].
But I do not know any really neat ways to also take into account the others - where you would be using the various sub-strings "=", ":=", "+" etc. to break up the main string.
Thank you for taking the time to enlighten me on Haskell :-)
The Prelude comes with a little-known handy function called lex, which is a lexer for Haskell expressions. These match the form you need.
lex :: String -> [(String,String)]
What a weird type though! The list is there for interfacing with a standard type of parser, but I'm pretty sure lex always returns either 1 or 0 elements (0 indicating a parse failure). The tuple is (token-lexed, rest-of-input), so lex only pulls off one token. So a simple way to lex a whole string would be:
lexStr :: String -> [String]
lexStr "" = []
lexStr s =
case lex s of
[(tok,rest)] -> tok : lexStr rest
[] -> error "Failed lex"
To appease the pedants, this code is in terrible form. An explicit call to error instead of returning a reasonable error using Maybe, assuming lex only returns 1 or 0 elements, etc. The code that does this reliably is about the same length, but is significantly more abstract, so I spared your beginner eyes.
I would take a look at parsec and build a simple grammar for parsing your strings.
how about using words .)
words :: String -> [String]
and words wont care for whitespaces..
words "Hello World"
= words "Hello World"
= ["Hello", "World"]

Resources