Checking if the remaining input is whitespace with parsec

Checking if the remaining input is whitespace with parsec - haskell

I'm trying out the parsec library and I'm not sure how to handle this basic task.
Suppose I have the following:
data Foo = A | AB
and I want the string "a" to be parsed as A and "a b" AB. If I just do this:
parseA :: parser Foo
parseA = do
reserved "a"
return A
parseAB :: parser Foo
parseAB = do
reserved "a"
reserved "b"
return AB
parseFoo :: parser Foo
parseFoo = parseA
<|> parseAB
then parseFoo will parse "a b" as A since parseA doesn't care that there is non-whitespace still left after consuming the 'a'. How can this be fixed?

You need to change grammar to AB | A and use try from parsec, which gives lookahead capability to your parser.
This should work
parseFoo = try Parse AB <|> parse A

Related

Haskell - Parsec :: Parse spaces until string literal

I am currently trying to design a Parser in Haskell using Parsec.
The syntax for declaring type should look something like this:
Fruit is a Apple
Types should also be able have parameters:
Fruit a b is a Apple
Where
Fruit has type Name
a b has type [Parameter]
Apple has type Value
The problem here is that my parser currently does not know when to stop parsing the parameters and start parsing the value.
The code is as follows:
newtype Name = Name String deriving (Show)
newtype Parameter = Parameter String deriving (Show)
newtype Value = Value String deriving (Show)
data TypeAssignment = TypeAssignment Name [Parameter] Value deriving (Show)
-- first variant using `sepBy`
typeAssigment :: Parser TypeAssignment
typeAssigment =
TypeAssignment
<$> name
<*> (space *> parameter `sepBy` space)
<*> (string "is a" *> value)
-- second variant using manyTill
typeAssigment2 :: Parser TypeAssignment
typeAssigment2 =
TypeAssignment
<$> name
<*> (space *> manyTill parameter (string "is a"))
<*> value
name :: Parser Name
name = Name <$> word
parameter :: Parser Parameter
parameter = Parameter <$> word
value :: Parser Value
value = Value <$> word
word :: Parser String
word = (:) <$> letter <*> many (letter <|> digit)
I have tried parsing the parameters/value in the two ways I would know how to (once with sepBy and once with manyTill) both failed with the nearly the parse-error:
*EParser> parseTest typeAssigment "Fruit a b is a Apple"
parse error at (line 1, column 21):
unexpected end of input
expecting space or "is a"
*EParser> parseTest typeAssigment2 "Fruit a b is a Apple"
parse error at (line 1, column 8):
unexpected " "
expecting letter, digit or "is a"

The problem with typeAssignment1 is that "is" and "a" are perfectly valid parameter parses. So, the parameter parsing gobbles up the whole input until nothing is left, and then you get an error. In fact, if you look closely at that error you see this to be true: the parser is expecting either a space (for more parameters) or "is a" (the terminal of your whole parser).
On the other hand, typeAssignment2 is really close, but it seems that you're not handling spaces properly. In order to parse many parameters, you need to parse all the spaces between those parameter, not just the first one.
I think the following alternative should do the trick:
typeAssigment3 :: Parser TypeAssignment
typeAssigment3 =
TypeAssignment
<$> name
<*> manyTill (space *> parameter) (try $ string "is a")
<*> (space *> value)

Parser Combinators (ReadP) - return entire list if it fails, otherwise return only the passing one

I have a string, for example "MMMMABCNNNXYZPPPPP". I know that this string may have ABC in it and may have XYZ in it, but it is not required to have either. Additionally, the XYZ may be swapped for DEF (e.g. "MMMMABCNNNDEFPPPPP") and the behavior should remain the same.
I would like to parse the string and return the sequences between them, as well as which one of XYZ or DEF was present. Example:
data Divider1 = Abc
data Divider2 = Xyz | Def
--"MMMMABCNNNXYZPPPPP" should return ("MMMM", Just Abc, "NNN", Just Xyz, "PPPPP")
--"MMMMABCNNNDEFPPPPP" should return ("MMMM", Just Abc, "NNN", Just Def, "PPPPP")
Note that if ABC is not present, I would like to return everything before the divider2 and if XYZ and DEF are both not present, I would like to return everything after divider 1.
Example:
--"MMMMNNNXYZPPPPP" should return ("MMMM", Nothing, "NNN", Just Xyz, "PPPPP")
--"MMMMABCNNNPPPPP" should return ("MMMM", Just Abc, "NNN", Nothing, "PPPPP")
If neither ABC nor XYZ is present then I don't care if it returns nothing, or if it returns the entire string.
Currently my code is
parseEverything = many $ satisfy someGeneralCondition--check if all characters are valid
parseAbc = (\str -> Abc) <$> string "ABC"
parseXyz = (\str -> Xyz) <$> string "XYZ"
parseDef = (\str -> Def) <$> string "DEF"
parseFull = do
beforeAbc <- gather parseEverything
parseAbc <- (Just <$> parseAbc) <++ return Nothing
beforeDivider2 <- gather parseEverything
parseDivider2 <- (Just <$> parseXyz) <++ (Just <$> parseDef) <++ (Just <$> Nothing)
everythingElse <- look
return (beforeAbc, parseAbc, beforeDivider2, parseDivider2, everythingElse)
But when I run this on the example string "MMMMABCNNNXYZPPPPP", I get mostly failed results with just one result that I want. The problem is that I need to return everything in beforeAbc if parseAbc fails, but if parseAbc passes then I just need to return that. And the same thing with parseXyz and parseDef. I don't think that <++ is the correct operator to do this. I also tried a variant of this code using option, but it gave the same result. Is there a simple solution that I am missing, and/or should I set up the parsers in a different way?
Thanks in advance!
Edit: does this have anything to do with chainl or chainr or manyTill?

Updated: See note on applicative parsers below.
Here's what's going wrong with your current approach. As you undoubtedly know, the parsers in Text.ParserCombinators.ReadP generate all possible valid parses of all possible prefixes of the string. If you write a parser:
letterAndOther = do
letters <- many (satisfy isLetter)
others <- many get
return (letters, others)
which grabs an initial string of letters followed by the "rest" of the string and run it on a simple test string, you'll usually get way more than you bargained for:
> readP_to_S letterAndOther "abc"
[(("",""),"abc"),(("","a"),"bc"),(("a",""),"bc"),(("","ab"),"c"),
(("a","b"),"c"),(("ab",""),"c"),(("","abc"),""),(("a","bc"),""),
(("ab","c"),""),(("abc",""),"")]
In other words, in a do-block, each monadic action will typically generate a tree of possible parses. In your current code, the very first line of the do-block:
beforeAbc <- gather parseEverything
introduces a whole tree of parse branches, one branch for each possible initial prefix. These branches only get pruned if a later line of the do-block introduces a parse that fails. But, every line of your do-block represents a parser that always succeeds. For example, this always succeeds:
parseAbc <- (Just <$> parseAbc) <++ return Nothing
because even if the first divider isn't found, the right-hand side parser return Nothing will always succeed.
I would suggest the following approach. First, as we discovered in the comments, the first thing you want to do is figure out what your parser should return. Instead of trying to shoehorn the result into a weird tuple, it's a good idea to leverage Haskell's best feature, it's algebraic data types. Define a return type for your parse:
data Result
= TwoDividers String Divider1 String Divider2 String
| FirstDivider String Divider1 String
| SecondDivider String Divider2 String
| NoDividers String
This is unambiguous and covers all possibilities. Admittedly, including Divider1 in the first two constructors is redundant, since there's only one possible Divider1, but programs are for humans to read, too, and keeping Divider1 explicit improves readability.
Now, let's define parsers for the first and second dividers:
divider1 = Abc <$ string "ABC"
divider2 = (Def <$ string "DEF") +++ (Xyz <$ string "XYZ")
Note that I've chosen to define a single divider2 instead of separate parsers for Def and Xyz. Since, in your grammar, it's always the case that "DEF" can appear anywhere "XYZ" can and vice versa, it makes sense to combine them into one parser.
We'll also want a parser for arbitrary strings (basically your parseEverything):
anything = many $ satisfy isLetter -- valid characters
Now, let's write a parser for the full string. A key insight here is that we have four alternatives (i.e., the four constructors for our Result type). It's true that they share some structure, but a first crack at a parser can just treat them as independent alternatives. We'll use the <++ operator to choose the best match:
result =
(TwoDividers <$> anything <*> divider1 <*> anything <*> divider2 <*> anything)
<++ (FirstDivider <$> anything <*> divider1 <*> anything)
<++ (SecondDivider <$> anything <*> divider2 <*> anything)
<++ (NoDividers <$> anything)
A quick test of this will show we've forgotten something:
> readP_to_S result "MMMMABCNNNXYZPPPPP"
[(TwoDividers "MMMM" Abc "NNN" Xyz "","PPPPP"),...]
By default, the parser combinators will try every possible prefix of the input string, leaving more for later parsers. So, we should wrap this up in a final parser function that checks for the end-of-string:
parseResult = readP_to_S (result <* eof)
and with the tests:
main = mapM_ (print . parseResult)
[ "MMMMABCNNNXYZPPPPP"
, "MMMMABCNNNDEFPPPPP"
, "MMMMNNNXYZPPPPP"
, "MMMMABCNNNPPPPP"
]
we get the expected unique parsed output:
[(TwoDividers "MMMM" Abc "NNN" Xyz "PPPPP","")]
[(TwoDividers "MMMM" Abc "NNN" Def "PPPPP","")]
[(SecondDivider "MMMMNNN" Xyz "PPPPP","")]
[(FirstDivider "MMMM" Abc "NNNPPPPP","")]
Note on Applicative Parsers. I've used applicative syntax here, rather than the monad syntax. The difference isn't purely syntactical -- you can always write an applicative expression in monadic form, but there are monadic operations that can't be expressed applicatively, so the monadic syntax is strictly more powerful. However, when an expression can be written both ways, often the applicative syntax is more succinct and easier to write and understand, at least once you get used to it.
In a nutshell, the expression p <*> x <*> y <*> z creates a new parser that applies the parsers p, x, y, and z in order, and then applies the result from parser p (which needs to be a function f) to the results from the rest of the parsers (which must be appropriate arguments for f). In many cases, the function f is a known function and doesn't need to be returned by a parser, so a common variant is to write f <$> x <*> y <*> z. This applies the parsers x, y, and z in order, and then applies f (given directly instead of returned by a parser) to the results from those parsers. For example, the expression:
FirstDivider <$> anything <*> divider1 <*> anything
runs three parsers in order to get anything, followed by a divider1, followed by anything, and then applies the function/contructor FirstDivider to the three arguments resulting from those parsers.
The operators <* and *> can be thought of as variants of <*>. The expression p <*> x first parses p, then parses x, then applies the result of the former to the latter. The expression p <* x first parses p, then parses x, but instead of applying the former to the latter, it returns the value the arrow is pointing to (i.e., whatever p produced) and throws away the other value. Similarly p *> x parses p then parses x, then returns whatever x produced. In particular:
someParser <* eof
first runs someParser, then parses (i.e., checks for) EOF, then returns whatever someParser produced.
This syntax can really shine when parsing more traditional languages into an abstract syntax tree. If you want to parse statements like:
let x = 1 + 5
into a Statement type like:
data Statement = ... | Let Var Expr | ...
you can write a Parsec parser that looks like:
statement = ...
<|> Let <$ string "let" <*> var <* symbol "=" <*> expr
...
The monadic equivalent in do-notation looks like this:
do string "let"
v <- var
symbol "="
e <- expr
return $ Let v e
which is fine, I suppose, but kind of obscures the simple structure of the parse. The applicative version is basically just the list of tokens to parse, with a little bit of syntactic sugar sprinkled in.
Anyway, here's the full program:
import Data.Char
import Text.ParserCombinators.ReadP
data Divider1 = Abc deriving (Show)
data Divider2 = Xyz | Def deriving (Show)
data Result
= TwoDividers String Divider1 String Divider2 String
| FirstDivider String Divider1 String
| SecondDivider String Divider2 String
| NoDividers String
deriving (Show)
anything :: ReadP String
anything = many $ satisfy isLetter -- valid characters
divider1 :: ReadP Divider1
divider1 = Abc <$ string "ABC"
divider2 :: ReadP Divider2
divider2 = (Def <$ string "DEF") +++ (Xyz <$ string "XYZ")
result :: ReadP Result
result =
(TwoDividers <$> anything <*> divider1 <*> anything <*> divider2 <*> anything)
<++ (FirstDivider <$> anything <*> divider1 <*> anything)
<++ (SecondDivider <$> anything <*> divider2 <*> anything)
<++ (NoDividers <$> anything)
parseResult :: String -> [(Result, String)]
parseResult = readP_to_S (result <* eof)
main :: IO ()
main = mapM_ (print . parseResult)
[ "MMMMABCNNNXYZPPPPP"
, "MMMMABCNNNDEFPPPPP"
, "MMMMNNNXYZPPPPP"
, "MMMMABCNNNPPPPP"
]

How can I force Parsec to return an error?

I'm making a parser with Parsec and I try to return a specific error during the parsing.
This is a minimal parser example to expose my problem :
parseA = try seq1
<|> seq2
seq1 = do
manyTill anyChar (try $ string "\n* ")
many1 anyChar
fail "My error message"
seq2 = do
manyTill anyChar (try $ string "\n- ")
many1 anyChar
I would like to perform some tests in the first try $ do sequence and stop the parsing and return a specific error message.
When I don't use fail I get :
ghci> parse parseA "" "aaaaaa\nbbbb\n* ccccc\n- ddd"
Right "ccccc\n- ddd"
When I use fail or unexpected, my parser doesn't stop (due to the try function) and execute the next do sequence:
ghci> parse parseA "" "aaaaaa\nbbbb\n* ccccc\n- ddd"
Right "ddd"
And it's not what I want!
I considered using the basic error function to stop the execution of my parser but I would like to have a "clean" error returned by the parsing function like this:
ghci> parse parseA "" "aaaaaa\nbbbb\n* ccccc\n- ddd"
Left "My error message"
Do you know how to properly stop a parser and return a custom error?

If you want the monad to behave differently then perhaps you should build a different monad. (N.B. I'm not entirely clear what you want, but moving forward anyway).
Solution: Use a Monad Transformer Stack
For example, to get a fail-like function that isn't caught and ignored by Parsec's try you could use an Except monad. Except allows you to throw errors much like exceptions but they are plumbed monadically instead of using the actual exception mechanism which demands IO to catch it.
First, lets define our monad:
import Text.Parsec
import Text.Parsec.Combinator
import Text.Parsec.Char
import Control.Monad.Trans.Except
import Control.Monad.Trans
type EscParse a = ParsecT String () (Except String) a
So the monad is EscParse and combines features of Parsec (via the transformer ParsecT) and Except.
Second, let's define some helpers:
run :: EscParse a -> SourceName -> String -> Either String (Either ParseError a)
run op sn input = runExcept (runPT op () sn input)
escFail :: String -> EscParse a
escFail = lift. throwE
Our run is like runParse but also runs the except monad. You might want to do something to avoid the nested Either, but that's an easy cosmetic change. escFail is what you'd use if you don't want the error to be ignored.
Third, we need to implement your parser using this new monad:
parseA :: EscParse String
parseA = try seq1 <|> seq2
seq1 :: EscParse String
seq1 = do manyTill anyChar (try $ string "\n* ")
many1 anyChar
escFail "My error message"
seq2 :: EscParse String
seq2 = do manyTill anyChar (try $ string "\n- ")
many1 anyChar
Other than spacing and type signature, the above matches what you had but using escFail instead of fail.

attoparsec: "nested" parsers -- parse a subset of the input with a different parser

Well in fact I'm pretty sure I'm using the wrong terminology. Here is the problem I want to solve: a parser for the markdown format, well a subset of it.
My problem is with the blockquote feature. Each line in a blockquote starts with >; otherwise everything is the normal structure in a markdown file.
You can't look at individual lines separately, because you need to separate paragraphs from normal lines, eg
> a
> b
is not the same as
> a
>
> b
and things like that (same if a list is blockquoted you don't want x lists but one list with x elements). A natural and trivial way is to "take off" the > signs, parse the blockquote on its own, ignoring anything around it, wrap it with a BlockQuote type constructor, put that in the outer AST and resume parsing of the original input. It's what pango does if I'm not wrong:
https://hackage.haskell.org/package/pandoc-1.14.0.4/docs/src/Text-Pandoc-Readers-Markdown.html#blockQuote
blockQuote :: MarkdownParser (F Blocks)
blockQuote = do
raw <- emailBlockQuote
-- parse the extracted block, which may contain various block elements:
contents <- parseFromString parseBlocks $ (intercalate "\n" raw) ++ "\n\n"
return $ B.blockQuote <$> contents
And then:
http://hackage.haskell.org/package/pandoc-1.5.1/docs/src/Text-Pandoc-Shared.html#parseFromString
-- | Parse contents of 'str' using 'parser' and return result.
parseFromString :: GenParser tok st a -> [tok] -> GenParser tok st a
parseFromString parser str = do
oldPos <- getPosition
oldInput <- getInput
setInput str
result <- parser
setInput oldInput
setPosition oldPos
return result
Now parseFromString looks quite hacky to me and besides that it's also Parsec not attoparsec so I can't use it in my project. I'm not sure how I could take that Text from the blockquote and parse it and return the parsing result so that it "fits" within the current parsing. Seems impossible?
I've been googling on the issue and I think that pipes-parse and conduit can help on that area although I struggle to find examples and what I see appears considerably less nice to look at than "pure" parsec/attoparsec parsers.
Other options to parse blockquotes would be to rewrite the usual parsers but with the > catch... Complicating and duplicating a lot. Parsing blockquotes counting each line separately and writing some messy "merge" function. Or parsing to a first AST that would contain the blockquotes as Text inside a first BlockquoteText type constructor waiting for a transformation where they would be parsed separately, not very elegant but it has the benefit of simplicity, which does count for something.
I would probably go for the latter, but surely there's a better way?

I have asked myself the same question. Why is there no standard combinator for nested parsers like how you describe? My default mode is to trust the package author, especially when that author also co-wrote "Real World Haskell". If such an obvious capability is missing, perhaps it is by design and I should look for a better way. However, I've managed to convince myself that such a convenient combinator is mostly harmless. Useful whenever an all-or-nothing type parser is appropriate for the inner parse.
Implementation
import Data.Attoparsec.Text
import qualified Data.Text as T
import Data.Text(Text)
import Control.Applicative
I've divided the required functionality into two parsers. The first, constP, performs an "in place" parse of some given text. It substitutes the constant parser's fail with empty (from Alternative), but otherwise has no other side effects.
constP :: Parser a -> Text -> Parser a
constP p t = case parseOnly p t of
Left _ -> empty
Right a -> return a
The second part comes from parseOf, which performs the constant, inner parse based on the result of the outer parse. The empty alternative here allows a failed parse to return without consuming any input.
parseOf :: Parser Text -> Parser a -> Parser a
parseOf ptxt pa = bothParse <|> empty
where
bothParse = ptxt >>= constP pa
The block quote markdown can be written in the desired fashion. This implementation requires the resulting block to be totally parsed.
blockQuoteMarkdown :: Parser [[Double]]
blockQuoteMarkdown = parseOf blockQuote ( markdownSurrogate <*
endOfInput
)
Instead of the actual markdown parser, I just implemented a quick parser of space separated doubles. The complication of the parser comes from allowing the last, non-empty line, either end in a new line or not.
markdownSurrogate :: Parser [[Double]]
markdownSurrogate = do
lns <- many (mdLine <* endOfLine)
option lns ((lns ++) . pure <$> mdLine1)
where
mdLine = sepBy double (satisfy (==' '))
mdLine1 = sepBy1 double (satisfy (==' '))
These two parsers are responsible for returning the text internal to block quotes.
blockQuote :: Parser Text
blockQuote = T.unlines <$> many blockLine
blockLine :: Parser Text
blockLine = char '>' *> takeTill isEndOfLine <* endOfLine
Finally, a test of the parser.
parseMain :: IO ()
parseMain = do
putStrLn ""
doParse "a" markdownSurrogate a
doParse "_" markdownSurrogate ""
doParse "b" markdownSurrogate b
doParse "ab" markdownSurrogate ab
doParse "a_b" markdownSurrogate a_b
doParse "badMarkdown x" markdownSurrogate x
doParse "badMarkdown axb" markdownSurrogate axb
putStrLn ""
doParse "BlockQuote ab" blockQuoteMarkdown $ toBlockQuote ab
doParse "BlockQuote a_b" blockQuoteMarkdown $ toBlockQuote a_b
doParse "BlockQuote axb" blockQuoteMarkdown $ toBlockQuote axb
where
a = "7 3 1"
b = "4 4 4"
x = "a b c"
ab = T.unlines [a,b]
a_b = T.unlines [a,"",b]
axb = T.unlines [a,x,b]
doParse desc p str = do
print $ T.concat ["Parsing ",desc,": \"",str,"\""]
let i = parse (p <* endOfInput ) str
print $ feed i ""
toBlockQuote = T.unlines
. map (T.cons '>')
. T.lines
*Main> parseMain
"Parsing a: \"7 3 1\""
Done "" [[7.0,3.0,1.0]]
"Parsing _: \"\""
Done "" []
"Parsing b: \"4 4 4\""
Done "" [[4.0,4.0,4.0]]
"Parsing ab: \"7 3 1\n4 4 4\n\""
Done "" [[7.0,3.0,1.0],[4.0,4.0,4.0]]
"Parsing a_b: \"7 3 1\n\n4 4 4\n\""
Done "" [[7.0,3.0,1.0],[],[4.0,4.0,4.0]]
"Parsing badMarkdown x: \"a b c\""
Fail "a b c" [] "endOfInput"
"Parsing badMarkdown axb: \"7 3 1\na b c\n4 4 4\n\""
Fail "a b c\n4 4 4\n" [] "endOfInput"
"Parsing BlockQuote ab: \">7 3 1\n>4 4 4\n\""
Done "" [[7.0,3.0,1.0],[4.0,4.0,4.0]]
"Parsing BlockQuote a_b: \">7 3 1\n>\n>4 4 4\n\""
Done "" [[7.0,3.0,1.0],[],[4.0,4.0,4.0]]
"Parsing BlockQuote axb: \">7 3 1\n>a b c\n>4 4 4\n\""
Fail ">7 3 1\n>a b c\n>4 4 4\n" [] "Failed reading: empty"
Discussion
The notable difference comes in the semantics of failure. For instance, when parsing axb and blockquoted axb, which are the following two strings, respectively
7 3 1
a b c
4 4 4
and
> 7 3 1
> a b c
> 4 4 4
the markdown parse results in
Fail "a b c\n4 4 4\n" [] "endOfInput"
whereas the quoted results in
Fail ">7 3 1\n>a b c\n>4 4 4\n" [] "Failed reading: empty"
The markdown consumes "7 3 1\n", but this is nowhere reported in the quoted failure. Instead, fail becomes all or nothing.
Likewise, there is no allowance for handling unparsed text in the case of partial success. But I don't see a need for this, given the use case. For example, if a parse looked something like the following
"{ <tok> unhandled }more to parse"
where {} denotes the recognized block quote context, and <tok> is parsed within that inner context. A partial success then would have to lift "unhandled" out of that block quote context and somehow combine it with "more to parse".
I see no general way to do this, but it is allowed through choice of inner parser return type. For instance, by some parser parseOf blockP innP :: Parser (<tok>,Maybe Text). However, if this need arises I would expect that there is a better way to handle the situation than with nested parsers.
There may also be concerns about the loss of attoparsec Partial parsing. That is, the implementation of constP uses parseOnly, which collapses the parse return Fail and Partial into a single Left failure state. In other words, we lose the ability to feed our inner parser with more text as it becomes available. However, note that text to parse is itself the result of an outer parse; it will only be available after enough text has been fed to the outer parse. So this shouldn't be an issue either.

Haskell Parsec accounting for multiple expression occrrences in grammar

I have been trying to create a parser using details from the following tutorial
much of the code is copied directly from the tutorial with only a few names changed.
import qualified Text.ParserCombinators.Parsec.Token as P
reserved = P.reserved lexer
integer = P.integer lexer
whiteSpace = P.whiteSpace lexer
identifier = P.identifier lexer
data Express = Seq [Express]
| ID String
| Num Integer
| BoolConst Bool
deriving (Show)
whileParser :: Parser Express
whileParser = whiteSpace >> expr7
expr7 = seqOfStmt
<|> expr8
seqOfStmt =
do list <- (sepBy1 expr8 whiteSpace)
return $ if length list == 1 then head list else Seq list
expr8 :: Parser Express
expr8 = name
<|> number
<|> bTerm
name :: Parser Express
name = fmap ID identifier
number :: Parser Express
number = fmap Num integer
bTerm :: Parser Express
bTerm = (reserved "True" >> return (BoolConst True ))
<|> (reserved "False" >> return (BoolConst False))
I understand that this code might be laughable but I would really like to learn a bit more about where I'm going wrong. I also think that this should provide enough info but if not let me know.
Error:
parse error on input `return'
I believe that the error has something to do with different return types, which is strange because I have tried to use the tutorial at the start of the post as a basis for all that I am attempting.
Thanks in advance,
Seán

If you are not comfortable with the layout rules, you may also use different syntax:
seqOfStmt =
do { list
<- (sepBy1 expr8 whiteSpace);
return $ if length
list == 1
then head list else Seq list;}
The layout without braces and semicolons is regarded superior, though, for 2 reasons:
You don't need to type ugly ; and braces
It forces you to write (mostly) readable code, unlike the distorted crap I gave as example above.
And the rules are really easy:
Don't use tabs, use spaces. Always. (Your editor can do that, if not, throw it away, it's crapware.)
Things that belong together must be aligned in the same column.
For example, you have 2 statements that belong to the do block, hence they must be aligned in the same column. But you have aligned the return with the do, hence the compiler sees this as:
do { list <- sepBy1 expr8 whiteSpace; };
return $ ....;
but what you want is this:
do {
list <- sepBy1 ....;
return $ .....;
}
(Note that you can just leave out the braces and the semicolon and it will be ok as long as you leave the indentation intact.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Checking if the remaining input is whitespace with parsec - haskell

You need to change grammar to AB | A and use try from parsec, which gives lookahead capability to your parser. This should work parseFoo = try Parse AB <|> parse A

Related

Haskell - Parsec :: Parse spaces until string literal

Parser Combinators (ReadP) - return entire list if it fails, otherwise return only the passing one

How can I force Parsec to return an error?

attoparsec: "nested" parsers -- parse a subset of the input with a different parser

Haskell Parsec accounting for multiple expression occrrences in grammar

Categories

Resources