Add IO functionality to imperative language parser - haskell

I'm trying to add IO functionality such as read and write statements to a parser for an imperative language like shown here https://wiki.haskell.org/Parsing_a_simple_imperative_language
I want to add statements such as write "example" which will write "example" to stdout with something like putStrLn or print.
So far I've made the following changes
-- Add Write data type so that a statement such as `write "test"` can be represented as a statement
data Stmt = Seq [Stmt]
| ... (same as before)
| Write String
deriving (Show)
-- write needs to be added to reservedNames as it has a function in the language
languageDef =
emptyDef { Token...
, ...
, Token.reservedNames = [ "if"
, ...
, "write"
...
-- whenever a statement is parsed, writeStmt function now needs to be called to parse a write statement
statement' :: Parser Stmt
statement' = ifStmt
<|> whileStmt
<|> skipStmt
<|> assignStmt
<|> writeStmt
-- do reserved write, get the identifier, print it to stdout and return the Write statement
writeStmt :: Parser Stmt
writeStmt =
do reserved "write"
var <- identifier
print var
return $ Write var
I'm getting errors on var <- identifier in whileStmt and i'm not sure what else needs to be added or changed to get this to work. thanks

As was noted in the comments, you need to fix the indentation so all of the statements in your do-block line up:
writeStmt :: Parser Stmt
writeStmt =
do reserved "write"
var <- identifier
print var
return $ Write var
and then, you need to delete that print var statement. During parsing, you just want to identify write statements and store them in your abstract syntax tree (AST) (i.e., in your Stmt tree). The actual printing will take place in a separate function that executes the program represented by the AST.
Unfortunately, the wiki page you're working from only shows you how to perform the parsing, not how to actually execute the parsed representation.

Related

Make Parsec function fail instead of expecting more input

I'm using Parsec to parse some expressions (see this question for more context), and most relevant part of my code is:
statement :: Parser Stmt
statement = assignStmt <|> simpleStmt
assignStmt :: Parser Stmt
assignStmt =
do var <- identifier
reservedOp "="
expr <- expression
return $ Assign var expr
simpleStmt :: Parser Stmt
simpleStmt =
do expr <- expression
return $ Simple expr
In action:
boobla> foo = 100 + ~100
167
boobla> foo
parser error: (line 1, column 4):
unexpected end of input
expecting letter or digit or "="
Second expression should have evaluated to 167, value of foo.
I thought that when Parsec would try to extract token reservedOp "=", it should have failed because there is no such token in the string, then it was to try second function simpleStmt and succeed with it. But it works differently: it expects more input and just throws this exception.
What should I use to make assignStmt fail if there is no more characters in the string (or in current line). foo = 10 should be parsed with assignStmt and foo should be parsed with simpleStmt.
You're missing the try function.
By default, the <|> operator will try the left parser, and if it fails without consuming any characters, it will try the right parser.
However, if — as in your case — the parser fails after having consumed some characters, and the right parser is never tried. Note that this is often the behavior you want; if you had something like
parseForLoop <|> parseWhileLoop
then if the input is something like "for break", then that's not a valid for-loop, and there's no point trying to parse it as a while-loop, since that will surely fail also.
The try combinator changes this behaviour. Specifically, it makes a failed parser appear to have consumed no input. (This has a space penalty; the input could have been thrown away, but try makes it hang around.)

Understanding I/O monad and the use of "do" notation

I am still struggling with Haskell and now I have encountered a problem with wrapping my mind around the Input/Output monad from this example:
main = do
line <- getLine
if null line
then return ()
else do
putStrLn $ reverseWords line
main
reverseWords :: String -> String
reverseWords = unwords . map reverse . words
I understand that because functional language like Haskell cannot be based on side effects of functions, some solution had to be invented. In this case it seems that everything has to be wrapped in a do block. I get simple examples, but in this case I really need someone's explanation:
Why isn't it enough to use one, single do block for I/O actions?
Why do you have to open completely new one in if/else case?
Also, when does the -- I don't know how to call it -- "scope" of the do monad ends, i.e. when can you just use standard Haskell terms/functions?
The do block concerns anything on the same indentation level as the first statement. So in your example it's really just linking two things together:
line <- getLine
and all the rest, which happens to be rather bigger:
if null line
then return ()
else do
putStrLn $ reverseWords line
main
but no matter how complicated, the do syntax doesn't look into these expressions. So all this is exactly the same as
main :: IO ()
main = do
line <- getLine
recurseMain line
with the helper function
recurseMain :: String -> IO ()
recurseMain line
| null line = return ()
| otherwise = do
putStrLn $ reverseWords line
main
Now, obviously the stuff in recurseMain can't know that the function is called within a do block from main, so you need to use another do.
do doesn't actually do anything, it's just syntactic sugar for easily combining statements. A dubious analogy is to compare do to []:
If you have multiple expressions you can combine them into lists using ::
(1 + 2) : (3 * 4) : (5 - 6) : ...
However, this is annoying, so we can instead use [] notation, which compiles to the same thing:
[1+2, 3*4, 5-6, ...]
Similarly, if you have multiple IO statments, you can combine them using >> and >>=:
(putStrLn "What's your name?") >> getLine >>= (\name -> putStrLn $ "Hi " ++ name)
However, this is annoying, so we can instead use do notation, which compiles to the same thing:
do
putStrLn "What's your name?"
name <- getLine
putStrLn $ "Hi " ++ name
Now the answer to why you need multiple do blocks is simple:
If you have multiple lists of values, you need multiple []s (even if they're nested).
If you have multiple sequences of monadic statements, you need multiple dos (even if they're nested).

Haskell Parsec accounting for multiple expression occrrences in grammar

I have been trying to create a parser using details from the following tutorial
much of the code is copied directly from the tutorial with only a few names changed.
import qualified Text.ParserCombinators.Parsec.Token as P
reserved = P.reserved lexer
integer = P.integer lexer
whiteSpace = P.whiteSpace lexer
identifier = P.identifier lexer
data Express = Seq [Express]
| ID String
| Num Integer
| BoolConst Bool
deriving (Show)
whileParser :: Parser Express
whileParser = whiteSpace >> expr7
expr7 = seqOfStmt
<|> expr8
seqOfStmt =
do list <- (sepBy1 expr8 whiteSpace)
return $ if length list == 1 then head list else Seq list
expr8 :: Parser Express
expr8 = name
<|> number
<|> bTerm
name :: Parser Express
name = fmap ID identifier
number :: Parser Express
number = fmap Num integer
bTerm :: Parser Express
bTerm = (reserved "True" >> return (BoolConst True ))
<|> (reserved "False" >> return (BoolConst False))
I understand that this code might be laughable but I would really like to learn a bit more about where I'm going wrong. I also think that this should provide enough info but if not let me know.
Error:
parse error on input `return'
I believe that the error has something to do with different return types, which is strange because I have tried to use the tutorial at the start of the post as a basis for all that I am attempting.
Thanks in advance,
Seán
If you are not comfortable with the layout rules, you may also use different syntax:
seqOfStmt =
do { list
<- (sepBy1 expr8 whiteSpace);
return $ if length
list == 1
then head list else Seq list;}
The layout without braces and semicolons is regarded superior, though, for 2 reasons:
You don't need to type ugly ; and braces
It forces you to write (mostly) readable code, unlike the distorted crap I gave as example above.
And the rules are really easy:
Don't use tabs, use spaces. Always. (Your editor can do that, if not, throw it away, it's crapware.)
Things that belong together must be aligned in the same column.
For example, you have 2 statements that belong to the do block, hence they must be aligned in the same column. But you have aligned the return with the do, hence the compiler sees this as:
do { list <- sepBy1 expr8 whiteSpace; };
return $ ....;
but what you want is this:
do {
list <- sepBy1 ....;
return $ .....;
}
(Note that you can just leave out the braces and the semicolon and it will be ok as long as you leave the indentation intact.

parsec error in haskelwiki tutorial

I was following the code in http://www.haskell.org/haskellwiki/Hitchhikers_guide_to_Haskell, and the code (in chapter 2) gives an error. There is no author name/email mentioned with the tutorial, so I am coming here for advise. The code is below, and the error occurs on the "eof" word.
module Main where
import Text.ParserCombinators.Parsec
parseInput =
do dirs <- many dirAndSize
eof
return dirs
data Dir = Dir Int String deriving Show
dirAndSize =
do size <- many1 digit
spaces
dir_name <- anyChar `manyTill` newline
return (Dir (read size) dir_name)
main = do
input <- getContents
putStrLn ("Debug: got inputs: " ++ input)
That tutorial was written a long time ago, when parsec was simple. Nowadays, since parsec-3, the library can wrap monads, so you now have to specify (or otherwise disambiguate) the type to use at some points. This is one of them, giving eof e.g. the expression type signature eof :: Parser () makes it compile.

Inferred type does not match expected type (IO a) with splitOn

I am trying to use splitOn function in haskell to split a string based on some delimiters. So the splitOn function returns data of type [[Char]], but the compiler is showing error saying that the expected type is IO a.
The line of code which is failing is:
splitOn "x" "1 2, 3"
Please suggest.
If you've written something like xs <- splitOn "x" line and simply want to bind a name to splitOn "x" line, then you can use let xs = splitOn "x" line. <- is for binding values from IO actions, rather than assigning names to pure values.
Otherwise, you've presumably done something like this:
main = do
line <- getLine
splitOn "x" line
You're using a [String], but you need an IO action. You need to do something with the pure result you've got to turn it into an IO computation.
You could print the list out, as GHCi would:
print :: (Show a) => a -> IO ()
Or you could use return, if you simply want to return the pure computation to be used in other IO actions:
return :: a -> IO a
It all depends on how you want to process the data.

Resources