Sequencing basic parsers in Haskell and Frege using do notation - haskell

I try to run snippets from chapter 8 about functional parsers in Graham Hutton's 'Programming in Haskell' both in ghci and frege-repl.
I'm not able to sequence parsers using do syntax.
I have following definitions in Frege (Haskell version differs only with simpler item definition that doesn't pack and unpack String and Char and is the same as in the book):
module Parser where
type Parser a = String -> [(a, String)]
return :: a -> Parser a
return v = \inp -> [(v, inp)]
-- this is Frege version
item :: Parser Char
item = \inp ->
let inp' = unpacked inp
in
case inp' of
[] -> []
(x:xs) -> [(x,packed xs)]
parse :: Parser a -> String -> [(a, String)]
parse p inp = p inp
-- sequencing
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = \inp -> case (parse p inp) of
[] -> []
[(v,out)] -> parse (f v) out
p :: Parser (Char, Char)
p = do x <- Parser.item
Parser.item
y <- Parser.item
Parser.return (x,y)
-- this works
p' :: Parser (Char, Char)
p' = item Parser.>>= \x ->
item Parser.>>= \_ ->
item Parser.>>= \y ->
Parser.return (x,y)
p' works both in ghci and frege-repl. However, when trying loading module I got those messages. First from ghci:
src/Parser.hs:38:8:
Couldn't match type ‘[(Char, String)]’ with ‘Char’
Expected type: String -> [((Char, Char), String)]
Actual type: Parser ([(Char, String)], [(Char, String)])
In a stmt of a 'do' block: Parser.return (x, y)
In the expression:
do { x <- item;
item;
y <- item;
Parser.return (x, y) }
Failed, modules loaded: none.
frege-repl is even less friendly because it simply kicks me out from repl with an error stack trace:
Exception in thread "main" frege.runtime.Undefined: returnTypeN: too many arguments
at frege.prelude.PreludeBase.error(PreludeBase.java:18011)
at frege.compiler.Utilities.returnTypeN(Utilities.java:1937)
at frege.compiler.Utilities.returnTypeN(Utilities.java:1928)
at frege.compiler.GenJava7$80.eval(GenJava7.java:11387)
at frege.compiler.GenJava7$80.eval(GenJava7.java:11327)
at frege.runtime.Fun1$1.eval(Fun1.java:63)
at frege.runtime.Delayed.call(Delayed.java:198)
at frege.runtime.Delayed.forced(Delayed.java:267)
at frege.compiler.GenJava7$78.eval(GenJava7.java:11275)
at frege.compiler.GenJava7$78.eval(GenJava7.java:11272)
at frege.runtime.Fun1$1.eval(Fun1.java:63)
at frege.runtime.Delayed.call(Delayed.java:200)
at frege.runtime.Delayed.forced(Delayed.java:267)
at frege.control.monad.State$IMonad_State$4.eval(State.java:1900)
at frege.control.monad.State$IMonad_State$4.eval(State.java:1897)
at frege.runtime.Fun1$1.eval(Fun1.java:63)
at frege.runtime.Delayed.call(Delayed.java:198)
at frege.runtime.Delayed.forced(Delayed.java:267)
at frege.control.monad.State$IMonad_State$4.eval
...
My intuition is that I need something apart >>= and return or there is something I should tell compilers. Or maybe I need to put p definition into State monad?

This is because String -> a is the monad that is being used in your do notation, since one of the instances of Monad in the Prelude is the function arrow.
Therefore, for example, the x in x <- Parser.item is an argument of type [(Char, String)].
You can get around this by making Parser a newtype and defining your own custom Monad instance for it.

The following works with Frege (and should work the same way with GHC language extension RebindableSyntax):
module P
where
type Parser a = String -> [(a, String)]
return :: a -> Parser a
return v = \inp -> [(v, inp)]
-- this is Frege version
item :: Parser Char
item = maybeToList . uncons
parse :: Parser a -> String -> [(a, String)]
parse p inp = p inp
-- sequencing
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = \inp -> case (parse p inp) of
[] -> []
[(v,out)] -> parse (f v) out
p :: Parser (Char, Char)
p = do
x <- item
item
y <- item
return (x,y)
main = println (p "Frege is cool")
It prints:
[(('F', 'r'), "ege is cool")]
The main difference to your version is a more efficient item function, but, as I said before, this is not the reason for the stack trace. And there was this small indentation problem with the do in your code.
So yes, you can use the do notation here, though some would call it "abuse".

Related

haskell return [] for data from a parser function

In the code below, I am supposed to get [] after running parse noun:
parse noun "something" = []
Unfortunately I can not change the signature of the function so Maybe is not an option.
How can I return [] from the noun function when running
parse noun "something"
?? (I don't want to return the 'c' variable)
Many thanks for your help.
import Parsing
data Tree = Branch Sort [Tree]
| Leaf Sort String deriving (Eq, Show)
nouns :: [String]
nouns = ["flight", "breeze", "trip", "morning"]
oneOf :: [String] -> Parser String
oneOf l = do
cs <- token identifier
guard (elem cs l)
return cs
noun :: Parser Tree
noun = do
cs <- token identifier
let a = Leaf Noun cs
let b = parse (oneOf nouns) cs
let c = Leaf Noun []
if null b then return c else return a
-- Parsing.hs
-- Functional parsing library from chapter 13 of Programming in Haskell,
-- Graham Hutton, Cambridge University Press, 2016.
module Parsing (module Parsing, module Control.Applicative) where
import Control.Applicative
import Data.Char
-- Basic definitions
newtype Parser a = P (String -> [(a,String)])
parse :: Parser a -> String -> [(a,String)]
parse (P p) inp = p inp
item :: Parser Char
item = P (\inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)])
-- Sequencing parsers
instance Functor Parser where
-- fmap :: (a -> b) -> Parser a -> Parser b
fmap g p = P (\inp -> case parse p inp of
[] -> []
[(v,out)] -> [(g v, out)])
instance Applicative Parser where
-- pure :: a -> Parser a
pure v = P (\inp -> [(v,inp)])
-- <*> :: Parser (a -> b) -> Parser a -> Parser b
pg <*> px = P (\inp -> case parse pg inp of
[] -> []
[(g,out)] -> parse (fmap g px) out)
instance Monad Parser where
-- (>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = P (\inp -> case parse p inp of
[] -> []
[(v,out)] -> parse (f v) out)
-- Making choices
instance Alternative Parser where
-- empty :: Parser a
empty = P (\inp -> [])
-- (<|>) :: Parser a -> Parser a -> Parser a
p <|> q = P (\inp -> case parse p inp of
[] -> parse q inp
[(v,out)] -> [(v,out)])
-- Derived primitives
sat :: (Char -> Bool) -> Parser Char
sat p = do x <- item
if p x then return x else empty
digit :: Parser Char
digit = sat isDigit
lower :: Parser Char
lower = sat isLower
upper :: Parser Char
upper = sat isUpper
letter :: Parser Char
letter = sat isAlpha
alphanum :: Parser Char
alphanum = sat isAlphaNum
char :: Char -> Parser Char
char x = sat (== x)
string :: String -> Parser String
string [] = return []
string (x:xs) = do char x
string xs
return (x:xs)
ident :: Parser String
ident = do x <- lower
xs <- many alphanum
return (x:xs)
nat :: Parser Int
nat = do xs <- some digit
return (read xs)
int :: Parser Int
int = do char '-'
n <- nat
return (-n)
<|> nat
-- Handling spacing
space :: Parser ()
space = do many (sat isSpace)
return ()
token :: Parser a -> Parser a
token p = do space
v <- p
space
return v
identifier :: Parser String
identifier = token ident
natural :: Parser Int
natural = token nat
integer :: Parser Int
integer = token int
symbol :: String -> Parser String
symbol xs = token (string xs)
Found the solution thanks to chi:
noun :: Parser Tree
noun = do
cs <- token identifier
guard (elem cs nouns)
let a = Leaf Noun cs
return a

Understanding the filterM function

I am learning about the filterM function in the book "Learn You a Haskell for Great Good!" by Miran Lipovaca. For the following example:
keepSmall :: Int -> Writer [String] Bool
keepSmall x
| x < 4 = do
tell ["Keeping " ++ show x]
return True
| otherwise = do
tell [show x ++ " is too large, throwing it away"]
return False
The result obtained from using this function with filterM is the following:
> runWriter $ filterM keepSmall [9,1,5,2,10,3]
([1,2,3],["9 is too large, throwing it away","Keeping 1","5 is too large,
throwing it away","Keeping 2","10 is too large, throwing it away","Keeping 3"])
Regarding the type of the result of filterM, I know that filterM has the following type declaration:
filterM :: (Monad m) => (a -> m Bool) -> [a] -> m [a]
Since the monad used for this example is Writer [String], would the type of the list resulting from filterM be Writer [String] [Int]? If this is the case, is this why the result type is ([Int], [String]), since Writer w a is equivalent to the tuple (a,w)?
That's because of the type of runWriter
runWriter :: Writer w a -> (a, w)
from Hoogle, it literally just unwrap a writer computation as a (result, output) pair. That's why you got the result in a pair.
A little example, just to see how it works in other context:
runWriter (tell $ return "Hello" ())
=> ((),"Hello")

Why does this loops with 'data' but not 'newtype'?

Here is the code :
import Control.Applicative
-- newtype Parser a = Parser { runParser :: String -> [(a, String)] }
data Parser a = Parser { runParser :: String -> [(a, String)] }
instance Functor Parser where
fmap f (Parser p) = Parser (\s -> [(f x, s') | (x, s') <- p s ] )
instance Applicative Parser where
pure a = Parser (\s -> [(a, s)])
Parser q <*> Parser p = Parser (\s -> [(f x, s'') | (f, s') <- q s, (x, s'') <- p s'])
instance Alternative Parser where
empty = Parser (\s -> [])
Parser q <|> Parser p = Parser (\s -> q s ++ p s)
item = Parser (\s -> case s of
(x:xs) -> [(x, xs)]
_ -> []
)
With the current code, runParser (some item) "abcd" loops, but if Parser is declared as newtype, it works just fine.
This is a great way of getting at one of the difference between data and newtype. The heart of the problem here is actually in the pattern matching of the <|> definition.
instance Alternative Parser where
empty = Parser (\s -> [])
Parser q <|> Parser p = Parser (\s -> q s ++ p s)
Remember that at runtime, a newtype becomes the same thing as the type it is wrapping. Then, when a newtype is pattern matched, GHC doesn't do anything - there is no constructor to evaluate to WNHF.
On the contrary, when a data is matched, seeing the pattern Parser q tells GHC it needs to evaluate that parser to WNHF. That is a problem, because some is an infinite fold of <|>. There are two ways to solve the problem with data:
Don't have Parser patterns in <|>:
instance Alternative Parser where
empty = Parser (\s -> [])
q <|> p = Parser (\s -> runParser q s ++ runParser p s)
Use lazy patterns:
instance Alternative Parser where
empty = Parser (\s -> [])
~(Parser q) <|> ~(Parser p) = Parser (\s -> q s ++ p s)

haskell, the same program using monads

As you can see, I wrote program, e.g:
test "12 124 212" = Right [12, 124, 212]
test "43 243 fs3d 2" = Left "fs3d is not a number"
test :: String -> Either String [Int]
test w = iter [] $ words w
where
iter acc [] = Right (reverse acc)
iter acc (x:xs) = if (all isDigit x) then
iter ((read x):acc) xs
else
Left (x++ "is not a number")
A am starting learning monads. Could you show me how to implement it using monads ?
I think you are looking for traverse/mapM (they're the same for lists). Also you can use readEither for simplification:
import Data.Traversable (traverse)
import Data.Bifunctor (first)
import Text.Read (readEither)
test :: String -> Either String [Int]
test = traverse parseItem . words
parseItem :: String -> Either String Int
parseItem x = first (const $ x++" is not a number") $ readEither x
So what does mapM do? It basically implements the recursion over the list that you did manually. However, unlike the standard map function it takes a monadic function (parseItem in our case, where Either String is a monad) and applies one step on the list after the other:
iter [] = Right []
iter (x:xs) = do
r <- parseItem x
rs <- iter xs
return (r:rs)
Bergi's answer is just right, but maybe you'll find it easy to understand presented this way:
test :: String -> Either String [Int]
test str = traverse parseNumber (words str)
parseNumber :: String -> Either String Int
parseNumber str
| all isDigit str = Right (read str)
| otherwise = Left (str ++ " is not a number")
The other thing I'd recommend is don't write tail-recursive accumulator loops like iter in your example. Instead, look at library documentation and try to find list functions that do what you want. In this case, as Bergi correctly pointed out, traverse is exactly what you want. It will take some study to get fully comfortable with this function, though. But given how the Monad instance of Either and the Traversable instance of lists work, the traverse in this example works like this:
-- This is the same as `traverse` for lists and `Either`
traverseListWithEither :: (a -> Either err b) -> [a] -> Either err [b]
traverseListWithEither f [] = Right []
traverseListWithEither f (a:as) =
case f a of
Left err -> Left err
Right b -> mapEither (b:) (traverseListWithEither f as)
-- This is the same as the `fmap` function for `Either`
mapEither :: (a -> b) -> Either e a -> Either e b
mapEither f (Left e) = Left e
mapEither f (Right a) = Right (f a)

Functional Parser example in Haskell using GHCi

I am a beginner of learning Haskell. Here is the problem I've encountered when using GHCi.
p :: Parser (Char, Char)
p = do x <- item
item
y <- item
return (x,y)
item :: Parser Char
item = P (\inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)])
item is another parser where item :: Parser Char, simply item is to parse a string
When I load the file then execute
parse p "abcdef"
An execption is then shown:
*** Exception: You must implement (>>=)
Any idea for fixing such problem ?
Updated information:
The Parser is defined as follow:
newtype Parser a = P (String -> [(a,String)])
instance Monad Parser where
return :: a -> Parser a
return v = P (\inp -> [(v,inp)])
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = --...
In order to use do notation, your Parser must be an instance of Monad:
instance Monad Parser where
return :: a -> Parser a
return = -- ...
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = -- ...
The compiler needs you to fill in definitions of return and >>=.
do notation is syntatic sugar that desugars to use of >>= (pronounced "bind"). For example, your code desugars to:
p :: Parser (Char, Char)
p = item >>= \x ->
item >>= \_ ->
item >>= \y ->
return (x,y)
Or, with more explicit parentheses:
p = item >>= (\x -> item >>= (\_ -> item >>= (\y -> return (x,y))))
>>= describes how to combine a Parser a along with a function a -> Parser b to create a new Parser b.
Using your definition of Parser, a working Monad instance is
instance Monad Parser where
return a = P $ \s -> [(a,s)]
p >>= f = P $ concatMap (\(a,s') -> runParser (f a) s') . runParser p
-- which is equivalent to
-- p >>= f = P $ \s -> [(b,s'') | (a,s') <- runParser p s, (b,s'') <- runParser (f a) s']
Consider what >>= does in terms of a p :: Parser a and a function f :: a -> Parser b.
when unwrapped, p takes a String, and returns a list of (a,String) pairs
runParser p :: String -> [(a,String)]
for each (a,String) pair, we can run f on the a to get a new parser q:
map go . runParser p :: String -> [(Parser b,String)]
where go :: (a, String) -> (Parser b, String)
go (a,s') = let q = f a in (q, s')
if we unwrap q, we get a function that takes a String and returns a list of (b, String) pairs:
map go . runParser p :: String -> [(String -> [(b,String)],String)]
where go :: (a, String) -> (String -> [(b,String)],String)
go (a,s') = let q = f a in (runParser q, s')
we can run that function on the String that was paired with the a to get our list of `(b, String) pairs immediately:
map go . runParser p :: String -> [[(b,String)]]
where go :: (a, String) -> [(b,String)]
go (a,s') = let q = f a in runParser q s'
and if we flatten the list-of-lists that results we get an String -> [(b,String)], which is just unwrapped Parser b
concat . map go . runParser p :: String -> [(b,String)]
where go :: (a, String) -> [(b,String)]
go (a,s') = let q = f a in runParser q s'

Resources