I am going through Write Yourself a Scheme in Haskell. Its a great tutorial, but I've run into a wall with one of the parsing exercises:
parseNumber :: Parser LispVal
parseNumber = liftM (Number . read) $ many1 digit
Rewrite parseNumber using:
Do-notation
explicit sequencing with the >>= operator
I had no problems with do-notation:
parseNumber :: Parser LispVal
parseNumber = do x <- many1 digit
let y = read x
return $ Number y
For #2 I've tried a bunch of variations such as:
parseNumber :: Parser LispVal
parseNumber = (many1 digit) >>= (liftM (Number . read))
but I keep running into type errors. I have two questions.
Why am I getting type errors? Am I misunderstanding the monadic bind operator?
Why AREN'T I getting similar type errors with my do-notation solution?
I feel like I am missing a fundamental concept regarding types?
You're attempting a non-trivial transformation from do-notation to bind notation, I recommend doing it the "trivial" way, and then making it points-free.
Recall:
x <- m === m >>= \x ->
let x = e === let x = e in
Then you have:
parseNumber = many1 digit >>= \x ->
let y = read x in
return (Number y)
(I've removed the $ to avoid precedence problems.)
We can then convert this into:
parseNumber = many1 digit >>= \x -> return (Number (read x))
= many1 digit >>= return . Number . read
Now, if you want to use liftM, you need to stop using bind, since the lifted function expects a monadic value as its argument.
parseNumber = liftM (Number . read) (many1 digit)
In your case, bind has type:
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
(since you're using Parser as the Monad)
You give bind two arguments: the first one, many1 digit, should be ok (regarding the type); but the type of the second argument is the result type of liftM, namely Parser a -> Parser b and this does not fit the second argument's expected type (a -> Parser b)!
Without having tested it: instead of using liftM (Number.read) as second argument of bind, try using return . Number . read - this should have the right type and gives probably what you want...
Related
I'm trying to write a simple parser via Haskell, but stuck at an infinite loop.
the code is:
import Control.Applicative (Alternative, empty, many, (<|>))
data Parser a = Parser {runParser :: String -> [(a, String)]}
instance Functor Parser where
fmap f (Parser p) = Parser $ \s -> [(f x', s') | (x', s') <- p s]
instance Applicative Parser where
pure x = Parser $ \s -> [(x, s)]
(Parser pf) <*> (Parser p) = Parser $ \s -> [(f' x, ss') | (f', ss) <- pf s, (x, ss') <- p ss]
instance Alternative Parser where
empty = Parser $ \s -> []
(Parser p1) <|> (Parser p2) = Parser $ \s ->
case p1 s of
[] -> p2 s
xs -> xs
singleSpaceParser :: Parser Char
singleSpaceParser = Parser $ \s ->
( case s of
x : xs -> if x == ' ' then [(' ', xs)] else []
[] -> []
)
multiSpaceParser :: Parser [Char]
multiSpaceParser = many singleSpaceParser
I just load this file in ghci, and run:
runParser multiSpaceParser " 123"
I expect it to get [(" ", "123")], but actually it got an infinite loop
I used trace to debug, and it seems that many is wrong
How can I fix this bug?
Let's assume
many p = (:) <$> p <*> many p <|> pure []
and consider the call
many singleSpaceParser " 123"
(The string does not actually matter here, the many singleSpaceParser call will always loop.)
One reduction step yields
((:) <$> singleSpaceParser <*> many singleSpaceParser <|> pure []) " 123"
Now observe that, in order to reduce the call to (<|>), we have to evaluate both arguments of (<|>) to be of the shape Parser ....
Let's consider doing that for (:) <$> singleSpaceParser <*> many singleSpaceParser. As both (<$>) and (<*>) are infixl 4, this is an application of <*> at the outermost level.
But now observe that in order to reduce (<*>), we again have to evaluate both arguments of (<*>) to be of the shape Parser ..., so in particular the recursive call many singleSpaceParser.
This is where we get the infinite loop.
By switching data to newtype (or alternatively, at least avoiding aggressively pattern-matching on the Parser constructor in all the second arguments), these problems can be avoided.
Sort of a followup to my last question. I'm going through Brent Yorgey's Haskell course and I'm trying to solve an exercise that asks us to create an instance of Applicative for the following type:
newtype Parser a = Parser { runParser :: String -> Maybe (a, String) }
runParser parsers a string and returns a token and the remaining string. p1 <*> p2 should in this case apply a function generated by runParser p1 to the token generated by runParser p2 (applied to what's left of the string after running runParser p1).
So far I have:
(Parser { runParser = run }) <*> (Parser { runParser = run' }) = Parser run''
where run'' s = (first <$> f) <*> (s' >>= run')
where f = fst <$> run s
s' = snd <$> run s
(first <$> f) <*> (s' >>= run') seems pretty concise to me, but the nested where's and the weird destructuring of run s look "off". Is there a nicer way to write this?
First, let me rewrite this a bit to avoid pattern matching:
p <*> q = Parser run
where run s = (first <$> f) <*> (s' >>= runParser q)
where f = fst <$> runParser p s
s' = snd <$> runParser p s
Here, I’ve simply used the field accessor runParser :: Parser a -> String -> Maybe (a, String) instead of directly pattern-matching on the arguments. This is considered a more idiomatic method of accessing newtyped functions in Haskell.
Next, there are some obvious simplifications which can be made, particularly inlining some functions:
p <*> q = Parser $ \s -> (first <$> f) <*> (s' >>= runParser q)
where
f = fst <$> runParser p s
s' = snd <$> runParser p s
(Note that s now has to be explicitly passed to the functions in the where block so they can access it. Don’t worry, I’ll get rid of this in a minute.)
One confusing thing in this implementation is the nested applicatives and monads. I’ll rewrite that section slightly to make it a bit clearer:
p <*> q = Parser $ \s ->
let qResult = s' s >>= runParser q
in first <$> f s <*> qResult
where
f s = fst <$> runParser p s
s' s = snd <$> runParser p s
Next, let’s get rid of those annoying f and s' definitions. We can do this using pattern-matching. By pattern-matching on the output of runParser p s, we can access these values directly:
p <*> q = Parser $ \s ->
case runParser p s of
Nothing -> Nothing
Just (f, s') ->
let qResult = runParser q s'
in first f <$> qOutput
(Note that, since f and s' are no longer in Maybe, much of the applicative and monadic plumbing which was required before is now unneeded. One <$> still remains, since runParser q s' can still fail).
Let’s rewrite this slightly, by inlining qResult:
p <*> q = Parser $ \s ->
case runParser p s of
Nothing -> Nothing
Just (f, s') -> first f <$> runParser q s'
Now observe a pattern in this code. It does runParser p s, fails if this fails; otherwise it uses the value in another computation which can fail. This just sounds like monadic sequencing! So let’s rewrite it with >>=:
p <*> q = Parser $ \s -> runParser p s >>= \(f, s') -> first f <$> runParser q s'
And finally, the whole thing can be rewritten in do-notation for readability:
p <*> q = Parser $ \s -> do
(f, s') <- runParser p s
qResult <- runParser q s'
return $ first f qResult
Much easier to read! And what makes this version particularly nice is that it’s easy to see what’s going on — run the first parser, get its output and use it to run the second parser, then combine the results.
In my eye, there's no shame in keeping it simple using basic pattern matching only, without relying too much on <*>, <$>, first, and other library functions.
Parser pF <*> Parser pX = Parser $ \s -> do
(f, s' ) <- pF s
(x, s'') <- pX s'
return (f x, s'')
The above do block is in the Maybe monad.
I'm trying to implement my own Applicative parser, here's the code I use:
{-# LANGUAGE ApplicativeDo, LambdaCase #-}
module Parser where
-- Implementation of an Applicative Parser
import Data.Char
import Control.Applicative (some, many, empty, (<*>), (<$>), (<|>), Alternative)
data Parser a = Parser { runParser :: String -> [(a, String)] }
instance Functor Parser where
-- fmap :: (a -> b) -> (Parser a -> Parser b)
fmap f (Parser p) = Parser (\s -> [(f a, s') | (a,s') <- p s])
instance Applicative Parser where
-- pure :: a -> Parser a
-- <*> :: Parser (a -> b) -> Parser a -> Parser b
pure x = Parser $ \s -> [(x, s)]
(Parser pf) <*> (Parser p) = Parser $ \s ->
[(f a, s'') | (f, s') <- pf s, (a, s'') <- p s']
instance Alternative Parser where
-- empty :: Parser a
-- <|> :: Parser a -> Parser a -> Parser a
empty = Parser $ \_s -> []
(Parser p1) <|> (Parser p2) = Parser $ \s ->
case p1 s of [] -> p2 s
xs -> xs
char :: Char -> Parser Char
char c = Parser $ \case (c':cs) | c == c' -> [(c,cs)] ; _ -> []
main = print $ runParser (some $ char 'A') "AAA"
When I run it, it gets stuck and never returns. After digging into the problem I pinpointed the root cause to be my implementation of the <|> method. If I use the following implementation then everything goes as expected:
instance Alternative Parser where
empty = Parser $ \_s -> []
p1 <|> p2 = Parser $ \s ->
case runParser p1 s of [] -> runParser p2 s
xs -> xs
These two implementations are, in my understanding, quite equivalent. What I guess is that this may have something to do with Haskell's lazy evaluation scheme. Can someone explain what's going on?
Fact "star": in your implementation of (<*>):
Parser p1 <*> Parser p2 = ...
...we must compute enough to know that both arguments are actually applications of the Parser constructor to something before we may proceed to the right-hand side of the equation.
Fact "pipe strict": in this implementation:
Parser p1 <|> Parser p2 = ...
...we must compute enough to know that both parsers are actually applications of the Parser constructor to something before we may proceed to the right-hand side of the equals sign.
Fact "pipe lazy": in this implementation:
p1 <|> p2 = Parser $ ...
...we may proceed to the right-hand side of the equals sign without doing any computation on p1 or p2.
This is important, because:
some v = some_v where
some_v = pure (:) <*> v <*> (some_v <|> pure [])
Let's take your first implementation, the one about which we know the "pipe strict" fact. We want to know if some_v is an application of Parser to something. Thanks to fact "star", we must therefore know whether pure (:), v, and some_v <|> pure [] are applications of Parser to something. To know this last one, by fact "pipe strict", we must know whether some_v and pure [] are applications of Parser to something. Whoops! We just showed that to know whether some_v is an application of Parser to something, we need to know whether some_v is an application of Parser to something -- an infinite loop!
On the other hand, with your second implementation, to check whether some_v is a Parser _, we still must check pure (:), v, and some_v <|> pure [], but thanks to fact "pipe lazy", that's all we need to check -- we can be confident that some_v <|> pure [] is a Parser _ without first checking recursively that some_v and pure [] are.
(And next, you will learn about newtype -- and be confused yet again when changing from data to newtype makes both implementation work!)
I'm just starting to learn Haskell from this wikibook and I had a little trouble with one of the exercises.
Specifically, the following doesn't work as I expect
parseNumber :: Parser LispVal
parseNumber = (many1 digit) >>= (return $ Number . read)
unless I change it slightly
parseNumber :: Parser LispVal
parseNumber = (many1 digit) >>= (\n -> return $ Number . read $ n)
I was hoping someone could explain why return $ Number . read doesn't evaluate to the same lambda function that I explicitly created in the second definition, since I thought that this is exactly what partial function evaluation does when it's used in point free style code (obviously not!)
Thanks for any help, hopefully it's not another beginner’s monad problem...
This is just an issue of how $ associates. Fundamentally, $ is just an operator for writing fewer parentheses; it's the same as adding parentheses wrapping to the end of the expression.
Using this idea, we can rewrite your second example:
parseNumber = (many1 digit) >>= (\n -> return (Number . read ( n)))
For reference, the original expression with parentheses looks like this:
parseNumber = (many1 digit) >>= (return (Number . read))
So the equivalent of the partial application is actually:
parseNumber = (many1 digit) >>= (\n -> (return (Number . read)) n)
Basically, combining multiple $ associates differently than what you expected.
Go to the definitions --
($) :: (a -> b) -> a -> b
($) = id
(.) :: (b -> c) -> (a -> b) -> (a -> c)
(.) f g x = f (g x)
Now you have
return $ Number . read = ($) return (Number . read) -- (.) has higher precedence
= return (Number . read)
and the monad you're in is the Parser monad, so this is trying to bind a parsed value to a function that returns a parser for another function (many layers of abstraction!)
Instead, what you want is
return . Number . read
which is equivalent to what you wrote, as you can see by doing
\n -> return $ Number . read $ n = \n -> return . Number . read $ n -- definition of (.)
= return . Number . read -- eta reduction
Finally, note that when you see the pattern
x >>= return . f
this can always be replaced with
fmap f x -- or liftM f x
i.e. it shows that you're not really using the Monad instance at all, but instead the weaker (and more general) Functor instance.
It looks like you want:
parseNumber = (many1 digit) >>= (return . Number . read)
or alteratively
parseNumber = (many1 digit) `fmap` (Number . read)
Number . read is a function String -> LispVal so the type of return $ Number . read is Parser (String -> LispVal), while you need the function to have type String -> Parser LispVal
How do I write this function using the >>= operator?
parseNumber2 :: Parser LispVal
parseNumber2 = do x <- many1 digit
return $ (Number . read) x
A straightforward desugaring of the do-notation gives
parseNumber2 :: Parser LispVal
parseNumber2 = many1 digit >>= (return . Number . read)
but the more idiomatic way is to use fmap or the equivalent <$> operator from Control.Applicative
parseNumber2 = Number . read <$> many1 digit
To desugar do-notation:
Flip any <- bindings over to the right side and add >>= and a lambda abstraction
do x <- a
y <- b
...
becomes
a >>= \x ->
b >>= \y ->
...
For any non-binding forms, add a >> on the right:
do a
b
...
becomes
a >>
b >>
...
Leave the last expression alone.
do a
becomes
a
Applying these rules to your code, we get
parseNumber2 =
many1 digit >>= \x ->
return $ (Number . read) x
Do some simplifications
parseNumber2 = many1 digit >>= \x -> (return . Number . read) x
parsenumber2 = many1 digit >>= (return . Number . read)
Now, for any monad, fmap or <$> can be defined as
f <$> x = x >>= (return . f)
Use this to get the idiomatic form
parseNumber2 = Number . read <$> many1 digit