Destructuring `Maybe (a, b)` - haskell

Sort of a followup to my last question. I'm going through Brent Yorgey's Haskell course and I'm trying to solve an exercise that asks us to create an instance of Applicative for the following type:
newtype Parser a = Parser { runParser :: String -> Maybe (a, String) }
runParser parsers a string and returns a token and the remaining string. p1 <*> p2 should in this case apply a function generated by runParser p1 to the token generated by runParser p2 (applied to what's left of the string after running runParser p1).
So far I have:
(Parser { runParser = run }) <*> (Parser { runParser = run' }) = Parser run''
where run'' s = (first <$> f) <*> (s' >>= run')
where f = fst <$> run s
s' = snd <$> run s
(first <$> f) <*> (s' >>= run') seems pretty concise to me, but the nested where's and the weird destructuring of run s look "off". Is there a nicer way to write this?

First, let me rewrite this a bit to avoid pattern matching:
p <*> q = Parser run
where run s = (first <$> f) <*> (s' >>= runParser q)
where f = fst <$> runParser p s
s' = snd <$> runParser p s
Here, I’ve simply used the field accessor runParser :: Parser a -> String -> Maybe (a, String) instead of directly pattern-matching on the arguments. This is considered a more idiomatic method of accessing newtyped functions in Haskell.
Next, there are some obvious simplifications which can be made, particularly inlining some functions:
p <*> q = Parser $ \s -> (first <$> f) <*> (s' >>= runParser q)
where
f = fst <$> runParser p s
s' = snd <$> runParser p s
(Note that s now has to be explicitly passed to the functions in the where block so they can access it. Don’t worry, I’ll get rid of this in a minute.)
One confusing thing in this implementation is the nested applicatives and monads. I’ll rewrite that section slightly to make it a bit clearer:
p <*> q = Parser $ \s ->
let qResult = s' s >>= runParser q
in first <$> f s <*> qResult
where
f s = fst <$> runParser p s
s' s = snd <$> runParser p s
Next, let’s get rid of those annoying f and s' definitions. We can do this using pattern-matching. By pattern-matching on the output of runParser p s, we can access these values directly:
p <*> q = Parser $ \s ->
case runParser p s of
Nothing -> Nothing
Just (f, s') ->
let qResult = runParser q s'
in first f <$> qOutput
(Note that, since f and s' are no longer in Maybe, much of the applicative and monadic plumbing which was required before is now unneeded. One <$> still remains, since runParser q s' can still fail).
Let’s rewrite this slightly, by inlining qResult:
p <*> q = Parser $ \s ->
case runParser p s of
Nothing -> Nothing
Just (f, s') -> first f <$> runParser q s'
Now observe a pattern in this code. It does runParser p s, fails if this fails; otherwise it uses the value in another computation which can fail. This just sounds like monadic sequencing! So let’s rewrite it with >>=:
p <*> q = Parser $ \s -> runParser p s >>= \(f, s') -> first f <$> runParser q s'
And finally, the whole thing can be rewritten in do-notation for readability:
p <*> q = Parser $ \s -> do
(f, s') <- runParser p s
qResult <- runParser q s'
return $ first f qResult
Much easier to read! And what makes this version particularly nice is that it’s easy to see what’s going on — run the first parser, get its output and use it to run the second parser, then combine the results.

In my eye, there's no shame in keeping it simple using basic pattern matching only, without relying too much on <*>, <$>, first, and other library functions.
Parser pF <*> Parser pX = Parser $ \s -> do
(f, s' ) <- pF s
(x, s'') <- pX s'
return (f x, s'')
The above do block is in the Maybe monad.

Related

haskell parser combinator infinite loop

I'm trying to write a simple parser via Haskell, but stuck at an infinite loop.
the code is:
import Control.Applicative (Alternative, empty, many, (<|>))
data Parser a = Parser {runParser :: String -> [(a, String)]}
instance Functor Parser where
fmap f (Parser p) = Parser $ \s -> [(f x', s') | (x', s') <- p s]
instance Applicative Parser where
pure x = Parser $ \s -> [(x, s)]
(Parser pf) <*> (Parser p) = Parser $ \s -> [(f' x, ss') | (f', ss) <- pf s, (x, ss') <- p ss]
instance Alternative Parser where
empty = Parser $ \s -> []
(Parser p1) <|> (Parser p2) = Parser $ \s ->
case p1 s of
[] -> p2 s
xs -> xs
singleSpaceParser :: Parser Char
singleSpaceParser = Parser $ \s ->
( case s of
x : xs -> if x == ' ' then [(' ', xs)] else []
[] -> []
)
multiSpaceParser :: Parser [Char]
multiSpaceParser = many singleSpaceParser
I just load this file in ghci, and run:
runParser multiSpaceParser " 123"
I expect it to get [(" ", "123")], but actually it got an infinite loop
I used trace to debug, and it seems that many is wrong
How can I fix this bug?
Let's assume
many p = (:) <$> p <*> many p <|> pure []
and consider the call
many singleSpaceParser " 123"
(The string does not actually matter here, the many singleSpaceParser call will always loop.)
One reduction step yields
((:) <$> singleSpaceParser <*> many singleSpaceParser <|> pure []) " 123"
Now observe that, in order to reduce the call to (<|>), we have to evaluate both arguments of (<|>) to be of the shape Parser ....
Let's consider doing that for (:) <$> singleSpaceParser <*> many singleSpaceParser. As both (<$>) and (<*>) are infixl 4, this is an application of <*> at the outermost level.
But now observe that in order to reduce (<*>), we again have to evaluate both arguments of (<*>) to be of the shape Parser ..., so in particular the recursive call many singleSpaceParser.
This is where we get the infinite loop.
By switching data to newtype (or alternatively, at least avoiding aggressively pattern-matching on the Parser constructor in all the second arguments), these problems can be avoided.

Infinite loop when calling many on custom Parser [duplicate]

I'm trying to implement my own Applicative parser, here's the code I use:
{-# LANGUAGE ApplicativeDo, LambdaCase #-}
module Parser where
-- Implementation of an Applicative Parser
import Data.Char
import Control.Applicative (some, many, empty, (<*>), (<$>), (<|>), Alternative)
data Parser a = Parser { runParser :: String -> [(a, String)] }
instance Functor Parser where
-- fmap :: (a -> b) -> (Parser a -> Parser b)
fmap f (Parser p) = Parser (\s -> [(f a, s') | (a,s') <- p s])
instance Applicative Parser where
-- pure :: a -> Parser a
-- <*> :: Parser (a -> b) -> Parser a -> Parser b
pure x = Parser $ \s -> [(x, s)]
(Parser pf) <*> (Parser p) = Parser $ \s ->
[(f a, s'') | (f, s') <- pf s, (a, s'') <- p s']
instance Alternative Parser where
-- empty :: Parser a
-- <|> :: Parser a -> Parser a -> Parser a
empty = Parser $ \_s -> []
(Parser p1) <|> (Parser p2) = Parser $ \s ->
case p1 s of [] -> p2 s
xs -> xs
char :: Char -> Parser Char
char c = Parser $ \case (c':cs) | c == c' -> [(c,cs)] ; _ -> []
main = print $ runParser (some $ char 'A') "AAA"
When I run it, it gets stuck and never returns. After digging into the problem I pinpointed the root cause to be my implementation of the <|> method. If I use the following implementation then everything goes as expected:
instance Alternative Parser where
empty = Parser $ \_s -> []
p1 <|> p2 = Parser $ \s ->
case runParser p1 s of [] -> runParser p2 s
xs -> xs
These two implementations are, in my understanding, quite equivalent. What I guess is that this may have something to do with Haskell's lazy evaluation scheme. Can someone explain what's going on?
Fact "star": in your implementation of (<*>):
Parser p1 <*> Parser p2 = ...
...we must compute enough to know that both arguments are actually applications of the Parser constructor to something before we may proceed to the right-hand side of the equation.
Fact "pipe strict": in this implementation:
Parser p1 <|> Parser p2 = ...
...we must compute enough to know that both parsers are actually applications of the Parser constructor to something before we may proceed to the right-hand side of the equals sign.
Fact "pipe lazy": in this implementation:
p1 <|> p2 = Parser $ ...
...we may proceed to the right-hand side of the equals sign without doing any computation on p1 or p2.
This is important, because:
some v = some_v where
some_v = pure (:) <*> v <*> (some_v <|> pure [])
Let's take your first implementation, the one about which we know the "pipe strict" fact. We want to know if some_v is an application of Parser to something. Thanks to fact "star", we must therefore know whether pure (:), v, and some_v <|> pure [] are applications of Parser to something. To know this last one, by fact "pipe strict", we must know whether some_v and pure [] are applications of Parser to something. Whoops! We just showed that to know whether some_v is an application of Parser to something, we need to know whether some_v is an application of Parser to something -- an infinite loop!
On the other hand, with your second implementation, to check whether some_v is a Parser _, we still must check pure (:), v, and some_v <|> pure [], but thanks to fact "pipe lazy", that's all we need to check -- we can be confident that some_v <|> pure [] is a Parser _ without first checking recursively that some_v and pure [] are.
(And next, you will learn about newtype -- and be confused yet again when changing from data to newtype makes both implementation work!)

Why does this loops with 'data' but not 'newtype'?

Here is the code :
import Control.Applicative
-- newtype Parser a = Parser { runParser :: String -> [(a, String)] }
data Parser a = Parser { runParser :: String -> [(a, String)] }
instance Functor Parser where
fmap f (Parser p) = Parser (\s -> [(f x, s') | (x, s') <- p s ] )
instance Applicative Parser where
pure a = Parser (\s -> [(a, s)])
Parser q <*> Parser p = Parser (\s -> [(f x, s'') | (f, s') <- q s, (x, s'') <- p s'])
instance Alternative Parser where
empty = Parser (\s -> [])
Parser q <|> Parser p = Parser (\s -> q s ++ p s)
item = Parser (\s -> case s of
(x:xs) -> [(x, xs)]
_ -> []
)
With the current code, runParser (some item) "abcd" loops, but if Parser is declared as newtype, it works just fine.
This is a great way of getting at one of the difference between data and newtype. The heart of the problem here is actually in the pattern matching of the <|> definition.
instance Alternative Parser where
empty = Parser (\s -> [])
Parser q <|> Parser p = Parser (\s -> q s ++ p s)
Remember that at runtime, a newtype becomes the same thing as the type it is wrapping. Then, when a newtype is pattern matched, GHC doesn't do anything - there is no constructor to evaluate to WNHF.
On the contrary, when a data is matched, seeing the pattern Parser q tells GHC it needs to evaluate that parser to WNHF. That is a problem, because some is an infinite fold of <|>. There are two ways to solve the problem with data:
Don't have Parser patterns in <|>:
instance Alternative Parser where
empty = Parser (\s -> [])
q <|> p = Parser (\s -> runParser q s ++ runParser p s)
Use lazy patterns:
instance Alternative Parser where
empty = Parser (\s -> [])
~(Parser q) <|> ~(Parser p) = Parser (\s -> q s ++ p s)

Is there a point-free way to convert a conditional check into a Maybe type of the input?

I am just working through some simple exercises in haskell and was wondering if there was a point-free way of converting an if-then-else statement into a Maybe type: Nothing being returned if the condition is false, and Just the input if the condition is true.
In short, given some:
maybeIf :: (a -> Bool) -> a -> Maybe a
maybeIf cond a = if cond a then Just a else Nothing
Is there an implementation that is point-free with respect to a? I've also been looking at a more concrete version, a -> Maybe a, and feel like there may be an answer somewhere in Control.Arrow. However, since Maybe is a data type and if-else statements control data flow, I'm unsure if there is a clean way of doing it.
The main thing getting in the way of making that pointfree is the if/then/else. You can define an if' combinator, or you can use this generalized version that I define and use often:
ensure p x = x <$ guard (p x)
Standard tools give successive point-free versions as
ensure p = ap (<$) (guard . p)
ensure = ap (<$) . (guard .)
though I really don't think either are better than the pointful version.
You can import find from Data.Foldable and then it's quite simply:
import Data.Foldable(find)
maybeIf cond = find cond . Just
The function find is not complicated so you could quite easily define it yourself less generically, in terms of Maybe, but it isn't actually so different from your own implementation of maybeIf so you might not gain much, depending on why you wanted to do it.
If we choose a Church-encoding for Booleans…
truth :: Bool -> a -> a -> a
truth True t f = t
truth False t f = f
Then we can write a point-free maybeIf in Applicative-style.
maybeIf :: (a -> Bool) -> a -> Maybe a
maybeIf = liftA3 truth <*> pure Just <*> pure (pure Nothing)
Some intuitions…
f <$> m₁ <*> … <*> mₙ = \x -> f (m₁ x) … (mₙ x)
liftAₙ f <$> m₁ <*> … <*> mₙ = \x -> f <$> m₁ x <*> … <*> mₙ x
Here is a rendering in PNG format of the above "intuitions", in case your installed fonts do not support the needed unicode characters.
So therefore:
liftA3 truth <*> pure Just <*> pure (pure Nothing)
= liftA3 truth <$> id <*> pure Just <*> pure (pure Nothing)
= \p -> truth <$> id p <*> (pure Just) p <*> (pure (pure Nothing)) p
= \p -> truth <$> p <*> Just <*> pure Nothing
= \p -> \a -> truth (p a) (Just a) ((pure Nothing) a)
= \p -> \a -> truth (p a) (Just a) Nothing
Following dfeuer's lead (and using Daniel Wagner's new name for this function),
import Data.Bool (bool)
-- F T
-- bool :: a -> a -> Bool -> a
ensure :: (a -> Bool) -> a -> Maybe a
ensure p x = bool (const Nothing) Just (p x) x
ensure p = join (bool (const Nothing) Just . p)
= bool (const Nothing) Just =<< p
ensure = (bool (const Nothing) Just =<<)
join is a monadic function, join :: Monad m => m (m a) -> m a, but for functions it is simply
join k x = k x x
(k =<< f) x = k (f x) x
join is accepted as a replacement for W combinator in point-free code.
You only wanted it point-free with respect to the value argument, but it's easy to transform the equation with join further (readability of the result is another issue altogether), as
= join ((bool (const Nothing) Just .) p)
= (join . (bool (const Nothing) Just .)) p
Indeed,
#> (join . (bool (const Nothing) Just .)) even 3
Nothing
#> (bool (const Nothing) Just =<<) even 4
Just 4
But I'd much rather see \p x -> listToMaybe [x | p x] in an actual code.
Or just \p x -> [x | p x], with Monad Comprehensions. Which is the same as Daniel Wagner's x <$ guard (p x), only with different syntax.
This function is defined in Control.Monad.Plus and is called partial

How do you chain an arbitrarily long series of atomic parsers using applicatives?

Let's say I have this parser type:
newtype Parser a = Parser { runParser :: String -> Maybe (a, String) }
And this atomic parser unit:
satisfy :: ( Char -> Bool ) -> Parser Char
satisfy g = Parser $ \stream -> case stream of
(x:xs) | g x -> Just ( x, xs )
otherwise -> Nothing
The parser implements these three interfaces:
instance Functor Parser where
fmap g ( Parser p ) = Parser $ \xs0 -> p xs0 >>= \(x,xs) -> return ( g x, xs )
instance Applicative Parser where
pure a = Parser $ \xs0 -> Just ( a, xs0 )
(Parser p1) <*> (Parser p2) = Parser $ \xs0 -> do
(x1, xs1) <- p1 xs0
(x2, xs2) <- p2 xs1
return ( x1 x2, xs2 )
instance Alternative Parser where
empty = Parser $ const Nothing
(Parser p1) <|> (Parser p2) = Parser $ \ss -> let ss1 = p1 ss in case ss1 of
Nothing -> p2 ss
_ -> ss1
Now as I understand, I can now pop up to a higher level of abstraction and build more complex parsers by chaining satisfy using the applicative interface. In example:
-- | A parser that parses the first two chars in the stream if they are upper case
uParser = satisfy isUpper
parser1 = ( (:) <$> uParser ) <*> ( (\x -> [x]) <$> uParser )
runParser parser1 "HEllo" = Just ("HE","llo")
runParser parser1 "Hello" = Nothing
This is great, now what if I want to structure a computation such that the parser parses all cap letters in the stream until it encounters a lowercase letter? Use case:
runParser idealParser "hello" = Nothing
runParser idealParser "HEllo" = Just ("HE","llo")
runParser idealParser "HELLOIAMnotincaps" = Just ("HELLOIAM", "notincaps")
How do I express this notion of non-determined length?
Since you have an Alternative instance, you can simply use Control.Applicative.some to match a list of one or more occurrences.
> runParser (some uParser) "hello"
Nothing
> runParser (some uParser) "HEllo"
Just ("HE","llo")
> runParser (some uParser) "HELLOIAMnotincaps"
Just ("HELLOIAM","notincaps")
To implement it manually, you could use two mutually recursive parsers, e.g.
zeroOrMore = oneOrMore <|> pure []
oneOrMore = (:) <$> uParser <*> zeroOrMore

Resources