Making an 'optional' parser using optparse-applicative and constructing value for recursive data type - haskell

I have a data type called EntrySearchableInfo written like this
type EntryDate = UTCTime -- From Data.Time
type EntryTag = Tag -- String
type EntryName = Name -- String
type EntryDescription = Description -- String
type EntryId = Int
data EntrySearchableInfo
= SearchableEntryDate EntryDate
| SearchableEntryTag EntryTag
| SearchableEntryName EntryName
| SearchableEntryDescription EntryDescription
| SearchableEntryId EntryId
Basically represents things that make sense in 'search' context.
I want to write a function with this type
entrySearchableInfoParser :: Parser (Either String EntrySearchableInfo)
which (I think) will be a combination of several primitive Parser <Type> functions I have already written
entryDateParser :: Parser (Either String UTCTime)
entryDateParser = parseStringToUTCTime <$> strOption
(long "date" <> short 'd' <> metavar "DATE" <> help entryDateParserHelp)
searchableEntryDateParser :: Parser (Either String EntrySearchableInfo)
searchableEntryDateParser = SearchableEntryDate <$$> entryDateParser -- <$$> is just (fmap . fmap)
searchableEntryTagParser :: Parser (Either String EntrySearchableInfo)
searchableEntryTagParser = ...
...
So I have two questions:
How do I combine those parsers to make entrySearchableInfoParser functions.
EntrySearchableInfo type is a part of a larger Entry type defined like this
data Entry
= Add EntryDate EntryInfo EntryTag EntryNote EntryId
| Replace EntrySearchableInfo Entry
| ...
...
I already have a function with type
entryAdd :: Parser (Either String Entry)
which constructs Entry using Add.
But I'm not sure how to make Entry type using Replace with entrySearchableInfoParser and entryAdd.

So combining those parsers were a lot simpler than I imagined.
I just had to use <|>
entrySearchableInfoParser :: Parser (Either String EntrySearchableInfo)
entrySearchableInfoParser =
searchableEntryDateParser
<|> searchableEntryTagParser
<|> searchableEntryNameParser
<|> searchableEntryDescriptionParser
<|> searchableEntryIdParser
and constructing Entry type using Replace with entrySearchableInfoParser and entryAdd was too.
entryAdd :: Parser (Either String Entry)
entryAdd = ...
entryReplace :: Parser (Either String Entry)
entryReplace = liftA2 Edit <$> entrySearchableInfoParser <*> entryAdd
Now it works perfectly!

Related

How to use the latest version of the Parsec.Indent library?

It might seem that this question is a duplicate of this question, however either Parsec or the Indent library has changed since 2012 and none of the old examples I have found for the indent library compile with the latest versions.
I want to make a parser for a programming language where indentation is part of the syntax (used to indicate scopes), in order to achieve this I want to make use of the Text.Parsec.Indent library, but I am at a loss on how to use it. It is clear to me that some modifications/custom parser type has to be made, but my limited knowledge on the State monad and surface level understanding of parsec seem to not be enough.
Let's say you wanted to make a parser for a simple list of ints like below. How would one achieve this?
mylist
fstitem
snditem
My attempts to create a simple parser based on some of the old examples floating around on the internet looked like this, but it obviously produces some type errors:
import Control.Monad.State
import Text.Parsec hiding (State)
import Text.Parsec.Indent
import Text.Parsec.Pos
type IParser a = ParsecT String () (State SourcePos) a
parseInt :: IParser Integer
parseInt = read <$> many1 digit
parseIndentedInt :: IParser Integer
parseIndentedInt = indented *> parseInt
specifically these:
Frontend/Parser.hs:14:20: error:
• Couldn't match type ‘Control.Monad.Trans.Reader.ReaderT
Text.Parsec.Indent.Internal.Indentation m0’
with ‘StateT SourcePos Data.Functor.Identity.Identity’
Expected type: IParser Integer
Actual type: ParsecT String () (IndentT m0) Integer
• In the expression: indented *> parseInt
In an equation for ‘parseIndentedInt’:
parseIndentedInt = indented *> parseInt
|
14 | parseIndentedInt = indented *> parseInt
| ^^^^^^^^^^^^^^^^^^^^
Frontend/Parser.hs:14:32: error:
• Couldn't match type ‘StateT
SourcePos Data.Functor.Identity.Identity’
with ‘Control.Monad.Trans.Reader.ReaderT
Text.Parsec.Indent.Internal.Indentation m0’
Expected type: ParsecT String () (IndentT m0) Integer
Actual type: IParser Integer
• In the second argument of ‘(*>)’, namely ‘parseInt’
In the expression: indented *> parseInt
In an equation for ‘parseIndentedInt’:
parseIndentedInt = indented *> parseInt
|
14 | parseIndentedInt = indented *> parseInt
| ^^^^^^^^
Failed, no modules loaded.
Okay after some deep diving into the source code and looking at the tests in the indents GitHub repository I managed to create a working example.
The following code can parse a simple indented list:
import Text.Parsec as Parsec
import Text.Parsec.Indent as Indent
data ExampleList = ExampleList String [ExampleList]
deriving (Eq, Show)
plistItem :: Indent.IndentParser String () String
plistItem = Parsec.many1 Parsec.lower <* Parsec.spaces
pList :: Indent.IndentParser String () ExampleList
pList = Indent.withPos (ExampleList <$> plistItem <*> Parsec.many (Indent.indented *> pList))
useParser :: Indent.IndentParser String () a -> String -> a
useParser p src = helper res
where res = Indent.runIndent $ Parsec.runParserT (p <* Parsec.eof) () "<test>" src
helper (Left err) = error "Parse error"
helper (Right ok) = ok
example usage:
*Main> useParser pList "mylist\n\tfstitem\n\tsnditem"
ExampleList "mylist" [ExampleList "fstitem" [],ExampleList "snditem" []]
Note that the useParser function does some stuff with actually taking the result from the Either monad, as well as putting an end of file parser behind the supplied parser. Depending on your application you might want to change this.
Additionally the type signatures could be shortend with something like this:
type IParser a = Indent.IndentParser String () a
plistItem :: IParser String
pList :: IParser ExampleList
useParser :: IParser a -> String -> a

Parsing user options into custom data types with OptParse-Applicative

I'm trying to build a CLI food journal app.
And this is the data type I want the user input to be parsed in.
data JournalCommand =
JournalSearch Query DataTypes Ingridents BrandOwnder PageNumber
| JournalReport Query DataTypes Ingridents BrandOwnder PageNumber ResultNumber
| JournalDisplay FromDate ToDate ResultNumber
| JournalStoreSearch Query DataTypes Ingridents BrandOwnder PageNumber ResultNumber StoreFlag
| JournalStoreCustom CustomEntry OnDate StoreFlag
| JournalDelete FromDate ToDate ResultNumber
| JournalEdit CustomEntry ResultNumber
deriving (Show, Eq)
and because there's a lot of overlap I have a total of 8 functions with Parser a type.
Functions like these
-- | Search Query
aQueryParser :: Parser String
aQueryParser = strOption
( long "search"
<> short 's'
<> help "Search for a term in the database"
)
The idea if to ultimately have a function like this
runJournal :: JournalCommand -> MT SomeError IO ()
runJournal = \case
JournalSearch q d i b p
-> runSearch q d i b p
JournalReport q d i b p r
-> runSearchAndReport q d i b p r
...
...
where MT is some monad transformer that can handle error + IO. Not sure yet.
The question is: How do I setup the parseArgs function
parseArgs :: IO JournalCommand
parseArgs = execParser ...
and parser function
parser :: Parser JournalCommand
parser = ...
so that I'd be able to parse user input into JournalCommand and then return the data to relevant functions.
I know I can fmap a data type like this
data JournalDisplay { jdFromDate :: UTCTime
, jdToDate :: UTCTime
, jdResultNumber :: Maybe Int
}
as
JournalDisplay
<$>
fromDateParser
<*>
toDateParser
<*>
optional resultNumberParser
But I'm not sure how to go about doing that with my original data structure.
I think I need to have a list like this [Mod CommandFields JournalCommand] which I may be able to pass into subparser function by concatenating the Mod list. I'm not completely sure.
In optparse-applicative there's the Parser type, but also the ParserInfo type which represents a "completed" parser holding extra information like header, footer, description, etc... and which is ready to be run with execParser.
We go from Parser to ParserInfo by way of the info function which adds the extra information as modifiers.
Now, when writing a parser with subcommands, each subcommand must have its own ParserInfo value (implying that it can have its own local help and description).
We pass each of these ParserInfo values to the command function (along with the name we want the subcommand to have) and then we combine the [Mod CommandFields JournalCommand] list using mconcat and pass the result to subparser. This will give us the top-level Parser. We need to use info again to provide the top-level description and get the final ParserInfo.
An example that uses a simplified version of your type:
data JournalCommand =
JournalSearch String String
| JournalReport String
deriving (Show, Eq)
journalParserInfo :: O.ParserInfo JournalCommand
journalParserInfo =
let searchParserInfo :: O.ParserInfo JournalCommand
searchParserInfo =
O.info
(JournalSearch
<$> strArgument (metavar "ARG1" <> help "This is arg 1")
<*> strArgument (metavar "ARG2" <> help "This is arg 2"))
(O.fullDesc <> O.progDesc "desc 1")
reportParserInfo :: O.ParserInfo JournalCommand
reportParserInfo =
O.info
(JournalReport
<$> strArgument (metavar "ARG3" <> help "This is arg 3"))
(O.fullDesc <> O.progDesc "desc 2")
toplevel :: O.Parser JournalCommand
toplevel = O.subparser (mconcat [
command "search" searchParserInfo,
command "journal" reportParserInfo
])
in O.info toplevel (O.fullDesc <> O.progDesc "toplevel desc")

Avoid repeating code when reusing for multiple value constructor pattern matches

Suppose you have a data structure with multiple value constructors, for example a LogMessage data structure like so:
data LogMessage = Unknown String
| LogMessage MessageType TimeStamp String
If the message can be parsed properly, it has some extra data and then a String. If it can't be parsed, then it's just a catch-all Unknown String.
Or suppose you are working with something like an Either String String, so that you might be dealing with a Left String or a Right String.
Now let's say that you want to apply the same processing steps to the underlying data, regardless of which value constructor it resides in.
For example, I might want to detect a certain word within the LogMessage strings, so I could have a function like this:
detectWord :: String -> LogMessage -> Bool
detectWord s (Unknown m) = isInfixOf s (map toLower m)
detectWord s (LogMessage _ _ m) = isInfixOf s (map toLower m)
or it could just as easily be written to handle Either String String as input instead of LogMessage.
In both cases, I have to repeat the exact same code (the isInfixOf ... part), because I have to extract the underlying data it will operate on differently due to pattern matching on different value constructors.
It is bad that one must repeat / "copy-paste" the code for each different value constructor match.
How does one write these kinds of Haskell functions without copy/paste code? How can I write the underlying logic just once, but then explain how it should be used across many different value constructor patterns?
Simply moving it to a secondary helper function would reduce the character count, but doesn't really solve the issue. For example, the idea below is not substantively any better about "don't repeat yourself" than the first case:
helper :: String -> String -> Bool
helper s m = isInfixOf s (map toLower m)
detectWord :: String -> LogMessage -> Bool
detectWord s (Unknown m) = helper s m
detectWord s (LogMessage _ _ m) = helper s m
There again we have to say the same thing for every different pattern.
Write a function that gets the message in either case. Then, you won't need to write separate cases for uses that don't care:
getMsg (Unknown m) = m
getMsg (LogMessage _ _ m) = m
detectWord s log = infixOf s (map toLower (getMsg log))
Note that something is going to have to check the cases of your type, and getMsg is about as minimal as it gets along those lines.
Simple, won't make people hate you
Try using view patterns.
{-# LANGUAGE ViewPatterns #-}
data LogMessage = Unknown String
| LogMessage MessageType TimeStamp String
stringOfLogMessage :: LogMessage -> String
stringOfLogMessage (Unknown s) = s
stringOfLogMessage (LogMessage _ _ s) = s
detectWord :: String -> LogMessage -> Bool
detectWord needle (stringOfLogMessage -> hay) =
needle `isInfixOf` map toLower hay
Complicated, might make people hate you
Use Generics and Generics.Deriving.Lens.
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE NoImplicitPrelude #-}
module Lib where
import BasePrelude
import Control.Lens
import Generics.Deriving.Lens
data LogMessage = Unknown String
| LogMessage () () String
deriving (Generic)
detectWord :: String -> LogMessage -> Bool
detectWord needle =
allOf tinplate (isInfixOf needle . map toLower)

Attoparsec optional parser with Maybe result

I have an Attoparsec parser like this:
myParser :: Parser Text
myParser = char '"' *> takeWhile (not . isspace) <* char '"'
I want to make this parser optional so I get a function that returns Just txt if the parser matches and Nothing else, i.e. a function of the signature:
myMaybeParser :: Parser (Maybe Text)
How can I do this?
You can use option and the Applicative instance of Parser for this:
-- Make a parser optional, return Nothing if there is no match
maybeOption :: Parser a -> Parser (Maybe a)
maybeOption p = option Nothing (Just <$> p)
You can then use it like this:
myMaybeParser = maybeOption myParser

Yet Another Haskell Rigid Type Variable Error

I've investigated many answers to other rigid type variable error questions; but, alas, none of them, to my knowledge, apply to my case. So I'll ask yet another question.
Here's the relevant code:
module MultipartMIMEParser where
import Control.Applicative ((<$>), (<*>), (<*))
import Text.ParserCombinators.Parsec hiding (Line)
data Header = Header { hName :: String
, hValue :: String
, hAddl :: [(String,String)] } deriving (Eq, Show)
data Content a = Content a | Posts [Post a] deriving (Eq, Show)
data Post a = Post { pHeaders :: [Header]
, pContent :: [Content a] } deriving (Eq, Show)
post :: Parser (Post a)
post = do
hs <- headers
c <- case boundary hs of
"" -> content >>= \s->return [s]
b -> newline >> (string b) >> newline >>
manyTill content (string b)
return $ Post { pHeaders=hs, pContent=c }
boundary hs = case lookup "boundary" $ concatMap hAddl hs of
Just b -> "--" ++ b
Nothing -> ""
-- TODO: lookup "boundary" needs to be case-insensitive.
content :: Parser (Content a)
content = do
xs <- manyTill line blankField
return $ Content $ unlines xs -- N.b. This is the line the error message refers to.
where line = manyTill anyChar newline
headers :: Parser [Header]
headers = manyTill header blankField
blankField = newline
header :: Parser Header
header =
Header <$> fieldName <* string ":"
<*> fieldValue <* optional (try newline)
<*> nameValuePairs
where fieldName = many $ noneOf ":"
fieldValue = spaces >> many (noneOf "\r\n;")
nameValuePairs = option [] $ many nameValuePair
nameValuePair :: Parser (String,String)
nameValuePair = do
try $ do n <- name
v <- value
return $ (n,v)
name :: Parser String
name = string ";" >> spaces >> many (noneOf "=")
value :: Parser String
value = string "=" >> between quote quote (many (noneOf "\r\n;\""))
where quote = string "\""
And the error message:
Couldn't match type `a' with `String'
`a' is a rigid type variable bound by
the type signature for content :: Parser (Content a)
at MultipartMIMEParser.hs:(See comment in code.)
Expected type: Text.Parsec.Prim.ParsecT
String () Data.Functor.Identity.Identity (Content a)
Actual type: Text.Parsec.Prim.ParsecT
String () Data.Functor.Identity.Identity (Content String)
Relevant bindings include
content :: Parser (Content a)
(bound at MultipartMIMEParser.hs:72:1)
In a stmt of a 'do' block: return $ Content $ unlines xs
In the expression:
do { xs <- manyTill line blankField;
return $ Content $ unlines xs }
In an equation for `content':
content
= do { xs <- manyTill line blankField;
return $ Content $ unlines xs }
where
line = manyTill anyChar newline
From what I've seen, the problem is that I'm explicitly returning a String using unlines xs, and that breaks the generic nature of a in the type signature. Am I close to understanding?
I've declared Content to be generic because, presumably, this parser might eventually be used on types other than String. Perhaps I'm abstracting prematurely. I did try removing all my as, but I started getting many more compile errors. I think I'd like to stick with the generic approach, if that's reasonable at this point.
Is it clear from the code what I'm trying to do? If so, any suggestions on how to do it best?
You're telling the compiler that content has type Parser (Content a), but the line causing the error is
return $ Content $ unlines xs
Since unlines returns a String, and the Content constructor has type a -> Content a, here you would have String ~ a, so the value Content $ unlines xs has type Content String. If you change the type signature of content to Parser (Content String) then it should compile.
I've declared Content to be generic because, presumably, this parser might eventually be used on types other than String. Perhaps I'm abstracting prematurely. I did try removing all my as, but I started getting many more compile errors. I think I'd like to stick with the generic approach, if that's reasonable at this point.
It's fine to declare Content to be generic, and in many cases it is the exact right way to solve the problem, the issue is that while your container is generic, whenever you fill your container with something concrete, the type variables also have to be concrete. In particular:
> :t Container (1 :: Int)
Container 1 :: Container Int
> :t Container "test"
Container "test" :: Container String
> :t Container (Container "test")
Container (Container "test") :: Container (Container String)
Notice how all of these have their types inferred without any type variables left. You can use the container to hold whatever you want, you just have to make sure that you're accurately telling the compiler what it is.

Resources