Creating a simple cli argument parser via pattern matching - haskell

I am working on a little project in Haskell and currently I am trying to make it more extensible by adding the possibility of using optional cli flags.
The desired interface would look as this:
program: [--command1 {n} ] [--help]
with command1 and help being optional arguments, and furthermore, command1 can also take an integer argument n.
Using patterm matching, I came up with the following result
main :: IO ()
main = do
args <- getArgs
case args of
[] -> command1'
["--command1", a] -> if isJust (readMaybe n::Maybe Int)
then command1 a-- handle proper integer
else -- call command1' but also show a warning message
["--command1"] -> command1'
["--help"] -> putStrLn showWarningMessage
_ -> putStrLn (showErrorMessage args) -- show an error message
As the program does not rely on a certain parameter to be present, it should be able to run even if the cli arguments aren't valid.
I am unsure how to handle the following situations:
How to handle the case when a is not an integer and thus, we want to warn the user but also call command1'?
What to do in the situation when wrong arguments are added. The pattern match is exhaustive so arguments as --command2 would be captured but what to do if args is of the form args=[--command2, --command1, n] or args=[--command1, --command2]. In both of this cases ideally an warning would be printed regarding command2 being unknown but the cli should accept command1.
Note: I saw that there are libraries specifically designed to handle these issues but I am interested in solving these simple cases in order to learn Haskell better. Of course, there could be more edge cases but for my usage, the two mentioned are my only concern.

Not an expert, but here are my answers:
1. What would be working for you is a simple do notation in your else, and provide a default value for the call to command1, since I assume your function needs an Integer to do something. Which would end up something like that :
main :: IO ()
main = do
args <- getArgs
case args of
[] -> command1'
["--command1", a] -> if isJust (readMaybe n::Maybe Int)
then command1 a
-- here what I think you want to do
else do
-- here your warning message
putStrLn (showWarningMessage args)
-- here let's say 10 is the default value
command1 10
["--command1"] -> command1'
["--help"] -> putStrLn showWarningMessage
_ -> putStrLn (showErrorMessage args) -- show an error message
I found that in http://www.happylearnhaskelltutorial.com/1/times_table_train_of_terror.html#s18.4 which you can begin from start, it explains well the concepts of Haskell.
2. I understand what your are trying to do, but isn't the purpose of a commandline prompt to fail fast ? Keep in mind that you must focus on what are the default values, and doing so, what should be the default behaviour of your tool. And maybe you can think of a different implementation of your params. This is what I learned from few months of Haskell, coming myself from imperative languages.
Anyway I have some hints about how to do this. And what if you add a command3 ? Your approach with describing all the possible use cases and manage everything by hand can lead easily to oversights and bugs. Without using an external library, this management can be avoided by using System.Console.GetOpt, the built-in parser of Haskell. Even if the documentation is a bit old they explain some concepts about flags and parsing.
I also found this post on Stack Overflow which shows a working example https://stackoverflow.com/a/10816382/6270743 with GetOpt. Based on that, I was not able to show an error message AND execute the command which is what you try to do.
But the code, even not perfect, handles perfectly every combination of flags/params and catches all the errors with 4 different params. Here it is with some comments :
import Control.Monad (foldM)
import System.Console.GetOpt
import System.Environment ( getArgs, getProgName )
data Options = Options {
optVerbose :: Bool
, optShowVersion :: Bool
, optNbOfIterations :: Int
, optSomeString :: String
} deriving Show
defaultOptions = Options {
optVerbose = False
, optShowVersion = False
, optNbOfIterations = 1
, optSomeString = "a random string"
}
options :: [OptDescr (Options -> Either String Options)]
options =
[ Option ['v'] ["verbose"]
(NoArg (\ opts -> Right opts { optVerbose = True }))
"chatty output on stderr"
, Option ['V','?'] ["version"]
(NoArg (\ opts -> Right opts { optShowVersion = True })) -- here no need for a value for this option because of `NoArg`
"show version number"
, Option ['i'] ["iterations"]
(ReqArg (\ i opts -> -- here the value of the param is required `ReqArg`, but you can also have `OptArg` which makes the value optional
case reads i of
[(iterations, "")] | iterations >= 1 && iterations <= 20 -> Right opts { optNbOfIterations = iterations } -- here your conditions
_ -> Left "--iterations must be a number between 1 and 20" -- error message in case previous conditions are not met
) "ITERATIONS") -- here the name of the param in the cli help
"Number of times you want to repeat the reversed string" -- here the description of param's usage in the cli help
, Option ['s'] ["string"]
(ReqArg (\arg opts -> Right opts { optSomeString = return arg !! 0 } -- really dirty, sorry but could not make it work another way
"Outputs the string reversed"
]
parseArgs :: IO Options
parseArgs = do
argv <- getArgs
progName <- getProgName
let header = "Usage: " ++ progName ++ " [OPTION...]"
let helpMessage = usageInfo header options
case getOpt RequireOrder options argv of
(opts, [], []) ->
case foldM (flip id) defaultOptions opts of
Right opts -> return opts
Left errorMessage -> ioError (userError (errorMessage ++ "\n" ++ helpMessage))
(_, _, errs) -> ioError (userError (concat errs ++ helpMessage))
main :: IO ()
main = do
options <- parseArgs
if isVerbose options
then mapM_ putStrLn ["verbose mode activated !","You will have a lot of logs","With a lot of messages","And so on ..."]
else return () -- do nothing
mapM_ print (repeatReversedString options)
if getVersion options
then print "version : 0.1.0-prealpha"
else return ()
return ()
-- With those functions we extract the option setting from the data type Options (and maybe we play with it)
isVerbose :: Options -> Bool
isVerbose (Options optVerbose _ _ _) = optVerbose
getVersion :: Options -> Bool
getVersion (Options _ optShowVersion _ _) = optShowVersion
repeatReversedString :: Options -> [String]
repeatReversedString (Options _ _ optNbOfIterations optSomeString) =
replicate optNbOfIterations (reverse optSomeString)
getString :: Options -> String
getString (Options _ _ _ optSomeString) = optSomeString
I think you can achieve your goal by using OptArg (instead ofthe NoArg and ReqArg I used), which has this definition :
OptArg (Maybe String -> a) String
Here is the link on hackage :
https://hackage.haskell.org/package/base-4.14.1.0/docs/System-Console-GetOpt.html
Remember that Haskell is very different from other common languages as for example you have no for loops !

Related

How to correctly parse arguments with Haskell?

I'm trying to learn how to work with IO in Haskell by writing a function that, if there is a flag, will take a list of points from a file, and if there is no flag, it asks the user to enter them.
dispatch :: [String] -> IO ()
dispatch argList = do
if "file" `elem` argList
then do
let (path : otherArgs) = argList
points <- getPointsFile path
else
print "Enter a point in the format: x;y"
input <- getLine
if (input == "exit")
then do
print "The user inputted list:"
print $ reverse xs
else (inputStrings (input:xs))
if "help" `elem` argList
then help
else return ()
dispatch [] = return ()
dispatch _ = error "Error: invalid args"
getPointsFile :: String -> IO ([(Double, Double)])
getPointsFile path = do
handle <- openFile path ReadMode
contents <- hGetContents handle
let points_str = lines contents
let points = foldl (\l d -> l ++ [tuplify2 $ splitOn ";" d]) [] points_str
hClose handle
return points
I get this: do-notation in pattern Possibly caused by a missing 'do'?` after `if "file" `elem` argList.
I'm also worried about the binding issue, assuming that I have another flag that says which method will be used to process the points. Obviously it waits for points, but I don't know how to make points visible not only in if then else, constructs. In imperative languages I would write something like:
init points
if ... { points = a}
else points = b
some actions with points
How I can do something similar in Haskell?
Here's a fairly minimal example that I've done half a dozen times when I'm writing something quick and dirty, don't have a complicated argument structure, and so can't be bothered to do a proper job of setting up one of the usual command-line parsing libraries. It doesn't explain what went wrong with your approach -- there's an existing good answer there -- it's just an attempt to show what this kind of thing looks like when done idiomatically.
import System.Environment
import System.Exit
import System.IO
main :: IO ()
main = do
args <- getArgs
pts <- case args of
["--help"] -> usage stdout ExitSuccess
["--file", f] -> getPointsFile f
[] -> getPointsNoFile
_ -> usage stderr (ExitFailure 1)
print (frobnicate pts)
usage :: Handle -> ExitCode -> IO a
usage h c = do
nm <- getProgName
hPutStrLn h $ "Usage: " ++ nm ++ " [--file FILE]"
hPutStrLn h $ "Frobnicate the points in FILE, or from stdin if no file is supplied."
exitWith c
getPointsFile :: FilePath -> IO [(Double, Double)]
getPointsFile = {- ... -}
getPointsNoFile :: IO [(Double, Double)]
getPointsNoFile = {- ... -}
frobnicate :: [(Double, Double)] -> Double
frobnicate = {- ... -}
if in Haskell doesn't inherently have anything to do with control flow, it just switches between expressions. Which, in Haskell, happen to include do blocks of statements (if we want to call them that), but you still always need to make that explicit, i.e. you need to say both then do and else do if there are multiple statements in each branch.
Also, all the statements in a do block need to be indented to the same level. So in your case
if "file" `elem` argList
...
if "help" `elem` argList
Or alternatively, if the help check should only happen in the else branch, it needs to be indented to the statements in that do block.
Independent of all that, I would recommend to avoid parsing anything in an IO context. It is usually much less hassle and easier testable to first parse the strings into a pure data structure, which can then easily be processed by the part of the code that does IO. There are libraries like cmdargs and optparse-applicative that help with the parsing part.

How Haskell's System.Console.GetOpt ReqArg Takes a 2 arity function as its constructors first argument?

I am still quite new to Haskell so forgive me if this is completely obvious and I am just not understanding correctly.
On Hackage the documentation says that System.Console.GetOpt ReqArg takes a function of arity 1 e.g (String -> a) as the first argument to its constructor.
ReqArg (String -> a) String
In many of the examples that I have seen a 2 arity function is passed to this constructor.
Example from (https://wiki.haskell.org/High-level_option_handling_with_GetOpt):
data Options = Options { optVerbose :: Bool
, optInput :: IO String
, optOutput :: String -> IO ()
}
startOptions :: Options
startOptions = Options { optVerbose = False
, optInput = getContents
, optOutput = putStr
}
options :: [ OptDescr (Options -> IO Options) ]
options =
[ Option "i" ["input"]
(ReqArg
(\arg opt -> return opt { optInput = readFile arg })
"FILE")
"Input file"
, Option "o" ["output"]
(ReqArg
(\arg opt -> return opt { optOutput = writeFile arg })
"FILE")
"Output file"
, Option "s" ["string"]
(ReqArg
(\arg opt -> return opt { optInput = return arg })
"FILE")
"Input string"
, Option "v" ["verbose"]
(NoArg
(\opt -> return opt { optVerbose = True }))
"Enable verbose messages"
, Option "V" ["version"]
(NoArg
(\_ -> do
hPutStrLn stderr "Version 0.01"
exitWith ExitSuccess))
"Print version"
, Option "h" ["help"]
(NoArg
(\_ -> do
prg <- getProgName
hPutStrLn stderr (usageInfo prg options)
exitWith ExitSuccess))
"Show help"
]
So my question is do value constructors not really enforce the type when a function is used in its arguments or is there something else I am missing?
Update:
This is making more sense to me know. I believe there were a couple of factors that I was overlooking. First, as #CommuSoft mentioned, all functions really are a single arity in Haskell due to currying. Second, I didn't look closely enough at options which is not a function but a variable which is of type:
[ OptDescr (Options -> IO Options) ]
This type signature of options declares what the type of the type variable of ReqArg is as well as the other type constructors NoArg and OptArg (the latter not utilized in the example).
The single arity anonymous function passed to the NoArg ArgDescr constructor will essentially just be:
(Options -> IO Options)
E.g it will receive the Options instance record
Where as the 2 arity anonymous function passed to the ReqArg constructor will be:
(String -> Options -> IO Options)
And it will receive a string (the value someone entered at the command line) and the Options instance record.
Thanks to all for helping me think this through!
The -> you see in type signatures is, actually, a type too. And because of this, type variable a can be a function b -> c. In your example it is Options -> IO Options.
ReqArg is not a function: it is a constructor. Now constructors are evidently functions as well. The signature of ReqArg is:
ReqArg :: (String -> a) -> String -> ArgDescr a
So you constructor returns an ArgDescr a.
Now a second aspect you have to notice is that a is in this case equivalent to a = Options -> IO Options, so that means the signature of your ReqArg constructor collapses to:
ReqArg :: (String -> (Options -> IO Options)) -> String -> ArgDescr (Options -> IO Options)
or less noisy:
ReqArg :: (String -> Options -> IO Options) -> String -> ArgDescr (Options -> IO Options)
(brackets removed)
So it is a function with "arity" 2 (note that strictly speaking in functional programming every function has (at least conceptually) arity 1). The point is that you generate out of the first argument a new function. But Haskells syntactical sugar allows to "define two arguments" at once.
Explaining the documentation
That's the reason why in the documentation example, you need to use foldr:
return (foldl (flip id) defaultOptions o, n)
Note this does not map on your (Options -> IO Options), in the example one uses Options -> Options.
The point is, in the documentation Haskell processes each command option individually. Initially you start with defaultOptions, and processing an option with o out of n results in a new Option, that you use as input for processing the next. After you completed the chain of elements, you return the final Options data.
For your program
You make things a bit harder using an IO Monad: it was perhaps better to store a boolean whether you had to print the version, and if that was the case, do this in the main, or somewhere else. Nevertheless, you can achieve the same using foldlM instead of foldl.

Using content of a string to call function with same name

I have a main like the following:
main :: IO ()
main = do
args <- getArgs
putStrLn $ functionName args
where
functionName args = "problem" ++ (filter (/= '"') $ show (args!!0))
Instead of putting the name to stdout like I do it right now, I want to call the function.
I am aware of the fact, that I could use hint (as mentioned in Haskell: how to evaluate a String like "1+2") but I think that would be pretty overkill for just getting that simple function name.
At the current stage it does not matter if the program crashes if the function does not exist!
Without taking special measures to preserve them, the names of functions will likely be gone completely in a compiled Haskell program.
I would suggest just making a big top-level map:
import Data.Map ( Map )
import qualified Data.Map as Map
functions :: Map String (IO ())
functions = Map.fromList [("problem1", problem1), ...]
call :: String -> IO ()
call name =
case Map.lookup name of
Nothing -> fail $ name + " not found"
Just m -> m
main :: IO ()
main = do
args <- getArgs
call $ functionName args
where
functionName args = "problem" ++ (filter (/= '"') $ show (args!!0))
If you're going to do this, you have a few approaches, but the easiest by far is to just pattern match on it
This method requires that all of your functions you want to call have the same type signature:
problem1 :: Int
problem1 = 1
problem2 :: Int
problem2 = 2
runFunc :: String -> Maybe Int
runFunc "problem1" = Just problem1
runFunc "problem2" = Just problem2
runFunc _ = Nothing
main = do
args <- getArgs
putStrLn $ runFunc $ functionName args
This requires you to add a line to runFunc each time you add a new problemN, but that's pretty manageable.
You can't get a string representation of an identifier, not without fancy non-standard features, because that information isn't retained after compilation. As such, you're going to have to write down those function names as string constants somewhere.
If the function definitions are all in one file anyway, what I would suggest is to use data types and lambdas to avoid having to duplicate those function names altogether:
Data Problem = {
problemName :: String,
evalProblem :: IO () # Or whatever your problem function signatures are
}
problems = [Problem]
problems = [
Problem {
problemName = "problem1",
evalProblem = do ... # Insert code here
},
Problem
problemName = "problem2",
evalProblem = do ... # Insert code here
}
]
main :: IO ()
main = do
args <- getArgs
case find (\x -> problemName x == (args!!0)) problems of
Just x -> evalProblem x
Nothing -> # Handle error
Edit: Just to clarify, I'd say the important takeaway here is that you have an XY Problem.

Dispatching to correct function with command line arguments in Haskell

I'm writing a little command-line program in Haskell. I need it to dispatch to the correct encryption function based on the command line arguments. I've gotten that far, but then I need the remaining arguments to get passed to the function as parameters. I've read:
http://learnyouahaskell.com/input-and-output
That's gotten me this far:
import qualified CaesarCiphers
import qualified ExptCiphers
dispatch::[(String, String->IO ())]
dispatch = [("EEncipher", ExptCiphers.exptEncipherString)
("EDecipher", ExptCiphers.exptDecipherString)
("CEncipher", CaesarCiphers.caesarEncipherString)
("CDecipher", CaesarCiphers.caesarDecipherString)
("CBruteForce", CaesarCiphers.bruteForceCaesar)]
main = do
(command:args) <- getArgs
Each of the functions takes some arguments that I won't know untill run-time. How do I pass those into a function seeing as they'll be bound up in a list? Do I just grab them manually? Like:
exampleFunction (args !! 1) (args !! 2)
That seems kind of ugly. Is there some sort of idiomatic way to do this? And what about error checking? My functions aren't equipped to gracefully handle errors like getting passed parameters in an idiotic order.
Also, and importantly, each function in dispatch takes a different number of arguments, so I can't do this statically anyways (as above.) It's too bad unCurry command args isn't valid Haskell.
One way is to wrap your functions inside functions that do further command line processing. e.g.
dispatch::[(String, [String]->IO ())]
dispatch = [("EEncipher", takesSingleArg ExptCiphers.exptEncipherString)
("EDecipher", takesSingleArg ExptCiphers.exptDecipherString)
("CEncipher", takesTwoArgs CaesarCiphers.caesarEncipherString)
("CDecipher", takesTwoArgs CaesarCiphers.caesarDecipherString)
("CBruteForce", takesSingleArg CaesarCiphers.bruteForceCaesar)]
-- a couple of wrapper functions:
takesSingleArg :: (String -> IO ()) -> [String] -> IO ()
takesSingleArg act [arg] = act arg
takesSingleArg _ _ = showUsageMessage
takesTwoArgs :: (String -> String -> IO ()) -> [String] -> IO ()
takesTwoArgs act [arg1, arg2] = act arg1 arg2
takesTwoArgs _ _ = showUsageMessage
-- put it all together
main = do
(command:args) <- getArgs
case lookup command dispatch of
Just act -> act args
Nothing -> showUsageMessage
You can extend this by having variants of the wrapper functions perform error checking, convert (some of) their arguments into Ints / custom datatypes / etc as necessary.
As dbaupp notes, the way we pattern match on getArgs above isn't safe. A better way is
run :: [String] -> IO ()
run [] = showUsageMessage
run (command : args)
= case lookup command dispatch of
Just act -> act args
Nothing -> showUsageMessage
main = run =<< getArgs

Can't seem to implement Either correctly

Alright so here's my current code:
import System.IO
import System.Environment
import System.Directory
main = do
unfiltered <- getArgs ; home <- getHomeDirectory ; let db = home ++ "/.grindstone"
case unfiltered of
(x:xs) -> return ()
_ -> error "No command given. See --help for more info."
command:args <- getArgs
createDirectoryIfMissing True db
let check = case args of
[] -> error "No arguments given. See --help for more info."
_ -> do let (params#(param:_),rest) = span (\(c:_) -> c=='-') args
if length params > 1 then error ("No arguments given for " ++ param)
else do
let (pArgs,_) = span (\(c:_) -> c/='-') rest
return (param, pArgs) :: Either (IO ()) (String, [String])
let add = print "sup"
let cmds = [("add", add)]
let action = lookup command cmds
case action of
Nothing -> error "Unknown command."
(Just action) -> action
The main problem is with check. I tried implementing the Either type since I want it to either error out, or return something for another function to use, but, it's currently erroring out with:
grindstone.hs:21:23:
No instance for (Monad (Either (IO ())))
arising from a use of `return' at grindstone.hs:21:23-43
Possible fix:
add an instance declaration for (Monad (Either (IO ())))
In the expression:
return (param, pArgs) :: Either (IO ()) (String, [String])
In the expression:
do { let (pArgs, _) = span (\ (c : _) -> ...) rest;
return (param, pArgs) :: Either (IO ()) (String, [String]) }
In the expression:
if length params > 1 then
error ("No arguments given for " ++ param)
else
do { let (pArgs, _) = ...;
return (param, pArgs) :: Either (IO ()) (String, [String]) }
I'm only starting out in haskell and haven't dealt too much with monads yet so just thought I'd ask on here. anyone have any ideas?
The error that is causing your compile problems is that you are directly casting an expression to the type Either (IO ()) (String, [String]) when it is not an Either value. (The compiler is not outputting a very helpful error message.)
To create an Either value [1], we use the data constructors Left and Right. Convention (from the library page) is that errors are a Left value, while correct values are a Right value.
I did a quick rewrite of your arg checking function as
checkArgs :: [String] -> Either String (String, [String])
checkArgs args =
case args of
[] -> Left "No arguments given. See --help for more info."
_ -> let (params#(param:_),rest) = span (\(c:_) -> c=='-') args in
if length params > 1 then
Left ("No arguments given for " ++ param)
else
let (pArgs,_) = span (\(c:_) -> c/='-') rest in
Right (param, pArgs)
Note that the arg checking function does not interact with any external IO () library functions and so has a purely functional type. In general if your code does not have monadic elements (IO ()), it can be clearer to write it in purely functional style. (When starting out in Haskell this is definitely something I would recommend rather than trying to get your head around monads/monad transformers/etc immediately.)
When you are a little more comfortable with monads, you may want to check out Control.Monad.Error [2], which can wraps similar functionality as Either as a monad and would encapsulate some details like Left always being computation errors.
[1] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-Either.html
[2] http://hackage.haskell.org/packages/archive/mtl/1.1.0.2/doc/html/Control-Monad-Error.html
Either (IO ()) (String, [String]) is a type that contains an IO action or a
(String, [String]), so values of this type could be Left IO () or
Right (String, [String]). Left values usually represents an error
occurrence in Haskell. This error can be represented with any type you want,
for example, an error code (Int) or a String that says what happened.
If you use IO () as the type which represents an error, you won't be able
to extract any information about the error. You just will able to perform an IO action later on.
The type that you are looking for isn't Either (IO ()) (String, [String]),
is Either String (String, [String]). With this type can get information about the
error (String). Now, you dont need any IO action into Either type, so you
can remove all do expressions:
let check = case args of
[] -> Left "No arguments given. See --help for more info."
_ -> let (params#(param:_),rest) = span (\(c:_) -> c=='-') args
in if length params > 1
then Left ("No arguments given for " ++ param)
else let (pArgs,_) = span (\(c:_) -> c/='-') rest
in Right (param, pArgs)

Resources