How do I use the Language.Haskell.Interpreter to read the given config file and assign the values given in it to initialize variables in my program?
My config file is like:
numRecords = 10
numFields = 3
inputFile = /home/user1/project/indata.file
outputFile = /home/user1/project/outdata.file
datefmt = ddmmyyyy
I want to initialize the variables corresponding to the identifiers given in the config file with the values given in the config file.
How do I use the Language.Haskell.Interpreter to accomplish this thing?
I am confused because of the IO Monad and the Interpreter Monad. A small example of this kind will also be useful.
Why not?
data Config = Config {size :: Int, path :: String} deriving (Read, Show)
readConfig :: String -> Config
readConfig = read
main = do
config <- readFile "my.config" >>= return . readConfig
putStrLn $ "Configured size := " ++ show (size config)
putStrLn $ "Configured path := " ++ show (path config)
Using my.config file
Config {
size = 1024,
path = "/root/passwords.db"
}
And testing using ghci
*Main> main
Configured size := 1024
Configured path := "/root/passwords.db"
*Main>
(sorry previous bugs, I was hurry)
First of all, you cannot (in any obvious way, at least) use Language.Haskell.Interpreter from the hint package to do this. That functions in that module are used to read in and run Haskell code, not arbitrary structured data.
For reading in structured data, you will need a parser of some sort. Here are some of the options that you have:
Use a parser generator such as Happy.
Use a parser-combinator library such as uu-parsinglib or parsec.
Directly implement your own parser.
Make use of automatically derived instances of the Read class.
Ad 1. and 2.
If the format of the data you need to read in is nontrivial or if you need helpful error messages in case parsing fails, I'd recommend to go for 1. or 2. and to consult the documentation of the respective tools and libraries. Be aware that you will need some time to get accustomed to their main concepts and interfaces.
Ad 3.
If the format of your data is simple enough (as it is in your example) and if extensive error reporting is not high on your wish list, you can easily roll your own parser.
In your example, a config file is essentially a listing of keys and values, seperated by newlines. In Haskell, we can represent such a listing by a list of pairs of strings:
type Config = [(String, String)]
"Parsing" a configuration then reduces to: (1) splitting the input string in lines, (2) splitting each line in words, (3) selecting from each line the first and the third word:
readConfig :: String -> Config
readConfig s =
[(key, val) | line <- lines s, let (key : _ : val : _) = words line]
To retrieve an entry from a parsed configuration file, we can then use a function get:
get :: String -> (String -> a) -> Config -> a
get key f config = case lookup key config of
Nothing -> error ("get: not found: " ++ key)
Just x -> f x
This function takes as its first argument the key of the entry and as its second argument a function that converts the raw value string into something of the appropriate type. For purely textual configuration values we can simply pass the identity function to get:
inputFile, outputFile, datefmt :: Config -> String
inputFile = get "inputFile" id
outputFile = get "outputFile" id
datefmt = get "datefmt" id
For integer entries we can use read:
numRecords, numFields :: Config -> Int
numRecords = get "numRecords" read
numFields = get "numFields" read
Perhaps these patterns are common enough to be factored out into their own dedicated versions of get:
getS :: String -> Config -> String
getS key = get key id
getR :: Read a => String -> Config -> a
getR key = get key read
inputFile', outputFile', datefmt' :: Config -> String
inputFile' = getS "inputFile"
outputFile' = getS "outputFile"
datefmt' = getS "datefmt"
numRecords', numFields' :: Config -> Int
numRecords' = getR "numRecords"
numFields' = getR "numFields"
As an example, here is program that reads in a configuration file and that prints the value for "outputFile":
main :: IO ()
main = do
s <- readFile "config.txt"
let config = readConfig s
putStrLn (outputFile config)
Ad 4.
If you can control the format of the configuration file, you can introduce a new datatype for holding configuration data and have Haskell automatically derive an instance of the class Read for it. For instance:
data Config = Config
{ numRecords :: Int
, numFields :: Int
, inputFile :: String
, outputFile :: String
, datefmt :: String
} deriving Read
Now you will need to make sure that your configuration files match the expected format. For instance:
Config
{ numRecords = 10
, numFields = 3
, inputFile = "/home/user1/project/indata.file"
, outputFile = "/home/user1/project/outdata.file"
, datefmt = "ddmmyyyy"
}
As an example, here is then the program that prints the value for "outputFile":
main :: IO ()
main = do
s <- readFile "config.txt"
let config = read s
putStrLn (outputFile config)
Related
I currently have a working parser in megaparsec, where I build an AST for my program. I now want to do some weeding operations on my AST, while being able to use the same kind of pretty errors as the parser. While this stage is after parsing, I'm wondering if there are general practices for megaparsec in doing so. Is there a way for me to extract every line and comment (used in the bundle) and add it to each item in my AST? Is there any other way that people tackle this problem?
Apologies in advance if this sounds open ended, but I'm mainly wondering is there are some better ideas than getting the line numbers and creating bundles myself. I'm still new to haskell so I haven't been able to navigate properly through all the source code.
This was answered by the megaparsec developer here.
To summarize, parsers have a getOffset function that returns the current char index. You can use that along with an initial PosState to create an error bundle which you can later pretty print.
I have a sample project within the github thread, and pasted again here:
module TestParser where
import Data.List.NonEmpty as NonEmpty
import qualified Data.Maybe as Maybe
import qualified Data.Set as Set
import Data.Void
import Parser
import Text.Megaparsec
data Sample
= Test Int
String
| TestBlock [Sample]
| TestBlank
deriving (Show, Eq)
sampleParser :: Parser Sample
sampleParser = do
l <- many testParser
return $ f l
where
f [] = TestBlank
f [s] = s
f p = TestBlock p
testParser :: Parser Sample
testParser = do
offset <- getOffset
test <- symbol "test"
return $ Test offset test
fullTestParser :: Parser Sample
fullTestParser = baseParser testParser
testParse :: String -> Maybe (ParseErrorBundle String Void)
testParse input =
case parse (baseParser sampleParser) "" input of
Left e -> Just e
Right x -> do
(offset, msg) <- testVerify x
let initialState =
PosState
{ pstateInput = input
, pstateOffset = 0
, pstateSourcePos = initialPos ""
, pstateTabWidth = defaultTabWidth
, pstateLinePrefix = ""
}
let errorBundle =
ParseErrorBundle
{ bundleErrors = NonEmpty.fromList [TrivialError offset Nothing Set.empty]
-- ^ A collection of 'ParseError's that is sorted by parse error offsets
, bundlePosState = initialState
-- ^ State that is used for line\/column calculation
}
return errorBundle
-- Sample verify; throw an error on the second test key
testVerify :: Sample -> Maybe (Int, String)
testVerify tree =
case tree of
TestBlock [_, Test a _, _] -> Just (a, "Bad")
_ -> Nothing
testMain :: IO ()
testMain = do
testExample "test test test"
putStrLn "Done"
testExample :: String -> IO ()
testExample input =
case testParse input of
Just error -> putStrLn (errorBundlePretty error)
Nothing -> putStrLn "pass"
Some parts are from other files, but the important parts are in the code.
Why do I get type Either Text.Parsec.Error.ParseError CSV for the parseCSV function although in the documentation it says that output is Either ParseError CSV? I want to import a CSV file into Haskell and then export specific column from it and then compute statistics for that column.
I import a CSV file like:
data = parseCSV "/home/user/Haskell/data/data.csv"
noEmpRows = either (const []) (filter (\row -> 2 <= length row))
readIndex :: Read cell => Either a CSV -> Int -> [cell]
readIndex csv index = map (read . (!!index)) (noEmpRows csv)
and then I get an error when I want to readIndex data 9 :: [Integer].
I've tried also a function parseCSVFromFile.
https://hackage.haskell.org/package/csv-0.1.2/docs/Text-CSV.html#t:CSV
Thanks in advance for your help.
The question you really seem to be asking is How do I use Text.CSV?
Given the file test.csv:
1,Banana,17
2,Apple,14
3,Pear,21
and this line in GHCi:
Prelude> Text.CSV.parseCSVFromFile "test.csv"
Right [["1","Banana","17"],["2","Apple","14"],["3","Pear","21"],[""]]
If you want to extract a column, then build a function for that:
main :: IO ()
main = do
test_csv <- parseCSVFromFile "test.csv"
case test_csv of
Right csv -> print (extractColumn csv 2 :: [Int])
Left err -> print err
extractColumn :: Read t => CSV -> Int -> [t]
extractColumn csv n =
[ read (record !! n) | record <- csv
, length record > n
, record /= [""] ]
This should produce the output [17,14,21].
Since there is ample room for failure here (a line could contain fewer fields than n, or the string in field n on a given line could fail to read as type t), you may want to handle or report if errors occur. The code above just throws away the line if it contains too few fields and throws a Prelude.read: no parse if the field isn't an Int. Consider readEither or readMaybe.
I need to backup some data to access it later.
At the interface level, I have two functions:
put: backs up data and returns a backup_Id.
get: retrieves data given a backup_Id.
My current code requires me to supply these two functions with the backup parameter.
import Data.Maybe
data Data = Data String deriving Show
type Backup = [(String,Data)]
put :: Backup -> String -> IO Backup
put boilerPlate a =
do let id = "id" ++ show(length (boilerPlate))
putStrLn $ id ++": " ++ a
return ((id,(Data a)):boilerPlate)
get :: Backup -> String -> Maybe Data
get boilerPlate id = lookup id (boilerPlate)
It works OK.
In the following sample, two values are backed up. The second one is retrieved.
main :: IO ()
main = do
let bp0 = []
bp1 <- put bp0 "a"
bp2 <- put bp1 "b"
let result = get bp2 "id1"
putStrLn $ "Looking for id1: " ++ show (fromJust(result))
But I need to simplify the signatures of put and get by getting rid of all the backup parameters.
I need something that looks like this:
main = do
put "a"
put "b"
let result = get "id1"
What is the simplest way to achieve this?
Here's an example using StateT. Note that the function names are changed because State and StateT already have get and put functions.
module Main where
import Control.Monad.State
data Data = Data String deriving Show
type Backup = [(String,Data)]
save :: String -> StateT Backup IO ()
save a = do
backup <- get
let id = "id" ++ ((show . length) backup)
liftIO $ putStrLn $ id ++ ": " ++ a
put ((id, Data a):backup)
retrieve :: String -> StateT Backup IO (Maybe Data)
retrieve id = do
backup <- get
return $ lookup id backup
run :: IO (Maybe Data)
run = flip evalStateT [] $ do
save "a"
save "b"
retrieve "id1"
main :: IO ()
main = do
result <- run
print result
The State monad threads a 'mutable' value through a computation. StateT combines State with other monads; in this case, allowing the use of IO.
As dfeuer mentioned, it is possible to make save and retrieve a bit more general with these types:
save :: (MonadState Backup m, MonadIO m) => String -> m ()
retrieve :: (MonadState Backup m, MonadIO m) => String -> m (Maybe Data)
(This also requires {-# LANGUAGE FlexibleContexts #-}) The advantage of this approach is that it allows our functions to work with any monad that provides the Backup state and IO. In particular, we can add effects to the monad and the functions will still work.
All this monad / monad transformer stuff can be pretty confusing at first, but it's actually pretty elegant once you get used to it. The advantage is that you can easily see what kind of effects are required in each function. That being said, I don't want you to think that there are things that Haskell can't do, so here's another way to achieve your goal which does away with the state monad in favor of a mutable reference.
module Main where
import Data.IORef
data Data = Data String deriving Show
type Backup = [(String,Data)]
mkSave :: IORef Backup -> String -> IO ()
mkSave r a = do
backup <- readIORef r
let id = "id" ++ ((show . length) backup)
putStrLn $ id ++ ": " ++ a
writeIORef r ((id, Data a):backup)
mkRetrieve :: IORef Backup -> String -> IO (Maybe Data)
mkRetrieve r id = do
backup <- readIORef r
return $ lookup id backup
main :: IO ()
main = do
ref <- newIORef []
let save = mkSave ref
retrieve = mkRetrieve ref
save "a"
save "b"
result <- retrieve "id0"
print result
Just be warned that this isn't usually the recommended approach.
I have a main like the following:
main :: IO ()
main = do
args <- getArgs
putStrLn $ functionName args
where
functionName args = "problem" ++ (filter (/= '"') $ show (args!!0))
Instead of putting the name to stdout like I do it right now, I want to call the function.
I am aware of the fact, that I could use hint (as mentioned in Haskell: how to evaluate a String like "1+2") but I think that would be pretty overkill for just getting that simple function name.
At the current stage it does not matter if the program crashes if the function does not exist!
Without taking special measures to preserve them, the names of functions will likely be gone completely in a compiled Haskell program.
I would suggest just making a big top-level map:
import Data.Map ( Map )
import qualified Data.Map as Map
functions :: Map String (IO ())
functions = Map.fromList [("problem1", problem1), ...]
call :: String -> IO ()
call name =
case Map.lookup name of
Nothing -> fail $ name + " not found"
Just m -> m
main :: IO ()
main = do
args <- getArgs
call $ functionName args
where
functionName args = "problem" ++ (filter (/= '"') $ show (args!!0))
If you're going to do this, you have a few approaches, but the easiest by far is to just pattern match on it
This method requires that all of your functions you want to call have the same type signature:
problem1 :: Int
problem1 = 1
problem2 :: Int
problem2 = 2
runFunc :: String -> Maybe Int
runFunc "problem1" = Just problem1
runFunc "problem2" = Just problem2
runFunc _ = Nothing
main = do
args <- getArgs
putStrLn $ runFunc $ functionName args
This requires you to add a line to runFunc each time you add a new problemN, but that's pretty manageable.
You can't get a string representation of an identifier, not without fancy non-standard features, because that information isn't retained after compilation. As such, you're going to have to write down those function names as string constants somewhere.
If the function definitions are all in one file anyway, what I would suggest is to use data types and lambdas to avoid having to duplicate those function names altogether:
Data Problem = {
problemName :: String,
evalProblem :: IO () # Or whatever your problem function signatures are
}
problems = [Problem]
problems = [
Problem {
problemName = "problem1",
evalProblem = do ... # Insert code here
},
Problem
problemName = "problem2",
evalProblem = do ... # Insert code here
}
]
main :: IO ()
main = do
args <- getArgs
case find (\x -> problemName x == (args!!0)) problems of
Just x -> evalProblem x
Nothing -> # Handle error
Edit: Just to clarify, I'd say the important takeaway here is that you have an XY Problem.
I have text file containing data like that:
13.
13.
[(1,2),(2,3),(4,5)].
And I want to read this into 3 variables in Haskell. But standard functions read this as strings, but considering I get rid of dot at the end myself is there any built-in parser function that will make Integer of "13" and [(Integer,Integer)] list out of [(1,2),(2,3),(4,5)] ?
Yes, it's called read:
let i = read "13" :: Integer
let ts = read "[(1,2),(2,3),(4,5)]" :: [(Integer, Integer)]
The example text file you gave has trailing spaces as well as the full stop, so merely cutting the last character doesn't work. Let's take just the digits, using:
import Data.Char (isDigit)
Why not have a data type to store the stuff from the file:
data MyStuff = MyStuff {firstNum :: Int,
secondNum:: Int,
intPairList :: [(Integer, Integer)]}
deriving (Show,Read)
Now we need to read the file, and then turn it into individual lines:
getMyStuff :: FilePath -> IO MyStuff
getMyStuff filename = do
rawdata <- readFile filename
let [i1,i2,list] = lines rawdata
return $ MyStuff (read $ takeWhile isDigit i1) (read $ takeWhile isDigit i2) (read $ init list)
The read function works with any data type that has a Read instance, and automatically produces data of the right type.
> getMyStuff "data.txt" >>= print
MyStuff {firstNum = 13, secondNum = 13, intPairList = [(1,2),(2,3),(4,5)]}
A better way
I'd be inclined to save myself a fair bit of work, and just write that data directly, so
writeMyStuff :: FilePath -> MyStuff -> IO ()
writeMyStuff filename somedata = writeFile filename (show somedata)
readMyStuff :: FilePath -> IO MyStuff
readMyStuff filename = fmap read (readFile filename)
(The fmap just applies the pure function read to the output of the readFile.)
> writeMyStuff "test.txt" MyStuff {firstNum=12,secondNum=42, intPairList=[(1,2),(3,4)]}
> readMyStuff "test.txt" >>= print
MyStuff {firstNum = 12, secondNum = 42, intPairList = [(1,2),(3,4)]}
You're far less likely to make little parsing or printing errors if you let the compiler sort it all out for you, it's less code, and simpler.
Haskell's strong types require you to know what you're getting. So let's forgo all error checking and optimization and assume that the file is always in the right format, you can do something like this:
data Entry = Number Integer
| List [(Integer, Integer)]
parseYourFile :: FilePath -> IO [Entry]
parseYourFile p = do
content <- readFile p
return $ parseYourFormat content
parseYourFormat :: String -> [Entry]
parseYourFormat data = map parseEntry $ lines data
parseEntry :: String -> Entry
parseEntry line = if head line == '['
then List $ read core
else Number $ read core
where core = init line
Or you could write a proper parser for it using one of the many combinator frameworks.