Why do I get type Either Text.Parsec.Error.ParseError CSV for the parseCSV function although in the documentation it says that output is Either ParseError CSV? I want to import a CSV file into Haskell and then export specific column from it and then compute statistics for that column.
I import a CSV file like:
data = parseCSV "/home/user/Haskell/data/data.csv"
noEmpRows = either (const []) (filter (\row -> 2 <= length row))
readIndex :: Read cell => Either a CSV -> Int -> [cell]
readIndex csv index = map (read . (!!index)) (noEmpRows csv)
and then I get an error when I want to readIndex data 9 :: [Integer].
I've tried also a function parseCSVFromFile.
https://hackage.haskell.org/package/csv-0.1.2/docs/Text-CSV.html#t:CSV
Thanks in advance for your help.
The question you really seem to be asking is How do I use Text.CSV?
Given the file test.csv:
1,Banana,17
2,Apple,14
3,Pear,21
and this line in GHCi:
Prelude> Text.CSV.parseCSVFromFile "test.csv"
Right [["1","Banana","17"],["2","Apple","14"],["3","Pear","21"],[""]]
If you want to extract a column, then build a function for that:
main :: IO ()
main = do
test_csv <- parseCSVFromFile "test.csv"
case test_csv of
Right csv -> print (extractColumn csv 2 :: [Int])
Left err -> print err
extractColumn :: Read t => CSV -> Int -> [t]
extractColumn csv n =
[ read (record !! n) | record <- csv
, length record > n
, record /= [""] ]
This should produce the output [17,14,21].
Since there is ample room for failure here (a line could contain fewer fields than n, or the string in field n on a given line could fail to read as type t), you may want to handle or report if errors occur. The code above just throws away the line if it contains too few fields and throws a Prelude.read: no parse if the field isn't an Int. Consider readEither or readMaybe.
Related
First of all, sorry for my bad english. I'm not native and try my best :)
Now to the problem: i have a list of Strings and want to convert them to a list of integers. The Problem is, it's not just numbers, basically the String is a List to.
["[1,2,3,4,5,6,7,8]","[8,7,6,5,4,3,2,1]","[1,2,3,4,5,6,7,8]"]
This is the result i get from my code i'll post further down.
Any idea how i can achieve, that the internal list of numbers are list of integers?
I tried like three hours and didn't find a solution.
Every help is appreciatet.
Kind regards
get "/authors/:author" $ do
authorName <- param "author"
directories <- liftIO(listDirectory("data/" ++ authorName))
liftIO(readFiles directories authorName)
html (T.pack (HtmlModule.h1 ("Author: " ++ authorName)))
readFiles :: [String] -> String -> IO ()
readFiles x authorName = do
let y = addPrefix x authorName
content <- mapM readFile y
putStrLn (show content)
Result: ["[1,2,3,4,5,6,7,8]","[8,7,6,5,4,3,2,1]","[1,2,3,4,5,6,7,8]"]
You can read the string into a list of ints:
let nums = map read content :: [[Int]]
You can use read :: Read a => String -> a to convert a string to a type that is a member of the Read typeclass.
Since Int is a member of the Read typeclass, and [a] is a member of the Read typeclass if a is a member of the Read typeclass, we thus can read a list of Ints:
Prelude> read "[1,2,3,4,5,6,7,8]" :: [Int]
[1,2,3,4,5,6,7,8]
We thus can convert a list of Strings with:
content <- mapM ((read :: String -> [Int]) . readFile) y
read will raise an error in case the String can not be converted. You can make use of readMaybe :: Read a => String -> Maybe a to wrap the result in a Just in case parsing was successful, and Nothing in case parsing failed.
I currently have a working parser in megaparsec, where I build an AST for my program. I now want to do some weeding operations on my AST, while being able to use the same kind of pretty errors as the parser. While this stage is after parsing, I'm wondering if there are general practices for megaparsec in doing so. Is there a way for me to extract every line and comment (used in the bundle) and add it to each item in my AST? Is there any other way that people tackle this problem?
Apologies in advance if this sounds open ended, but I'm mainly wondering is there are some better ideas than getting the line numbers and creating bundles myself. I'm still new to haskell so I haven't been able to navigate properly through all the source code.
This was answered by the megaparsec developer here.
To summarize, parsers have a getOffset function that returns the current char index. You can use that along with an initial PosState to create an error bundle which you can later pretty print.
I have a sample project within the github thread, and pasted again here:
module TestParser where
import Data.List.NonEmpty as NonEmpty
import qualified Data.Maybe as Maybe
import qualified Data.Set as Set
import Data.Void
import Parser
import Text.Megaparsec
data Sample
= Test Int
String
| TestBlock [Sample]
| TestBlank
deriving (Show, Eq)
sampleParser :: Parser Sample
sampleParser = do
l <- many testParser
return $ f l
where
f [] = TestBlank
f [s] = s
f p = TestBlock p
testParser :: Parser Sample
testParser = do
offset <- getOffset
test <- symbol "test"
return $ Test offset test
fullTestParser :: Parser Sample
fullTestParser = baseParser testParser
testParse :: String -> Maybe (ParseErrorBundle String Void)
testParse input =
case parse (baseParser sampleParser) "" input of
Left e -> Just e
Right x -> do
(offset, msg) <- testVerify x
let initialState =
PosState
{ pstateInput = input
, pstateOffset = 0
, pstateSourcePos = initialPos ""
, pstateTabWidth = defaultTabWidth
, pstateLinePrefix = ""
}
let errorBundle =
ParseErrorBundle
{ bundleErrors = NonEmpty.fromList [TrivialError offset Nothing Set.empty]
-- ^ A collection of 'ParseError's that is sorted by parse error offsets
, bundlePosState = initialState
-- ^ State that is used for line\/column calculation
}
return errorBundle
-- Sample verify; throw an error on the second test key
testVerify :: Sample -> Maybe (Int, String)
testVerify tree =
case tree of
TestBlock [_, Test a _, _] -> Just (a, "Bad")
_ -> Nothing
testMain :: IO ()
testMain = do
testExample "test test test"
putStrLn "Done"
testExample :: String -> IO ()
testExample input =
case testParse input of
Just error -> putStrLn (errorBundlePretty error)
Nothing -> putStrLn "pass"
Some parts are from other files, but the important parts are in the code.
The problem is that I need to input a decimal number, like a float, with right format.
However, I don't know how can I parse the input to ensure it's really a float. If not, I need to putStrLn "ERR". Assume I have the consecutive input.
As example shown below, what condition can I add after IF to exclude the wrong input format, like 1.2.e!##$, which I should give an "ERR" and loop main rather than get an error and exit program immediately.
input <- getLine
if (read input1 :: Float) > 1.0
then do
let result1 = upperbound (read input :: Float)
let result2 = lowerbound (read input :: Float)
print result4
print result3
main
else do
putStrLn"ERR"
main
read is a partial function - it works only on a subset of the input domain. A better example for a partial function is head: it works well on non-empty lists, but will throw an error on an empty list - and you can only handle errors when in the IO monad. Partial functions are useful in some cases, but you should generally avoid using them. So like head, read is an unsafe function - it may fail when the input cannot be parsed.
read has a safe alternative: readMaybe from Text.Read.
readMaybe :: Read a => String -> Maybe a
readMaybe will never fail - if it can't parse a string, it will return Nothing. Handling a Maybe value is a simple task and can be done in several ways (case expressions, Data.Maybe functions, do notation and so on). Here's an example using a case expression:
import Text.Read
...
case (readMaybe input :: Maybe Float) of
Just f | f > 1.0 -> ...
| otherwise -> ...
Nothing -> ...
This article can be helpful in understanding the different ways of error handling in Haskell.
Prelude> let s1 = "1.223"
Prelude> let s2 = "1"
Prelude> let s3 = "1.2.e!##$"
Prelude> read s1 :: Float
1.223
Prelude> read s2 :: Float
1.0
Prelude> read s3 :: Float
*** Exception: Prelude.read: no parse
read throws an exception when it can't parse the string. You need to handle that exception.
I have text file containing data like that:
13.
13.
[(1,2),(2,3),(4,5)].
And I want to read this into 3 variables in Haskell. But standard functions read this as strings, but considering I get rid of dot at the end myself is there any built-in parser function that will make Integer of "13" and [(Integer,Integer)] list out of [(1,2),(2,3),(4,5)] ?
Yes, it's called read:
let i = read "13" :: Integer
let ts = read "[(1,2),(2,3),(4,5)]" :: [(Integer, Integer)]
The example text file you gave has trailing spaces as well as the full stop, so merely cutting the last character doesn't work. Let's take just the digits, using:
import Data.Char (isDigit)
Why not have a data type to store the stuff from the file:
data MyStuff = MyStuff {firstNum :: Int,
secondNum:: Int,
intPairList :: [(Integer, Integer)]}
deriving (Show,Read)
Now we need to read the file, and then turn it into individual lines:
getMyStuff :: FilePath -> IO MyStuff
getMyStuff filename = do
rawdata <- readFile filename
let [i1,i2,list] = lines rawdata
return $ MyStuff (read $ takeWhile isDigit i1) (read $ takeWhile isDigit i2) (read $ init list)
The read function works with any data type that has a Read instance, and automatically produces data of the right type.
> getMyStuff "data.txt" >>= print
MyStuff {firstNum = 13, secondNum = 13, intPairList = [(1,2),(2,3),(4,5)]}
A better way
I'd be inclined to save myself a fair bit of work, and just write that data directly, so
writeMyStuff :: FilePath -> MyStuff -> IO ()
writeMyStuff filename somedata = writeFile filename (show somedata)
readMyStuff :: FilePath -> IO MyStuff
readMyStuff filename = fmap read (readFile filename)
(The fmap just applies the pure function read to the output of the readFile.)
> writeMyStuff "test.txt" MyStuff {firstNum=12,secondNum=42, intPairList=[(1,2),(3,4)]}
> readMyStuff "test.txt" >>= print
MyStuff {firstNum = 12, secondNum = 42, intPairList = [(1,2),(3,4)]}
You're far less likely to make little parsing or printing errors if you let the compiler sort it all out for you, it's less code, and simpler.
Haskell's strong types require you to know what you're getting. So let's forgo all error checking and optimization and assume that the file is always in the right format, you can do something like this:
data Entry = Number Integer
| List [(Integer, Integer)]
parseYourFile :: FilePath -> IO [Entry]
parseYourFile p = do
content <- readFile p
return $ parseYourFormat content
parseYourFormat :: String -> [Entry]
parseYourFormat data = map parseEntry $ lines data
parseEntry :: String -> Entry
parseEntry line = if head line == '['
then List $ read core
else Number $ read core
where core = init line
Or you could write a proper parser for it using one of the many combinator frameworks.
How do I use the Language.Haskell.Interpreter to read the given config file and assign the values given in it to initialize variables in my program?
My config file is like:
numRecords = 10
numFields = 3
inputFile = /home/user1/project/indata.file
outputFile = /home/user1/project/outdata.file
datefmt = ddmmyyyy
I want to initialize the variables corresponding to the identifiers given in the config file with the values given in the config file.
How do I use the Language.Haskell.Interpreter to accomplish this thing?
I am confused because of the IO Monad and the Interpreter Monad. A small example of this kind will also be useful.
Why not?
data Config = Config {size :: Int, path :: String} deriving (Read, Show)
readConfig :: String -> Config
readConfig = read
main = do
config <- readFile "my.config" >>= return . readConfig
putStrLn $ "Configured size := " ++ show (size config)
putStrLn $ "Configured path := " ++ show (path config)
Using my.config file
Config {
size = 1024,
path = "/root/passwords.db"
}
And testing using ghci
*Main> main
Configured size := 1024
Configured path := "/root/passwords.db"
*Main>
(sorry previous bugs, I was hurry)
First of all, you cannot (in any obvious way, at least) use Language.Haskell.Interpreter from the hint package to do this. That functions in that module are used to read in and run Haskell code, not arbitrary structured data.
For reading in structured data, you will need a parser of some sort. Here are some of the options that you have:
Use a parser generator such as Happy.
Use a parser-combinator library such as uu-parsinglib or parsec.
Directly implement your own parser.
Make use of automatically derived instances of the Read class.
Ad 1. and 2.
If the format of the data you need to read in is nontrivial or if you need helpful error messages in case parsing fails, I'd recommend to go for 1. or 2. and to consult the documentation of the respective tools and libraries. Be aware that you will need some time to get accustomed to their main concepts and interfaces.
Ad 3.
If the format of your data is simple enough (as it is in your example) and if extensive error reporting is not high on your wish list, you can easily roll your own parser.
In your example, a config file is essentially a listing of keys and values, seperated by newlines. In Haskell, we can represent such a listing by a list of pairs of strings:
type Config = [(String, String)]
"Parsing" a configuration then reduces to: (1) splitting the input string in lines, (2) splitting each line in words, (3) selecting from each line the first and the third word:
readConfig :: String -> Config
readConfig s =
[(key, val) | line <- lines s, let (key : _ : val : _) = words line]
To retrieve an entry from a parsed configuration file, we can then use a function get:
get :: String -> (String -> a) -> Config -> a
get key f config = case lookup key config of
Nothing -> error ("get: not found: " ++ key)
Just x -> f x
This function takes as its first argument the key of the entry and as its second argument a function that converts the raw value string into something of the appropriate type. For purely textual configuration values we can simply pass the identity function to get:
inputFile, outputFile, datefmt :: Config -> String
inputFile = get "inputFile" id
outputFile = get "outputFile" id
datefmt = get "datefmt" id
For integer entries we can use read:
numRecords, numFields :: Config -> Int
numRecords = get "numRecords" read
numFields = get "numFields" read
Perhaps these patterns are common enough to be factored out into their own dedicated versions of get:
getS :: String -> Config -> String
getS key = get key id
getR :: Read a => String -> Config -> a
getR key = get key read
inputFile', outputFile', datefmt' :: Config -> String
inputFile' = getS "inputFile"
outputFile' = getS "outputFile"
datefmt' = getS "datefmt"
numRecords', numFields' :: Config -> Int
numRecords' = getR "numRecords"
numFields' = getR "numFields"
As an example, here is program that reads in a configuration file and that prints the value for "outputFile":
main :: IO ()
main = do
s <- readFile "config.txt"
let config = readConfig s
putStrLn (outputFile config)
Ad 4.
If you can control the format of the configuration file, you can introduce a new datatype for holding configuration data and have Haskell automatically derive an instance of the class Read for it. For instance:
data Config = Config
{ numRecords :: Int
, numFields :: Int
, inputFile :: String
, outputFile :: String
, datefmt :: String
} deriving Read
Now you will need to make sure that your configuration files match the expected format. For instance:
Config
{ numRecords = 10
, numFields = 3
, inputFile = "/home/user1/project/indata.file"
, outputFile = "/home/user1/project/outdata.file"
, datefmt = "ddmmyyyy"
}
As an example, here is then the program that prints the value for "outputFile":
main :: IO ()
main = do
s <- readFile "config.txt"
let config = read s
putStrLn (outputFile config)