I have the following code which works fine unless the file has utf-8 characteres :
module Main where
import Ref
main = do
text <- getLine
theInput <- readFile text
writeFile ("a"++text) (unlist . proc . lines $ theInput)
With utf-8 characteres I get this:
hGetContents: invalid argument (invalid byte sequence)
Since the file I'm working with has UTF-8 characters, I would like to handle this exception in order to reuse the functions imported from Ref if possible.
Is there a way to read a UTF-8 file as IO String so I can reuse my Ref's functions?. What modifications should I make to my code?. Thanks in Advance.
I attach the functions declarations from my Ref module:
unlist :: [String] -> String
proc :: [String] -> [String]
from prelude:
lines :: String -> [String]
This can be done with just GHC's basic (but extended from the standard) System.IO module, although you'll then have to use more functions:
module Main where
import Ref
import System.IO
main = do
text <- getLine
inputHandle <- openFile text ReadMode
hSetEncoding inputHandle utf8
theInput <- hGetContents inputHandle
outputHandle <- openFile ("a"++text) WriteMode
hSetEncoding outputHandle utf8
hPutStr outputHandle (unlist . proc . lines $ theInput)
hClose outputHandle -- I guess this one is optional in this case.
Thanks for the answers, but I found the solution by myself.
Actually the file I was working with has this codification:
ISO-8859 text, with CR line terminators
So to work with that file with my haskell code It should have this codification instead:
UTF-8 Unicode text, with CR line terminators
You can check the file codification with the utility file like this:
$ file filename
To change the file codification follow the instructions from this link!
Use System.IO.Encoding.
The lack of unicode support is a well known problem with with the standard Haskell IO library.
module Main where
import Prelude hiding (readFile, getLine, writeFile)
import System.IO.Encoding
import Data.Encoding.UTF8
main = do
let ?enc = UTF8
text <- getLine
theInput <- readFile text
writeFile ("a" ++ text) (unlist . proc . lines $ theInput)
Related
Haskell programming
how to use readFile
I use getLine but it should react with a user for each command
But i need to read lines from text file and process the input line
text <- readFile "input.txt"
let linii = lines text
interact (unlines . (map calculate) . linii)
So interact is to do IO using stdin and stdout. If instead of stdin you want to use a file for input, that's fine you can use the readFile function as you are already doing:
applyOnFileLines :: FilePath -> (String -> String) -> IO ()
applyOnFileLines filePath func = do
file <- readFile filePath
putStr . unlines
. map func
. lines
$ file
Let's say I have a file
mary had a little lamb
It's fleece was white as snow
Everywhere
the child went
The lamb, the lamb was sure to go, yeah
How would I read the file as a string, and remove the trailing and leading whitespace? It could be spaces or tabs. It would print like this after removing whitespace:
mary had a little lamb
It's fleece was white as snow
Everywhere
the child went
The lamb, the lamb was sure to go, yeah
Here's what I have currently:
import Data.Text as T
readTheFile = do
handle <- openFile "mary.txt" ReadMode
contents <- hGetContents handle
putStrLn contents
hClose handle
return(contents)
main :: IO ()
main = do
file <- readTheFile
file2 <- (T.strip file)
return()
Your code suggests a few misunderstandings about Haskell so let's go through your code before getting to the solution.
import Data.Text as T
You're using Text, great! I suggest you also use the IO operations that read and write Text types instead of what is provided by the prelude which works on Strings (linked lists of characters). That is, import Data.Text.IO as T
readTheFile = do
handle <- openFile "mary.txt" ReadMode
contents <- hGetContents handle
putStrLn contents
hClose handle
return(contents)
Oh, hey, the use of hGetContents and manually opening and closing a file can be error prone. Consider readTheFile = T.readFile "mary.txt".
main :: IO ()
main = do
file <- readTheFile
file2 <- (T.strip file)
return()
Two issues here.
Issue one Notice here you have used strip as though it's an IO action... but it isn't. I suggest you learn more about IO and binding (do notation) vs let-bound variables. strip computes a new value of type Text and presumably you want to do something useful with that value, like write it.
Issue two Stripping the whole file is different than stripping each line one at a time. I suggest you read mathk's answer.
So in the end I think you want:
-- Qualified imports are accessed via `T.someSymbol`
import qualified Data.Text.IO as T
import qualified Data.Text as T
-- Not really need as a separate function unless you want to also
-- put the stripping here too.
readTheFile :: IO T.Text
readTheFile = T.readFile "mary.txt"
-- First read, then strip each line, then write a new file.
main :: IO ()
main =
do file <- readTheFile
let strippedFile = T.unlines $ map T.strip $ T.lines file
T.writeFile "newfile.txt" (T.strip strippedFile)
Here is a possible solution for what you are looking for:
import qualified Data.Text as T
main = do
trimedFile <- (T.unlines . map T.strip . T.lines) <$> T.readFile "mary.txt"
T.putStr trimedFile
strip from Data.Text is doing the job.
Read the file or process the file one line at a time then
> intercalate " ".words $ " The lamb, the lamb was sure to go, yeah "
"The lamb, the lamb was sure to go, yeah"
But, unwords with no parameter is better than intercalate " " and it neither has to be imported.
How do I create a program that reads a line from a file, parse it to an int and print it(ignoring exceptions of course). Is there anything like "read" but for IO String?
I've got this so far but I couldn't get around the IO types:
readFromFile = do
inputFile <- openFile "catalogue.txt" ReadMode
isbn <- read( hGetLine inputFile)
hClose inputFile
You can specify the type explicitly, change the read line to
isbn <- fmap read (hGetLine inputFile) :: IO Int
As hGetLine inputFile is of type IO String, you should use fmap to get "inside" to read as an Int.
You can use the readFile function to convert your file to a string.
main = do
contents <- readFile "theFile"
let value = read $ head $ lines contents::Int
print value
You should add better error detection, or this program will fail if there isn't a first line, or if the value is malformed, but this is the basic flow....
First, observe that reading stuff and then immediately printing it can result in mysterious errors:
GHCi, version 8.0.0.20160421: http://www.haskell.org/ghc/ :? for help
Prelude λ read "123"
*** Exception: Prelude.read: no parse
The reason is that you don't specify what type you want to read. You can counter this by using type annotations:
Prelude λ read "123" :: Integer
123
but it is sometimes easier to introduce a little helper function:
Prelude λ let readInteger = read :: String -> Integer
Prelude λ readInteger "123"
123
Now to the main problem. read( hGetLine inputFile) doesn't work because hGetLine inputFile returns and IO String and read needs a String. This can be solved in two steps:
line <- hGetLine inputFile
let isbn = readInteger line
Note two different constructs <- and let .. =, they do different things. Can you figure out exactly what?
As shown in another answer, you can do it in a less verbose manner:
isbn <- fmap readInteger (hGetLine inputFile)
which is great if you do a simple thing like read. But it is often desirable to explicitly name intermediate results. You can use <- and let .. = constructs in such cases.
When I try to run this code...
module Main where
import qualified Data.Text.Lazy.IO as LTIO
import qualified Data.Text.Lazy as LT
import System.IO (IOMode(..), withFile)
getFirstLine :: FilePath -> IO String
getFirstLine path =
withFile path ReadMode (\f -> do
contents <- LTIO.hGetContents f
return ("-- "++(LT.unpack . head $ LT.lines contents)++" --"))
main::IO()
main = do
firstLine <- getFirstLine "/tmp/foo.csv"
print firstLine
I get
"-- *** Exception: Prelude.head: empty list
... where I would expect it to print the first line of "/tmp/foo.csv". Could you please explain why? Ultimately, I'm trying to figure out how to create a lazy list of Texts from file input.
As Daniel Lyons mentions in a comment, this is due to IO and laziness interacting.
Imagine, if you will:
withFile opens the file, to file handle f.
Thunk using contents of f is returned.
withFile closes the file.
Thunk is evaluated. There are no contents in a closed file.
This trap is mentioned on the HaskellWiki / Maintaining laziness page.
To fix, you can either read the whole file contents within withFile (possibly by forcing it with seq) or lazily close the file instead of using withFile.
I think it's like this: withFile closes the file after executing the function. hGetContents reads the contents lazily (lazy IO), and by the time it needs to read the stuff, the file is closed.
Instead of using withFile, try just using openFile, and not closing it. hGetContents will place the file in semi-closed state after it's reading from it. Or better, just read the contents directly using readFile
So I'm attempting to make a program that reads from a handle and writes to stdOut, like so:
import IO
import System
cat w = do
fromHandle <- getAndOpenFile w ReadMode
contents <- hGetContents fromHandle
putStr contents
putStr "Done."
getAndOpenFile :: String -> IOMode -> IO Handle
getAndOpenFile name mode =
do
catch (openFile name mode)
(\_ -> do
putStrLn ("Cannot open "++ name ++"\n")
return())
I'm fairly new to Hs and it seems like this should be far more simple than I'm making it for myself. Any suggestions to helping me move further?
the usage would be ./cat "foo.txt" and would print the text in the foo.txt file to stdOut.
There is the below function which does what you want.
readFile :: FilePath -> IO String
use this with putStr to print the IO String