How can I get unicode output in Haskell - haskell

I have a main function that outputs unicode which looks like this:
main = do
hSetEncoding stdout utf8
input <- getContents
mapM_ putStr $ myfunc input
How can I write this function without do notation?
I get <stdout>: commitBuffer: invalid argument (invalid character) when I try to compile this main function:
main = getContents >>= mapM_ putStr . myfunc

Just use sequence (>>):
main = do
hSetEncoding stdout utf8
input <- getContents
mapM_ putStr $ myfunc input
~~>
main = hSetEncoding stdout utf8 >> getContents >>= \input -> mapM_ putStr $ lines input
~~>
main = hSetEncoding stdout utf8 >> getContents >>= mapM_ putStr . lines

Related

Strange monadic behaviour

I tried to write a program which takes filepaths as (command line) arguments and returns the first line of each file
main = getArgs >>= ( mapM_ ( \file -> ( openFile file ReadMode >>= ( (\handle -> hGetLine handle >>= print) >> hClose ) ) ) )
I know that this doesn't look very beautiful but I am just a beginner in Haskell. I did also avoid the do notation on purpose because I just don't feel very comfortable with her (yet).
So the Code above compiles and returns an error for invalid file paths, and nothing (i.e. especially not the first line of a file) for valid paths.
I must confess that I have pretty much no idea what I did wrong, but I made the following observation:
If I add the following to check which parts still get executed
main = getArgs >>= ( mapM_ ( \file -> ( openFile file ReadMode >>= ( (\handle -> hGetLine handle >>= print) >> (const $ putStr "Hello1") >> hClose >> (const $ putStr "Hello2") ) ) ) )
the program prints only the second "Hello", this reminded me of the type signature of (>>):
(>>) :: Monad m => m a -> m b -> m b
so taking into perspective that only something of the type of the second argument gets returned, maybe the first argument is just ignored?
But the first argument against this theory is that such a function would not seem to be very useful (at least not in the context of the IO Monad), and the second is that the program
main = (putStr "Hello" >> putStr "World" >> putStr "!")
returns 'HelloWorld!' as expected. Hence I must be completely on the wrong track, which is why I came here.
Thanks for your help!
I think you main error is that you messed up with the handle:
main = getArgs >>= (mapM_ (\file -> (openFile file ReadMode >>= (\handle -> (hGetLine handle >>= print) >> hClose handle) ) ) )
this way you did it >> was for the (-> handle) Monad (it's a reader monad - see there is an Monad instance for (->) c for constant c) not the IO!
So it did indeed pass the handle to both hGetLine handle >>= print and hClose but >> ignored the first resulting IO action and returned the hClose one as the result to >>
Here the effect was passing the handle!
So yes in the end the only executed IO-effect was closing the file!
It's subtle and not obvious as you seldom see/think about the reader-monad instance like this.
here is this with do notation
main = do
args <- getArgs
mapM_ (\file -> do
handle <- openFile file ReadMode
line <- hGetLine handle
print line
hClose handle) args
and I'd suggest switching to forM_ (from Control.Monad) for the args parameter:
main = do
args <- getArgs
forM_ args (\file -> do
handle <- openFile file ReadMode
line <- hGetLine handle
print line
hClose handle)
now you should make sure you close the handle - you can use bracket from Control.Exception for this:
main = do
args <- getArgs
forM_ args (\file -> do
bracket
(openFile file ReadMode)
hClose
(\h -> do
line <- hGetLine h
print line
)
)
or (as this is very common) just withFile from System.IO which does the opening/closing for you:
main = do
args <- getArgs
forM_ args (\file -> do
withFile file ReadMode
(\h -> do
line <- hGetLine h
print line
)
)
finally you don't really have to use all the handle stuff you can use the (lazy) readFile instead and be a bit safer with empty files too:
main = do
args <- getArgs
forM_ args (\file -> do
content <- readFile file
let ls = lines content
case ls of
[] -> putStrLn "no line in file"
(firstLine:_) -> putStrLn firstLine
)

Couldn't match expected type ‘IO [String]’ with actual type ‘[String]’

I have these two code snippets, which I'd guess do the same thing, but they don't. Why is that?
This one works fine:
fdup :: String -> IO ()
fdup filename = do
h <- openFile filename ReadMode
c <- hGetContents h
putStr $ unlines $ parse $ lines c
hClose h
This one returns an error Couldn't match expected type ‘IO [String]’ with actual type ‘[String]’:
fdup' :: String -> IO ()
fdup' filename = do
h <- openFile filename ReadMode
c <- hGetContents h
ls <- lines c
putStr $ unlines $ parse $ ls
hClose h
parse :: [String] -> [String]
What is the difference between them?
As Willem Van Onsem explained, you don't need <- in that specific place because lines c is just a list of strings, and not an IO computation. If you want to give it a name, you can use a let-binding instead:
fdup' :: String -> IO ()
fdup' filename = do
h <- openFile filename ReadMode
c <- hGetContents h
let ls = lines c
putStr $ unlines $ parse $ ls
hClose h

hGetContents hangs when getting contents utf-8 file

I'm parsing files from a git repository and, while planning to use the gitlib module for that, I'm getting the file contents using the git executable for now - until I find some tutorial or have time to dive into gitlib's code.
I have a function that essentially run a "git show" for a specific file on a specific commit, and return its contents. Here is a full working example.
import System.IO
import System.Process
import System.Exit
main = do
let commit = Commit { hash = "811e22679008298176d8be24eedc65f9e8c4900b", time = ""}
fileIO <- showFileIO "/path/to/the/repo" (commit, "/path/to/the/file")
putStr (show fileIO)
showFileIO :: String -> (Commit, String) -> IO (Commit, String, String)
showFileIO directory (commit, filepath) = do
(_, Just hout, Just herr, procHandle) <- createProcess $ createCommand command directory
hSetEncoding hout utf8
hSetEncoding herr utf8
exitCode <- waitForProcess procHandle
stdOut <- hGetContents hout
stdErr <- hGetContents herr
if exitCode == ExitSuccess
then return (commit, filepath, stdOut)
-- Continue in the case of an error.
else return (commit, filepath, "")
where command = "git show " ++ (hash commit) ++ ":" ++ filepath
createCommand :: String -> FilePath -> CreateProcess
createCommand command directory = (shell command){std_out = CreatePipe, std_err = CreatePipe, cwd = Just directory}
-- Where Commit is defined as:
data Commit = Commit { hash :: String
, time :: String
} deriving (Show)
I was initially getting some errors ("invalid byte sequence") when getting the contents of a php file with mime-type "text/x-php" and charset "utf-8", and that was resolved when I set the encoding of the Handles to utf8. There is another file with mime-type "text/html" that is actually a html.twig file (Twig templating engine) with charset "utf-8". Now the function hangs indefinitely when trying to get the contents of this file. It works fine for other files.
Any ideas what could be wrong? How do I even get to debug in Haskell something that does not give me an error or any info? Are there any debugging tools that could help with that?
I would try something like this: (untested)
showFileIO directory (commit, filepath) = do
(_, Just hout, Just herr, procHandle) <- createProcess $ createCommand command directory
hSetEncoding hout utf8
hSetEncoding herr utf8
stdOut <- hGetContents hout
evaluate (length stdOut) -- strictify the above lazy IO
stdErr <- hGetContents herr
evaluate (length stdErr)
exitCode <- waitForProcess procHandle
if exitCode == ExitSuccess
...
Alternatively, use some strict-IO variant of hGetContents.
Note that there still is, as far as I can see, some window for deadlock. If the command produces a vast amount of data on stderr, then the command & OS buffers will become full and writes to stderr will block. Since the Haskell consumer now first waits for stdout to be consumed completely, we have a deadlock. Note that this will not be an issue for "short" error messages.
If we want to make it more robust, we need to read from both stdout and stderr at the same time. E.g.
showFileIO directory (commit, filepath) = do
(_, Just hout, Just herr, procHandle) <- createProcess $ createCommand command directory
hSetEncoding hout utf8
hSetEncoding herr utf8
stdOutV <- newEmptyMVar
stdErrV <- newEmptyMVar
forkIO $ do
stdOut <- hGetContents hout
evaluate (length stdOut)
putMVar stdOutV stdOut
forkIO $ fo
stdErr <- hGetContents herr
evaluate (length stdErr)
putMVar stdErrV stdErr
stdOut <- takeMVar stdOutV
stdErr <- takeMVar stdErrV
exitCode <- waitForProcess procHandle
if exitCode == ExitSuccess
...
Update. This should also work, and is much simpler.
showFileIO directory (commit, filepath) = do
(_, Just hout, Just herr, procHandle) <- createProcess $ createCommand command directory
hSetEncoding hout utf8
hSetEncoding herr utf8
stdOut <- hGetContents hout
stdErr <- hGetContents herr
forkIO $ evaluate (length stdOut)
evaluate (length stdErr)
exitCode <- waitForProcess procHandle
if exitCode == ExitSuccess
...
I wouldn't be surprised if there were some library function doing all of this for you, but I can't remember anything at the moment.
Unrelated: I prefer proc to shell to construct the CreateProcess options. The latter requires careful escaping of filenames (spaces, special characters), while the former simply takes a list of strings parameters.

How to use a value produced in another do block?

I just watched a video on Haskell so I tried to play a little bit with it but I can't get to understand this (In short I want to print one random value):
import System.Random
import System.IO
randomNum = do
gen <- newStdGen
let ns = randoms gen :: [Int]
let val = take 10 ns
print $ head val
writeToFile = do
theFile <- openFile "test.txt" WriteMode
let val = randomNum;
hPutStrLn theFile ("Random number " ++ randomNum)
hClose theFile
readFromFile = do
theFile2 <- openFile "test.txt" ReadMode
contents <- hGetContents theFile2
putStr contents
hClose theFile2
The randomNum seems to work fine but when I try to put that on writeToFile it triggers an error. What can I do?
Thanks in advance!
EDIT: The error I get in the beginning is:
Prelude> :r
[1 of 1] Compiling Main ( haskell.hs, interpreted )
haskell.hs:207:48:
No instance for (Show (IO ())) arising from a use of `show'
In the second argument of `(++)', namely `show randomNum'
In the second argument of `hPutStrLn', namely
`("Random number " ++ show randomNum)'
In a stmt of a 'do' block:
hPutStrLn theFile ("Random number " ++ show randomNum)
Failed, modules loaded: none.
It looks like what you need is
randomNum = do
gen <- newStdGen
return (head (randoms gen :: [Int]))
writeToFile = do
theFile <- openFile "test.txt" WriteMode
val <- randomNum
hPutStrLn theFile ("Random number " ++ show val)
hClose theFile
You could try this instead:
import System.Random
import System.IO
writeToFile = do
gen <- newStdGen
let ns = randoms gen :: [Int]
let val = head ns;
theFile <- openFile "test.txt" WriteMode
hPutStrLn theFile ("Random number " ++ show val)
hClose theFile
readFromFile = do
theFile2 <- openFile "test.txt" ReadMode
contents <- hGetContents theFile2
putStr contents
hClose theFile2
One problem was that the do block in your randomNum did not return a value; rather, it performed the action you told it to do: print a random number. As an alternative, see Louis Wasserman's answer for a way to make randomNum actually return a value. In this answer's code, the random number generation was just moved into writeToFile.
Also notice that I shortened the code to get one random value: ns is already a list, so you can take its head right away. The take 10 was redundant.
Finally, val was an Int, which cannot be concatenated directly onto a string. Using show val converts it to a string which can be concatenated with "Random number "

I try for lazy I/O, but entire file is consumed

I am a Haskell newbie. I want to read only N characters of a text file into memory. So I wrote this code:
main :: IO()
main = do
inh <- openFile "input.txt" ReadMode
transformedList <- Control.Monad.liftM (take 4) $ transformFileToList inh
putStrLn "transformedList became available"
putStrLn transformedList
hClose inh
transformFileToList :: Handle -> IO [Char]
transformFileToList h = transformFileToListAcc h []
transformFileToListAcc :: Handle -> [Char] -> IO [Char]
transformFileToListAcc h acc = do
readResult <- tryIOError (hGetChar h)
case readResult of
Left e -> if isEOFError e then return acc else ioError e
Right c -> do let acc' = acc ++ [transformChar c]
putStrLn "got char"
unsafeInterleaveIO $ transformFileToListAcc h acc'
My input file several lines, with the first one being "hello world", and when I run this program, I get this output:
got char
transformedList became available
got char
["got char" a bunch of times]
hell
My expectation is that "got char" happens only 4 times. Instead, the entire file is read, one character at a time, and only THEN the first 4 characters are taken.
What am I doing wrong?
I acknowledge I don't understand how unsafeInterLeaveIO works but I suspect the problem here is somehow related to it. Maybe with this example you are trying to understand unsafeInterLeaveIO, but if I were you I'd try to avoid its direct use. Here is how I'd do it in your particular case.
main :: IO ()
main = do
inh <- openFile "input.txt" ReadMode
charList <- replicateM 4 $ hGetChar inh
let transformedList = map transformChar charList
putStrLn "transformedList became available"
putStrLn transformedList
hClose inh
This should just read the first 4 characters of the file.
If you are looking for a truly effectful streaming solution, I'd look into pipes or conduit instead of unsafeInterLeaveIO.

Resources