I have the following problem:
I want to read from a file line by line and write the lines to another file. However, I want to return the number of lines.
Therefore, inside a pure function I would use an accumulator like this:
function parameters=method 0 ......
method accu {end case scenario} =accu
method accu {not end case} = accu+1 //and other stuff
How can I achieve the same in a do-block without using another function?
Concrete example
module Main where
import System.IO
import Data.Char(toUpper)
main::IO()
main=do
let inHandle=openFile "in.txt" ReadMode
let outHandle=openFile "out.txt" WriteMode
inHandle>>= \a ->outHandle>>= \b ->loop a b 0>>=print . show
loop::Handle->Handle->Int->IO Int
loop inh outh cnt=hIsEOF inh>>= \l ->if l then return elem
else
do
hGetLine inh>>=hPutStrLn outh
loop inh outh (cnt+1)
Edit
Refactored the way loop gets its parameters
P.S 2 (after K. A. Buhr's thorough response)
I. What I really wanted to achieve, was the last expression of the main method. I wanted to take the multiple IO Actions and bind their results to a method. Specifically:
inHandle>>= \a ->outHandle>>= \b ->loop a b 0>>=print . show
What I do not understand in this case is:
If inHandle>>= is supplied to \a -> and then the result is passed to ...>>=\b, do the variables inside the outer scope get closured in \b?
If not, shouldn't it be >>=\a->..>>= \a b? Shouldn't the inner scope hold a parameter corresponding to the result of the outer scope?
Eliminating the do inside the helper method
What I wanted to know, is if there is a way to glue together multiple actions without them being in a do block.
In my case:
loop::Handle->Handle->Int->IO Int
loop inh outh cnt=hIsEOF inh>>= \l ->if l then return elem
else
do
hGetLine inh>>=hPutStrLn outh
loop inh outh (cnt+1)
Can't I say something like:
if ... then ... elsehPutStrLn=<<action1 [something] v2=<<action2 [something] loop inh outh (cnt+1)
where something could be an operator? I do not know, that is why I am asking.
It looks like the answer to your last question still left you confused.
tl;dr: stop using >>= and =<< until you master the do-block notation which you can do by Googling "understanding haskell io" and working through lots of examples from tutorials.
Long answer...
First, I would suggest avoiding the >>= and =<< operators for now. Even though they are sometimes named "bind", they don't bind variables or bind parameters to methods or anything else like that, and they seem to be tripping you up. You may also find the section about IO from "A Gentle Introduction to Haskell" helpful as a quick introduction to how IO works.
Here's a very short explanation of IO that may help you, and it'll provide a basis for answering your question. Google for "understanding haskell io" to get a more in-depth explanation:
Super short IO explanation in three paragraphs:
(1) In Haskell, any value of type IO a is an IO action. An IO action is like a recipe that can be used (by executing the action) to perform some actual input/output and then produce a value of type a. So, a value of type IO String is an action that, if executed, will perform some input/output and produce a value of type String, while an IO () is an action that, if executed, will perform some input/output and produce a value of type (). In the latter case, because values of type () are useless, actions of type IO () are normally executed for their I/O side effects, such as printing a line of output.
(2) The only way to execute an IO action in a Haskell program is to give it the special name main. (The interactive interpreter GHCi provides more ways to execute IO actions, but let's ignore that.)
(3) IO actions can be combined using do-notation into a larger IO action. A do block consists of lines of the following form:
act -- action to be executed, with the result
-- thrown away (unless it's the last line)
x <- act -- action to be executed, with the result
-- named #x# for later lines
let y = expr -- add a name #y# for the value of #expr#
-- for later lines, but this has nothing to
-- do with executing actions
In the above templates, act can be any expression that evaluates to an IO action (i.e., a value of type IO a for some a). It's important to understand that the do-block does not itself execute any IO actions. Instead, it builds a new IO action that -- when executed -- will execute the given set of IO actions in the order they appear in the do-block, either throwing away or naming the values produced by executing these actions. The value produced by executing the whole do-block will be the value produced by the last line of the do-block (which has to be a line of the first form above).
A simple exmple
Therefore, if a Haskell program includes:
myAction :: IO ()
myAction = do
putStrLn "Your name?"
x <- getLine
let stars = "***"
putStrLn (stars ++ x ++ stars)
then this defines a value myAction of type IO (), an IO action. By itself, it does nothing, but if it is ever executed, then it will execute each of the IO actions (values of type IO a for various types a) in the do-block in the order they appear. The value produced by executing myAction will be the value produced by the last line (in this case, the value () of type ()).
Applied to the problem of copying lines
Armed with this explanation, let's tackle your question. First, how do we write a Haskell program to copy lines from one file to another using a loop, ignoring the problem of counting lines? Here's one way that's fairly similar to your code example:
import System.IO
myAction :: IO ()
myAction = do
inHandle <- openFile "in.txt" ReadMode
outHandle <- openFile "out.txt" WriteMode
loop inHandle outHandle
hClose outHandle
hClose inHandle
Here, if we check the type of one of these openFile calls in GHCi:
> :t openFile "in.txt" ReadMode
openFile "in.txt" ReadMode :: IO Handle
>
we see that it has type IO Handle. That is, this is an IO action that, when executed, performs some actual I/O (namely an operating system call to open a file) and then produces a value of type Handle, which is the Haskell value representing the open file handle. In your original version, when you wrote:
let inHandle = openFile "in.txt" ReadMode
all this did was assign a name inHandle to an IO action -- it didn't actually execute the IO action and so didn't actually open the file. In particular, the value of inHandle of type IO Handle was not itself a file handle, just an IO action (or "recipe") for producing a file handle.
In the version of myAction above, we've used the notation:
inHandle <- openFile "in.txt" ReadMode
to indicate that, if and when the IO action named by myAction is ever executed, it will start by executing the IO action openFile "in.txt" ReadMode" (that is, the value of that expression which has type IO Handle), and that execution will produce a Handle which will be named inHandle. Ditto for the next line to produce and name an open outHandle. We will then pass these open handles to loop in the expression loop inHandle outHandle.
Now, loop can be defined like so:
loop :: Handle -> Handle -> IO ()
loop inHandle outHandle = do
end <- hIsEOF inHandle
if end
then return ()
else do
line <- hGetLine inHandle
hPutStrLn outHandle line
loop inHandle outHandle
It's worth taking a moment to explain this. loop is a fuction that takes two arguments, each Handle. When it's applied to two handles, as in the expression loop inHandle outHandle, the resulting value is of type IO (). That means it's an IO action, specifically, the IO action created by the outer do-block in the definition of loop. This do-block creates an IO action that -- when it is executed -- executes two IO actions in order, as given by the lines of the outer do-block. The first line is:
end <- hIsEOF inHandle
which takes the IO action hEof inHandle (a value of type IO Bool), executes it (which consists of asking the operating system if we've reached the end of file for the file represented by handle inHandle), and names the result end -- note that end will be a value of type Bool.
The second line of the do-block is the entire if statement. It produces a value of type IO (), so a second IO action. The IO action depends on the value of end. If end is true, the IO action will be the value of return () which, if executed, will perform no actual I/O and will produce a value () of type (). If end is false, the IO action will be the value of the inner do-block. This inner do-block is an IO action (a value of type IO ()) which, if executed, will execute three IO actions in order:
The IO action hGetLine inHandle, a value of type IO String that, when executed, will read a line from inHandle and produce the resulting String. As per the do-block, this result will be given the name line.
The IO action hPutStrLn outHandle line, a value of type IO () that, when exectued, will write line to outHandle.
The IO action loop inHandle outHandle, a recursive use of the IO action produced by the outer do-block, which -- when executed -- starts the whole process over again, starting with the EOF check.
If you put these two definitions (for myAction and loop) in a program, they won't do anything, because they're just definitions of IO actions. The only way to have them execute is to name one of them main, like so:
main :: IO ()
main = myAction
Of course, we could have just used the name main in place of myAction to get the same effect, as in the whole program:
import System.IO
main :: IO ()
main = do
inHandle <- openFile "in.txt" ReadMode
outHandle <- openFile "out.txt" WriteMode
loop inHandle outHandle
hClose inHandle
hClose outHandle
loop :: Handle -> Handle -> IO ()
loop inHandle outHandle = do
end <- hIsEOF inHandle
if end
then return ()
else do
line <- hGetLine inHandle
hPutStrLn outHandle line
loop inHandle outHandle
Take some time to compare this to your "concrete example" above, and see where it's different and where it's simliar. In particular, can you figure out why I wrote:
end <- hIsEOF inHandle
if end
then ...
instead of:
if hIsEOF inHandle
then ...
Copying lines with a line count
To modify this program to count lines, a fairly standard way to do it would be to make the count a parameter to the loop function, and have loop produce the final value of the count. Since the expression loop inHandle outHandle is an IO action (above, it's of type IO ()), to have it produce a count we need to give it type IO Int, as you've done in your example. It will still be an IO action but now -- when it is executed -- it'll produce a useful Int value instead of a useless () value.
To make this change, main will have to invoke loop with a starting counter, name the value it produces, and output that value to the user.
To make it absolutely clear: main's value is still an IO action created by a do-block. We're just modifying one of the lines of the do-block. It used to be:
loop inHandle outHandle
which evaluated to a value of type IO () representing an IO action that -- when the whole do-block was executed -- would be executed when its turn came to copy the lines from one file to the other before producing a () value to be thrown away. Now, it's going to be:
count <- loop inHandle outHandle 0
where the right-hand side will evaluate to a value of type IO Int representing an IO action that -- when the whole do-block is executed -- will be executed when its turn comes to copy the lines from one file to the other before producing a count value of type Int to be named count for later do-block steps.
Anyway, the modified main looks like this:
main :: IO ()
main = do
inHandle <- openFile "in.txt" ReadMode
outHandle <- openFile "out.txt" WriteMode
count <- loop inHandle outHandle 0
hClose inHandle
hClose outHandle
putStrLn (show count) -- could just write #print count#
Now, we rewrite loop to maintain a count (taking the running count as a parameter through recursive calls and producing the final value when the IO action is executed):
loop :: Handle -> Handle -> Int -> IO Int
loop inHandle outHandle count = do
end <- hIsEOF inHandle
if end
then return count
else do
line <- hGetLine inHandle
hPutStrLn outHandle line
loop inHandle outHandle (count + 1)
The whole program is:
import System.IO
main :: IO ()
main = do
inHandle <- openFile "in.txt" ReadMode
outHandle <- openFile "out.txt" WriteMode
count <- loop inHandle outHandle 0
hClose inHandle
hClose outHandle
putStrLn (show count) -- could just write #print count#
loop :: Handle -> Handle -> Int -> IO Int
loop inHandle outHandle count = do
end <- hIsEOF inHandle
if end
then return count
else do
line <- hGetLine inHandle
hPutStrLn outHandle line
loop inHandle outHandle (count + 1)
The rest of your question
Now, you asked about how to use an accumulator within a do-block without using another function. I don't know if you meant without using another function besides loop (in which case the answer above satisfies the requirement) or if you meant without using any explicit loop at all.
If the latter, there are a couple of approaches. First, there are monadic loop combinators available in the monad-loops package that can allow you to do the following (to copy without counting). I've also switched to using withFile in place of explicit open/close calls:
import Control.Monad.Loops
import System.IO
main :: IO ()
main =
withFile "in.txt" ReadMode $ \inHandle ->
withFile "out.txt" WriteMode $ \outHandle ->
whileM_ (not <$> hIsEOF inHandle) $ do
line <- hGetLine inHandle
hPutStrLn outHandle line
and you can count lines with a state monad:
import Control.Monad.State
import Control.Monad.Loops
import System.IO
main :: IO ()
main = do
n <- withFile "in.txt" ReadMode $ \inHandle ->
withFile "out.txt" WriteMode $ \outHandle ->
flip execStateT 0 $
whileM_ (not <$> liftIO (hIsEOF inHandle)) $ do
line <- liftIO (hGetLine inHandle)
liftIO (hPutStrLn outHandle line)
modify succ
print n
With respect to removing the last do block from the definition of loop above, there's no good reason to do this. It's not like do blocks have overhead or introduce some extra processing pipeline or something. They're just ways of constructing IO action values. So, you could replace:
else do
line <- hGetLine inHandle
hPutStrLn outHandle line
loop inHandle outHandle (count + 1)
with
else hGetLine inHandle >>= hPutStrLn outHandle >> loop inHandle outHandle (count + 1)
but this is a purely syntactic change. The two are otherwise identical (and will almost certainly compile to equivalent code).
Related
Below is my haskell code.
readTableFile :: String -> (Handle -> IO a) -> IO [a]
readTableFile file func = do
fileHandle <- withFile file ReadMode (\handle -> do
contents <- readDataFrom handle
putStr contents)
where readDataFrom fileHandle = do
isFileEnd <- hIsEOF fileHandle
if isFileEnd
then
return ("")
else
do
info <- hGetLine fileHandle
putStrLn $ func info
readDataFrom fileHandle
But I get an error:
error: parse error on input ‘isFileEnd’
|
270 | isFileEnd <- hIsEOF fileHandle
| ^^^^^^^^^
I don't know why. Please help me
You've got a couple things going on that are contributing here. As the commenter above pointed out, when you get parse errors that look surprising, spacing is always the first thing to look for. However, we could take a look at a couple things that are contributing here:
Your readTableFile is really just one line long. You've got a do block in which the only thing you do is to assign to fileHandle the value from inside the IO monad that withFile ran in. Aside from the fact that withFile is going to return an IO action from your handler (and not the file handle that your naming might imply) your function isn't actually returning an IO action. Let's clean up some:
readTableFile file func = do
withFile file ReadMode (\handle -> do
contents <- readDataFrom handle
putStr contents)
where readDataFrom fileHandle = do
isFileEnd <- hIsEOF fileHandle
[...]
Now we're returning the right type, but you're still going to get a parse error from the isFileEnd <- assignment. Now that we've cleaned up, you can get your code to compile by moving that (and subsequent lines) to the right of the first character of the readDataFrom declaration:
where readDataFrom fileHandle = do
isFileEnd <- hIsEOF fileHandle
[...]
Your top level do is still redundant, but you'll be past your immediate problems.
Hello i was wondering how can you unwrap a value at a later time in the IO monad?
If a<-expression binds the result to a then can't i use (<-expression) as a parameter for a given method eg:
method (<-expression) where method method accepts the result of the evaluation?
Code
let inh=openFile "myfile" WriteMode
let outh=openFile "out.txt" WriteMode
hPutStrLn (<-outh) ((<-inh)>>=getLine)
I have not entered the Monad chapter just basic <- and do blocks but i suppose it has to do with monads.
Then if i want to pass the result if the evaluation to hGetLine can't i use something like:
(<-expression)=>>hGetLine
You already understand that <- operator kind of unwraps IO value, but it's actually the syntax of do notation and can be expressed like this (actually I'm not sure, which results you're trying to achieve, but the following example just reads the content from one file and puts the content to another file):
import System.IO
main = do
inh <- openFile "myfile" ReadMode
outh <- openFile "out.txt" WriteMode
inContent <- hGetLine inh
hPutStrLn outh inContent
hClose outh
According to documentation hGetLine, hPutStrlLn and hClose accept values of Handle type as an argument, but openFile returns IO Handle, so we need to unwrap it using <- operator
But if you want to use >>= function instead, then this is one of the options of doing it:
import System.IO
writeContentOfMyFile :: Handle -> IO ()
writeContentOfMyFile handler =
openFile "myfile" ReadMode >>= hGetLine >>= hPutStrLn handler
main =
withFile "out.txt" WriteMode writeContentOfMyFile
I implemented withFile in Haskell:
withFile' :: FilePath -> IOMode -> (Handle -> IO a) -> IO a
withFile' path iomode f = do
handle <- openFile path iomode
result <- f handle
hClose handle
return result
When I ran the main provided by Learn You a Haskell, it printed out the content of "girlfriend.txt," as expected:
import System.IO
main = do
withFile' "girlfriend.txt" ReadMode (\handle -> do
contents <- hGetContents handle
putStr contents)
I wasn't sure if my withFile' would've worked with the last 2 lines: (1) close the handle and (2) returning the result as anIO a.
Why didn't the following happen?
result gets lazily bound to f handle
hClose handle closes the file handle
result gets return'd, which results in the actual evaluate of f handle. Since handle was closed, an error gets thrown.
Lazy IO is popularly known as confusing.
It depends on whether putStr executes before hClose or not.
Notice the difference between the first and second uses (the brackets are unnecessary but clarifying in the second example).
ghci> withFile' "temp.hs" ReadMode (hGetContents >=> putStr) -- putStr
import System.IO
import Control.Monad
withFile' :: FilePath -> IOMode -> (Handle -> IO a) -> IO a
withFile' path iomode f = do
handle <- openFile path iomode
result <- f handle
hClose handle
return result
ghci> (withFile' "temp.hs" ReadMode hGetContents) >>= putStr
ghci>
In both cases, the f passed in gets a chance to run before the handle is closed. Because of lazy evaluation, hGetContents only reads the file if it needs to, i.e. is forced to in order to produce output for some other function.
In the first example, since f is (hGetContents >=> putStr), the full contents of the file must be read in order to execute putStr.
In the second example, nothing needs to be evaluated after hGetContents in order to return result, which is a lazy list. (I can quite happily return (show [1..]) which will only fail to terminate if I choose to use the entire output.) This is seen as a problem for lazy IO, which is fixed by alternatives such as strict IO, pipes or conduit.
Maybe returning the empty string for a file when the handle was closed prematurely is a bug, but certainly running the entirety of f before closing it is not.
Equational reasoning means that you can reason about Haskell code by just inlining and substituting things (with certain caveats, but they don't apply here).
This means that all I need to do to understand your code is to take the withFile' here:
import System.IO
main = do
withFile' "girlfriend.txt" ReadMode (\handle -> do
contents <- hGetContents handle
putStr contents)
... and inline its definition:
main = do
handle <- openFile "girlfriend.txt" ReadMode
contents <- hGetContents handle
result <- putStr contents
hClose handle
return result
Once you inline its definition, it's easier to see what is going on. putStr evaluates the entire contents of the file before you close the handle, so there is no error. Also, result is not what you think it is: it's the return value of putStr, which is just (), not the contents of the file.
Most IO actions are not lazily executed.
IO action execution is different from normal Haskell evaluation of values. IO execution is only ever carried out by the outer driver that is trying to execute all the effects of main; it does so in the correct order implied by the monadic sequencing of IO actions.
The driver's need to know what the next IO action is ultimately triggers all evaluation of lazy values in Haskell; if it were happy with an unevaluated lazy value and moved on to the next thing without fully evaluating and executing it, then it would just leave main unevaluated and no Haskell program could ever do anything.
The Haskell value resulting from executing an IO action may of course be an unevaluated lazy value, but each IO action itself is evaluated and executed by the driver (including all sub-actions sequenced with do blocks or binds).
So result doesn't get lazily bound to f handle completely unevaluated; f handle is evaluated to come up with the sub actions hGetContents handle and putStr contents. These are both fully executed before the outer driver moves on to hClose handle, so everything's okay.
Note however that hGetContents is special. Quoting from the documentation:
Computation hGetContents hdl returns the list of characters corresponding to the unread portion of the channel or file managed by hdl, which is put into an intermediate state, semi-closed. In this state, hdl is effectively closed, but items are read from hdl on demand and accumulated in a special list returned by hGetContents hdl.
Any operation that fails because a handle is closed, also fails if a handle is semi-closed. The only exception is hClose. A semi-closed handle becomes closed:
if hClose is applied to it;
if an I/O error occurs when reading an item from the handle;
or once the entire contents of the handle has been read.
Once a semi-closed handle becomes closed, the contents of the associated list becomes fixed. The contents of this final list is only partially specified: it will contain at least all the items of the stream that were evaluated prior to the handle becoming closed.
So executing hGetContents handle actually results in a partially evaluated list, whose lazy evaluation is tied to further IO operations under the hood. This is impossible to do yourself without using the Unsafe family of operations, since it is essentially bypassing the type system and can result in exactly the sort of problem you were concerned about; if you had attempted the following code:
main = do
text <- withFile' "girlfriend.txt" ReadMode (\handle -> do
contents <- hGetContents handle
return contents)
putStr text
(where the function passed to withFile' tries to return the file contents, and they are passed to putStr after the withFile' call), then the putStr would be executed after hClose, and the file may well not have been fully read before it was closed.
I'm trying to learn Haskell and want to write a small program which prints the content of a file to the screen. When I load it into GHCi I get the following error:
The last statement in a 'do' construct must be an expression
I know this question has be asked already here: Haskell — “The last statement in a 'do' construct must be an expression”.
Even though my code is very similar I still can't figure out the problem. If anyone could point out the problem to me I'd be very thankful.
module Main (main) where
import System.IO
import System(getArgs)
main :: IO()
main = do
args <- getArgs
inh <- openFile $ ReadMode head args
printFile inh
hClose inh
printFile :: Handle -> IO ()
printFile handle = do
end <- hIsEOF handle
if end
then return ()
else do line <- hGetLine handle
putStrLn line
printFile handle
Your indentation is broken. These are better:
printFile :: Handle -> IO ()
printFile handle = do
end <- hIsEOF handle
if end
then return ()
else do line <- hGetLine handle
putStrLn line
printFile handle
printFile :: Handle -> IO ()
printFile handle = do
end <- hIsEOF handle
if end
then return ()
else do
line <- hGetLine handle
putStrLn line
printFile handle
By having if further indented than end <- hIsEof handle, it was actually a line continuation, not a subsequent action in the do. Similarly, the fact that you had putStrLn line less indented than line <- hGetLine handle means that the do (inside the else) ended there.
There are seveal issues. First, the if is indented too far - end <- ... is assumed to be the last line of the do. Unindent...
next issue comes up. Same error message, only at line 18. This time, line 19 and 20 are not indented deeply enough (they aren't parsed as part of the do). Indent (looks nicer anyway, since it all lines up now)... next error message. The good news is, it's not an indentation error this time and the fix is again trivial.
test.hs:9:22:
Couldn't match expected type `([a] -> a) -> [String] -> FilePath'
against inferred type `IOMode'
In the second argument of `($)', namely `ReadMode head args'
In a stmt of a 'do' expression:
inh <- openFile $ ReadMode head args
In the expression:
do { args <- getArgs;
inh <- openFile $ ReadMode head args;
printFile inh;
hClose inh }
The fix is inh <- openFile (head args) ReadMode. If you want a more detailed explanation of why/how your version is incorrect, or what the error means, let me know and I'll edit.
You wrote this:
main :: IO()
main = do
args <- getArgs
inh <- openFile $ ReadMode head args
printFile inh
hClose inh
But it is probably nicer like this:
main :: IO()
main = do
args <- getArgs
withFile (head args) ReadMode printFile
You can always use explicit bracketing with { ; } to never have to worry about this whitespace foolishness.
printFile :: Handle -> IO ()
printFile handle = do {
end <- hIsEOF handle ;
if end
then return ()
else do { line <- hGetLine handle ;
putStrLn line ;
printFile handle }}
would have been totally fine (as in, not cause the error).
I/O is dealt with through the special "do" language, in Haskell. It should be embraced. That it is actually implemented via monads is an implementational detail.
To clarify: I don't think braces are better, I think they should go together with a nice and consistent indentation. Braces give us nice and immediate visual clues as to the code's structure. Wild indentation will of course be a pointless distraction most of the time. But also, braces give us a guarantee for the working code, and relieve us from the pointless worries of whitespace accidents. They remove this brittleness.
Still quite new to Haskell..
I want to read the contents of a file, do something with it possibly involving IO (using putStrLn for now) and then write new contents to the same file.
I came up with:
doit :: String -> IO ()
doit file = do
contents <- withFile tagfile ReadMode $ \h -> hGetContents h
putStrLn contents
withFile tagfile WriteMode $ \h -> hPutStrLn h "new content"
However this doesn't work due to laziness. The file contents are not printed. I found this post which explains it well.
The solution proposed there is to include putStrLn within the withFile:
doit :: String -> IO ()
doit file = do
withFile tagfile ReadMode $ \h -> do
contents <- hGetContents h
putStrLn contents
withFile tagfile WriteMode $ \h -> hPutStrLn h "new content"
This works, but it's not what I want to do. The operation in I will eventually replace putStrLn might be long, I don't want to keep the file open the whole time. In general I just want to be able to get the file content out and then close it before working with that content.
The solution I came up with is the following:
doit :: String -> IO ()
doit file = do
c <- newIORef ""
withFile tagfile ReadMode $ \h -> do
a <- hGetContents h
writeIORef c $! a
d <- readIORef c
putStrLn d
withFile tagfile WriteMode $ \h -> hPutStrLn h "Test"
However, I find this long and a bit obfuscated. I don't think I should need an IORef just to get a value out, but I needed "place" to put the file contents. Also, it still didn't work without the strictness annotation $! for writeIORef. I guess IORefs are not strict by nature?
Can anyone recommend a better, shorter way to do this while keeping my desired semantics?
Thanks!
The reason your first program does not work is that withFile closes the file after executing the IO action passed to it. In your case, the IO action is hGetContents which does not read the file right away, but only as its contents are demanded. By the time you try to print the file's contents, withFile has already closed the file, so the read fails (silently).
You can fix this issue by not reinventing the wheel and simply using readFile and writeFile:
doit file = do
contents <- readFile file
putStrLn contents
writeFile file "new content"
But suppose you want the new content to depend on the old content. Then you cannot, generally, simply do
doit file = do
contents <- readFile file
writeFile file $ process contents
because the writeFile may affect what the readFile returns (remember, it has not actually read the file yet). Or, depending on your operating system, you might not be able to open the same file for reading and writing on two separate handles. The simple but ugly workaround is
doit file = do
contents <- readFile file
length contents `seq` (writeFile file $ process contents)
which will force readFile to read the entire file and close it before the writeFile action can begin.
I think the easiest way to solve this problem is useing strict IO:
import qualified System.IO.Strict as S
main = do
file <- S.readFile "filename"
writeFile "filename" file
You can duplicate the file Handle, do lazy write with original one (to the end of file) and lazy read with another. So no strictness annotation involved in case of appending to file.
import System.IO
import GHC.IO.Handle
main :: IO ()
main = do
h <- openFile "filename" ReadWriteMode
h2 <- hDuplicate h
hSeek h2 AbsoluteSeek 0
originalFileContents <- hGetContents h2
putStrLn originalFileContents
hSeek h SeekFromEnd 0
hPutStrLn h $ concatMap ("{new_contents}" ++) (lines originalFileContents)
hClose h2
hClose h
The hDuplicate function is provided by GHC.IO.Handle module.
Returns a duplicate of the original handle, with its own buffer. The two Handles will share a file pointer, however. The original handle's buffer is flushed, including discarding any input data, before the handle is duplicated.
With hSeek you can set position of the handle before reading or writing.
But I'm not sure how reliable would be using "AbsoluteSeek 0" instead of "SeekFromEnd 0" for writing, i.e. overwriting contents. Generally I would suggest to write to a temporary file first, for example using openTempFile (from System.IO), and then replace original.
It's ugly but you can force the contents to be read by asking for the length of the input and seq'ing it with the next statement in your do-block. But really the solution is to use a strict version of hGetContents. I'm not sure what it's called.