"resource busy (file is locked)" error in Haskell - haskell

I'm very new to Haskell. In fact, I'm working through this section of this tutorial.
I came across this piece of code:
import System.IO
import Data.Char
main = do
contents <- readFile "girlfriend.txt"
writeFile "girlfriendcaps.txt" (map toUpper contents)
Which reads the contents of the file called "girlfriend.txt" and writes the upper-cased version of the file to a new file called "girlfriendcaps.txt".
So, I wanted to modify the code a bit to take the name of the file to act on. I changed the code to this:
import System.IO
import Data.Char
main = do
path <- getLine
contents <- readFile path
writeFile path (map toUpper contents)
now, obviously the major difference here is that I'm reading from and writing to the same file. As I'm thinking about it now, this must be a lazy-evaluation thing, but i'm getting the "resource busy" error message. Correct me if I'm wrong, but I guess that readFile doesn't start reading the file until writeFile asks for the contents of it. And then writeFile tries to write to the file, but it must still have the file open because it's also asking for the contents. Am I close there?
So, the real question is: how do I read from and write to the same file in Haskell? It makes sense that it's more difficult, because you will write to a different file from the file you read from more often than not, but for my own edification, how would you read and write to the same file?

Indeed, this is a "lazy evaluation thing".
import System.IO
import Data.Char
main = do
path <- getLine
contents <- readFile path
writeFile path (map toUpper contents)
Remember that Haskell is primarily lazy in evaluation, and so is much of the IO subsystem. So when you call 'readFile' you begin streaming data in from the file. When you then immediately call "writeFile" you start streaming bytes back to the same file
This would be an error (i.e. destroy your data), so Haskell locks the resource until it is fully evaluated, and you get a nice error message.
There are two solutions:
Don't destructively overwrite the file, instead, copy to a new file
Or, use strict IO
To use strict IO, the 'text' or 'strict' packages are recommended.

What you're looking for is how to open a file in ReadWriteMode.
fileHandle <- openFile "fileName.txt" ReadWriteMode
contents <- hGetContents fileHandle
There's trickier stuff for navigating forwards and backwards through the file.
See Working with files and handles from RWH, and Operations on Handles at the System.IO docs.

Depends on exactly what you are trying to do. As a rule, in any language, this is probably a bad design because if anything goes wrong either inside the program or outside (e.g user error) then you have destroyed your original data and cannot try again. It also requires that the entire file be held in memory, which is cool if its just a few bytes, but not so good when someone decides to run this on a really big file.
If you really want to do this then generate a temporary filename for the output, and then once you know that you have written to it successfully you can delete the original and rename the new one.

Related

Can I have separate functions for reading and writing to a txt file in Haskell, without using a 'main' function?

I'm making a program using Haskell that requires simple save and load functions. When I call the save function, I need to put a string into a text file. When I call load, I need to pull the string out of the text file.
I'm aware of the complexities surrounding IO in Haskell. From some reading around online I have discovered that it is possible through a 'main' function. However, I seem to only be able to implement either save, or load... not both.
For example, I have the following function at the moment for reading from the file.
main = do
contents <- readFile "Test.txt"
putStrLn contents
How can I also implement a write function? Does it have to be within the same function? Or can I separate it? Also, is there a way of me being able to name the functions load/save? Having to call 'main' when I actually want to call 'load' or 'save' is rather annoying.
I can't find any examples online of someone implementing both, and any implementations I've found of either always go through a main function.
Any advice will be greatly appreciated.
I'm aware of the complexities surrounding IO in Haskell.
It's actually not that complex. It might seem a little intimidating at first but you'll quickly get the hang of it.
How can I also implement a write function?
The same way
Or can I separate it?
Yes
Also, is there a way of me being able to name the functions load/save?
Yes, for example you could do your loading like this:
load :: IO String
load = readFile "Test.txt"
All Haskell programs start inside main, but they don't have to stay there, so you can use it like this:
main :: IO ()
main = do
contents <- load -- notice we're using the thing we just defined above
putStrLn contents
Note the main is always what your program does; But your main doesn't only have to do a single thing. It could just as well do many things, including for instance reading a value and then deciding what to do; Here's a more complicated (complete) example - I expect you'll not understand all parts of it right off the bat, but it at least should give you something to play around with:
data Choice = Save | Load
pickSaveOrLoad :: IO Choice
pickSaveOrLoad = do
putStr "Do you want to save or load? "
answer <- getLine
case answer of
"save" -> return Save
"load" -> return Load
_ -> do
putStrLn "Invalid choice (must pick 'save' or 'load')"
pickSaveOrLoad
save :: IO ()
save = do
putStrLn "You picked save"
putStrLn "<put your saving stuff here>"
load :: IO ()
load = do
putStrLn "You picked load"
putStrLn "<put your loading stuff here>"
main :: IO ()
main = do
choice <- pickSaveOrLoad
case choice of
Save -> save
Load -> load
Of course it's a bit odd to want to do either save or load, most programs that can do these things want to do both, but I don't know what exactly you're going for so I kept it generic.

How can I ensure the destination file does not exist when renaming a file with Haskell, to avoid overwriting it?

I want to rename a file in Haskell without overwriting an already existing one. In case the target file exists I want to deal with that in my code (by appending something to the file name).
The description of renameFile from System.Directory says:
renameFile old new changes the name of an existing file system object from old to new. If the new object already exists, it is atomically replaced by the old object. Neither path may refer to an existing directory.
Is there any existing module or command that would let me rename without overwriting?
I know I can do the checks myself. I'd just feel much better if there was a function written by someone experienced. Overwritten files are gone for good.
Update
I want to rename photos, videos, live photos by creation data from either EXIF (similar to jhead) or the file system timestamp normalized to the timezone the photo was taken in. It might be that two photos were taken at exactly the same time and would end up with the same name: 2017-01-12 – 11-12-11.jpg. This must not happen. The second photo should be called something like 2017-01-12 – 11-12-11a.jpg.
POSIX has the ability to create a new file: atomically check a file exists and only create it if it does not, via the O_EXCL flag to open(). This lets you avoid the race condition in the more obvious implementation in which two processes may check that a file doesn't exist before either of them creates it, causing one process to overwrite the other's file. This can help here: the idea is to exclusively create an empty file at the target, and then overwrite it with a rename only if the exclusive creation succeeded. If the exclusive creation failed then another process already created the file. This is exposed in Haskell's unix package, via the openFd function, which either succeeds or else throws an IOException. It can be used like this:
module RenameNoOverwrite where
import Control.Exception
import Control.Monad
import Data.Bits
import System.Directory
import System.Posix.Files
import System.Posix.IO
renameFileNoOverwrite :: FilePath -> FilePath -> IO Bool
renameFileNoOverwrite old new = do
created <- handle handleIOException $ bracket createNewFile closeFd $ pure $ pure True
when created $ renameFile old new
return created
where
createNewFile = openFd new WriteOnly (Just defaultMode) defaultFileFlags {exclusive = True}
defaultMode = ownerReadMode .|. ownerWriteMode .|. groupReadMode .|. otherReadMode
handleIOException :: IOException -> IO Bool
handleIOException _ = return False
The key part is the {exclusive = True} option, which sets the O_EXCL flag on the resulting call to open().
Windows has a similar ability, exposed via the CREATE_NEW flag to CreateFile. There's also a MOVEFILE_REPLACE_EXISTING flag to MoveFileEx that looks like it might be useful, but I've never used it and the documentation is not 100% clear to me. These are exposed in Haskell's Win32 package.
Unfortunately there doesn't currently seem to be a portable way of doing this.
Here is one potential solution:
import System.Directory (doesFileExist, renameFile)
-- | Rename a src file as tgt file, safely. If the tgt file exists, don't
-- rename and return False. Otherwise, rename src to tgt and return True.
renameSafely :: FilePath -> FilePath -> IO Bool
renameSafely src tgt = do
exists <- doesFileExist tgt
if not exists
then (renameFile src tgt >> return True)
else return False
(Disclaimer: I didn't run this through GHC to ensure that it compiles; the ">>" in the then clause might be an issue.)
As noted in the comments, there is a potential race condition in the file system with two processes trying to create or rename a file with the same name at the same time. However, as you pointed out, that is unlikely to be an issue for you.
If renameSafely returns IO False, then simply try another name. :-)

Error reading and writing same file simultaneously in Haskell

I need to modify a file in-place. So I planned to read file contents, process them, then write the output to the same file:
main = do
input <- readFile "file.txt"
let output = (map toUpper input)
-- putStrLn $ show $ length output
writeFile "file.txt" output
But the problem is, it works as expected only if I uncomment the 4th line - where I just output number of characters to console. If I don't uncomment it, I get
openFile: resource busy (file is locked)
Is there a way to force reading of that file?
The simplest thing might be strict ByteString IO:
import qualified Data.ByteString.Char8 as B
main = do
input <- B.readFile "file.txt"
B.writeFile "file.txt" $ B.map toUpper input
As you can see, it's the same code -- but with some functions replaced with ByteString versions.
Lazy IO
The problem that you're running into is that some of Haskell's IO functions use "Lazy IO", which has surprising semantics. In almost every program I would avoid lazy IO.
These days, people are looking for replacements to Lazy IO like Conduit and the like, and lazy IO is seen as an ugly hack which unfortunately is stuck in the standard library.

Sum the filesizes in a directory

I'm having trouble wrapping my head around how to accomplish this. I'm relatively new to working with monads/IO, so excuse me if I'm missing something obvious. I searched google for a while and came up with nothing, and nothing I've read is making me figure out how to do this.
Here is what I have now:
import System.Path.Glob (glob)
import System.Posix.Files (fileSize, getFileStatus)
dir = "/usr/bin/"
lof = do files <- (glob (dir++"*"))
(mapM_ fileS files)
fileS file = do fs <- getFileStatus file
print (fileSize fs)
As you can see, this gets the sizes and prints them out, but I'm stuck on how to actually sum them.
You're almost there. You can have fileS return the file size instead of printing it:
return (fileSize fs)
Then, instead of mapM_ing (which throws away the result), do a mapM (which returns the list of results):
sizes <- mapM fileS files
Now sizes is a list of numbers corresponding to the sizes. summing them should be easy :-)
To deepen your understanding of this example (and practice good habits), try to write type signatures for your functions. Consult ghci with :t for help.

Read the contents of a file, then append to the file

I cannot figure out how to read the contents of a file and then append more data to a file. hGetContents, which I am using at the moment seems to close the file after reading, thus I cannot write to it.
How can I work around this?
Maybe something like:
import System.IO
modifyFile :: FilePath -> (String -> String) -> IO ()
modifyFile fn func = do
str <- readFile fn
length str `seq` return ()
appendFile fn (func str)
The seq call forces the file to be fully read and the file closed before we reopen to append to it (or the write fails).
This is quick and dirty. You might look into System.IO.hSeek and related functions if you want to do something more elaborate. E.g. open it, read it, seek to the end, append.
You shouldn't get the contents of the file.
The appropriate thing to do is to open file, edit and close. The getContent method, gets the content for you and does nothing else.
Pseudo Code ::
Open File
Read/Append File (as the case may be)
Close the file
Here's the doc
http://haskell.org/ghc/docs/latest/html/libraries/base/System-IO.html
You are correct that hGetContents closes (or rather, semi-closes) a file handle so you cannot use it for other operations anymore. One option is to acquire a new handle for the file after reading it and then using that for appending, but this might work in unexpected ways if, for some reason, you don't process the file contents completely before re-opening it.
Another way is to open the file in ReadWriteMode and to read the contents in some other way, for example, using hGetLine (if your data is line based) until you reach the end-of-file and then appending using the same handle.

Resources