Error reading and writing same file simultaneously in Haskell - haskell

I need to modify a file in-place. So I planned to read file contents, process them, then write the output to the same file:
main = do
input <- readFile "file.txt"
let output = (map toUpper input)
-- putStrLn $ show $ length output
writeFile "file.txt" output
But the problem is, it works as expected only if I uncomment the 4th line - where I just output number of characters to console. If I don't uncomment it, I get
openFile: resource busy (file is locked)
Is there a way to force reading of that file?

The simplest thing might be strict ByteString IO:
import qualified Data.ByteString.Char8 as B
main = do
input <- B.readFile "file.txt"
B.writeFile "file.txt" $ B.map toUpper input
As you can see, it's the same code -- but with some functions replaced with ByteString versions.
Lazy IO
The problem that you're running into is that some of Haskell's IO functions use "Lazy IO", which has surprising semantics. In almost every program I would avoid lazy IO.
These days, people are looking for replacements to Lazy IO like Conduit and the like, and lazy IO is seen as an ugly hack which unfortunately is stuck in the standard library.

Related

Can I have separate functions for reading and writing to a txt file in Haskell, without using a 'main' function?

I'm making a program using Haskell that requires simple save and load functions. When I call the save function, I need to put a string into a text file. When I call load, I need to pull the string out of the text file.
I'm aware of the complexities surrounding IO in Haskell. From some reading around online I have discovered that it is possible through a 'main' function. However, I seem to only be able to implement either save, or load... not both.
For example, I have the following function at the moment for reading from the file.
main = do
contents <- readFile "Test.txt"
putStrLn contents
How can I also implement a write function? Does it have to be within the same function? Or can I separate it? Also, is there a way of me being able to name the functions load/save? Having to call 'main' when I actually want to call 'load' or 'save' is rather annoying.
I can't find any examples online of someone implementing both, and any implementations I've found of either always go through a main function.
Any advice will be greatly appreciated.
I'm aware of the complexities surrounding IO in Haskell.
It's actually not that complex. It might seem a little intimidating at first but you'll quickly get the hang of it.
How can I also implement a write function?
The same way
Or can I separate it?
Yes
Also, is there a way of me being able to name the functions load/save?
Yes, for example you could do your loading like this:
load :: IO String
load = readFile "Test.txt"
All Haskell programs start inside main, but they don't have to stay there, so you can use it like this:
main :: IO ()
main = do
contents <- load -- notice we're using the thing we just defined above
putStrLn contents
Note the main is always what your program does; But your main doesn't only have to do a single thing. It could just as well do many things, including for instance reading a value and then deciding what to do; Here's a more complicated (complete) example - I expect you'll not understand all parts of it right off the bat, but it at least should give you something to play around with:
data Choice = Save | Load
pickSaveOrLoad :: IO Choice
pickSaveOrLoad = do
putStr "Do you want to save or load? "
answer <- getLine
case answer of
"save" -> return Save
"load" -> return Load
_ -> do
putStrLn "Invalid choice (must pick 'save' or 'load')"
pickSaveOrLoad
save :: IO ()
save = do
putStrLn "You picked save"
putStrLn "<put your saving stuff here>"
load :: IO ()
load = do
putStrLn "You picked load"
putStrLn "<put your loading stuff here>"
main :: IO ()
main = do
choice <- pickSaveOrLoad
case choice of
Save -> save
Load -> load
Of course it's a bit odd to want to do either save or load, most programs that can do these things want to do both, but I don't know what exactly you're going for so I kept it generic.

Using "printf" on a list in Haskell

How can i use printf over a list?
I have a list of numbers and i want to print them all by respecting a format (ex: %.3f). I tried to use map over printf, but it does not work. So, i have no idea. Can somebody help me with this? Any ideas are acceptable. Is there a way to create a string from a list respecting a custom format?
printf can produce strings instead of just printing them to stdout. This
is because it is overloaded on its result type (it's also part of machinery
that makes it variadic).
import Text.Printf
main :: IO ()
main = putStrLn . unwords $ printf "%.3f" <$> ([1..10] :: [Double])
That should do the trick.
BTW, printf is not type safe and can blow at run time. I recommend you use
something like
formatting.

Haskell: Conditionally execute external process with Maybe FilePath

I am struggling to understand a block of code which is extremely easy in imperative world.
That's what I need to do: given an executable full path, which is a Maybe FilePath type, I need to execute it conditionally.
If the path is a Nothing - print an error, if the path is Just Path - execute it and print message that the file has been executed. Only "Hello, World" can be easier,right?
But in Haskell I dug my self into numerous layers of Maybe's and IO's and got stuck.
Two concrete questions arise from here:
How do I feed a Maybe FilePath into a system or rawSystem? liftM does not work for me here.
What is the correct way of doing this kind of conditional branching?
Thanks.
Simple pattern matching will do the job nicely.
case command of
Just path -> system path >> putStrLn "Done"
Nothing -> putStrLn "None specified"
Or, if you'd rather not pattern-match, use the maybe function:
maybe (putStrLn "None specified") ((>> putStrLn "Done") . system) command
That may occasionally be nicer than matching with a case, but not here, I think. The composition with the printing of the success message is clunky. It fares better if you don't print messages but return the ExitCode in both branches:
maybe (return $ ExitFailure 1) system command
This is exactly what the Traversable type class was made for!
Prelude Data.Traversable System.Cmd> traverse system Nothing
Nothing
Prelude Data.Traversable System.Cmd> traverse system (Just "echo OMG BEES")
OMG BEES
Just ExitSuccess

System.Directory.getDirectoryContents unicode support

The following code prints something like °Ð½Ð´Ð¸Ñ-ÐÑпаниÑ
getDirectoryContents "path/to/directory/that/contains/files/with/nonASCII/names"
>>= mapM_ putStrLn
Looks like it is a ghc bug and it is fixed already in repository. But what to do until everybody upgrade ghc?
The last time I encountered such the problem (it was few years ago, btw), I used utf8-string package to convert strings, but I don't remember how I did it, and ghc unicode support was changed visibly last years.
So, what is the best (or at least working) way to get directory contents with full unicode support?
ghc version 7.0.4
locale en_US.UTF-8
Here's a simple workaround using decodeString and encodeString from utf8-string.
import System.Directory
import qualified Codec.Binary.UTF8.String as UTF8
main = do
getDirectoryContents "." >>= mapM_ (putStrLn . UTF8.decodeString)
putStrLn "------------"
readFile (UTF8.encodeString "brøken-file-nåme.txt") >>= putStrLn
Output:
.
..
brøken-file-nåme.txt
Broken.hs
------------
hello
I would recommend looking at system-filepath, which provides an abstract datatype for representing filepaths. I've used it extensively for some internal code and it works wonderfully.

"resource busy (file is locked)" error in Haskell

I'm very new to Haskell. In fact, I'm working through this section of this tutorial.
I came across this piece of code:
import System.IO
import Data.Char
main = do
contents <- readFile "girlfriend.txt"
writeFile "girlfriendcaps.txt" (map toUpper contents)
Which reads the contents of the file called "girlfriend.txt" and writes the upper-cased version of the file to a new file called "girlfriendcaps.txt".
So, I wanted to modify the code a bit to take the name of the file to act on. I changed the code to this:
import System.IO
import Data.Char
main = do
path <- getLine
contents <- readFile path
writeFile path (map toUpper contents)
now, obviously the major difference here is that I'm reading from and writing to the same file. As I'm thinking about it now, this must be a lazy-evaluation thing, but i'm getting the "resource busy" error message. Correct me if I'm wrong, but I guess that readFile doesn't start reading the file until writeFile asks for the contents of it. And then writeFile tries to write to the file, but it must still have the file open because it's also asking for the contents. Am I close there?
So, the real question is: how do I read from and write to the same file in Haskell? It makes sense that it's more difficult, because you will write to a different file from the file you read from more often than not, but for my own edification, how would you read and write to the same file?
Indeed, this is a "lazy evaluation thing".
import System.IO
import Data.Char
main = do
path <- getLine
contents <- readFile path
writeFile path (map toUpper contents)
Remember that Haskell is primarily lazy in evaluation, and so is much of the IO subsystem. So when you call 'readFile' you begin streaming data in from the file. When you then immediately call "writeFile" you start streaming bytes back to the same file
This would be an error (i.e. destroy your data), so Haskell locks the resource until it is fully evaluated, and you get a nice error message.
There are two solutions:
Don't destructively overwrite the file, instead, copy to a new file
Or, use strict IO
To use strict IO, the 'text' or 'strict' packages are recommended.
What you're looking for is how to open a file in ReadWriteMode.
fileHandle <- openFile "fileName.txt" ReadWriteMode
contents <- hGetContents fileHandle
There's trickier stuff for navigating forwards and backwards through the file.
See Working with files and handles from RWH, and Operations on Handles at the System.IO docs.
Depends on exactly what you are trying to do. As a rule, in any language, this is probably a bad design because if anything goes wrong either inside the program or outside (e.g user error) then you have destroyed your original data and cannot try again. It also requires that the entire file be held in memory, which is cool if its just a few bytes, but not so good when someone decides to run this on a really big file.
If you really want to do this then generate a temporary filename for the output, and then once you know that you have written to it successfully you can delete the original and rename the new one.

Resources