How withFile is implemented in haskell

How withFile is implemented in haskell - haskell

Following a haskell tutorial, the author provides the following implementation of the withFile method:
withFile' :: FilePath -> IOMode -> (Handle -> IO a) -> IO a
withFile' path mode f = do
handle <- openFile path mode
result <- f handle
hClose handle
return result
But why do we need to wrap the result in a return? Doesn't the supplied function f already return an IO as can be seen by it's type Handle -> IO a?

You're right: f already returns an IO, so if the function were written like this:
withFile' path mode f = do
handle <- openFile path mode
f handle
there would be no need for a return. The problem is hClose handle comes in between, so we have to store the result first:
result <- f handle
and doing <- gets rid of the IO. So return puts it back.

This is one of the tricky little things that confused me when I first tried Haskell. You're misunderstanding the meaning of the <- construct in do-notation. result <- f handle doesn't mean "assign the value of f handle to result"; it means "bind result to a value 'extracted' from the monadic value of f handle" (where the 'extraction' happens in some way that's defined by the particular Monad instance that you're using, in this case the IO monad).
I.e., for some Monad typeclass m, the <- statement takes an expression of type m a in the right hand side and a variable of type a on the left hand side, and binds the variable to a value. Thus in your particular example, with result <- f handle, we have the types f result :: IO a, result :: a and return result :: IO a.
PS do-notation has also a special form of let (without the in keyword in this case!) that works as a pure counterpart to <-. So you could rewrite your example as:
withFile' :: FilePath -> IOMode -> (Handle -> IO a) -> IO a
withFile' path mode f = do
handle <- openFile path mode
let result = f handle
hClose handle
result
In this case, because the let is a straightforward assignment, the type of result is IO a.

Related

How to use readFile

I am having trouble reading in a level file in Haskell. The goal is to read in a simple txt file with two numbers seperated by a space and then commas. The problem I keep getting is this: Couldn't match type `IO' with `[]'
If I understand correctly the do statement is supposed to pull the String out of the Monad.
readLevelFile :: FilePath -> [FallingRegion]
readLevelFile f = do
fileContent <- readFile f
(map lineToFallingRegion (lines fileContent))
lineToFallingRegion :: String -> FallingRegion
lineToFallingRegion s = map textShapeToFallingShape (splitOn' (==',') s)
textShapeToFallingShape :: String -> FallingShape
textShapeToFallingShape s = FallingShape (read $ head numbers) (read $ head
$ tail numbers)
where numbers = splitOn' (==' ') s

You can't pull things out of IO. You can think of IO as a container (in fact, some interpretations of IO liken it to the box containing Schrödinger's cat). You can't see what's in the container, but if you step into the container, values become visible.
So this should work:
readLevelFile f = do
fileContent <- readFile f
return (map lineToFallingRegion (lines fileContent))
It does not, however, have the type given in the OP. Inside the do block, fileContent is a String value, but the entire block is still inside the IO container.
This means that the return type of the function isn't [FallingRegion], but IO [FallingRegion]. So if you change the type annotation for readLevelFile to
readLevelFile :: FilePath -> IO [FallingRegion]
you should be able to get past the first hurdle.

Let's look at your first function with explicit types:
readLevelFile f = do
(fileContent :: String) <-
(readFile :: String -> IO String) (f :: String) :: IO String
fileContent is indeed of type String but is only available within the execution of the IO Monad under which we are evaluating. Now what?
(map lineToFallingRegion (lines fileContent)) :: [String]
Now you are suddenly using an expression that is not an IO monad but instead is a list value - since lists are also a type of monad the type check tries to unify IO with []. What you actually wanted is to return this value:
return (map lineToFallingRegion (lines fileContent)) :: IO [String]
Now recalling that we can't ever "exit" the IO monad your readLevelFile type must be IO - an honest admission that it interacts with the outside world:
readLevelFile :: FilePath -> IO [FallingRegion]

After reading a file I have IO [Char], but I need [IO Char]

I have a file number.txt which contains a large number and I read it into an IO String like this:
readNumber = readFile "number.txt" >>= return
In another function I want to create a list of Ints, one Int for each digit…
Lets assume the content of number.txt is:
1234567890
Then I want my function to return [1,2,3,4,5,6,7,8,9,0].
I tried severall versions with map, mapM(_), liftM, and, and, and, but I got several error messages everytime, which I was able to reduce to
Couldn't match expected type `[m0 Char]'
with actual type `IO String'
The last version I have on disk is the following:
module Main where
import Control.Monad
import Data.Char (digitToInt)
main = intify >>= putStrLn . show
readNumber = readFile "number.txt" >>= return
intify = mapM (liftM digitToInt) readNumber
So, as far as I understand the error, I need some function that takes IO [a] and returns [IO a], but I was not able to find such thing with hoogle… Only the other way round seemes to exist

In addition to the other great answers here, it's nice to talk about how to read [IO Char] versus IO [Char]. In particular, you'd call [IO Char] "an (immediate) list of (deferred) IO actions which produce Chars" and IO [Char] "a (deferred) IO action producing a list of Chars".
The important part is the location of "deferred" above---the major difference between a type IO a and a type a is that the former is best thought of as a set of instructions to be executed at runtime which eventually produce an a... while the latter is just that very a.
This phase distinction is key to understanding how IO values work. It's also worth noting that it can be very fluid within a program---functions like fmap or (>>=) allow us to peek behind the phase distinction. As an example, consider the following function
foo :: IO Int -- <-- our final result is an `IO` action
foo = fmap f getChar where -- <-- up here getChar is an `IO Char`, not a real one
f :: Char -> Int
f = Data.Char.ord -- <-- inside here we have a "real" `Char`
Here we build a deferred action (foo) by modifying a deferred action (getChar) by using a function which views a world that only comes into existence after our deferred IO action has run.
So let's tie this knot and get back to the question at hand. Why can't you turn an IO [Char] into an [IO Char] (in any meaningful way)? Well, if you're looking at a piece of code which has access to IO [Char] then the first thing you're going to want to do is sneak inside of that IO action
floob = do chars <- (getChars :: IO [Char])
...
where in the part left as ... we have access to chars :: [Char] because we've "stepped into" the IO action getChars. This means that by this point we've must have already run whatever runtime actions are required to generate that list of characters. We've let the cat out of the monad and we can't get it back in (in any meaningful way) since we can't go back and "unread" each individual character.
(Note: I keep saying "in any meaningful way" because we absolutely can put cats back into monads using return, but this won't let us go back in time and have never let them out in the first place. That ship has sailed.)
So how do we get a type [IO Char]? Well, we have to know (without running any IO) what kind of IO operations we'd like to do. For instance, we could write the following
replicate 10 getChar :: [IO Char]
and immediately do something like
take 5 (replicate 10 getChar)
without ever running an IO action---our list structure is immediately available and not deferred until the runtime has a chance to get to it. But note that we must know exactly the structure of the IO actions we'd like to perform in order to create a type [IO Char]. That said, we could use yet another level of IO to peek at the real world in order to determine the parameters of our action
do len <- (figureOutLengthOfReadWithoutActuallyReading :: IO Int)
return $ replicate len getChar
and this fragment has type IO [IO Char]. To run it we have to step through IO twice, we have to let the runtime perform two IO actions, first to determine the length and then second to actually act on our list of IO Char actions.
sequence :: [IO a] -> IO [a]
The above function, sequence, is a common way to execute some structure containing a, well, sequence of IO actions. We can use that to do our two-phase read
twoPhase :: IO [Char]
twoPhase = do len <- (figureOutLengthOfReadWithoutActuallyReading :: IO Int)
putStrLn ("About to read " ++ show len ++ " characters")
sequence (replicate len getChar)
>>> twoPhase
Determining length of read
About to read 22 characters
let me write 22 charac"let me write 22 charac"

You got some things mixed up:
readNumber = readFile "number.txt" >>= return
the return is unecessary, just leave it out.
Here is a working version:
module Main where
import Data.Char (digitToInt)
main :: IO ()
main = intify >>= print
readNumber :: IO String
readNumber = readFile "number.txt"
intify :: IO [Int]
intify = fmap (map digitToInt) readNumber

Such a function can't exists, because you would be able to evaluate the length of the list without ever invoking any IO.
What is possible is this:
imbue' :: IO [a] -> IO [IO a]
imbue' = fmap $ map return
Which of course generalises to
imbue :: (Functor f, Monad m) => m (f a) -> m (f (m a))
imbue = liftM $ fmap return
You can then do, say,
quun :: IO [Char]
bar :: [IO Char] -> IO Y
main = do
actsList <- imbue quun
y <- bar actsLists
...
Only, the whole thing about using [IO Char] is pointless: it's completely equivalent to the much more straightforward way of working only with lists of "pure values", only using the IO monad "outside"; how to do that is shown in Markus's answer.

Do you really need many different helper functions? Because you may write just
main = do
file <- readFile "number.txt"
let digits = map digitToInt file
print digits
or, if you really need to separate them, try to minimize the amount of IO signatures:
readNumber = readFile "number.txt" --Will be IO String
intify = map digitToInt --Will be String -> [Int], not IO
main = readNumber >>= print . intify

Store Handle in custom data field?

One more stupid question =) I have a custom data type with Handle field:
import System.IO
data CustomType = CustomType {
file::Handle
}
How can I set a file field? I'm trying to use this obvious code:
let someFile = openFile fileName AppendMode
let object = CustomType {
file=someFile
}
but openFile has a type openFile :: FilePath -> IOMode -> IO Handle, so I've got an error
Couldn't match expected type `Handle' with actual type `IO Handle'
So how can I store Handle object in this field?
UPD
I'm trying also this
data CustomType = CustomType {
file::IO Handle
}
but this results to an error, when I'm using the hPutStrLn function
let object = CustomType {
file=someFile
}
hPutStrLn (file object)
Error message is:
Couldn't match expected type `Handle' with actual type `IO Handle'
In the return type of a call of `file'
In the first argument of `TO.hPutStrLn', namely `(file object)'
In a stmt of a 'do' block:
TO.hPutStrLn (file object) text

I'm not completely sure what you want. If you don't understand type errors involving IO type mismatches, you should probably read an introduction to IO in Haskell first. Anyway, this is code that works:
import System.IO
data CustomType = CustomType {
file :: Handle
}
fileName :: FilePath
fileName = "foo"
process :: IO ()
process = do
someFile <- openFile fileName AppendMode
let object = CustomType { file = someFile }
hPutStrLn (file object) "abc"
hClose (file object)
If you want to type the commands in GHCi instead, you can enter every line of the do-block in the process action individually, like this:
GHCi> someFile <- openFile fileName AppendMode
GHCi> let object = CustomType { file = someFile }
GHCi> hPutStrLn (file object) "abc"
GHCi> hClose (file object)

So you have created your type like this:
data CustomType = CustomType {
file::Handle
}
Now try this in ghci:
ghci> let a = openFile "someFile.txt" AppendMode
ghci> :t a
a :: IO Handle
So Handle is wrapped with IO type. You can use the Monad bind operator
to extract the Handle out of it.
ghci> let b = a >>= \handle -> return $ CustomType handle
The return function will wrap the CustomType again in the IO monad. You can verify this again in the ghci:
ghci> :t b
b :: IO CustomType
The type of bind or >>= is like this:
ghic> :t (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b
Try to replace m with IO and you get:
(>>=) :: IO a -> (a -> IO b) -> IO b
And that's why you have to use return function with the bind operator, so that it will typecheck.

So your confusion seems to be with how to use values inside a IO monad. The thing about the IO monad is that its infectious, so whenever you want to do something with a IO value you will get a IO result. This might sound like a pain in the ass, but its actually pretty nice in reality as it keeps the parts of your program pure and gives you complete control over when actions are performed. Instead of trying to get out of the IO monad you have to learn to embrace the haskell way of applying functions to monadic values. Every monad is a functor so you can apply a pure function to a monadic value with fmap. The extra power a monad gives is that it allows you to join contexts together. So if you have a IO a and a IO b you can join the IO's together to get IO (a, b). We can use this knowledge to solve your problem:
openFile has the signature:
openFile :: FilePath -> IOMode -> IO Handle
As mentioned above there is no way* to remove the IO from Handle, so the only thing you can do is to put your type inside the IO monad too. You can for example make this function that uses fmap to apply your object to the IO Handle:
createMyObject :: FilePath -> IO CustomType
createMyObject fp = CustomType `fmap` openFile fp AppendMode
Now you have your object but its in a IO monad so how do you use it? The outermost layer of your application is always in the IO monad. So your main function should have a signature like IO (). Inside the main function you can use other IO values like they are pure by using do notation. The (<-) keyword is kind of like join we talked about above. It draws the value from another IO into the current IO:
main :: IO ()
main = do
myObjectPure <- createMyObject "someFilePath.txt"
let myHandle :: Handle -- No IO!
myHandle = file myObjectPure
-- Now you can use the functions that takes a pure handle:
hPutStrLn myHandler "Yay"
By the way you probably shouldn't use a Handle directly this way because it will be really easy for you to forget to close it. Its better to use something like withFile which will close the handle for you when its done.
*Actually there is a way, but you don't need to know about it yet because its very unlikely that you are solving a problem that actually needs it, and its too easy to abuse for someone new.

Is there a way to unwrap a type from an IO monad?

I have this very simple function
import qualified Data.ByteString.Lazy as B
getJson :: IO B.ByteString
getJson = B.readFile jsonFile
readJFile :: IO (Maybe Response)
readJFile = parsing >>= (\d ->
case d of
Left err -> return Nothing
Right ps -> return (Just ps))
where parsing = fmap eitherDecode getJson :: IO (Either String Response)
where jsonFile is a path to a file on my harddrive (pardon the lack of do-notation, but I found this more clear to work with)
my question is; is there a way for me to ditch the IO part so I can work with the bytestring alone?
I know that you can pattern match on certain monads like Either and Maybe to get their values out, but can you do something similar with IO?
Or voiced differently: is there a way for me to make readJFile return Maybe Response without the IO?

To expand on my comments, here's how you can do it:
getJson :: IO B.ByteString
getJson = B.readFile jsonFile -- as before
readJFile :: B.ByteString -> Maybe Response -- look, no IO
readJFile b = case eitherDecode b of
Left err -> Nothing
Right ps -> Just ps
In the end, you combine everything in one IO action again:
getAndProcess :: IO (Maybe Response)
getAndProcess = do
b <- getJson
return (readJFile b)

You never need to "drag a monad" through any functions, unless they all need to actually do IO. Just lift the entire chain into the monad with fmap (or liftM / liftM2 / ...).
For instance,
f1 :: B.ByteString -> K
f2 :: K -> I
f3 :: K -> J
f4 :: I -> J -> M
and your entire thing is supposed to be like
m :: M
m = let k = "f1 getJson"
in f4 (f2 k) (f3 k)
The you can simply do
m = fmap (\b -> let k = f1 b
in f4 (f2 k) (f3 k) )
getJson
Incidentally, this might look nicer with do notation:
m = do
b <- getJson
return $ let k = f1 b
in f4 (f2 k) (f3 k)
Concerning you edit and the question
is there a way for me to make readJFile return Maybe Response without the IO?
No, that can't possibly work, because readJFile does need to do IO. There's no way escaping from the IO monad then, that's the whole point of it! (Well, there is unsafePerformIO as Ricardo says, but this is definitely not a valid application for it.)
If it's the clunkiness of unpacking Maybe values in the IO monad, and the signatures with parens in them, you may want to looks at the MaybeT transformer.
readJFile' :: MaybeT IO Response
readJFile' = do
b <- liftIO getJson
case eitherDecode b of
Left err -> mzero
Right ps -> return ps

No, there is no safe way to get a value out of the IO monad. Instead you should do the work inside the IO monad by applying functions with fmap or bind (>>=). Also you should use decode instead of eitherDecode when you want your result to be in Maybe.
getJson :: IO B.ByteString
getJson = B.readFile jsonFile
parseResponse :: B.ByteString -> Maybe Response
parseResponse = decode
readJFile :: IO (Maybe Response)
readJFile = fmap parseResponse getJSON
You could also use do notation if that is clearer to you:
readJFile :: IO (Maybe Response)
readJFile = do
bytestring <- getJson
return $ decode bytestring
Note that you dont even need the parseResponse function since readJFile specifies the type.

In general, yes, there is a way. Accompanied by a lot of "but", but there is. You're asking for what it's called an unsafe IO operation: System.IO.Unsafe. It's used to write wrappers when calling to external libraries usually, it's not something to resort to in regular Haskell code.
Basically, you can call unsafePerformIO :: IO a -> a which does exactly what you want, it strips out the IO part and gives you back wrapped value of type a. But, if you look at the documentation, there are a number of requirements which you should guarantee yourself to the system, which all end up in the same idea: even though you performed the operation via IO, the answer should be the result of a function, as expected from any other haskell function which does not operate in IO: it should always have the same result without side effects, only based on the input values.
Here, given your code, this is obviously NOT the case, since you're reading from a file. You should just continue working within the IO monad, by calling your readJFile from within another function with result type IO something. Then, you'll be able to read the value within the IO wrapper (being in IO yourself), work on it, and then re-wrap the result in another IO when returning.

How do I combine IOError exceptions with locally relevant exceptions?

I am building a Haskell application and trying to figure out how I am going to build the error handling mechanism. In the real application, I'm doing a bunch of work with Mongo. But, for this, I'm going to simplify by working with basic IO operations on a file.
So, for this test application, I want to read in a file and verify that it contains a proper fibonnacci sequence, with each value separated by a space:
1 1 2 3 5 8 13 21
Now, when reading the file, any number of things could actually be wrong, and I am going to call all of those exceptions in the Haskell usage of the word.
data FibException = FileUnreadable IOError
| FormatError String String
| InvalidValue Integer
| Unknown String
instance Error FibException where
noMsg = Unknown "No error message"
strMsg = Unknown
Writing a pure function that verifies the sequence and throws an error in the case that the sequence is invalid is easy (though I could probably do better):
verifySequence :: String -> (Integer, Integer) -> Either FibException ()
verifySequence "" (prev1, prev2) = return ()
verifySequence s (prev1, prev2) =
let readInt = reads :: ReadS Integer
res = readInt s in
case res of
[] -> throwError $ FormatError s
(val, rest):[] -> case (prev1, prev2, val) of
(0, 0, 1) -> verifySequence rest (0, 1)
(p1, p2, val') -> (if p1 + p2 /= val'
then throwError $ InvalidValue val'
else verifySequence rest (p2, val))
_ -> throwError $ InvalidValue val
After that, I want the function that reads the file and verifies the sequence:
type FibIOMonad = ErrorT FibException IO
verifyFibFile :: FilePath -> FibIOMonad ()
verifyFibFile path = do
sequenceStr <- liftIO $ readFile path
case (verifySequence sequenceStr (0, 0)) of
Right res -> return res
Left err -> throwError err
This function does exactly what I want if the file is in the invalid format (it returns Left (FormatError "something")) or if the file has a number out of sequence (Left (InvalidValue 15)). But it throws an error if the file specified does not exist.
How do I catch the IO errors that readFile may produce so that I can transform them into the FileUnreadable error?
As a side question, is this even the best way to do it? I see the advantage that the caller of verifyFibFile does not have to set up two different exception handling mechanisms and can instead catch just one exception type.

You might consider EitherT and the errors package in general. http://hackage.haskell.org/packages/archive/errors/1.3.1/doc/html/Control-Error-Util.html has a utility tryIO for catching IOError in EitherT and you could use fmapLT to map error values to your custom type.
Specifically:
type FibIOMonad = EitherT FibException IO
verifyFibFile :: FilePath -> FibIOMonad ()
verifyFibFile path = do
sequenceStr <- fmapLT FileUnreadable (tryIO $ readFile path)
hoistEither $ verifySequence sequenceStr (0, 0)

#Savanni D'Gerinel: you are on the right track. Let's extract your error-catching code from verifyFibFile to make it more generic, and modify it slightly so that it works directly in ErrorT:
catchError' :: ErrorT e IO a -> (IOError -> ErrorT e IO a) -> ErrorT e IO a
catchError' m f =
ErrorT $ catchError (runErrorT m) (fmap runErrorT f)
verifyFibFile can now be written as:
verifyFibFile' :: FilePath -> FibIOMonad ()
verifyFibFile' path = do
sequenceStr <- catchError' (liftIO $ readFile path) (throwError . FileUnReadable)
ErrorT . return $ verifySequence sequenceStr' (0, 0)
Notice what we have done in catchError'. We have stripped the ErrorT constructor from the ErrorT e IO a action, and also from the return value of the error-handling function, knowing than we can reconstruct them afterwards by wrapping the result of the control operation in ErrorT again.
Turns out that this is a common pattern, and it can be done with monad transformers other than ErrorT. It can get tricky though (how to do this with ReaderT for example?). Luckily, the monad-control packgage already provides this functionality for many common transformers.
The type signatures in monad-control can seem scary at first. Start by looking at just one function: control. It has the type:
control :: MonadBaseControl b m => (RunInBase m b -> b (StM m a)) -> m a
Let's make it more specific by making b be IO:
control :: MonadBaseControl IO m => (RunInBase m IO -> IO (StM m a)) -> m a
m is a monad stack built on top of IO. In your case, it would be ErrorT IO.
RunInBase m IO is a type alias for a magical function, that takes a value of type m a and returns a value of type IO *something*, something being some complex magic that encodes the state of the whole monad stack inside IO and lets you reconstruct the m a value afterwards, once you have "fooled" the control operation that only accepts IO values. control provides you with that function, and also handles the reconstruction for you.
Applying this to your problem, we rewrite verifyFibFile once more as:
import Control.Monad.Trans.Control (control)
import Control.Exception (catch)
verifyFibFile'' :: FilePath -> FibIOMonad ()
verifyFibFile'' path = do
sequenceStr <- control $ \run -> catch (run . liftIO $ readFile path)
(run . throwError . FileUnreadable)
ErrorT . return $ verifySequence sequenceStr' (0, 0)
Keep in mind that this only works when the proper instance of MonadBaseControl b m exists.
Here is a nice introduction to monad-control.

So, here's an answer that I have developed. It centers around getting readFile wrapped into the proper catchError statement, and then lifted.
verifyFibFile :: FilePath -> FibIOMonad ()
verifyFibFile path = do
contents <- liftIO $ catchError (readFile path >>= return . Right) (return . Left . FileUnreadable)
case contents of
Right sequenceStr' -> case (verifySequence sequenceStr' (0, 0)) of
Right res -> return res
Left err -> throwError err
Left err -> throwError err
So, verifyFibFile gets a little more nested in this solution.
readFile path has type IO String, obviously. In this context, the type for catchError will be:
catchError :: IO String -> (IOError -> IO String) -> IO String
So, my strategy was to catch the error and turn it into the left side of an Either, and turn the successful value into the right side, changing my data type to this:
catchError :: IO (Either FibException String) -> (IOError -> IO (Either FibException String)) -> IO (Either FibException String)
I do this by, in the first parameter, simply wrapping the result into Right. I figure that I won't actually execute the return . Right branch of the code unless readFile path was successful. In the other parameter to catch, I start with an IOError, wrap it in Left, and then return it back into the IO context. After that, no matter what the result is, I lift the IO value up into the FibIOMonad context.
I'm bothered by the fact that the code gets even more nested. I have Left values, and all of those Left values get thrown. I'm basically in an Either context, and I had thought that one of the benefits Either's implementation of the Monad class was that Left values would simply be passed along through the binding operations and that no further code in that context would be executed. I would love some elucidation on this, or to see how the nesting can be removed from this function.
Maybe it can't. It does seem that the caller, however, can call verifyFibFile repeatedly and execution basically stops the first time verifyFibFile returns an error. This works:
runTest = do
res <- verifyFibFile "goodfib.txt"
liftIO $ putStrLn "goodfib.txt"
--liftIO $ printResult "goodfib.txt" res
res <- verifyFibFile "invalidValue.txt"
liftIO $ putStrLn "invalidValue.txt"
res <- verifyFibFile "formatError.txt"
liftIO $ putStrLn "formatError.txt"
Main> runErrorT $ runTest
goodfib.txt
Left (InvalidValue 17)
Given the files that I have created, both invalidValue.txt and formatError.txt cause errors, but this function returns Left (InvalidValue ...) for me.
That's okay, but I still feel like I've missed something with my solution. And I have no idea whether I'll be able to translate this into something that makes MongoDB access more robust.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How withFile is implemented in haskell - haskell

Related

How to use readFile

After reading a file I have IO [Char], but I need [IO Char]

Store Handle in custom data field?

Is there a way to unwrap a type from an IO monad?

How do I combine IOError exceptions with locally relevant exceptions?

Categories

Resources