Why is there difference between throw and throwIO?

Why is there difference between throw and throwIO? - haskell

I am trying to get a firm grasp of exceptions, so that I can improve my conditional loop implementation. To this end, I am staging various experiments, throwing stuff and seeing what gets caught.
This one surprises me to no end:
% cat X.hs
module Main where
import Control.Exception
import Control.Applicative
main = do
throw (userError "I am an IO error.") <|> print "Odd error ignored."
% ghc X.hs && ./X
...
X: user error (I am an IO error.)
% cat Y.hs
module Main where
import Control.Exception
import Control.Applicative
main = do
throwIO (userError "I am an IO error.") <|> print "Odd error ignored."
% ghc Y.hs && ./Y
...
"Odd error ignored."
I thought that the Alternative should ignore exactly IO errors. (Not sure where I got this idea from, but I certainly could not offer a non-IO exception that would be ignored in an Alternative chain.) So I figured I can hand craft and deliver an IO error. Turns out, whether it gets ignored depends on the packaging as much as the contents: if I throw an IO error, it is somehow not anymore an IO error.
I am completely lost. Why does it work this way? Is it intended? The definitions lead deep into the GHC internal modules; while I can more or less understand the meaning of disparate fragments of code by themselves, I am having a hard time seeing the whole picture.
Should one even use this Alternative instance if it is so difficult to predict? Would it not be better if it silenced any synchronous exception, not just some small subset of exceptions that are defined in a specific way and thrown in a specific way?

throw is a generalization of undefined and error, it's meant to throw an exception in pure code. When the value of the exception does not matter (which is most of the time), it is denoted by the symbol ⟘ for an "undefined value".
throwIO is an IO action which throws an exception, but is not itself an undefined value.
The documentation of throwIO thus illustrates the difference:
throw e `seq` x ===> throw e
throwIO e `seq` x ===> x
The catch is that (<|>) is defined as mplusIO which uses catchException which is a strict variant of catch. That strictness is summarized as follows:
⟘ <|> x = ⟘
hence you get an exception (and x is never run) in the throw variant.
Note that, without strictness, an "undefined action" (i.e., throw ... :: IO a) actually behaves like an action that throws from the point of view of catch:
catch (throw (userError "oops")) (\(e :: SomeException) -> putStrLn "caught") -- caught
catch (throwIO (userError "oops")) (\(e :: SomeException) -> putStrLn "caught") -- caught
catch (pure (error "oops")) (\(e :: SomeException) -> putStrLn "caught") -- not caught

Say you have
x :: Integer
That means that x should be an integer, of course.
x = throw _whatever
What does that mean? It means that there was supposed to be an Integer, but instead there’s just a mistake.
Now consider
x :: IO ()
That means x should be an I/O-performing program that returns no useful value. Remember, IO values are just values. They are values that just happen to represent imperative programs. So now consider
x = throw _whatever
That means that there was supposed to be an I/O-performing program there, but there is instead just a mistake. x is not a program that throws an error—there is no program. Regardless of whether you’ve used an IOError, x isn’t a valid IO program. When you try to execute the program
x <|> _whatever
You have to execute x to see whether it throws an error. But, you can’t execute x, because it’s not a program—it’s a mistake. Instead, everything explodes.
This differs significantly from
x = throwIO _whatever
Now x is a valid program. It is a valid program that always happens to throw an error, but it’s still a valid program that can actually be executed. When you try to execute
x <|> _whatever
now, x is executed, the error produced is discarded, and _whatever is executed in its place. You can also think of there being a difference between computing a program/figuring out what to execute and actually executing it. throw throws the error while computing the program to execute (it is a "pure exception"), while throwIO throws it during execution (it is an "impure exception"). This also explains their types: throw returns any type because all types can be "computed", but throwIO is restricted to IO because only programs can be executed.
This is further complicated by the fact that you can catch the pure exceptions that occur while executing IO programs. I believe this is a design compromise. From a theoretical perspective, you shouldn't be able to catch pure exceptions, because their presence should always be taken to indicate programmer error, but that can be rather embarrassing, because then you can only handle external errors, while programmer errors cause everything to blow up. If we were perfect programmers, that would be fine, but we aren't. Therefore, you are allowed to catch pure exceptions.
is :: [Int]
is = []
-- fails, because the print causes a pure exception
-- it was a programmer error to call head on is without checking that it,
-- in fact, had a head in the first place
-- (the program on the left is not valid, so main is invalid)
main1 = print (head is) <|> putStrLn "Oops"
-- throws exception
-- catch creates a program that computes and executes the program print (head is)
-- and catches both impure and pure exceptions
-- the program on the left is invalid, but wrapping it with catch
-- makes it valid again
-- really, that shouldn't happen, but this behavior is useful
main2 = print (head is) `catch` (\(_ :: SomeException) -> putStrLn "Oops")
-- prints "Oops"

The rest of this answer may not be entirely correct. But fundamentally, the difference is this: throwIO terminates and returns an IO action, while throw does not terminate.
As soon as you try to evaluate throw (userError "..."), your program aborts. <|> never gets a chance to look at its first argument to decide if the second argument should be evaluated; in fact, it never gets the first argument, because throw didn't return a value.
With throwIO, <|> isn't evaluating anything; it's creating a new IO action which, when it does get executed, will first look at its first argument. The runtime can "safely" execute the IO action and see that it does not, in fact, provide a value, at which point it can stop and try the other "half" of the <|> expression.

Related

Attempting to return a default value if converting strings to ints fails [duplicate]

In my Haskell program, I want to read in a value given by the user using the getLine function. I then want to use the read function to convert this value from a string to the appropriate Haskell type. How can I catch parse errors thrown by the read function and ask the user to reenter the value?
Am I right in thinking that this is not an "IO Error" because it is not an error caused by the IO system not functioning correctly? It is a semantic error, so I can't use IO error handling mechanisms?

You don't want to. You want to use reads instead, possibly like that:
maybeRead = fmap fst . listToMaybe . reads
(though you might want to error out if the second element of the tuple is not "", that is, if there's a remaining string, too)
The reason why you want to use reads instead of catching error exceptions is that exceptions in pure code are evil, because it's very easy to attempt to catch them in the wrong place: Note that they only fly when they are forced, not before. Locating where that is can be a non-trivial exercise. That's (one of the reasons) why Haskell programmers like to keep their code total, that is, terminating and exception-free.
You might want to have a look at a proper parsing framework (e.g. parsec) and haskeline, too.

There are readMaybe and readEither that satisfy your expectation. You find this functions in Text.Read package.

This is an addendum to #barsoap's answer more than anything else.
Haskell exceptions may be thrown anywhere, including in pure code, but they may only be caught from within the IO monad. In order to catch exceptions thrown by pure code, you need to use a catch or try on the IO statement that would force the pure code to be evaluated.
str2Int :: String -> Int -- shortcut so I don't need to add type annotations everywhere
str2Int = read
main = do
print (str2Int "3") -- ok
-- print (str2Int "a") -- raises exception
eVal <- try (print (str2Int "a")) :: IO (Either SomeException ())
case eVal of
Left e -> do -- couldn't parse input, try again
Right n -> do -- could parse the number, go ahead
You should use something more specific than SomeException because that will catch anything. In the above code, the try will return a Left exception if read can't parse the string, but it will also return a Left exception if there's an IO error when trying to print the value, or any number of other things that could possibly go wrong (out of memory, etc.).
Now, here's why exceptions from pure code are evil. What if the IO code doesn't actually force the result to be evaluated?
main2 = do
inputStr <- getLine
let data = [0,1,read inputStr] :: [Int]
eVal <- try (print (head data)) :: IO (Either SomeException ())
case eVal of
Right () -> do -- No exception thrown, so the user entered a number ?!
Left e -> do -- got an exception, probably couldn't read user input
If you run this, you'll find that you always end up in the Right branch of the case statement, no matter what the user entered. This is because the IO action passed to try doesn't ever try to read the entered string. It prints the first value of the list data, which is constant, and never touches the tail of the list. So in the first branch of the case statement, the coder thinks the data is evaluated but it isn't, and read may still throw an exception.
read is meant for unserializing data, not parsing user-entered input. Use reads, or switch to a real parser combinator library. I like uu-parsinglib, but parsec, polyparse, and many others are good too. You'll very likely need the extra power before long anyway.

Here's an improved maybeRead which allows only for trailing whitespaces, but nothing else:
import Data.Maybe
import Data.Char
maybeRead2 :: Read a => String -> Maybe a
maybeRead2 = fmap fst . listToMaybe . filter (null . dropWhile isSpace . snd) . reads

What's the difference between throw and throwIO

The documentation says:
The throwIO variant should be used in preference to throw to raise an exception within the IO monad because it guarantees ordering with respect to other IO operations, whereas throw does not.
I'm still confused after reading it. Is there an example to show that throw will cause a problem whereas throwIO will not?
Additional question:
Is the following statement correct?
If throw is being used to throw an exception in an IO, then the order of the exception is not guaranteed.
If throw is being used to throw an exception in a non-IO value, then the order of the exception is guaranteed.
If I need to throw an exception in a Monad Transformer, which I have to use throw instead of throwIO, does it guarantee the order of the exception?

I think the docs could be improved. The issue you need to keep in mind with throw and the like is that throw returns a bottom value that "explodes" (raises an exception) when evaluated; but whether and when evaluation happens is difficult to control due to laziness.
For example:
Prelude Control.Exception> let f n = if odd n then throw Underflow else True
Prelude Control.Exception> snd (f 1, putStrLn "this is fine")
this is fine
This could arguably be what you want to happen, but usually not. e.g. rather than the tuple above you might end up with a big data structure with a single exploding element that causes an exception to be raised after your web server returns 200 to the user, or something.
throwIO allows you to sequence raising an exception just as if it was another IO action, so it can be tightly controlled:
Prelude Control.Exception> throwIO Underflow >> putStrLn "this is fine"
*** Exception: arithmetic underflow
...just like doing print 1 >> print 2.
But note that you can actually replace throwIO with throw, for instance:
Prelude Control.Exception> throw Underflow >> putStrLn "this is fine"
*** Exception: arithmetic underflow
Since now the exploding value is of type IO a. It's actually not clear to me why throwIO exists other than to document an idiom. Maybe someone else can answer that.
As a final example, this has the same issue as my first example:
Prelude Control.Exception> return (throw Underflow) >> putStrLn "this is fine"
this is fine

throw is a generalization of undefined, while throwIO is an actual IO action. A key difference is that many laws don't quite hold when strictness is considered (i.e., when you have undefined (or throw) and seq).
> (throw Underflow :: IO ()) `seq` ()
*** Exception: arithmetic underflow
> (throw Underflow >>= pure) `seq` ()
()
Hence contradicting the law m >>= pure = m. throwIO doesn't have that issue, so it is the more principled way of throwing exceptions.

What is the purpose of instance Alternative IO?

This instance doesn't seem to behave properly:
> guard True <|> guard False
> guard False <|> guard False
*** Exception: user error (mzero)
One might argue that this cannot result in anything else. But why define such instance in the first place? Is there any good reason to result in _|_ whenever evaluation does not make sense?

The purpose of the Alternative instance for IO is to combine IO actions that might fail (by causing an IO error or otherwise throwing an exception) into a single IO action that "tries" multiple actions in turn, accepting the first successful one, or -- if all actions fail -- fails itself.
So, something like this would work to read one or more lines (using some) from standard input or else (using <|>) complain if no lines are available:
main = (print =<< some getLine) <|> putStrLn "No input!"
or you could write something like:
readConfig :: IO Config
readConfig = readConfigFile "~/.local/myapp/config"
<|> readConfigFile "/etc/myapp/config"
<|> return defaultConfig
Given this, it makes perfect sense that:
guard False <|> guard False
represents an action that, when executed, must fail by generating an exception. If it didn't, as #danidaz has pointed out, then executing the action:
guard False <|> guard False <|> putStrLn "success!"
wouldn't work to execute the third action. Since <|> is left associative and tries its left action before its right, executing the value of this expression would just execute whatever successful action guard False <|> guard False represented (e.g., return () or whatever) and never try putStrLn "success!".
There's a subtlety here that may be throwing you off. Contrary to first appearances, the value of:
guard False <|> guard False
isn't _|_ in the usual sense. Rather it's a perfectly well defined IO action that, if executed will fail to terminate in the sense of throwing an exception. That type of non-termination is still useful, though, because we can catch it (by adding another <|> alternative, for example!).
Also note, because you haven't supplied a better exception, a default exception of userError "mzero" is thrown. If you had instead caused failure via:
ioError (userError "one") <|> ioError (userError "two")
you'd see that if all actions fail, the last exception thrown is the one that gets thrown by the composite action.

asum from Data.Foldable can be useful to repeat a IOException-throwing action a number of times, until it succeeds or fails altogether:
import Data.Foldable (asum)
import Control.Monad
import Control.Exception
import System.Random -- from the "random" package
diceRoll :: IO Int
diceRoll = do
putStrLn "hi"
r <- randomRIO (0,20)
if r < 18
then throwIO (userError (show r))
else return r
main :: IO ()
main = do
r <- asum $ take 7 $ repeat diceRoll
print r
Given the "return the result of the first action that doesn't throw" semantics, empty must be an action that throws an exception. Otherwise it wouldn't work as a neutral element, for example in empty <|> return 4.
This is not that different from how the Alternative instance for Maybe behaves. There, asum returns the first non-Nothing value in a sequence of Maybes.
(Another "strange" empty is the one for the Alternative instace of Concurrently, which just waits forever. The <|> races two actions against each other.)

While not explicitly documented with Alternative, instances should essentially obey the following laws:
pure x <|> y = pure x
empty <|> x = x
You can intuit this as implementing some notion of “truthiness” and “falsiness”, where pure x is always truthy and empty is always falsy.
For this to make any sense for IO, we need some notion of truthiness. There aren’t many good ones, but IO has the ability to handle exceptions, so we can define truthy IO actions as actions that produce a value and falsy IO actions as actions that throw exceptions. Therefore, (<|>) for IO runs its first argument, and if it produces a value without throwing an exception, it returns the value; otherwise, it returns its second argument.
We now have a definition of (<|>) for IO, but what should empty be? Well, empty must be falsy, and we have defined falsiness on IO as “throwing an exception”. Therefore, empty must be an action that throws an exception.
The guard function is very simple, since it is just pure () when given True and empty when given False. This means your examples are really equivalent to the following:
empty <|> pure ()
empty <|> empty
In the first example, empty throws, so (<|>) catches it and returns pure (), which obviously produces (). In the second example, the same thing happens, except that the second argument is also empty, so the expression’s result also throws an exception.

How to catch a no parse exception from the read function in Haskell?

In my Haskell program, I want to read in a value given by the user using the getLine function. I then want to use the read function to convert this value from a string to the appropriate Haskell type. How can I catch parse errors thrown by the read function and ask the user to reenter the value?
Am I right in thinking that this is not an "IO Error" because it is not an error caused by the IO system not functioning correctly? It is a semantic error, so I can't use IO error handling mechanisms?

You don't want to. You want to use reads instead, possibly like that:
maybeRead = fmap fst . listToMaybe . reads
(though you might want to error out if the second element of the tuple is not "", that is, if there's a remaining string, too)
The reason why you want to use reads instead of catching error exceptions is that exceptions in pure code are evil, because it's very easy to attempt to catch them in the wrong place: Note that they only fly when they are forced, not before. Locating where that is can be a non-trivial exercise. That's (one of the reasons) why Haskell programmers like to keep their code total, that is, terminating and exception-free.
You might want to have a look at a proper parsing framework (e.g. parsec) and haskeline, too.

There are readMaybe and readEither that satisfy your expectation. You find this functions in Text.Read package.

This is an addendum to #barsoap's answer more than anything else.
Haskell exceptions may be thrown anywhere, including in pure code, but they may only be caught from within the IO monad. In order to catch exceptions thrown by pure code, you need to use a catch or try on the IO statement that would force the pure code to be evaluated.
str2Int :: String -> Int -- shortcut so I don't need to add type annotations everywhere
str2Int = read
main = do
print (str2Int "3") -- ok
-- print (str2Int "a") -- raises exception
eVal <- try (print (str2Int "a")) :: IO (Either SomeException ())
case eVal of
Left e -> do -- couldn't parse input, try again
Right n -> do -- could parse the number, go ahead
You should use something more specific than SomeException because that will catch anything. In the above code, the try will return a Left exception if read can't parse the string, but it will also return a Left exception if there's an IO error when trying to print the value, or any number of other things that could possibly go wrong (out of memory, etc.).
Now, here's why exceptions from pure code are evil. What if the IO code doesn't actually force the result to be evaluated?
main2 = do
inputStr <- getLine
let data = [0,1,read inputStr] :: [Int]
eVal <- try (print (head data)) :: IO (Either SomeException ())
case eVal of
Right () -> do -- No exception thrown, so the user entered a number ?!
Left e -> do -- got an exception, probably couldn't read user input
If you run this, you'll find that you always end up in the Right branch of the case statement, no matter what the user entered. This is because the IO action passed to try doesn't ever try to read the entered string. It prints the first value of the list data, which is constant, and never touches the tail of the list. So in the first branch of the case statement, the coder thinks the data is evaluated but it isn't, and read may still throw an exception.
read is meant for unserializing data, not parsing user-entered input. Use reads, or switch to a real parser combinator library. I like uu-parsinglib, but parsec, polyparse, and many others are good too. You'll very likely need the extra power before long anyway.

Here's an improved maybeRead which allows only for trailing whitespaces, but nothing else:
import Data.Maybe
import Data.Char
maybeRead2 :: Read a => String -> Maybe a
maybeRead2 = fmap fst . listToMaybe . filter (null . dropWhile isSpace . snd) . reads

How to catch (and ignore) a call to the error function?

I'm surprised I couldn't find an answer to this anywhere.
I'm writing a roguelike and I'm using the ncurses library from hackage, which is a pretty good wrapper around the ncurses library. Now ncurses has this quirk where if you try to write the bottom right character, it does so, then it tries to move the cursor to the next character, then it fails because there's nowhere to move it to. It returns an error value that you can only ignore.
My problem is that the haskell ncurses library writer dutifully checks for any errors on all calls, and when there is one, he calls: error "drawText: etc etc.".
In other languages, like c or python, to get around this you are forced to ignore the error or catch and ignore the exception, but for the life of me I can't figure out how to do it in haskell. Is the error function unrecoverable?
I will modify the library locally to not check for errors on that function if I have to, but I hate to do that. I'm also open to any workaround that would allow me to draw that last character without moving the cursor, but I don't think that is possible.

You can do this using catch from Control.Exception. Note, however, that you need to be in the IO monad to do this.
import qualified Control.Exception as Exc
divide :: Float -> Float -> Float
divide x 0 = error "Division by 0."
divide x y = x / y
main :: IO ()
main = Exc.catch (print $ divide 5 0) handler
where
handler :: Exc.ErrorCall -> IO ()
handler _ = putStrLn $ "You divided by 0!"

error is supposed to be as observable as an infinite loop. You can only catch error in IO, which is like saying "yeah you can if you know magic". But from the really nice part of Haskell, pure code, it is unrecoverable, and thus it is strongly advised not to use in your code, only as much as you would ever use an infinite loop as an error code.
ncurses is being rude and making you do magic to correct it. I'd say unsafePerformIO would be warranted to clean it up. Other than that, this is largely the same as Paul's answer.
import qualified Control.Exception as Exc
{-# NOINLINE unsafeCleanup #-}
unsafeCleanup :: a -> Maybe a
unsafeCleanup x = unsafePerformIO $ Exc.catch (x `seq` return (Just x)) handler
where
handler exc = return Nothing `const` (exc :: Exc.ErrorCall)
Then wrap unsafeCleanup around any value that would evaluate to an error to turn it into a Maybe.
This is available in the spoon package if you don't want to write it yourself (and you shouldn't -- exception code can be really tricky, especially in the presence of threads).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string