What is the purpose of instance Alternative IO?

What is the purpose of instance Alternative IO? - haskell

This instance doesn't seem to behave properly:
> guard True <|> guard False
> guard False <|> guard False
*** Exception: user error (mzero)
One might argue that this cannot result in anything else. But why define such instance in the first place? Is there any good reason to result in _|_ whenever evaluation does not make sense?

The purpose of the Alternative instance for IO is to combine IO actions that might fail (by causing an IO error or otherwise throwing an exception) into a single IO action that "tries" multiple actions in turn, accepting the first successful one, or -- if all actions fail -- fails itself.
So, something like this would work to read one or more lines (using some) from standard input or else (using <|>) complain if no lines are available:
main = (print =<< some getLine) <|> putStrLn "No input!"
or you could write something like:
readConfig :: IO Config
readConfig = readConfigFile "~/.local/myapp/config"
<|> readConfigFile "/etc/myapp/config"
<|> return defaultConfig
Given this, it makes perfect sense that:
guard False <|> guard False
represents an action that, when executed, must fail by generating an exception. If it didn't, as #danidaz has pointed out, then executing the action:
guard False <|> guard False <|> putStrLn "success!"
wouldn't work to execute the third action. Since <|> is left associative and tries its left action before its right, executing the value of this expression would just execute whatever successful action guard False <|> guard False represented (e.g., return () or whatever) and never try putStrLn "success!".
There's a subtlety here that may be throwing you off. Contrary to first appearances, the value of:
guard False <|> guard False
isn't _|_ in the usual sense. Rather it's a perfectly well defined IO action that, if executed will fail to terminate in the sense of throwing an exception. That type of non-termination is still useful, though, because we can catch it (by adding another <|> alternative, for example!).
Also note, because you haven't supplied a better exception, a default exception of userError "mzero" is thrown. If you had instead caused failure via:
ioError (userError "one") <|> ioError (userError "two")
you'd see that if all actions fail, the last exception thrown is the one that gets thrown by the composite action.

asum from Data.Foldable can be useful to repeat a IOException-throwing action a number of times, until it succeeds or fails altogether:
import Data.Foldable (asum)
import Control.Monad
import Control.Exception
import System.Random -- from the "random" package
diceRoll :: IO Int
diceRoll = do
putStrLn "hi"
r <- randomRIO (0,20)
if r < 18
then throwIO (userError (show r))
else return r
main :: IO ()
main = do
r <- asum $ take 7 $ repeat diceRoll
print r
Given the "return the result of the first action that doesn't throw" semantics, empty must be an action that throws an exception. Otherwise it wouldn't work as a neutral element, for example in empty <|> return 4.
This is not that different from how the Alternative instance for Maybe behaves. There, asum returns the first non-Nothing value in a sequence of Maybes.
(Another "strange" empty is the one for the Alternative instace of Concurrently, which just waits forever. The <|> races two actions against each other.)

While not explicitly documented with Alternative, instances should essentially obey the following laws:
pure x <|> y = pure x
empty <|> x = x
You can intuit this as implementing some notion of “truthiness” and “falsiness”, where pure x is always truthy and empty is always falsy.
For this to make any sense for IO, we need some notion of truthiness. There aren’t many good ones, but IO has the ability to handle exceptions, so we can define truthy IO actions as actions that produce a value and falsy IO actions as actions that throw exceptions. Therefore, (<|>) for IO runs its first argument, and if it produces a value without throwing an exception, it returns the value; otherwise, it returns its second argument.
We now have a definition of (<|>) for IO, but what should empty be? Well, empty must be falsy, and we have defined falsiness on IO as “throwing an exception”. Therefore, empty must be an action that throws an exception.
The guard function is very simple, since it is just pure () when given True and empty when given False. This means your examples are really equivalent to the following:
empty <|> pure ()
empty <|> empty
In the first example, empty throws, so (<|>) catches it and returns pure (), which obviously produces (). In the second example, the same thing happens, except that the second argument is also empty, so the expression’s result also throws an exception.

Related

Attempting to return a default value if converting strings to ints fails [duplicate]

In my Haskell program, I want to read in a value given by the user using the getLine function. I then want to use the read function to convert this value from a string to the appropriate Haskell type. How can I catch parse errors thrown by the read function and ask the user to reenter the value?
Am I right in thinking that this is not an "IO Error" because it is not an error caused by the IO system not functioning correctly? It is a semantic error, so I can't use IO error handling mechanisms?

You don't want to. You want to use reads instead, possibly like that:
maybeRead = fmap fst . listToMaybe . reads
(though you might want to error out if the second element of the tuple is not "", that is, if there's a remaining string, too)
The reason why you want to use reads instead of catching error exceptions is that exceptions in pure code are evil, because it's very easy to attempt to catch them in the wrong place: Note that they only fly when they are forced, not before. Locating where that is can be a non-trivial exercise. That's (one of the reasons) why Haskell programmers like to keep their code total, that is, terminating and exception-free.
You might want to have a look at a proper parsing framework (e.g. parsec) and haskeline, too.

There are readMaybe and readEither that satisfy your expectation. You find this functions in Text.Read package.

This is an addendum to #barsoap's answer more than anything else.
Haskell exceptions may be thrown anywhere, including in pure code, but they may only be caught from within the IO monad. In order to catch exceptions thrown by pure code, you need to use a catch or try on the IO statement that would force the pure code to be evaluated.
str2Int :: String -> Int -- shortcut so I don't need to add type annotations everywhere
str2Int = read
main = do
print (str2Int "3") -- ok
-- print (str2Int "a") -- raises exception
eVal <- try (print (str2Int "a")) :: IO (Either SomeException ())
case eVal of
Left e -> do -- couldn't parse input, try again
Right n -> do -- could parse the number, go ahead
You should use something more specific than SomeException because that will catch anything. In the above code, the try will return a Left exception if read can't parse the string, but it will also return a Left exception if there's an IO error when trying to print the value, or any number of other things that could possibly go wrong (out of memory, etc.).
Now, here's why exceptions from pure code are evil. What if the IO code doesn't actually force the result to be evaluated?
main2 = do
inputStr <- getLine
let data = [0,1,read inputStr] :: [Int]
eVal <- try (print (head data)) :: IO (Either SomeException ())
case eVal of
Right () -> do -- No exception thrown, so the user entered a number ?!
Left e -> do -- got an exception, probably couldn't read user input
If you run this, you'll find that you always end up in the Right branch of the case statement, no matter what the user entered. This is because the IO action passed to try doesn't ever try to read the entered string. It prints the first value of the list data, which is constant, and never touches the tail of the list. So in the first branch of the case statement, the coder thinks the data is evaluated but it isn't, and read may still throw an exception.
read is meant for unserializing data, not parsing user-entered input. Use reads, or switch to a real parser combinator library. I like uu-parsinglib, but parsec, polyparse, and many others are good too. You'll very likely need the extra power before long anyway.

Here's an improved maybeRead which allows only for trailing whitespaces, but nothing else:
import Data.Maybe
import Data.Char
maybeRead2 :: Read a => String -> Maybe a
maybeRead2 = fmap fst . listToMaybe . filter (null . dropWhile isSpace . snd) . reads

Why is there difference between throw and throwIO?

I am trying to get a firm grasp of exceptions, so that I can improve my conditional loop implementation. To this end, I am staging various experiments, throwing stuff and seeing what gets caught.
This one surprises me to no end:
% cat X.hs
module Main where
import Control.Exception
import Control.Applicative
main = do
throw (userError "I am an IO error.") <|> print "Odd error ignored."
% ghc X.hs && ./X
...
X: user error (I am an IO error.)
% cat Y.hs
module Main where
import Control.Exception
import Control.Applicative
main = do
throwIO (userError "I am an IO error.") <|> print "Odd error ignored."
% ghc Y.hs && ./Y
...
"Odd error ignored."
I thought that the Alternative should ignore exactly IO errors. (Not sure where I got this idea from, but I certainly could not offer a non-IO exception that would be ignored in an Alternative chain.) So I figured I can hand craft and deliver an IO error. Turns out, whether it gets ignored depends on the packaging as much as the contents: if I throw an IO error, it is somehow not anymore an IO error.
I am completely lost. Why does it work this way? Is it intended? The definitions lead deep into the GHC internal modules; while I can more or less understand the meaning of disparate fragments of code by themselves, I am having a hard time seeing the whole picture.
Should one even use this Alternative instance if it is so difficult to predict? Would it not be better if it silenced any synchronous exception, not just some small subset of exceptions that are defined in a specific way and thrown in a specific way?

throw is a generalization of undefined and error, it's meant to throw an exception in pure code. When the value of the exception does not matter (which is most of the time), it is denoted by the symbol ⟘ for an "undefined value".
throwIO is an IO action which throws an exception, but is not itself an undefined value.
The documentation of throwIO thus illustrates the difference:
throw e `seq` x ===> throw e
throwIO e `seq` x ===> x
The catch is that (<|>) is defined as mplusIO which uses catchException which is a strict variant of catch. That strictness is summarized as follows:
⟘ <|> x = ⟘
hence you get an exception (and x is never run) in the throw variant.
Note that, without strictness, an "undefined action" (i.e., throw ... :: IO a) actually behaves like an action that throws from the point of view of catch:
catch (throw (userError "oops")) (\(e :: SomeException) -> putStrLn "caught") -- caught
catch (throwIO (userError "oops")) (\(e :: SomeException) -> putStrLn "caught") -- caught
catch (pure (error "oops")) (\(e :: SomeException) -> putStrLn "caught") -- not caught

Say you have
x :: Integer
That means that x should be an integer, of course.
x = throw _whatever
What does that mean? It means that there was supposed to be an Integer, but instead there’s just a mistake.
Now consider
x :: IO ()
That means x should be an I/O-performing program that returns no useful value. Remember, IO values are just values. They are values that just happen to represent imperative programs. So now consider
x = throw _whatever
That means that there was supposed to be an I/O-performing program there, but there is instead just a mistake. x is not a program that throws an error—there is no program. Regardless of whether you’ve used an IOError, x isn’t a valid IO program. When you try to execute the program
x <|> _whatever
You have to execute x to see whether it throws an error. But, you can’t execute x, because it’s not a program—it’s a mistake. Instead, everything explodes.
This differs significantly from
x = throwIO _whatever
Now x is a valid program. It is a valid program that always happens to throw an error, but it’s still a valid program that can actually be executed. When you try to execute
x <|> _whatever
now, x is executed, the error produced is discarded, and _whatever is executed in its place. You can also think of there being a difference between computing a program/figuring out what to execute and actually executing it. throw throws the error while computing the program to execute (it is a "pure exception"), while throwIO throws it during execution (it is an "impure exception"). This also explains their types: throw returns any type because all types can be "computed", but throwIO is restricted to IO because only programs can be executed.
This is further complicated by the fact that you can catch the pure exceptions that occur while executing IO programs. I believe this is a design compromise. From a theoretical perspective, you shouldn't be able to catch pure exceptions, because their presence should always be taken to indicate programmer error, but that can be rather embarrassing, because then you can only handle external errors, while programmer errors cause everything to blow up. If we were perfect programmers, that would be fine, but we aren't. Therefore, you are allowed to catch pure exceptions.
is :: [Int]
is = []
-- fails, because the print causes a pure exception
-- it was a programmer error to call head on is without checking that it,
-- in fact, had a head in the first place
-- (the program on the left is not valid, so main is invalid)
main1 = print (head is) <|> putStrLn "Oops"
-- throws exception
-- catch creates a program that computes and executes the program print (head is)
-- and catches both impure and pure exceptions
-- the program on the left is invalid, but wrapping it with catch
-- makes it valid again
-- really, that shouldn't happen, but this behavior is useful
main2 = print (head is) `catch` (\(_ :: SomeException) -> putStrLn "Oops")
-- prints "Oops"

The rest of this answer may not be entirely correct. But fundamentally, the difference is this: throwIO terminates and returns an IO action, while throw does not terminate.
As soon as you try to evaluate throw (userError "..."), your program aborts. <|> never gets a chance to look at its first argument to decide if the second argument should be evaluated; in fact, it never gets the first argument, because throw didn't return a value.
With throwIO, <|> isn't evaluating anything; it's creating a new IO action which, when it does get executed, will first look at its first argument. The runtime can "safely" execute the IO action and see that it does not, in fact, provide a value, at which point it can stop and try the other "half" of the <|> expression.

Why does extracting an IO (Maybe Bool) using fromMaybe do both IO actions

In this code:
fromMaybe <$> (print "A" >> return True) <*> (print "B" >> (return $ Just False))
fromMaybe <$> (print "A" >> return True) <*> (print "B" >> (return $ Nothing))
I expected that due to laziness, either "A" or "B" would be printed depending on whether I supply Just something or Nothing but instead both are printed no matter what. Can someone explain a) what is going on exactly here? and b) how can I achieve the effect I want?

Focusing on (print "B" >> (return $ Just False)) you're sequencing a print command with a return in the IO Monad. Since it's a Monad, it needs to evaluate print "B" exactly enough to get the "value" (even though it's just ignored) before being able to evaluate the return statement. For print in the IO monad that means performing the side effect.
This sequencing occurs in both of the IO arguments and so before they're able to be passed to the pure computation, fromMaybe, all of the effects have already been executed. Applicatives always execute all of the effects first and then compute the pure computation on the pure values.
fromMaybe True <$> case thing of
Just _ -> print "A" >> return thing
Nothing -> print "B" >> return thing
or maybe fromMaybe True <$> when (isJust thing) (print "A") >> print "B" >> return thing if that's better behavior.

The following happens:
You map fromMaybe over an IO value. Hence the left part
fromMaybe <$> (print "A" >> return True)
is an IO action that could be rewritten thus
print "A" >> return (fromMaybe True) :: IO (Maybe Bool -> Bool)
This means that the "A" will be printed no matter what.
Note that the IO monad is all about sequencing actions, hence an action later in the >>= chain can never affect whether earlier actions are executed.
Consider
fromMaybe <$> (getChar >>= return)
It is clear that the Char the fromMaybe is applied to must come from actually reading a character. It is not the case that the character will only be read when it is needed.
If this were so, the following code would not make sense:
do
a <- getChar
b <- getChar
-- at this point, a and b have been actually read from stdin already
return (a < b)
For, it is not known whether (<) evaluates its left or right argument first.
Rather, in any case, a gets the value of the first character read and b that of the second. And the meaning of the code snippet is, accordingly, to read two characters and to check whether the first character read is lower than the second.
Indeed, if an IO action would be executed only when its value is actually needed, many programs wouldn't print anything, as it stands. This is because code like
print "A" >> print "B"
deliberately ignores the result of the first print.
For the same reason, the "B" will always be printed.

Why is that not lazy

I'm still starting to explore Haskell. I know this code "runs" in the IO monad. When it goes from the l <- ... line to the next one, the IO - bind is called.
One could think that because Haskell is lazy, the l is never evaluated. But "bind" always evaluates the previous command, is that right? Because the program produces the "file-not-found" error.
main = do
l <- mapM readFile [ "/tmp/notfound" ]
return ()

One could think that because Haskell is lazy, the l is never evaluated.
Yes, and it never is evaluated. However, due to the definition of (>>=) in IO, the action readFile "/tmp/notfound" is executed, and that means the runtime tries to open the file. If there is no such file, a "File not found" error is raised. If there were such a file, it would be opened, but its contents would not be read until demanded. In the above, they are not demanded, so the contents will not be read.
What is evaluated here (and even executed) is the action producing l. Since the file doesn't exist, that raises an error.

If you expand the do notation in your code, you get:
main = (mapM readFile ["/tmp/notfound"]) >>= (\l -> return ())
So yes, l is never evaluated, but that doesn't mean that the call to mapM is never evaluated. >>= always needs to evaluate its left operand in order to produce a value at least to some degree (at least in the IO monad and in any other monad that comes to mind).

How to catch a no parse exception from the read function in Haskell?

In my Haskell program, I want to read in a value given by the user using the getLine function. I then want to use the read function to convert this value from a string to the appropriate Haskell type. How can I catch parse errors thrown by the read function and ask the user to reenter the value?
Am I right in thinking that this is not an "IO Error" because it is not an error caused by the IO system not functioning correctly? It is a semantic error, so I can't use IO error handling mechanisms?

You don't want to. You want to use reads instead, possibly like that:
maybeRead = fmap fst . listToMaybe . reads
(though you might want to error out if the second element of the tuple is not "", that is, if there's a remaining string, too)
The reason why you want to use reads instead of catching error exceptions is that exceptions in pure code are evil, because it's very easy to attempt to catch them in the wrong place: Note that they only fly when they are forced, not before. Locating where that is can be a non-trivial exercise. That's (one of the reasons) why Haskell programmers like to keep their code total, that is, terminating and exception-free.
You might want to have a look at a proper parsing framework (e.g. parsec) and haskeline, too.

There are readMaybe and readEither that satisfy your expectation. You find this functions in Text.Read package.

This is an addendum to #barsoap's answer more than anything else.
Haskell exceptions may be thrown anywhere, including in pure code, but they may only be caught from within the IO monad. In order to catch exceptions thrown by pure code, you need to use a catch or try on the IO statement that would force the pure code to be evaluated.
str2Int :: String -> Int -- shortcut so I don't need to add type annotations everywhere
str2Int = read
main = do
print (str2Int "3") -- ok
-- print (str2Int "a") -- raises exception
eVal <- try (print (str2Int "a")) :: IO (Either SomeException ())
case eVal of
Left e -> do -- couldn't parse input, try again
Right n -> do -- could parse the number, go ahead
You should use something more specific than SomeException because that will catch anything. In the above code, the try will return a Left exception if read can't parse the string, but it will also return a Left exception if there's an IO error when trying to print the value, or any number of other things that could possibly go wrong (out of memory, etc.).
Now, here's why exceptions from pure code are evil. What if the IO code doesn't actually force the result to be evaluated?
main2 = do
inputStr <- getLine
let data = [0,1,read inputStr] :: [Int]
eVal <- try (print (head data)) :: IO (Either SomeException ())
case eVal of
Right () -> do -- No exception thrown, so the user entered a number ?!
Left e -> do -- got an exception, probably couldn't read user input
If you run this, you'll find that you always end up in the Right branch of the case statement, no matter what the user entered. This is because the IO action passed to try doesn't ever try to read the entered string. It prints the first value of the list data, which is constant, and never touches the tail of the list. So in the first branch of the case statement, the coder thinks the data is evaluated but it isn't, and read may still throw an exception.
read is meant for unserializing data, not parsing user-entered input. Use reads, or switch to a real parser combinator library. I like uu-parsinglib, but parsec, polyparse, and many others are good too. You'll very likely need the extra power before long anyway.

Here's an improved maybeRead which allows only for trailing whitespaces, but nothing else:
import Data.Maybe
import Data.Char
maybeRead2 :: Read a => String -> Maybe a
maybeRead2 = fmap fst . listToMaybe . filter (null . dropWhile isSpace . snd) . reads

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string