Signature of IO in Haskell (is this class or data?) - haskell

The question is not what IO does, but how is it defined, its signature. Specifically, is this data or class, is "a" its type parameter then? I didn't find it anywhere. Also, I don't understand the syntactic meaning of this:
f :: IO a

You asked whether IO a is a data type: it is. And you asked whether the a is its type parameter: it is. You said you couldn't find its definition. Let me show you how to find it:
localhost:~ gareth.rowlands$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Prelude> :i IO
newtype IO a
= GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
-- Defined in `GHC.Types'
instance Monad IO -- Defined in `GHC.Base'
instance Functor IO -- Defined in `GHC.Base'
Prelude>
In ghci, :i or :info tells you about a type. It shows the type declaration and where it's defined. You can see that IO is a Monad and a Functor too.
This technique is more useful on normal Haskell types - as others have noted, IO is magic in Haskell. In a typical Haskell type, the type signature is very revealing but the important thing to know about IO is not its type declaration, rather that IO actions actually perform IO. They do this in a pretty conventional way, typically by calling the underlying C or OS routine. For example, Haskell's putChar action might call C's putchar function.

IO is a polymorphic type (which happens to be an instance of Monad, irrelevant here).
Consider the humble list. If we were to write our own list of Ints, we might do this:
data IntList = Nil | Cons { listHead :: Int, listRest :: IntList }
If you then abstract over what element type it is, you get this:
data List a = Nil | Cons { listHead :: a, listRest :: List a }
As you can see, the return value of listRest is List a. List is a polymorphic type of kind * -> *, which is to say that it takes one type argument to create a concrete type.
In a similar way, IO is a polymorphic type with kind * -> *, which again means it takes one type argument. If you were to define it yourself, it might look like this:
data IO a = IO (RealWorld -> (a, RealWorld))
(definition courtesy of this answer)

The amount of magic in IO is grossly overestimated: it has some support from compiler and runtime system, but much less than newbies usually expect.
Here is the source file where it is defined:
http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.3.0.0/src/GHC-Types.html
newtype IO a
= IO (State# RealWorld -> (# State# RealWorld, a #))
It is just an optimized version of state monad. If we remove optimization annotations we will see:
data IO a = IO (Realworld -> (Realworld, a))
So basically IO a is a data structure storing a function that takes old real world and returns new real world with io operation performed and a.
Some compiler tricks are necessary mostly to remove Realworld dummy value efficiently.
IO type is an abstract newtype - constructors are not exported, so you cannot bypass library functions, work with it directly and perform nasty things: duplicate RealWorld, create RealWorld out of nothing or escape the monad (write a function of IO a -> a type).

Since IO can be applied to objects of any type a, as it is a polymorphic monad, a is not specified.
If you have some object with type a, then it can be 'wrappered' as an object of type IO a, which you can think of as being an action that gives an object of type a. For example, getChar is of type IO Char, and so when it is called, it has the side effect of (From the program's perspective) generating a character, which comes from stdin.
As another example, putChar has type Char -> IO (), meaning that it takes a char, and then performs some action that gives no output (in the context of the program, though it will print the char given to stdout).
Edit: More explanation of monads:
A monad can be thought of as a 'wrapper type' M, and has two associated functions:
return and >>=.
Given a type a, it is possible to create objects of type M a (IO a in the case of the IO monad), using the return function.
return, therefore, has type a -> M a. Moreover, return attempts not to change the element that it is passed -- if you call return x, you will get a wrappered version of x that contains all of the information of x (Theoretically, at least. This doesn't happen with, for example, the empty monad.)
For example, return "x" will yield an M Char. This is how getChar works -- it yields an IO Char using a return statement, which is then pulled out of its wrapper with <-.
>>=, read as 'bind', is more complicated. It has type M a -> (a -> M b) -> M b, and its role is to take a 'wrappered' object, and a function from the underlying type of that object to another 'wrappered' object, and apply that function to the underlying variable in the first input.
For example, (return 5) >>= (return . (+ 3)) will yield an M Int, which will be the same M Int that would be given by return 8. In this way, any function that can be applied outside of a monad can also be applied inside of it.
To do this, one could take an arbitrary function f :: a -> b, and give the new function g :: M a -> M b as follows:
g x = x >>= (return . f)
Now, for something to be a monad, these operations must also have certain relations -- their definitions as above aren't quite enough.
First: (return x) >>= f must be equivalent to f x. That is, it must be equivalent to perform an operation on x whether it is 'wrapped' in the monad or not.
Second: x >>= return must be equivalent to m. That is, if an object is unwrapped by bind, and then rewrapped by return, it must return to its same state, unchanged.
Third, and finally (x >>= f) >>= g must be equivalent to x >>= (\y -> (f y >>= g) ). That is, function binding is associative (sort of). More accurately, if two functions are bound successively, this must be equivalent to binding the combination thereof.
Now, while this is how monads work, it's not how it's most commonly used, because of the syntactic sugar of do and <-.
Essentially, do begins a long chain of binds, and each <- sort of creates a lambda function that gets bound.
For example,
a = do x <- something
y <- function x
return y
is equivalent to
a = something >>= (\x -> (function x) >>= (\y -> return y))
In both cases, something is bound to x, function x is bound to y, and then y is returned to a in the wrapper of the relevant monad.
Sorry for the wall of text, and I hope it explains something. If there's more you need cleared up about this, or something in this explanation is confusing, just ask.

This is a very good question, if you ask me. I remember being very confused about this too, maybe this will help...
'IO' is a type constructor, 'IO a' is a type, the 'a' (in 'IO a') is an type variable. The letter 'a' carries no significance, the letter 'b' or 't1' could have been used just as well.
If you look at the definition of the IO type constructor you will see that it is a newtype defined as: GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
'f :: IO a' is the type of a function called 'f' of apparently no arguments that returns a result of some unconstrained type in the IO monad. 'in the IO monad' means that f can do some IO (i.e. change the 'RealWorld', where 'change' means replace the provided RealWorld with a new one) while computing its result. The result of f is polymorphic (that's a type variable 'a' not a type constant like 'Int'). A polymorphic result means that in your program it's the caller that determines the type of the result, so used in one place f could return an Int, used in another place it could return a String. 'Unconstrained' means that there's no type class restricting what type can be returned and so any type can be returned.
Why is 'f' a function and not a constant since there are no parameters and Haskell is pure? Because the definition of IO means that 'f :: IO a' could have been written 'f :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #)' and so in fact has a parameter -- the 'state of the real world'.

In the data IO a a have mainly the same meaning as in Maybe a.
But we can't rid of a constructor, like:
fromIO :: IO a -> a
fromIO (IO a) = a
Fortunately we could use this data in Monads, like:
{-# LANGUAGE ScopedTypeVariables #-}
foo = do
(fromIO :: a) <- (dataIO :: IO a)
...

Related

Struggling with basic experiment printing to stdio

Brand new to Haskell and failing to print the date in a simple program. What I'm trying to do:
Get the current time from getCurrentTime
Call a pure function on the date, returning a string
Print the string to stdio.
I've learned that getCurrentTime returns an IO monad. I must raise my pure function into the monad using extra sauce like fmap. Still no luck.
What am I doing wrong?
--- EDIT ---
Forgot to mention that this compiles and runs but produces no output.
module Main where
import System.IO
import Data.Time.Clock
import Data.Time.Calendar
date :: IO (Integer,Int,Int)
date = fmap (toGregorian . utctDay) getCurrentTime
getDateStr :: (Integer,Int,Int) -> String
getDateStr (year,month,day) = "Date is " ++ show year ++ "/" ++ show month ++ "/" ++ show day ++ "\n"
main = do
let printabledate = fmap getDateStr date
fmap print printabledate
It works like this:
fmap :: (a -> b) -> f a -> f b, noted the function to fmap is a normal function.
fmap getDateStr date :: IO String.
print :: a -> IO (), a will be String.
So: fmap print (fmap getDateStr date) will has type of IO (IO ()). The point is print is not a "normal" function, but it is a monadic function. You fmap a monadic function to a monadic value, you will get a monadic value wrapped inside another monadic value.
Then, when you evaluate main, you get back the inner monadic value of type IO (). That's not what you want. To get the desired result, just bind print to printabledate as #Ryan suggested in the comment:
(>>=) :: m a -> (a -> m b) -> m b
printabledate >>= print :: IO ()
That's all.
Let's forget the concept of monads for a second to clear stuff a bit:
In OOP, we have classes. Most of the time, these bind some behaviour to some data. In Haskell we do not do this, and rather we create data types that are just that, data.
Also, in OOP, we have the concept of interface, which allows us to define an API for some common functions that some classes can share. In some way, we could group those classes by properties they share, like for example creating an interface Mappable which has a map method that applies a function to the contents of the class that implements that interface.
Now, for example we could create a class List<T> that implements Mappable where map applies the function to each of the elements of the list.
In Haskell, we have type classes which are like interfaces, but much better, because they allow you to implement the API for any existing type. For example our Mappable from before, is called Functor in Haskell. Don't get scared, as it is just a name like AbstractEnterpriseJavaBeanFactory. It is defined like this:
class Functor f where
fmap :: (a -> b) -> f a -> f b
...
And now we can extend anything that has some contents, like our List<T> from before, with it, which in Haskell would be:
data List a = ... -- List implementation
instance Functor List where
fmap f lst = ... -- Implementation of fmap
This is great, because the Functor concept gives us the assurance that the function f will be applied to the contents of the data types that implements this, always returning another copy of the data type.
And you now might be wondering, what has this to do with my question?
Haskell comes with a lot of predefined type classes: Functor, Foldable, Applicative, ...
Type classes only assure you that some constraints will be met when applying the functions defined in their API:
Functor's fmap takes a function a -> b and an f a as argument
Applicative takes a "container" with a function inside f (a -> b) and an f a
etc...
Within all these many typeclasses there is one that has the following API
class X f where
bind :: (a -> f b) -> f a -> f b
It's like our functor from before, but instead of returning an element, the function passed as a parameter, returns another "container".
This typeclass is called Monad, and in Haskell it is defined like:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a -- This puts the value a inside of the "container" m
...
(Note that arguments are flipped)
So basically IO is not a monad, just a container type that happens to implement the API defined in the type class Monad, but also Functor and many others (you can check them in the instances section of the IO API documentation).
Now let's reason a bit:
First of all, we use date to get an IO (Integer, Int, Int), which basically is a container that contains the triple.
Then, we apply the getDateStr function to it's contents using fmap, so we get back an IO that has String inside of it.
Now we bind this value to printableDate. So printableDate is now an IO String.
Now we apply print to the contents of printableDate using fmap, but wait, print returns an "empty" IO container, so now what we get back is an IO containing an IO () ( IO (IO ()) ), but the main function's return type must be always IO ().
What do we do now? We have two options:
1) Unwrap the value inside our printableDate using the <- operator, allowing us to get the String itself:
main = do
let printabledate = fmap getDateStr date
unwrappedPrintableDate <- printabledate
print unwrappedPrintabledate
Or, directly:
main = do
printabledate <- fmap getDateStr date
print printabledate
2) Use the >>= operator defined in the Monad type class:
main = do
let printabledate = fmap getDateStr date
printabledate >>= print
Remember? The >>= operator expects that the function passed to it returns another "container", which is what print does.
Use which one feels more natural to you, as they are both accepted and the same thing

When are type signatures necessary in Haskell?

Many introductory texts will tell you that in Haskell type signatures are "almost always" optional. Can anybody quantify the "almost" part?
As far as I can tell, the only time you need an explicit signature is to disambiguate type classes. (The canonical example being read . show.) Are there other cases I haven't thought of, or is this it?
(I'm aware that if you go beyond Haskell 2010 there are plenty for exceptions. For example, GHC will never infer rank-N types. But rank-N types are a language extension, not part of the official standard [yet].)
Polymorphic recursion needs type annotations, in general.
f :: (a -> a) -> (a -> b) -> Int -> a -> b
f f1 g n x =
if n == (0 :: Int)
then g x
else f f1 (\z h -> g (h z)) (n-1) x f1
(Credit: Patrick Cousot)
Note how the recursive call looks badly typed (!): it calls itself with five arguments, despite f having only four! Then remember that b can be instantiated with c -> d, which causes an extra argument to appear.
The above contrived example computes
f f1 g n x = g (f1 (f1 (f1 ... (f1 x))))
where f1 is applied n times. Of course, there is a much simpler way to write an equivalent program.
Monomorphism restriction
If you have MonomorphismRestriction enabled, then sometimes you will need to add a type signature to get the most general type:
{-# LANGUAGE MonomorphismRestriction #-}
-- myPrint :: Show a => a -> IO ()
myPrint = print
main = do
myPrint ()
myPrint "hello"
This will fail because myPrint is monomorphic. You would need to uncomment the type signature to make it work, or disable MonomorphismRestriction.
Phantom constraints
When you put a polymorphic value with a constraint into a tuple, the tuple itself becomes polymorphic and has the same constraint:
myValue :: Read a => a
myValue = read "0"
myTuple :: Read a => (a, String)
myTuple = (myValue, "hello")
We know that the constraint affects the first part of the tuple but does not affect the second part. The type system doesn't know that, unfortunately, and will complain if you try to do this:
myString = snd myTuple
Even though intuitively one would expect myString to be just a String, the type checker needs to specialize the type variable a and figure out whether the constraint is actually satisfied. In order to make this expression work, one would need to annotate the type of either snd or myTuple:
myString = snd (myTuple :: ((), String))
In Haskell, as I'm sure you know, types are inferred. In other words, the compiler works out what type you want.
However, in Haskell, there are also polymorphic typeclasses, with functions that act in different ways depending on the return type. Here's an example of the Monad class, though I haven't defined everything:
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
fail :: String -> m a
We're given a lot of functions with just type signatures. Our job is to make instance declarations for different types that can be treated as Monads, like Maybe t or [t].
Have a look at this code - it won't work in the way we might expect:
return 7
That's a function from the Monad class, but because there's more than one Monad, we have to specify what return value/type we want, or it automatically becomes an IO Monad. So:
return 7 :: Maybe Int
-- Will return...
Just 7
return 6 :: [Int]
-- Will return...
[6]
This is because [t] and Maybe have both been defined in the Monad type class.
Here's another example, this time from the random typeclass. This code throws an error:
random (mkStdGen 100)
Because random returns something in the Random class, we'll have to define what type we want to return, with a StdGen object tupelo with whatever value we want:
random (mkStdGen 100) :: (Int, StdGen)
-- Returns...
(-3650871090684229393,693699796 2103410263)
random (mkStdGen 100) :: (Bool, StdGen)
-- Returns...
(True,4041414 40692)
This can all be found at learn you a Haskell online, though you'll have to do some long reading. This, I'm pretty much 100% certain, it the only time when types are necessary.

STRef and phantom types

Does s in STRef s a get instantiated with a concrete type? One could easily imagine some code where STRef is used in a context where the a takes on Int. But there doesn't seem to be anything for the type inference to give s a concrete type.
Imagine something in pseudo Java like MyList<S, A>. Even if S never appeared in the implementation of MyList instantiating a concrete type like MyList<S, Integer> where a concrete type is not used in place of S would not make sense. So how can STRef s a work?
tl;dr - in practice it seems it always gets initialised to RealWorld in the end
The source notes that s can be instantiated to RealWorld inside invocations of stToIO, but is otherwise uninstantiated:
-- The s parameter is either
-- an uninstantiated type variable (inside invocations of 'runST'), or
-- 'RealWorld' (inside invocations of 'Control.Monad.ST.stToIO').
Looking at the actual code for ST however it seems runST uses a specific value realWorld#:
newtype ST s a = ST (STRep s a)
type STRep s a = State# s -> (# State# s, a #)
runST :: (forall s. ST s a) -> a
runST st = runSTRep (case st of { ST st_rep -> st_rep })
runSTRep :: (forall s. STRep s a) -> a
runSTRep st_rep = case st_rep realWorld# of
(# _, r #) -> r
realWorld# is defined as a magic primitive inside the GHC source code:
realWorldName = mkWiredInIdName gHC_PRIM (fsLit "realWorld#")
realWorldPrimIdKey realWorldPrimId
realWorldPrimId :: Id -- :: State# RealWorld
realWorldPrimId = pcMiscPrelId realWorldName realWorldStatePrimTy
(noCafIdInfo `setUnfoldingInfo` evaldUnfolding
`setOneShotInfo` stateHackOneShot)
You can also confirm this in ghci:
Prelude> :set -XMagicHash
Prelude> :m +GHC.Prim
Prelude GHC.Prim> :t realWorld#
realWorld# :: State# RealWorld
From your question I can not see if you understand why the phantom s type is there at all. Even if you did not ask for this explicitly, let me elaborate on that.
The role of the phantom type
The main use of the phantom type is to constrain references (aka pointers) to stay "inside" the ST monad. Roughly, the dynamically allocated data must end its life when runST returns.
To see the issue, let's pretend that the type of runST were
runST :: ST s a -> a
Then, consider this:
data Dummy
let var :: STRef Dummy Int
var = runST (newSTRef 0)
change :: () -> ()
change = runST (modifySTRef var succ)
access :: () -> Int
result :: (Int, ())
result = (access() , change())
in result
(Above I added a few useless () arguments to make it similar to imperative code)
Now what should be the result of the code above? It could be either (0,()) or (1,()) depending on the evaluation order. This is a big no-no in the pure Haskell world.
The issue here is that var is a reference which "escaped" from its runST. When you escape the ST monad, you are no longer forced to use the monad operator >>= (or equivalently, the do notation to sequentialize the order of side effects. If references are still around, then we can still have side effects around when there should be none.
To avoid the issue, we restrict runST to work on ST s a where a does not depend on s. Why this? Because newSTRef returns a STRef s a, a reference to a tagged with the phantom type s, hence the return type depends on s and can not be extracted from the ST monad through runST.
Technically, this restriction is done by using a rank-2 type:
runST :: (forall s. ST s a) -> a
the "forall" here is used to implement the restriction. The type is saying: choose any a you wish, then provide a value of type ST s a for any s I wish, then I will return an a. Mind that s is chosen by runST, not by the caller, so it could be absolutely anything. So, the type system will accept an application runST action only if action :: forall s. ST s a where s is unconstrained, and a does not involve s (recall that the caller has to choose a before runST chooses s).
It is indeed a slightly hackish trick to implement the independence constraint, but it does work.
On the actual question
To connect this to your actual question: in the implementation of runST, s will be chosen to be any concrete type. Note that, even if s were simply chosen to be Int inside runST it would not matter much, because the type system has already constrained a to be independent from s, hence to be reference-free. As #Ganesh pointed out, RealWorld is the type used by GHC.
You also mentioned Java. One could attempt to play a similar trick in Java as follows: (warning, overly simplified code follows)
interface ST<S,A> { A call(); }
interface STAction<A> { <S> ST<S,A> call(S dummy); }
...
<A> A runST(STAction<A> action} {
RealWorld dummy = new RealWorld();
return action.call(dummy).call();
}
Above in STAction parameter A can not depend on S.

What monadic type return return function

This function is strange. I'm confused.
return :: (Monad m) => a -> m a
If i wrote return 5, I will get monad with 5 inside. But what type? Typeclasses are only named dependencies, not types. Monad is List, IO ... but this is undefined monad type.
return is polymorphic so it can stand for more than one type. Just like + in C is overloaded to work both at summing ints and at summing floats, return is overloaded to work with any monad.
Of course, when its time to run the code you need to know what type the m corresponds to in order to know what concrete implementation of return to use. Some times you have explicit type annotations or type inference that lets you know what implementation of return to use
(return 5) :: [Int]
Other times, you can "push up" the decision higher up. If you write a larger polymorphic function, the inner returns use the same type from the outer function.
my_func :: Monad m => a -> m a
my_func x = return x
(my_func 10) :: [Int]
I told my func that I was working on the list monad and in turn, this made my_func use the list monad implementation of return inside.
Finally, if you don't leave enough information for the compiler to figure out what type to use, you will get an ambiguou intance compilation error. This is specially common with the Read typeclass. (try typing x <- readLn in ghci to see what happens...)
It's polymorphic. It returns whatever monad instance's return implementation was called. What specific data type it returns depends on the function.
[1,2,3] >>= \n -> return $ n + 1 -- Gives [2,3,4]
getLine >>= \str -> return $ reverse str -- Gets input and reverses it

Why discarded values are () instead of ⊥ in Haskell?

Howcome in Haskell, when there is a value that would be discarded, () is used instead of ⊥?
Examples (can't really think of anything other than IO actions at the moment):
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()
foldM_ :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m ()
writeFile :: FilePath -> String -> IO ()
Under strict evaluation, this makes perfect sense, but in Haskell, it only makes the domain bigger.
Perhaps there are "unused parameter" functions d -> a which are strict on d (where d is an unconstrained type parameter and does not appear free in a)? Ex: seq, const' x y = yseqx.
I think this is because you need to specify the type of the value to be discarded. In Haskell-98, () is the obvious choice. And as long as you know the type is (), you may as well make the value () as well (presuming evaluation proceeds that far), just in case somebody tries to pattern-match on it or something. I think most programmers don't like introducing extra ⊥'s into code because it's just an extra trap to fall into. I certainly avoid it.
Instead of (), it is possible to create an uninhabited type (except by ⊥ of course).
{-# LANGUAGE EmptyDataDecls #-}
data Void
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m Void
Now it's not even possible to pattern-match, because there's no Void constructor. I suspect the reason this isn't done more often is because it's not Haskell-98 compatible, as it requires the EmptyDataDecls extension.
Edit: you can't pattern-match on Void, but seq will ruin your day. Thanks to #sacundim for pointing this out.
Well, bottom type literally means an unterminating computation, and unit type is just what it is - a type inhabited with single value. Clearly, monadic computations usually meant to be finished, so it simply doesn't make sense to make them return undefined. And, of course, it is simply a safety measure - just like John L said, what if someone pattern matches on monadic result? So monadic computations return the 'lowest' possible (in Haskell 98) type - unit.
So, maybe we could have the following signatures:
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m z
foldM_ :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m z
writeFile :: FilePath -> String -> IO z
We'd reimplement the functions in question so that any attempt to bind the z in m z or IO z would bind the variable to undefined or any other bottom.
What do we gain? Now people can write programs that force the undefined result of these computations. How is that a good thing? All it means is that people can now write programs that fail to terminate for no good reason, that were impossible to write before.
You're getting confused between types and values.
In writeFile :: FilePath -> String -> IO (), the () is the unit type. The value you get for x by doing x <- writeFile foo bar in a do block is (normally) the value (), which is the sole non-bottom inhabitant of the type ().
⊥ OTOH is a value. Since ⊥ is a member of every type, it's also usable as a value for the type (). If you're discarding that x above without using it (we normally don't even extract it into a variable), it may very well be ⊥ and you'd never know. In that sense you already have what you want; if you're ever writing a function whose result you expect to be always ignored, you could use ⊥. But since ⊥ is a value of every type, there is no type ⊥, and so there is no type IO ⊥.
But really, they represent different conceptual things. The type () is the type of values that contain zero information (which is why there is only one value; if there were two or more values then () values would contain at least as much information as values of Bool). IO () is the type of IO actions that generate a value with no information, but may effects that will happen as a result of generating that non-informative value.
⊥ is in some sense a non-value. 1 `div` 0 gives ⊥ because there is no value that could be used as the result of that expression which satisfies the laws of integer division. Throwing an exception gives ⊥ because functions that contain exception throws do not give you a value of their type. Non-termination gives ⊥ because the expression never terminates with a value. ⊥ is a way of treating all of these non-values as if they were a value for some purposes. As far as I can tell it's mainly useful because Haskell's laziness means that ⊥ and a data structure containing ⊥ (i.e. [⊥]) are distinguishable.
The value () is not like the cases where we use ⊥. writeFile foo bar doesn't have an "impossible value" like return $ 1 `div` 0, it just has no information in its value (other than that contained in the monadic structure). There are perfectly sensible things I could do with the () I get from doing x <- writeFile foo bar; they're just not very interesting and so nobody ever does them. This is distinctly different from x <- return $ 1 `div` 0, where doing anything with that value has to give me another ill-defined value.
I would like to point out one severe downside to writing one particular form of returning ⊥: if you write types like this, you get bad programs:
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m z
This is way too polymorphic. As an example, consider forever :: Monad m => m a -> m b. I encountered this gotcha a long time ago and I'm still bitter:
main :: IO ()
main = forever putStrLn "This is going to get printed a lot!"
The error is obvious and simple: missing parentheses.
It typechecks. This is exactly the sort of error that the type system is supposed to catch easily.
It silently infinite loops at runtime (without printing anything). It is a pain to debug.
Why? Well, because r -> is a monad. So m b matches virtually anything. For example:
forever :: m a -> m b
forever putStrLn :: String -> b
forever putStrLn "hello!" :: b -- eep!
forever putStrLn "hello" readFile id flip (Nothing,[17,0]) :: t -- no type error.
This sort of thing inclines me to the view that forever should be typed m a -> m Void.
() is ⊤, i.e. the unit type, not the ⊥ (the bottom type). The big difference is that the unit type is inhabited, so that it has a value (() in Haskell), on the other hand, the bottom type is uninhabited, so that you can't write functions like that:
absurd : ⊥
absurd = -- no way
Of course you can do this in Haskell since the "bottom type" (there is no such thing, of course) is inhabited here with undefined. This makes Haskell inconsistent.
Functions like this:
disprove : a → ⊥
disprove x = -- ...
can be written, it is the same as
disprove : ¬ a
disprove x = -- ...
i.e. it disproving the type a, so that a is an absurd.
In any case, you can see how the unit type is used in different languages, as () :: () in Haskell, () : unit in ML, () : Unit in Scala and tt : ⊤ in Agda. In languages like Haskell and Agda (with the IO monad) functions like putStrLn should have a type String → IO ⊤, not the String → IO ⊥ since this is an absurd (logically it states that there is no strings that can be printed, this is just not right).
DISCLAIMER: previous text use Agda notation and it is more about Agda than Haskell.
In Haskell if we have
data Void
It doesn't mean that Void is uninhabited. It is inhabited with undefined, non-terminating programs, errors and exceptions. For example:
data Void
instance Show Void where
show _ = "Void"
data Identity a = Identity { runIdentity :: a }
mapM__ :: (a -> Identity b) -> [a] -> Identity Void
mapM__ _ _ = Identity undefined
then
print $ runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3]
-- ^ will print "Void".
case runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3] of _ -> print "1"
-- ^ will print "1".
let x = runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3]
x `seq` print x
-- ^ will thrown an exception.
But it also doesn't mean that Void is ⊥. So
mapM_ :: Monad m => (a -> m b) -> [a] -> m Void
where Void is decalred as empty data type, is ok. But
mapM_ :: Monad m => (a -> m b) -> [a] -> m ⊥
is nonsence, but there is no such type as ⊥ in Haskell.

Resources