Haskell rank two polymorphism compile error - haskell

Given the following definitions:
import Control.Monad.ST
import Data.STRef
fourty_two = do
x <- newSTRef (42::Int)
readSTRef x
The following compiles under GHC:
main = (print . runST) fourty_two -- (1)
But this does not:
main = (print . runST) $ fourty_two -- (2)
But then as bdonlan points out in a comment, this does compile:
main = ((print . runST) $) fourty_two -- (3)
But, this does not compile
main = (($) (print . runST)) fourty_two -- (4)
Which seems to indicate that (3) only compiles due to special treatment of infix $, however, it still doesn't explain why (1) does compile.
Questions:
1) I've read the following two questions (first, second), and I've been led to believe $ can only be instantiated with monomorphic types. But I would similarly assume . can only be instantiated with monomorphic types, and as a result would similarly fail.
Why does the first code succeed but the second code does not? (e.g. is there a special rule GHC has for the first case that it can't apply in the second?)
2) Is there a current GHC extension that compiles the second code? (perhaps ImpredicativePolymorphism did this at some point, but it seems deprecated, has anything replaced it?)
3) Is there any way to define say `my_dollar` using GHC extensions to do what $ does, but is also able to handle polymorphic types, so (print . runST) `my_dollar` fourty_two compiles?
Edit: Proposed Answer:
Also, the following fails to compile:
main = ((.) print runST) fourty_two -- (5)
This is the same as (1), except not using the infix version of ..
As a result, it seems GHC has special rules for both $ and ., but only their infix versions.

I'm not sure I understand why the second doesn't work. We can look at the type of print . runST and observe that it is sufficiently polymorphic, so the blame doesn't lie with (.). I suspect that the special rule that GHC has for infix ($) just isn't quite sufficient. SPJ and friends might be open to re-examining it if you propose this fragment as a bug on their tracker.
As for why the third example works, well, that's just because again the type of ((print . runST) $) is sufficiently polymorphic; in fact, it's equal to the type of print . runST.
Nothing has replaced ImpredicativePolymorphism, because the GHC folks haven't seen any use cases where the extra programmer convenience outweighed the extra potential for compiler bugs. (I don't think they'd see this as compelling, either, though of course I'm not the authority.)
We can define a slightly less polymorphic ($$):
{-# LANGUAGE RankNTypes #-}
infixl 0 $$
($$) :: ((forall s. f s a) -> b) -> ((forall s. f s a) -> b)
f $$ x = f x
Then your example typechecks okay with this new operator:
*Main> (print . runST) $$ fourty_two
42

I can't say with too much authority on this subject, but here's what I think may be happening:
Consider what the typechecker has to do in each of these cases. (print . runST) has type Show b => (forall s. ST s t) -> IO (). fourty_two has type ST x Int.
The forall here is an existential type qualifier - here it means that the argument passed in has to be universal on s. That is, you must pass in a polymorphic type that supports any value for s whatsoever. If you don't explicitly state forall, Haskell puts it at the outermost level of the type definition. This means that fourty_two :: forall x. ST x Int and (print . runST) :: forall t. Show t => (forall s. ST s t) -> IO ()
Now, we can match forall x. ST x Int with forall s. ST s t by letting t = Int, x = s. So the direct call case works. What happens if we use $, though?
$ has type ($) :: forall a b. (a -> b) -> a -> b. When we resolve a and b, since the type for $ doesn't have any explicit type scoping like this, the x argument of fourty_two gets lifted out to the outermost scope in the type for ($) - so ($) :: forall x t. (a = forall s. ST s t -> b = IO ()) -> (a = ST x t) -> IO (). At this point, it tries to match a and b, and fails.
If you instead write ((print . runST) $) fourty_two, then the compiler first resolves the type of ((print . runST $). It resolves the type for ($) to be forall t. (a = forall s. ST s t -> b = IO ()) -> a -> b; note that since the second occurance of a is unconstrained, we don't have that pesky type variable leaking out to the outermost scope! And so the match succeeds, the function is partially applied, and the overall type of the expression is forall t. (forall s. ST s t) -> IO (), which is right back where we started, and so it succeeds.

Related

'Referencing' typeclass functions

I'm a beginner and I'm trying to use Hoed to trace Haskell evaluations, because maybe it will further help my learning process.
I saw in their examples code like this
isEven :: Int -> Bool
isEven = observe "isEven" isEven'
isEven' n = mod2 n == 0
I was thinking how could I observe in order to trace an instance defined function like >>= for example.
I wrote something like
bind' = observe "bind'" (>>=)
and of course I've got an error
* Ambiguous type variable 'm0' arising from a use of '>>='
prevents the constraint '(Monad m0)' from being solved.
Relevant bindings include
bind' :: m0 a0 -> (a0 -> m0 b0) -> m0 b0 (bound at my.hs:46:1)
Probable fix: use a type annotation to specify what 'm0' should be.
These potential instances exist:
...
Should I / How could I use a type annotation in order to specify which Monad instance's (e.g. Reader, State etc.) >>= function
It looks like you have found the infamous MonomorphismRestriction. More info. The links do a great job of explaining what the MonomorphismRestriction is and how it works.
You're not wrong to expect that writing bind' with no signature should "just work". However, sometimes the compiler needs a bit of help. In short, due to the MonomorphismRestriction, GHC tries to take the nominally polymorphic signature of bind' :: Monad m => m a -> (a -> m b) -> m b, and make it less polymorphic by instantiating some of the type variables.
In your case, it looks like the compiler wants to make bind' only work for one specific Monad m. Without your real code, I can't say for sure, but consider this example:
import Debug.Trace
main :: IO ()
main = (>>=) (return "hello") print
bind' = trace "bind" (>>=)
The compiler produces an error similar to yours: Ambiguous type variable m0
However, if you use bind':
import Debug.Trace
main :: IO ()
main = bind' (return "hello") print
bind' = trace "bind" (>>=)
no error! That's because GHC is inferring that m should be IO since bind' is used in the IO monad.
Alternatively, you can tell GHC to turn off the MonomorphismRestriction:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Debug.Trace
main :: IO ()
main = (>>=) (return "hello") print
bind' = trace "bind" (>>=)
and it compiles just fine!

When are type signatures necessary in Haskell?

Many introductory texts will tell you that in Haskell type signatures are "almost always" optional. Can anybody quantify the "almost" part?
As far as I can tell, the only time you need an explicit signature is to disambiguate type classes. (The canonical example being read . show.) Are there other cases I haven't thought of, or is this it?
(I'm aware that if you go beyond Haskell 2010 there are plenty for exceptions. For example, GHC will never infer rank-N types. But rank-N types are a language extension, not part of the official standard [yet].)
Polymorphic recursion needs type annotations, in general.
f :: (a -> a) -> (a -> b) -> Int -> a -> b
f f1 g n x =
if n == (0 :: Int)
then g x
else f f1 (\z h -> g (h z)) (n-1) x f1
(Credit: Patrick Cousot)
Note how the recursive call looks badly typed (!): it calls itself with five arguments, despite f having only four! Then remember that b can be instantiated with c -> d, which causes an extra argument to appear.
The above contrived example computes
f f1 g n x = g (f1 (f1 (f1 ... (f1 x))))
where f1 is applied n times. Of course, there is a much simpler way to write an equivalent program.
Monomorphism restriction
If you have MonomorphismRestriction enabled, then sometimes you will need to add a type signature to get the most general type:
{-# LANGUAGE MonomorphismRestriction #-}
-- myPrint :: Show a => a -> IO ()
myPrint = print
main = do
myPrint ()
myPrint "hello"
This will fail because myPrint is monomorphic. You would need to uncomment the type signature to make it work, or disable MonomorphismRestriction.
Phantom constraints
When you put a polymorphic value with a constraint into a tuple, the tuple itself becomes polymorphic and has the same constraint:
myValue :: Read a => a
myValue = read "0"
myTuple :: Read a => (a, String)
myTuple = (myValue, "hello")
We know that the constraint affects the first part of the tuple but does not affect the second part. The type system doesn't know that, unfortunately, and will complain if you try to do this:
myString = snd myTuple
Even though intuitively one would expect myString to be just a String, the type checker needs to specialize the type variable a and figure out whether the constraint is actually satisfied. In order to make this expression work, one would need to annotate the type of either snd or myTuple:
myString = snd (myTuple :: ((), String))
In Haskell, as I'm sure you know, types are inferred. In other words, the compiler works out what type you want.
However, in Haskell, there are also polymorphic typeclasses, with functions that act in different ways depending on the return type. Here's an example of the Monad class, though I haven't defined everything:
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
fail :: String -> m a
We're given a lot of functions with just type signatures. Our job is to make instance declarations for different types that can be treated as Monads, like Maybe t or [t].
Have a look at this code - it won't work in the way we might expect:
return 7
That's a function from the Monad class, but because there's more than one Monad, we have to specify what return value/type we want, or it automatically becomes an IO Monad. So:
return 7 :: Maybe Int
-- Will return...
Just 7
return 6 :: [Int]
-- Will return...
[6]
This is because [t] and Maybe have both been defined in the Monad type class.
Here's another example, this time from the random typeclass. This code throws an error:
random (mkStdGen 100)
Because random returns something in the Random class, we'll have to define what type we want to return, with a StdGen object tupelo with whatever value we want:
random (mkStdGen 100) :: (Int, StdGen)
-- Returns...
(-3650871090684229393,693699796 2103410263)
random (mkStdGen 100) :: (Bool, StdGen)
-- Returns...
(True,4041414 40692)
This can all be found at learn you a Haskell online, though you'll have to do some long reading. This, I'm pretty much 100% certain, it the only time when types are necessary.

Signature of IO in Haskell (is this class or data?)

The question is not what IO does, but how is it defined, its signature. Specifically, is this data or class, is "a" its type parameter then? I didn't find it anywhere. Also, I don't understand the syntactic meaning of this:
f :: IO a
You asked whether IO a is a data type: it is. And you asked whether the a is its type parameter: it is. You said you couldn't find its definition. Let me show you how to find it:
localhost:~ gareth.rowlands$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Prelude> :i IO
newtype IO a
= GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
-- Defined in `GHC.Types'
instance Monad IO -- Defined in `GHC.Base'
instance Functor IO -- Defined in `GHC.Base'
Prelude>
In ghci, :i or :info tells you about a type. It shows the type declaration and where it's defined. You can see that IO is a Monad and a Functor too.
This technique is more useful on normal Haskell types - as others have noted, IO is magic in Haskell. In a typical Haskell type, the type signature is very revealing but the important thing to know about IO is not its type declaration, rather that IO actions actually perform IO. They do this in a pretty conventional way, typically by calling the underlying C or OS routine. For example, Haskell's putChar action might call C's putchar function.
IO is a polymorphic type (which happens to be an instance of Monad, irrelevant here).
Consider the humble list. If we were to write our own list of Ints, we might do this:
data IntList = Nil | Cons { listHead :: Int, listRest :: IntList }
If you then abstract over what element type it is, you get this:
data List a = Nil | Cons { listHead :: a, listRest :: List a }
As you can see, the return value of listRest is List a. List is a polymorphic type of kind * -> *, which is to say that it takes one type argument to create a concrete type.
In a similar way, IO is a polymorphic type with kind * -> *, which again means it takes one type argument. If you were to define it yourself, it might look like this:
data IO a = IO (RealWorld -> (a, RealWorld))
(definition courtesy of this answer)
The amount of magic in IO is grossly overestimated: it has some support from compiler and runtime system, but much less than newbies usually expect.
Here is the source file where it is defined:
http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.3.0.0/src/GHC-Types.html
newtype IO a
= IO (State# RealWorld -> (# State# RealWorld, a #))
It is just an optimized version of state monad. If we remove optimization annotations we will see:
data IO a = IO (Realworld -> (Realworld, a))
So basically IO a is a data structure storing a function that takes old real world and returns new real world with io operation performed and a.
Some compiler tricks are necessary mostly to remove Realworld dummy value efficiently.
IO type is an abstract newtype - constructors are not exported, so you cannot bypass library functions, work with it directly and perform nasty things: duplicate RealWorld, create RealWorld out of nothing or escape the monad (write a function of IO a -> a type).
Since IO can be applied to objects of any type a, as it is a polymorphic monad, a is not specified.
If you have some object with type a, then it can be 'wrappered' as an object of type IO a, which you can think of as being an action that gives an object of type a. For example, getChar is of type IO Char, and so when it is called, it has the side effect of (From the program's perspective) generating a character, which comes from stdin.
As another example, putChar has type Char -> IO (), meaning that it takes a char, and then performs some action that gives no output (in the context of the program, though it will print the char given to stdout).
Edit: More explanation of monads:
A monad can be thought of as a 'wrapper type' M, and has two associated functions:
return and >>=.
Given a type a, it is possible to create objects of type M a (IO a in the case of the IO monad), using the return function.
return, therefore, has type a -> M a. Moreover, return attempts not to change the element that it is passed -- if you call return x, you will get a wrappered version of x that contains all of the information of x (Theoretically, at least. This doesn't happen with, for example, the empty monad.)
For example, return "x" will yield an M Char. This is how getChar works -- it yields an IO Char using a return statement, which is then pulled out of its wrapper with <-.
>>=, read as 'bind', is more complicated. It has type M a -> (a -> M b) -> M b, and its role is to take a 'wrappered' object, and a function from the underlying type of that object to another 'wrappered' object, and apply that function to the underlying variable in the first input.
For example, (return 5) >>= (return . (+ 3)) will yield an M Int, which will be the same M Int that would be given by return 8. In this way, any function that can be applied outside of a monad can also be applied inside of it.
To do this, one could take an arbitrary function f :: a -> b, and give the new function g :: M a -> M b as follows:
g x = x >>= (return . f)
Now, for something to be a monad, these operations must also have certain relations -- their definitions as above aren't quite enough.
First: (return x) >>= f must be equivalent to f x. That is, it must be equivalent to perform an operation on x whether it is 'wrapped' in the monad or not.
Second: x >>= return must be equivalent to m. That is, if an object is unwrapped by bind, and then rewrapped by return, it must return to its same state, unchanged.
Third, and finally (x >>= f) >>= g must be equivalent to x >>= (\y -> (f y >>= g) ). That is, function binding is associative (sort of). More accurately, if two functions are bound successively, this must be equivalent to binding the combination thereof.
Now, while this is how monads work, it's not how it's most commonly used, because of the syntactic sugar of do and <-.
Essentially, do begins a long chain of binds, and each <- sort of creates a lambda function that gets bound.
For example,
a = do x <- something
y <- function x
return y
is equivalent to
a = something >>= (\x -> (function x) >>= (\y -> return y))
In both cases, something is bound to x, function x is bound to y, and then y is returned to a in the wrapper of the relevant monad.
Sorry for the wall of text, and I hope it explains something. If there's more you need cleared up about this, or something in this explanation is confusing, just ask.
This is a very good question, if you ask me. I remember being very confused about this too, maybe this will help...
'IO' is a type constructor, 'IO a' is a type, the 'a' (in 'IO a') is an type variable. The letter 'a' carries no significance, the letter 'b' or 't1' could have been used just as well.
If you look at the definition of the IO type constructor you will see that it is a newtype defined as: GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
'f :: IO a' is the type of a function called 'f' of apparently no arguments that returns a result of some unconstrained type in the IO monad. 'in the IO monad' means that f can do some IO (i.e. change the 'RealWorld', where 'change' means replace the provided RealWorld with a new one) while computing its result. The result of f is polymorphic (that's a type variable 'a' not a type constant like 'Int'). A polymorphic result means that in your program it's the caller that determines the type of the result, so used in one place f could return an Int, used in another place it could return a String. 'Unconstrained' means that there's no type class restricting what type can be returned and so any type can be returned.
Why is 'f' a function and not a constant since there are no parameters and Haskell is pure? Because the definition of IO means that 'f :: IO a' could have been written 'f :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #)' and so in fact has a parameter -- the 'state of the real world'.
In the data IO a a have mainly the same meaning as in Maybe a.
But we can't rid of a constructor, like:
fromIO :: IO a -> a
fromIO (IO a) = a
Fortunately we could use this data in Monads, like:
{-# LANGUAGE ScopedTypeVariables #-}
foo = do
(fromIO :: a) <- (dataIO :: IO a)
...

Simple example for ImpredicativeTypes

The GHC user's guide describes the impredicative polymorphism extension with reference to the following example:
f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char])
f (Just g) = Just (g [3], g "hello")
f Nothing = Nothing
However, when I define this example in a file and try to call it, I get a type error:
ghci> f (Just reverse)
<interactive>:8:9:
Couldn't match expected type `forall a. [a] -> [a]'
with actual type `[a0] -> [a0]'
In the first argument of `Just', namely `reverse'
In the first argument of `f', namely `(Just reverse)'
In the expression: f (Just reverse)
ghci> f (Just id)
<interactive>:9:9:
Couldn't match expected type `forall a. [a] -> [a]'
with actual type `a0 -> a0'
In the first argument of `Just', namely `id'
In the first argument of `f', namely `(Just id)'
In the expression: f (Just id)
Seemingly only undefined, Nothing, or Just undefined satisfies the type-checker.
I have two questions, therefore:
Can the above function be called with Just f for any non-bottom f?
Can someone provide an example of a value only definable with impredicative polymorphism, and usable in a non-trivial way?
The latter is particularly with the HaskellWiki page on Impredicative Polymorphism in mind, which currently makes a decidedly unconvincing case for the existence of the extension.
Here's an example of how one project, const-math-ghc-plugin, uses ImpredicativeTypes to specify a list of matching rules.
The idea is that when we have an expression of the form App (PrimOp nameStr) (Lit litVal), we want to look up the appropriate rule based upon the primop name. A litVal will be either a MachFloat d or MachDouble d (d is a Rational). If we find a rule, we want to apply the function for that rule to d converted to the correct type.
The function mkUnaryCollapseIEEE does this for unary functions.
mkUnaryCollapseIEEE :: (forall a. RealFloat a => (a -> a))
-> Opts
-> CoreExpr
-> CoreM CoreExpr
mkUnaryCollapseIEEE fnE opts expr#(App f1 (App f2 (Lit lit)))
| isDHash f2, MachDouble d <- lit = e d mkDoubleLitDouble
| isFHash f2, MachFloat d <- lit = e d mkFloatLitFloat
where
e d = evalUnaryIEEE opts fnE f1 f2 d expr
The first argument needs to have a Rank-2 type, because it will be instantiated at either Float or Double depending on the literal constructor. The list of rules looks like this:
unarySubIEEE :: String -> (forall a. RealFloat a => a -> a) -> CMSub
unarySubIEEE nm fn = CMSub nm (mkUnaryCollapseIEEE fn)
subs =
[ unarySubIEEE "GHC.Float.exp" exp
, unarySubIEEE "GHC.Float.log" log
, unarySubIEEE "GHC.Float.sqrt" sqrt
-- lines omitted
, unarySubIEEE "GHC.Float.atanh" atanh
]
This is ok, if a bit too much boilerplate for my taste.
However, there's a similar function mkUnaryCollapsePrimIEEE. In this case, the rules are different for different GHC versions. If we want to support multiple GHCs, it gets a bit tricky. If we took the same approach, the subs definition would require a lot of CPP, which can be unmaintainable. Instead, we defined the rules in a separate file for each GHC version. However, mkUnaryCollapsePrimIEEE isn't available in those modules due to circular import issues. We could probably re-structure the modules to make it work, but instead we defined the rulesets as:
unaryPrimRules :: [(String, (forall a. RealFloat a => a -> a))]
unaryPrimRules =
[ ("GHC.Prim.expDouble#" , exp)
, ("GHC.Prim.logDouble#" , log)
-- lines omitted
, ("GHC.Prim.expFloat#" , exp)
, ("GHC.Prim.logFloat#" , log)
]
By using ImpredicativeTypes, we can keep a list of Rank-2 functions, ready to use for the first argument to mkUnaryCollapsePrimIEEE. The alternatives would be much more CPP/boilerplate, changing the module structure (or circular imports), or a lot of code duplication. None of which I would like.
I do seem to recall GHC HQ indicating that they would like to drop support for the extension, but perhaps they've reconsidered. It is quite useful at times.
Isn't it just that ImpredicativeTypes has been quietly dropped with the new typechecker in ghc-7+ ? Note that ideone.com still uses ghc-6.8 and indeed your program use to run fine :
{-# OPTIONS -fglasgow-exts #-}
f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char])
f (Just g) = Just (g [3], g "hello")
f Nothing = Nothing
main = print $ f (Just reverse)
prints Just ([3],"olleh") as expected; see http://ideone.com/KMASZy
augustss gives a handsome use case -- some sort of imitation Python dsl -- and a defense of the extension here: http://augustss.blogspot.com/2011/07/impredicative-polymorphism-use-case-in.html referred to in the ticket here http://hackage.haskell.org/trac/ghc/ticket/4295
Note this workaround:
justForF :: (forall a. [a] -> [a]) -> Maybe (forall a. [a] -> [a])
justForF = Just
ghci> f (justForF reverse)
Just ([3],"olleh")
Or this one (which is basically the same thing inlined):
ghci> f $ (Just :: (forall a. [a] -> [a]) -> Maybe (forall a. [a] -> [a])) reverse
Just ([3],"olleh")
Seems like the type inference has problems infering the type of the Just in your case and we have to tell it the type.
I have no clue if it's a bug or if there is a good reason for it.. :)

Why discarded values are () instead of ⊥ in Haskell?

Howcome in Haskell, when there is a value that would be discarded, () is used instead of ⊥?
Examples (can't really think of anything other than IO actions at the moment):
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()
foldM_ :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m ()
writeFile :: FilePath -> String -> IO ()
Under strict evaluation, this makes perfect sense, but in Haskell, it only makes the domain bigger.
Perhaps there are "unused parameter" functions d -> a which are strict on d (where d is an unconstrained type parameter and does not appear free in a)? Ex: seq, const' x y = yseqx.
I think this is because you need to specify the type of the value to be discarded. In Haskell-98, () is the obvious choice. And as long as you know the type is (), you may as well make the value () as well (presuming evaluation proceeds that far), just in case somebody tries to pattern-match on it or something. I think most programmers don't like introducing extra ⊥'s into code because it's just an extra trap to fall into. I certainly avoid it.
Instead of (), it is possible to create an uninhabited type (except by ⊥ of course).
{-# LANGUAGE EmptyDataDecls #-}
data Void
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m Void
Now it's not even possible to pattern-match, because there's no Void constructor. I suspect the reason this isn't done more often is because it's not Haskell-98 compatible, as it requires the EmptyDataDecls extension.
Edit: you can't pattern-match on Void, but seq will ruin your day. Thanks to #sacundim for pointing this out.
Well, bottom type literally means an unterminating computation, and unit type is just what it is - a type inhabited with single value. Clearly, monadic computations usually meant to be finished, so it simply doesn't make sense to make them return undefined. And, of course, it is simply a safety measure - just like John L said, what if someone pattern matches on monadic result? So monadic computations return the 'lowest' possible (in Haskell 98) type - unit.
So, maybe we could have the following signatures:
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m z
foldM_ :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m z
writeFile :: FilePath -> String -> IO z
We'd reimplement the functions in question so that any attempt to bind the z in m z or IO z would bind the variable to undefined or any other bottom.
What do we gain? Now people can write programs that force the undefined result of these computations. How is that a good thing? All it means is that people can now write programs that fail to terminate for no good reason, that were impossible to write before.
You're getting confused between types and values.
In writeFile :: FilePath -> String -> IO (), the () is the unit type. The value you get for x by doing x <- writeFile foo bar in a do block is (normally) the value (), which is the sole non-bottom inhabitant of the type ().
⊥ OTOH is a value. Since ⊥ is a member of every type, it's also usable as a value for the type (). If you're discarding that x above without using it (we normally don't even extract it into a variable), it may very well be ⊥ and you'd never know. In that sense you already have what you want; if you're ever writing a function whose result you expect to be always ignored, you could use ⊥. But since ⊥ is a value of every type, there is no type ⊥, and so there is no type IO ⊥.
But really, they represent different conceptual things. The type () is the type of values that contain zero information (which is why there is only one value; if there were two or more values then () values would contain at least as much information as values of Bool). IO () is the type of IO actions that generate a value with no information, but may effects that will happen as a result of generating that non-informative value.
⊥ is in some sense a non-value. 1 `div` 0 gives ⊥ because there is no value that could be used as the result of that expression which satisfies the laws of integer division. Throwing an exception gives ⊥ because functions that contain exception throws do not give you a value of their type. Non-termination gives ⊥ because the expression never terminates with a value. ⊥ is a way of treating all of these non-values as if they were a value for some purposes. As far as I can tell it's mainly useful because Haskell's laziness means that ⊥ and a data structure containing ⊥ (i.e. [⊥]) are distinguishable.
The value () is not like the cases where we use ⊥. writeFile foo bar doesn't have an "impossible value" like return $ 1 `div` 0, it just has no information in its value (other than that contained in the monadic structure). There are perfectly sensible things I could do with the () I get from doing x <- writeFile foo bar; they're just not very interesting and so nobody ever does them. This is distinctly different from x <- return $ 1 `div` 0, where doing anything with that value has to give me another ill-defined value.
I would like to point out one severe downside to writing one particular form of returning ⊥: if you write types like this, you get bad programs:
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m z
This is way too polymorphic. As an example, consider forever :: Monad m => m a -> m b. I encountered this gotcha a long time ago and I'm still bitter:
main :: IO ()
main = forever putStrLn "This is going to get printed a lot!"
The error is obvious and simple: missing parentheses.
It typechecks. This is exactly the sort of error that the type system is supposed to catch easily.
It silently infinite loops at runtime (without printing anything). It is a pain to debug.
Why? Well, because r -> is a monad. So m b matches virtually anything. For example:
forever :: m a -> m b
forever putStrLn :: String -> b
forever putStrLn "hello!" :: b -- eep!
forever putStrLn "hello" readFile id flip (Nothing,[17,0]) :: t -- no type error.
This sort of thing inclines me to the view that forever should be typed m a -> m Void.
() is ⊤, i.e. the unit type, not the ⊥ (the bottom type). The big difference is that the unit type is inhabited, so that it has a value (() in Haskell), on the other hand, the bottom type is uninhabited, so that you can't write functions like that:
absurd : ⊥
absurd = -- no way
Of course you can do this in Haskell since the "bottom type" (there is no such thing, of course) is inhabited here with undefined. This makes Haskell inconsistent.
Functions like this:
disprove : a → ⊥
disprove x = -- ...
can be written, it is the same as
disprove : ¬ a
disprove x = -- ...
i.e. it disproving the type a, so that a is an absurd.
In any case, you can see how the unit type is used in different languages, as () :: () in Haskell, () : unit in ML, () : Unit in Scala and tt : ⊤ in Agda. In languages like Haskell and Agda (with the IO monad) functions like putStrLn should have a type String → IO ⊤, not the String → IO ⊥ since this is an absurd (logically it states that there is no strings that can be printed, this is just not right).
DISCLAIMER: previous text use Agda notation and it is more about Agda than Haskell.
In Haskell if we have
data Void
It doesn't mean that Void is uninhabited. It is inhabited with undefined, non-terminating programs, errors and exceptions. For example:
data Void
instance Show Void where
show _ = "Void"
data Identity a = Identity { runIdentity :: a }
mapM__ :: (a -> Identity b) -> [a] -> Identity Void
mapM__ _ _ = Identity undefined
then
print $ runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3]
-- ^ will print "Void".
case runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3] of _ -> print "1"
-- ^ will print "1".
let x = runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3]
x `seq` print x
-- ^ will thrown an exception.
But it also doesn't mean that Void is ⊥. So
mapM_ :: Monad m => (a -> m b) -> [a] -> m Void
where Void is decalred as empty data type, is ok. But
mapM_ :: Monad m => (a -> m b) -> [a] -> m ⊥
is nonsence, but there is no such type as ⊥ in Haskell.

Resources