StateMonad instance for TeletypeIO - haskell

So, I have this datatype (it's from here: https://wiki.haskell.org/IO_Semantics):
data IO a = Done a
| PutChar Char (IO a)
| GetChar (Char -> IO a)
and I thought of writing a StateMonad instance for it. I have already written Monad and Applicative instances for it.
instance MonadState (IO s) where
get = GetChar (\c -> Done c)
put = PutChar c (Done ())
state f = Done (f s)
I don't think I fully understand what state (it was named 'modify' before) is supposed to do here.
state :: (s -> (a, s)) -> m a
I also have messed up with declarations. I don't really understand what is wrong, let alone how to fix it. Would appreciate your help.
Expecting one more argument to ‘MonadState (IO s)’
Expected a constraint,
but ‘MonadState (IO s)’ has kind ‘(* -> *) -> Constraint’
In the instance declaration for ‘MonadState (IO s)’

As I mentioned in the comments, your type doesn't really hold any state, so a StateMonad instance would be nonsensical for it.
However, since this is just an exercise (also based on the comments), I guess it's ok to implement the instance technically, even if it doesn't do what you'd expect it to do.
First, the compiler error you're getting tells you that the MonadState class actually takes two arguments - the type of the state and the type of the monad, where monad has to have kind * -> *, that is, have a type parameter, like Maybe, or list, or Identity.
In your case, the monad in question (not really a monad, but ok) is IO, and the type of your "state" is Char, since that's what you're getting and putting. So the declaration has to look like this:
instance MonadState Char IO where
Second, the state method doesn't have the signature (s -> s) -> m s as you claim, but rather (s -> (a, s)) -> m a. See its definition. What it's supposed to do is create a computation in monad m out of a function that takes a state and returns "result" plus new (updated) state.
Note also that this is the most general operation on the State monad, and both get and put can be expressed in terms of state:
get = state $ \s -> (s, s)
put s = state $ \_ -> ((), s)
This means that you do not have to implement get and put yourself. You only need to implement the state function, and get/put will come from default implementations.
Incidentally, this works the other way around as well: if you define get and put, the definition of state will come from the default:
state f = do
s <- get
let (a, s') = f s
put s'
return a
And now, let's see how this can actually be implemented.
The semantics of the state's parameter is this: it's a function that takes some state as input, then performs some computation based on that state, and this computation has some result a, and it also may modify the state in some way; so the function returns both the result and the new, modified state.
In your case, the way to "get" the state from your "monad" is via GetChar, and the way in which is "returns" the Char is by calling a function that you pass to it (such function is usually referred to as "continuation").
The way to "put" the state back into your "monad" is via PutChar, which takes the Char you want to "put" as a parameter, plus some IO a that represents the computation "result".
So, the way to implement state would be to (1) first "get" the Char, then (2) apply the function to it, then (3) "put" the resulting new Char, and then (3) return the "result" of the function. Putting it all together:
state f = GetChar $ \c -> let (a, c') = f c in PutChar c' (Done a)
As further exercise, I encourage you to see how get and put would unfold, starting from the definitions I gave above, and performing step-by-step substitution. Here, I'll give you a couple first steps:
get = state $ \s -> (s, s)
-- Substituting definition of `state`
= GetChar $ \c -> let (a, c') = (\s -> (s, s)) c in PutChar c' (Done a)
-- Substituting (\s -> (s, s)) c == (c, c)
= GetChar $ \c -> let (a, c') = (c, c) in PutChar c' (Done a)
= <and so on...>

MonadState takes two arguments, in this order:
the type of the state
the monad
In this case the monad is IO:
instance MonadState _ IO where
...
And you need to figure out what goes in place of the underscore

Related

confusion over the passing of State monad in Haskell

In Haskell the State is monad is passed around to extract and store state. And in the two following examples, both pass the State monad using >>, and a close verification (by function inlining and reduction) confirms that the state is indeed passed to the next step.
Yet this seems not very intuitive. So does this mean when I want to pass the State monad I just need >> (or the >>= and lambda expression \s -> a where s is not free in a)? Can anyone provide an intuitive explanation for this fact without bothering to reduce the function?
-- the first example
tick :: State Int Int
tick = get >>= \n ->
put (n+1) >>
return n
-- the second example
type GameValue = Int
type GameState = (Bool, Int)
playGame' :: String -> State GameState GameValue
playGame' [] = get >>= \(on, score) -> return score
playGame' (x: xs) = get >>= \(on, score) ->
case x of
'a' | on -> put (on, score+1)
'b' | on -> put (on, score-1)
'c' -> put (not on, score)
_ -> put (on, score)
>> playGame xs
Thanks a lot!
It really boils down to understanding that state is isomorphic to s -> (a, s). So any value "wrapped" in a monadic action is a result of applying a transformation to some state s (a stateful computation producing a).
Passing a state between two stateful computations
f :: a -> State s b
g :: b -> State s c
corresponds to composing them with >=>
f >=> g
or using >>=
\a -> f a >>= g
the result here is
a -> State s c
it is a stateful action that transforms some underlying state s in some way, it is allowed access to some a and it produces some c. So the entire transformation is allowed to depend on a and the value c is allowed to depend on some state s. This is exactly what you would want to express a stateful computation. The neat thing (and the sole purpose of expressing this machinery as a monad) is that you do not have to bother with passing the state around. But to understand how it is done, please refer to the definition of >>= on hackage), just ignore for a moment that it is a transformer rather than a final monad).
m >>= k = StateT $ \ s -> do
~(a, s') <- runStateT m s
runStateT (k a) s'
you can disregard the wrapping and unwrapping using StateT and runStateT, here m is in form s -> (a, s), k is of form a -> (s -> (b, s)), and you wish to produce a stateful transformation s -> (b, s). So the result is going to be a function of s, to produce b you can use k but you need a first, how do you produce a? you can take m and apply it to the state s, you get a modified state s' from the first monadic action m, and you pass that state into (k a) (which is of type s -> (b, s)). It is here that the state s has passed through m to become s' and be passed to k to become some final s''.
For you as a user of this mechanism, this remains hidden, and that is the neat thing about monads. If you want a state to evolve along some computation, you build your computation from small steps that you express as State-actions and you let do-notation or bind (>>=) to do the chaining/passing.
The sole difference between >>= and >> is that you either care or don't care about the non-state result.
a >> b
is in fact equivalent to
a >>= \_ -> b
so what ever value gets output by the action a, you throw it away (keeping only the modified state) and continue (pass the state along) with the other action b.
Regarding you examples
tick :: State Int Int
tick = get >>= \n ->
put (n+1) >>
return n
you can rewrite it in do-notation as
tick = do
n <- get
put (n + 1)
return n
while the first way of writing it makes it maybe more explicit what is passed how, the second way nicely shows how you do not have to care about it.
First get the current state and expose it (get :: s -> (s, s) in a simplified setting), the <- says that you do care about the value and you do not want to throw it away, the underlying state is also passed in the background without a change (that is how get works).
Then put :: s -> (s -> ((), s)), which is equivalent after dropping unnecessary parens to put :: s -> s -> ((), s), takes a value to replace the current state with (the first argument), and produces a stateful action whose result is the uninteresting value () which you drop (because you do not use <- or because you use >> instead of >>=). Due to put the underlying state has changed to n + 1 and as such it is passed on.
return does nothing to the underlying state, it only returns its argument.
To summarise, tick starts with some initial value s it updates it to s+1 internally and outputs s on the side.
The other example works exactly the same way, >> is only used there to throw away the () produced by put. But state gets passed around all the time.

What is the intuitive meaning of "join"?

What is the intuitive meaning of join for a Monad?
The monads-as-containers analogies make sense to me, and inside these analogies join makes sense. A value is double-wrapped and we unwrap one layer. But as we all know, a monad is not a container.
How might one write sensible, understandable code using join in normal circumstances, say when in IO?
An action :: IO (IO a) is a way of producing a way of producing an a. join action, then, is a way of producing an a by running the outermost producer of action, taking the producer it produced and then running that as well, to finally get to that juicy a.
join collapses consecutive layers of the type constructor.
A valid join must satisfy the property that, for any number of consecutive applications of the type constructor, it shouldn't matter the order in which we collapse the layers.
For example
ghci> let lolol = [[['a'],['b','c']],[['d'],['e']]]
ghci> lolol :: [[[Char]]]
ghci> lolol :: [] ([] ([] Char)) -- the type can also be expressed like this
ghci> join (fmap join lolol) -- collapse inner layers first
"abcde"
ghci> join (join lolol) -- collapse outer layers first
"abcde"
(We used fmap to "get inside" the outer monadic layer so that we could collapse the inner layers first.)
A small non container example where join is useful: for the function monad (->) a, join is equivalent to \f x -> f x x, a function of type (a -> a -> b) -> a -> b that applies two times the same argument to another function.
For the List monad, join is simply concat, and concatMap is join . fmap.
So join implicitly appears in any list expression which uses concat
or concatMap.
Suppose you were asked to find all of the numbers which are divisors of any
number in an input list. If you have a divisors function:
divisors :: Int -> [Int]
divisors n = [ d | d <- [1..n], mod n d == 0 ]
you might solve the problem like this:
foo xs = concat $ (map divisors xs)
Here we are thinking of solving the problem by first mapping the
divisors function over all of the input elements and then concatenating
all of the resulting lists. You might even think that this is a very
"functional" way of solving the problem.
Another approch would be to write a list comprehension:
bar xs = [ d | x <- xs, d <- divisors x ]
or using do-notation:
bar xs = do x <- xs
d <- divisors
return d
Here it might be said we're thinking a little more
imperatively - first draw a number from the list xs; then draw
a divisors from the divisors of the number and yield it.
It turns out, though, that foo and bar are exactly the same function.
Morever, these two approaches are exactly the same in any monad.
That is, for any monad, and appropriate monadic functions f and g:
do x <- f
y <- g x is the same as: (join . fmap g) f
return y
For instance, in the IO monad if we set f = getLine and g = readFile,
we have:
do x <- getLine
y <- readFile x is the same as: (join . fmap readFile) getLine
return y
The do-block is a more imperative way of expressing the action: first read a
line of input; then treat returned string as a file name, read the contents
of the file and finally return the result.
The equivalent join expression seems a little unnatural in the IO-monad.
However it shouldn't be as we are using it in exactly the same way as we
used concatMap in the first example.
Given an action that produces another action, run the action and then run the action that it produces.
If you imagine some kind of Parser x monad that parses an x, then Parser (Parser x) is a parser that does some parsing, and then returns another parser. So join would flatten this into a Parser x that just runs both actions and returns the final x.
Why would you even have a Parser (Parser x) in the first place? Basically, because fmap. If you have a parser, you can fmap a function that changes the result over it. But if you fmap a function that itself returns a parser, you end up with a Parser (Parser x), where you probably want to just run both actions. join implements "just run both actions".
I like the parsing example because a parser typically has a runParser function. And it's clear that a Parser Int is not an integer. It's something that can parse an integer, after you give it some input to parse from. I think a lot of people end up thinking of an IO Int as being just a normal integer but with this annoying IO bit that you can't get rid of. It isn't. It's an unexecuted I/O operation. There's no integer "inside" it; the integer doesn't exist until you actually perform the I/O.
I find these things easier to interpret by writing out the types and refactoring them a bit to reveal what the functions do.
Reader monad
The Reader type is defined thus, and its join function has the type shown:
newtype Reader r a = Reader { runReader :: r -> a }
join :: Reader r (Reader r a) -> Reader r a
Since this is a newtype, this means that the type Reader r a is isomorphic to r -> a. So we can refactor the type definition to give us this type that, albeit it's not the same, it's really "the same" with scare quotes:
In the (->) r monad, which is isomorphic to Reader r, join is the function:
join :: (r -> r -> a) -> r -> a
So the Reader join is the function that takes a two-place function (r -> r -> a) and applies to the same value at both its argument positions.
Writer monad
Since the Writer type has this definition:
newtype Writer w a = Writer { runWriter :: (a, w) }
...then when we remove the newtype, its join function has a type isomorphic to:
join :: Monoid w => ((a, w), w) -> (a, w)
The Monoid constraint needs to be there because the Monad instance for Writer requires it, and it lets us guess right away what the function does:
join ((a, w0), w1) = (a, w0 <> w1)
State monad
Similarly, since State has this definition:
newtype State s a = State { runState :: s -> (a, s) }
...then its join is like this:
join :: (s -> (s -> (a, s), s)) -> s -> (a, s)
...and you can also venture just writing it directly:
join f s0 = (a, s2)
where
(g, s1) = f s0
(a, s2) = g s1
{- Here's the "map" to the variable names in the function:
f g s2 s1 s0 s2
join :: (s -> (s -> (a, s ), s )) -> s -> (a, s )
-}
If you stare at this type a bit, you might think that it bears some resemblance to both the Reader and Writer's types for their join operations. And you'd be right! The Reader, Writer and State monads are all instances of a more general pattern called update monads.
List monad
join :: [[a]] -> [a]
As other people have pointed out, this is the type of the concat function.
Parsing monads
Here comes a really neat thing to realize. Very often, "fancy" monads turn out to be combinations or variants of "basic" ones like Reader, Writer, State or lists. So often what I do when confronted with a novel monad is ask: which of the basic monads does it resemble, and how?
Take for example parsing monads, which have been brought up in other answers here. A simplistic parser monad (with no support for important things like error reporting) looks like this:
newtype Parser a = Parser { runParser :: String -> [(a, String)] }
A Parser is a function that takes a string as input, and returns a list of candidate parses, where each candidate parse is a pair of:
A parse result of type a;
The leftovers (the suffix of the input string that was not consumed in that parse).
But notice that this type looks very much like the state monad:
newtype Parser a = Parser { runParser :: String -> [(a, String)] }
newtype State s a = State { runState :: s -> (a, s) }
And this is no accident! Parser monads are nondeterministic state monads, where the state is the unconsumed portion of the input string, and parse steps generate alternatives that may be later rejected in light of further input. List monads are often called "nondeterminism" monads, so it's no surprise that a parser resembles a mix of the state and list monads.
And this intuition can be systematized by using monad transfomers. The state monad transformer is defined like this:
newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }
Which means that the Parser type from above can be written like this as well:
type Parser a = StateT String [] a
...and its Monad instance follows mechanically from those of StateT and [].
The IO monad
Imagine we could enumerate all of the possible primitive IO actions, somewhat like this:
{-# LANGUAGE GADTs #-}
data Command a where
-- An action that writes a char to stdout
putChar :: Char -> Command ()
-- An action that reads a char from stdin
getChar :: Command Char
-- ...
Then we could think of the IO type as this (which I've adapted from the highly-recommended Operational monad tutorial):
data IO a where
-- An `IO` action that just returns a constant value.
Return :: a -> IO a
-- An action that binds the result of a `Command` to
-- a function that computes the next step after it.
Bind :: Command x -> (x -> IO a) -> IO a
instance Monad IO where ...
Then join action would then look like this:
join :: IO (IO a) -> IO a
-- If the action is just `Return`, then its payload already
-- is what we need to return.
join (Return ioa) = ioa
-- If the action is a `Bind`, then its "next step" function
-- `f` produces `IO (IO a)`, so we can just recursively stick
-- a `join` to its result end.
join (Bind cmd f) = Bind cmd (join . f)
So all that the join does here is "chase down" the IO action until it sees a result that fits the pattern Return (ma :: IO a), and strip out the outer Return.
So what did I do here? Just like for parser monads, I just defined (or rather copied) a toy model of the IO type that has the virtue of being transparent. Then I work out the behavior of join from the toy model.

Why does bind (>>=) exist? What are typical cases where a solution without bind is ugly?

This is a type declaration of a bind method:
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b
I read this as follows: apply a function that returns a wrapped value, to a wrapped value.
This method was included to Prelude as part of Monad typeclass. That means there are a lot of cases where it's needed.
OK, but I don't understand why it's a typical solution of a typical case at all.
If you already created a function which returns a wrapped value, why that function doesn't already take a wrapped value?
In other words, what are typical cases where there are many functions which take a normal value, but return a wrapped value? (instead of taking a wrapped value and return a wrapped value)
The 'unwrapping' of values is exactly what you want to keep hidden when dealing with monads, since it is this that causes a lot of boilerplate.
For example, if you have a sequence of operations which return Maybe values that you want to combine, you have to manually propagate Nothing if you receive one:
nested :: a -> Maybe b
nested x = case f x of
Nothing -> Nothing
Just r ->
case g r of
Nothing -> Nothing
Just r' ->
case h r' of
Nothing -> Nothing
r'' -> i r''
This is what bind does for you:
Nothing >>= _ = Nothing
Just a >>= f = f a
so you can just write:
nested x = f x >>= g >>= h >>= i
Some monads don't allow you to manually unpack the values at all - the most common example is IO. The only way to get the value from an IO is to map or >>= and both of these require you to propagate IO in the output.
Everyone focuses on IO monad and inability to "unwrap".
But a Monad is not always a container, so you can't unwrap.
Reader r a == r->a such that (Reader r) is a Monad
to my mind is the simplest best example of a Monad that is not a container.
You can easily write a function that can produce m b given a: a->(r->b). But you can't easily "unwrap" the value from m a, because a is not wrapped in it. Monad is a type-level concept.
Also, notice that if you have m a->m b, you don't have a Monad. What Monad gives you, is a way to build a function m a->m b from a->m b (compare: Functor gives you a way to build a function m a->m b from a->b; ApplicativeFunctor gives you a way to build a function m a->m b from m (a->b))
If you already created a function which returns a wrapped value, why that function doesn't already take a wrapped value?
Because that function would have to unwrap its argument in order to do something with it.
But for many choices of m, you can only unwrap a value if you will eventually rewrap your own result. This idea of "unwrap, do something, then rewrap" is embodied in the (>>=) function which unwraps for you, let's you do something, and forces you to rewrap by the type a -> m b.
To understand why you cannot unwrap without eventually rewrapping, we can look at some examples:
If m a = Maybe a, unwrapping for Just x would be easy: just return x. But how can we unwrap Nothing? We cannot. But if we know that we will eventually rewrap, we can skip the "do something" step and return Nothing for the overall operation.
If m a = [a], unwrapping for [x] would be easy: just return x. But for unwrapping [], we need the same trick as for Maybe a. And what about unwrapping [x, y, z]? If we know that we will eventually rewrap, we can execute the "do something" three times, for x, y and z and concat the results into a single list.
If m a = IO a, no unwrapping is easy because we only know the result sometimes in the future, when we actually run the IO action. But if we know that we will eventually rewrap, we can store the "do something" inside the IO action and perform it later, when we execute the IO action.
I hope these examples make it clear that for many interesting choices of m, we can only implement unwrapping if we know that we are going to rewrap. The type of (>>=) allows precisely this assumption, so it is cleverly chosen to make things work.
While (>>=) can sometimes be useful when used directly, its main purpose is to implement the <- bind syntax in do notation. It has the type m a -> (a -> m b) -> m b mainly because, when used in a do notation block, the right hand side of the <- is of type m a, the left hand side "binds" an a to the given identifier and, when combined with remainder of the do block, is of type a -> m b, the resulting monadic action is of type m b, and this is the only type it possibly could have to make this work.
For example:
echo = do
input <- getLine
putStrLn input
The right hand side of the <- is of type IO String
The left hands side of the <- with the remainder of the do block are of type String -> IO (). Compare with the desugared version using >>=:
echo = getLine >>= (\input -> putStrLn input)
The left hand side of the >>= is of type IO String. The right hand side is of type String -> IO (). Now, by applying an eta reduction to the lambda we can instead get:
echo = getLine >>= putStrLn
which shows why >>= is sometimes used directly rather than as the "engine" that powers do notation along with >>.
I'd also like to provide what I think is an important correction to the concept of "unwrapping" a monadic value, which is that it doesn't happen. The Monad class does not provide a generic function of type Monad m => m a -> a. Some particular instances do but this is not a feature of monads in general. Monads, generally speaking, cannot be "unwrapped".
Remember that m >>= k = join (fmap k m) is a law that must be true for any monad. Any particular implementation of >>= must satisfy this law and so must be equivalent to this general implementation.
What this means is that what really happens is that the monadic "computation" a -> m b is "lifted" to become an m a -> m (m b) using fmap and then applied the m a, giving an m (m b); and then join :: m (m a) -> m a is used to squish the two ms together to yield a m b. So the a never gets "out" of the monad. The monad is never "unwrapped". This is an incorrect way to think about monads and I would strongly recommend that you not get in the habit.
I will focus on your point
If you already created a function which returns a wrapped value, why
that function doesn't already take a wrapped value?
and the IO monad. Suppose you had
getLine :: IO String
putStrLn :: IO String -> IO () -- "already takes a wrapped value"
how one could write a program which reads a line and print it twice? An attempt would be
let line = getLine
in putStrLn line >> putStrLn line
but equational reasoning dictates that this is equivalent to
putStrLn getLine >> putStrLn getLine
which reads two lines instead.
What we lack is a way to "unwrap" the getLine once, and use it twice. The same issue would apply to reading a line, printing "hello", and then printing a line:
let line = getLine in putStrLn "hello" >> putStrLn line
-- equivalent to
putStrLn "hello" >> putStrLn getLine
So, we also lack a way to specify "when to unwrap" the getLine. The bind >>= operator provides a way to do this.
A more advanced theoretical note
If you swap the arguments around the (>>=) bind operator becomes (=<<)
(=<<) :: (a -> m b) -> (m a -> m b)
which turns any function f taking an unwrapped value into a function g taking a wrapped
value. Such g is known as the Kleisli extension of f. The bind operator guarantees
such an extension always exists, and provides a convenient way to use it.
Because we like to be able to apply functions like a -> b to our m as. Lifting such a function to m a -> m b is trivial (liftM, liftA, >>= return ., fmap) but the opposite is not necessarily possible.
You want some typical examples? How about putStrLn :: String -> IO ()? It would make no sense for this function to have the type IO String -> IO () because the origin of the string doesn't matter.
Anyway: You might have the wrong idea because of your "wrapped value" metaphor; I use it myself quite often, but it has its limitations. There isn't necessarily a pure way to get an a out of an m a - for example, if you have a getLine :: IO String, there's not a great deal of interesting things you can do with it - you can put it in a list, chain it in a row and other neat things, but you can't get any useful information out of it because you can't look inside an IO action. What you can do is use >>= which gives you a way to use the result of the action.
Similar things apply to monads where the "wrapping" metaphor applies too; For example the point Maybe monad is to avoid manually wrapping and unwrapping values with and from Just all the time.
My two most common examples:
1) I have a series of functions that generate a list of lists, but I finally need a flat list:
f :: a -> [a]
fAppliedThrice :: [a] -> [a]
fAppliedThrice aList = concat (map f (concat (map f (concat (map f a)))))
fAppliedThrice' :: [a] -> [a]
fAppliedThrice' aList = aList >>= f >>= f >>= f
A practical example of using this was when my functions fetched attributes of a foreign key relationship. I could just chain them together to finally obtain a flat list of attributes. Eg: Product hasMany Review hasMany Tag type relationship, and I finally want a list of all the tag names for a product. (I added some template-haskell and got a very good generic attribute fetcher for my purposes).
2) Say you have a series of filter-like functions to apply to some data. And they return Maybe values.
case (val >>= filter >>= filter2 >>= filter3) of
Nothing -> putStrLn "Bad data"
Just x -> putStrLn "Good data"

Why discarded values are () instead of ⊥ in Haskell?

Howcome in Haskell, when there is a value that would be discarded, () is used instead of ⊥?
Examples (can't really think of anything other than IO actions at the moment):
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()
foldM_ :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m ()
writeFile :: FilePath -> String -> IO ()
Under strict evaluation, this makes perfect sense, but in Haskell, it only makes the domain bigger.
Perhaps there are "unused parameter" functions d -> a which are strict on d (where d is an unconstrained type parameter and does not appear free in a)? Ex: seq, const' x y = yseqx.
I think this is because you need to specify the type of the value to be discarded. In Haskell-98, () is the obvious choice. And as long as you know the type is (), you may as well make the value () as well (presuming evaluation proceeds that far), just in case somebody tries to pattern-match on it or something. I think most programmers don't like introducing extra ⊥'s into code because it's just an extra trap to fall into. I certainly avoid it.
Instead of (), it is possible to create an uninhabited type (except by ⊥ of course).
{-# LANGUAGE EmptyDataDecls #-}
data Void
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m Void
Now it's not even possible to pattern-match, because there's no Void constructor. I suspect the reason this isn't done more often is because it's not Haskell-98 compatible, as it requires the EmptyDataDecls extension.
Edit: you can't pattern-match on Void, but seq will ruin your day. Thanks to #sacundim for pointing this out.
Well, bottom type literally means an unterminating computation, and unit type is just what it is - a type inhabited with single value. Clearly, monadic computations usually meant to be finished, so it simply doesn't make sense to make them return undefined. And, of course, it is simply a safety measure - just like John L said, what if someone pattern matches on monadic result? So monadic computations return the 'lowest' possible (in Haskell 98) type - unit.
So, maybe we could have the following signatures:
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m z
foldM_ :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m z
writeFile :: FilePath -> String -> IO z
We'd reimplement the functions in question so that any attempt to bind the z in m z or IO z would bind the variable to undefined or any other bottom.
What do we gain? Now people can write programs that force the undefined result of these computations. How is that a good thing? All it means is that people can now write programs that fail to terminate for no good reason, that were impossible to write before.
You're getting confused between types and values.
In writeFile :: FilePath -> String -> IO (), the () is the unit type. The value you get for x by doing x <- writeFile foo bar in a do block is (normally) the value (), which is the sole non-bottom inhabitant of the type ().
⊥ OTOH is a value. Since ⊥ is a member of every type, it's also usable as a value for the type (). If you're discarding that x above without using it (we normally don't even extract it into a variable), it may very well be ⊥ and you'd never know. In that sense you already have what you want; if you're ever writing a function whose result you expect to be always ignored, you could use ⊥. But since ⊥ is a value of every type, there is no type ⊥, and so there is no type IO ⊥.
But really, they represent different conceptual things. The type () is the type of values that contain zero information (which is why there is only one value; if there were two or more values then () values would contain at least as much information as values of Bool). IO () is the type of IO actions that generate a value with no information, but may effects that will happen as a result of generating that non-informative value.
⊥ is in some sense a non-value. 1 `div` 0 gives ⊥ because there is no value that could be used as the result of that expression which satisfies the laws of integer division. Throwing an exception gives ⊥ because functions that contain exception throws do not give you a value of their type. Non-termination gives ⊥ because the expression never terminates with a value. ⊥ is a way of treating all of these non-values as if they were a value for some purposes. As far as I can tell it's mainly useful because Haskell's laziness means that ⊥ and a data structure containing ⊥ (i.e. [⊥]) are distinguishable.
The value () is not like the cases where we use ⊥. writeFile foo bar doesn't have an "impossible value" like return $ 1 `div` 0, it just has no information in its value (other than that contained in the monadic structure). There are perfectly sensible things I could do with the () I get from doing x <- writeFile foo bar; they're just not very interesting and so nobody ever does them. This is distinctly different from x <- return $ 1 `div` 0, where doing anything with that value has to give me another ill-defined value.
I would like to point out one severe downside to writing one particular form of returning ⊥: if you write types like this, you get bad programs:
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m z
This is way too polymorphic. As an example, consider forever :: Monad m => m a -> m b. I encountered this gotcha a long time ago and I'm still bitter:
main :: IO ()
main = forever putStrLn "This is going to get printed a lot!"
The error is obvious and simple: missing parentheses.
It typechecks. This is exactly the sort of error that the type system is supposed to catch easily.
It silently infinite loops at runtime (without printing anything). It is a pain to debug.
Why? Well, because r -> is a monad. So m b matches virtually anything. For example:
forever :: m a -> m b
forever putStrLn :: String -> b
forever putStrLn "hello!" :: b -- eep!
forever putStrLn "hello" readFile id flip (Nothing,[17,0]) :: t -- no type error.
This sort of thing inclines me to the view that forever should be typed m a -> m Void.
() is ⊤, i.e. the unit type, not the ⊥ (the bottom type). The big difference is that the unit type is inhabited, so that it has a value (() in Haskell), on the other hand, the bottom type is uninhabited, so that you can't write functions like that:
absurd : ⊥
absurd = -- no way
Of course you can do this in Haskell since the "bottom type" (there is no such thing, of course) is inhabited here with undefined. This makes Haskell inconsistent.
Functions like this:
disprove : a → ⊥
disprove x = -- ...
can be written, it is the same as
disprove : ¬ a
disprove x = -- ...
i.e. it disproving the type a, so that a is an absurd.
In any case, you can see how the unit type is used in different languages, as () :: () in Haskell, () : unit in ML, () : Unit in Scala and tt : ⊤ in Agda. In languages like Haskell and Agda (with the IO monad) functions like putStrLn should have a type String → IO ⊤, not the String → IO ⊥ since this is an absurd (logically it states that there is no strings that can be printed, this is just not right).
DISCLAIMER: previous text use Agda notation and it is more about Agda than Haskell.
In Haskell if we have
data Void
It doesn't mean that Void is uninhabited. It is inhabited with undefined, non-terminating programs, errors and exceptions. For example:
data Void
instance Show Void where
show _ = "Void"
data Identity a = Identity { runIdentity :: a }
mapM__ :: (a -> Identity b) -> [a] -> Identity Void
mapM__ _ _ = Identity undefined
then
print $ runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3]
-- ^ will print "Void".
case runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3] of _ -> print "1"
-- ^ will print "1".
let x = runIdentity $ mapM__ (const $ Identity 0) [1, 2, 3]
x `seq` print x
-- ^ will thrown an exception.
But it also doesn't mean that Void is ⊥. So
mapM_ :: Monad m => (a -> m b) -> [a] -> m Void
where Void is decalred as empty data type, is ok. But
mapM_ :: Monad m => (a -> m b) -> [a] -> m ⊥
is nonsence, but there is no such type as ⊥ in Haskell.

Haskell confusion with ContT, callCC, when

Continuing quest to make sense of ContT and friends. Please consider the (absurd but illustrative) code below:
v :: IO (Either String [String])
v = return $ Left "Error message"
doit :: IO (Either String ())
doit = (flip runContT return) $ callCC $ \k -> do
x <- liftIO $ v
x2 <- either (k . Left) return x
when True $ k (Left "Error message 2")
-- k (Left "Error message 3")
return $ Right () -- success
This code does not compile. However, if the replace the when with the commented k call below it, it compiles. What's going on?
Alternatively, if I comment out the x2 line, it also compiles. ???
Obviously, this is a distilled version of the original code and so all of the elements serve a purpose. Appreciate explanatory help on what's going on and how to fix it. Thanks.
The problem here has to do with the types of when and either, not anything particular to ContT:
when :: forall (m :: * -> *). (Monad m) => Bool -> m () -> m ()
either :: forall a c b. (a -> c) -> (b -> c) -> Either a b -> c
The second argument needs to be of type m () for some monad m. The when line of your code could thus be amended like so:
when True $ k (Left "Error message 2") >> return ()
to make the code compile. This is probably not what you want to do, but it gives us a hint as to what might be wrong: k's type has been inferred to be something unpalatable to when.
Now for the either signature: notice that the two arguments to either must be functions which produce results of the same type. The type of return here is determined by the type of x, which is in turn fixed by the explicit signature on v. Thus the (k . Left) bit must have the same type; this in turn fixes the type of k at (GHC-determined)
k :: Either String () -> ContT (Either String ()) IO [String]
This is incompatible with when's expectations.
When you comment out the x2 line, however, its effect on the type checker's view of the code is removed, so k is no longer forced into an inconvenient type and is free to assume the type
k :: Either [Char] () -> ContT (Either [Char] ()) IO ()
which is fine in when's book. Thus, the code compiles.
As a final note, I used GHCi's breakpoints facility to obtain the exact types of k under the two scenarios -- I'm nowhere near expert enough to write them out by hand and be in any way assured of their correctness. :-) Use :break ModuleName line-number column-number to try it out.

Resources