How to pattern match on an universally quantified free monad? - haskell

I'm wondering if I can write a function isPure :: Free f () -> Bool, which tells you if the given free monad equals Pure () or not. This is easy to do for a simple case, but I can't figure it out for a more complex case where there are constraints on the functor f.
import Control.Monad.Free
-- * This one compiles
isPure :: Free Maybe () -> Bool
isPure (Pure ()) = True
isPure _ = False
-- * This one fails to compile with "Ambiguous type variable ‘context0’ arising from a pattern
-- prevents the constraint ‘(Functor context0)’ from being solved."
{-# LANGUAGE RankNTypes #-}
type ComplexFree = forall context. (Functor context) => Free context ()
isPure' :: ComplexFree -> Bool
isPure' (Pure ()) = True
isPure' _ = False
I can see why specifying the exact type of context0 would be necessary in general, but all I want is to look the coarse-grained structure of the free monad (i.e. is it Pure or not Pure). I don't want to pin down the type because my program relies on passing around some constrained universally quantified free monads and I want this to work with any of them. Is there some way to do this? Thanks!
EDITED to change "existentially quantified" -> "universally quantified"
EDIT: since my ComplexFree type might have been too general, here's a version that more exactly mimics what I'm trying to do.
--* This one actually triggers GHC's warning about impredicative polymorphism...
{-# LANGUAGE GADTs #-}
data MyFunctor context next where
MyFunctor :: Int -> MyFunctor context next -- arguments not important
type RealisticFree context a = Free (MyFunctor context) a
class HasFoo context where
getFoo :: context -> Foo
class HasBar context where
getBar :: context -> Bar
type ConstrainedRealisticFree = forall context. (HasFoo context, HasBar context) => RealisticFree context ()
processRealisticFree :: ConstrainedRealisticFree -> IO ()
processRealisticFree crf = case isPure'' crf of
True -> putStrLn "It's pure!"
False -> putStrLn "Not pure!"
isPure'' :: ConstrainedRealisticFree -> Bool
isPure'' = undefined -- ???
(For some more context, this free monad is meant to model an interpreter for a simple language, where a "context" is present. You can think of the context as describing a reader monad that the language is evaluated within, so HasFoo context and HasBar context enforce that a Foo and Bar are available. I use universal quantification so that the exact type of the context can vary. My goal is to be able to identify an "empty program" in this free monad interpreter.)

First of all, this is not existential quantification. That would look like this:
data ComplexFree = forall context. (Functor context) => ComplexFree (Free context ())
(a syntax I find rather confusing, so I prefer the GADT form
data ComplexFree where
ComplexFree :: (Functor context) => Free context () -> ComplexFree
, which means the same thing)
You have a universally quantified type here, that is, if you have a value of type ComplexFree (the way you have written it), it can turn into a free monad over any functor you choose. So you can just instantiate it at Identity, for example
isPure' :: ComplexFree -> Bool
isPure' m = case m :: Free Identity () of
Pure () -> True
_ -> False
It must be instantiated at some type in order to inspect it, and the error you see is because the compiler couldn't decide which functor to use by itself.
However, instantiating is not necessary for defining isPure'. Ignoring bottoms1, one of the functors you could instantiate ComplexFree with is Const Void, which means that the recursive case of Free reduces to
f (m a)
= Const Void (m a)
~= Void
that is, it is impossible. By some naturality arguments, we can show that which branch a ComplexFree takes cannot depend on the choice of functor, which means that any fully-defined ComplexFree must be a Pure one. So we can "simplify" to
isPure' :: ComplexFree -> Bool
isPure' _ = True
How boring.
However, I suspect you may have made a mistake defining ComplexFree, and you really do want the existential version?
data ComplexFree where
ComplexFree :: (Functor context) => Free context () -> ComplexFree
In this case, a ComplexFree "carries" the functor with it. It only works for one functor, and it (and only it) knows what functor that is. This problem is better-formed, and implemented just as you would expect
isPure' :: ComplexFree -> Bool
isPure' (ComplexFree (Pure _)) = True
isPure' _ = False
1 Ignoring bottom is a common practice, and usually not problematic. This reduction strictly increases the information content of the program -- that is, whenever the original version gave a defined answer, the new version will give the same answer. But the new one might "fail to go into an infinite loop" and accidentally give an answer instead. In any case, this reduction can be modified to be completely correct, and the resulting isPure' is just as useless.

I'll answer your revamped question here. It turns out that the answer is still basically the same as luqui's: you need to instantiate the polymorphic argument before you can pattern match on it. Thanks to the constraint, you need to use a context type that's an instance of the relevant classes. If it would be inconvenient to use a "real" one, you can easily make a throw-away:
data Gump = Gump Foo Bar
instance HasFoo Gump where
getFoo (Gump f _) = f
instance HasBar Gump where
getBar (Gump _ b) = b
That should be fine for this particular case. However, in most similar practical situations, you'll want to instantiate to your real type to get its specialized behavior.
Now you can instantiate the context to Gump and you're good to go:
isPure'' :: ConstrainedRealisticFree -> Bool
isPure'' q = case q :: RealisticFree Gump () of
Pure _ -> True
Free _ -> False
The reason you got that error about impredicative types is that you wrote
isPure'' = ...
Higher-rank polymorphic parameters are generally required to be syntactically parameters:
isPure'' q = ...

Related

What are freer monads?

I heard this term a few times, but I still don't know what exactly is a so-called "Freer Monad". The name makes me think about Free Monads, but I don't see how they are actually related. There is some library I found on hackage: http://hackage.haskell.org/package/freer, but the example out there didn't help me a lot.
I don't understand the idea at all, and therefore I don't see any good usecases for them. I also wonder what advantages they provide over free monads and classic mtl stacks.
I know this is an old thread, but i thought I'd answer it anyway just in case
what [...] is a so-called "Freer Monad"
according to the original paper Freer Monads, More Extensible Effects a "Freer Monad" is essentially a Free Monad without the necessary Functor constraint of a Free Monad.
A free monad is basically the essence of the monadic structure; the "smallest" thing that is still a monad. A very nice practial explanation approach can be found in this article. This article also shows that the "normal" free monad needs a Functor constraint.
However, it is often quite tedious adding the functor constraint in every function (and sometimes maybe even weird to implement), and as it turns out, by "moving the functor functionality" to an argument for the Impure constructor so that the implementing side can alter the type of the output itself (so without a general functor), it is possible to get rid of this constraint. This is done by using GADTs: (example from the Freer Monads paper)
data Free f a = Pure a
| Impure (f (Free f a))
instance Functor f => Monad (Free f) where
becomes
data FFree f a where
Pure :: a → FFree f a
Impure :: f x → (x → FFree f a) → FFree f a
instance Monad (FFree f) where
[...]
Impure fx k’ >>= k = Impure fx (k’ >>> k)
This basically lets the later implementation choose how to perform the fmap operation fixed [pun not intended] to the appropriate "output/wrapped type".
So the fundamental difference is essentially usability and generality.
As there was some confusion: FFree is the Freer monad and corresponds to Eff in the package freer-simple.
good usecases for them
Freer monads, just as well as Free monads lend themselves for constructing DSLs.
consider for example a type
data Lang r where
LReturn :: Var -> Lang Int
LPrint :: IntExpr -> Lang ()
LAssign :: Var -> IntExpr -> Lang ()
LRead :: Var -> Lang Int
this tells me that there are a couple of operations to be performed in Lang: return x print x assign x y read y.
We use GADTs here so that we can also specify what output the individual actions are going to have. This comes in quite handy if we write functions in our DSL, because their output can be tpechecked.
adding some convenience functions (that can acutally be derived):
lReturn :: Member Lang effs
=> Var -> Eff effs Int
lReturn = send . LReturn
lPrint :: Member Lang effs
=> IntExpr -> Eff effs ()
lPrint = send . LPrint
lAssign :: Member Lang effs
=> Var -> IntExpr -> Eff effs ()
lAssign v i = send $ LAssign v i
lRead :: Member Lang effs
=> Var -> Eff effs Int
lRead = send . LRead
(this is already written using freer)
now we can use them like this: (assuming that IntExpr contains Variables and Ints)
someFunctionPrintingAnInt = do
lAssign (Var "a") (IE_Int 12)
lPrint (IE_Var $ Var "a")
these functions now enable you to have a DSL that can be interpreted in different ways. All needed for this is an interpreter with a specific type for effs (which is ~~ a type level list of freer monad "instances)
so freer takes the idea of the freer monads and packs it into an effect system.
this interpreter could look something like this:
runLangPure :: Eff '[Lang] Int -> Either () Int -- [StateMap]
runLangPure program = fst . fst $
run (runWriter (runState empty (runError (reinterpret3 go program))))
where
go :: Lang v -> Eff '[Error (), State StateMap, Writer [String]] v
go (LReturn var) = get >>= go (Eval stmt) >>= tell . []
go (LPrint expr) = do
store <- get
value <- evalM expr
tell [show value]
go (LAssign var expr) = do
value <- evalM expr
--modify state (change var)
go (LRead var) = do
strValue <- getLine
get >>= insert var (stringToInt strValue)
the run... part specifies the initial "state" of the monads. the go part is the interpreter itself, interpreting the different possible actions.
Note that one can use the functions get and tell in the same do block even though they are part of different monads, which brings us to
I also wonder what advantages do they provide over free monads and classic mtl stacks.
the implementation allows to use monadic actions of different parts of the "monad stack" without lifting.
About the implementation:
To understand this, we look at it from a high level of abstraction:
the auxiliary functions of our DSL are send to Eff effs where it is required that Member Lang effs.
So the Member constraint is just a way of declaing that Lang is in the type-level list effs in Member Lang effs. (basically typelevel elem)
The Eff monad has the functionality to "ask" the Members of the type level list of monads whether they can handle the current value (remeber, the operations are just values that are intrepreted subsequently). if so their intrepretation is executed, if not, the question is handed off to the next monad in the list.
This becomes more intuitive and understandable when spending some time in the freer-simple code base.

Does Higher order polymorphism require strict order of arguments?

Reading LYAH, I stumbled upon this piece of code:
newtype Writer w a = Writer { runWriter :: (a, w) }
instance (Monoid w) => Monad (Writer w) where
return x = Writer (x, mempty)
(Writer (x,v)) >>= f = let (Writer (y, v')) = f x in Writer (y, v `mappend` v')
While trying to understand what the heck is Writer w in the first line, I discovered this not being a full type, but a sort of type constructor with 1 argument, like Maybe for Maybe String
Looks great, but what if the initial type if Writer' is defined with swapped type arguments, like this:
newtype Writer' a w = Writer' { runWriter :: (a, w) }
Is it possible to implement Monad instance now? Something like this, but what could actually be compiled:
instance (Monoid w) => Monad (\* -> Writer' * monoid) where
The idea of \* -> Writer' * monoid is the same as Writer w :
A type constructor with one type argument missing -- this time first one.
This is not possible in Haskell, what you'd need is a type-level lambda function, which does not exist.
There are type synonyms which you can use to define reorderings of type variables:
type Writer'' a w = Writer' a w
but you can not give class instances for partially applied type synonyms (even with the TypeSynonymInstances extension).
I wrote my MSc thesis about the subject of how type-level lambdas can be added to GHC: https://xnyhps.nl/~thijs/share/paper.pdf to be used in type-class instances without sacrificing type inference.
What you are seeing here is a parochial design choice of Haskell. It makes perfect sense, conceptually speaking, to say that your Writer' type is a functor if you "leave out" its first parameter. And a programming language syntax could be invented to allow such declarations.
The Haskell community hasn't done so, because what they have is relatively simple and it works well enough. This isn't to say that alternative designs aren't possible, but to be adopted such a design would have to:
Be no more complex to use in practice than what we already have;
Offer functionality or advantage that would be worth the switch.
This generalizes to many other ways that the Haskell community uses types; often the choice to represent something as a type distinction is tied to some artifact of the language's design. Many monad transformers are good examples, like MaybeT:
newtype MaybeT m a = MaybeT { runMaybeT :: m (Maybe a) }
instance Functor m => Functor (MaybeT m) where ...
instance Applicative m => Applicative (MaybeT m) where ...
instance Monad m => Monad (MaybeT m) where ...
instance MonadTrans MaybeT where ...
Since it's a newtype, this means that MaybeT IO String is isomorphic to IO (Maybe String); you can think of the two types as being two "perspectives" on the same set of values:
IO (Maybe String) is an IO action that produces values of type Maybe String;
MaybeT IO String is a MaybeT IO action that produces values of type String.
The difference between the perspectives is that they imply different implementations of the Monad operations. In Haskell then this is also tied to the following parochial technical facts:
In one String is the last type parameter (the "values") and in the other Maybe String is;
IO and MaybeT IO have different instances for the Monad class.
But maybe there is a language design where you could say that the type IO (Maybe a) can have a monad specific to it, and distinct from the monad for the more general IO a type. That language would incur some complexity to make that distinction consistently (e.g., rules to determine which Monad instance to by default for IO (Maybe String) and rules to allow the programmer to override the default choice). And I'd wager modestly that the end result would be no less complex than what we do have. TL;DR: Meh.

STRef and phantom types

Does s in STRef s a get instantiated with a concrete type? One could easily imagine some code where STRef is used in a context where the a takes on Int. But there doesn't seem to be anything for the type inference to give s a concrete type.
Imagine something in pseudo Java like MyList<S, A>. Even if S never appeared in the implementation of MyList instantiating a concrete type like MyList<S, Integer> where a concrete type is not used in place of S would not make sense. So how can STRef s a work?
tl;dr - in practice it seems it always gets initialised to RealWorld in the end
The source notes that s can be instantiated to RealWorld inside invocations of stToIO, but is otherwise uninstantiated:
-- The s parameter is either
-- an uninstantiated type variable (inside invocations of 'runST'), or
-- 'RealWorld' (inside invocations of 'Control.Monad.ST.stToIO').
Looking at the actual code for ST however it seems runST uses a specific value realWorld#:
newtype ST s a = ST (STRep s a)
type STRep s a = State# s -> (# State# s, a #)
runST :: (forall s. ST s a) -> a
runST st = runSTRep (case st of { ST st_rep -> st_rep })
runSTRep :: (forall s. STRep s a) -> a
runSTRep st_rep = case st_rep realWorld# of
(# _, r #) -> r
realWorld# is defined as a magic primitive inside the GHC source code:
realWorldName = mkWiredInIdName gHC_PRIM (fsLit "realWorld#")
realWorldPrimIdKey realWorldPrimId
realWorldPrimId :: Id -- :: State# RealWorld
realWorldPrimId = pcMiscPrelId realWorldName realWorldStatePrimTy
(noCafIdInfo `setUnfoldingInfo` evaldUnfolding
`setOneShotInfo` stateHackOneShot)
You can also confirm this in ghci:
Prelude> :set -XMagicHash
Prelude> :m +GHC.Prim
Prelude GHC.Prim> :t realWorld#
realWorld# :: State# RealWorld
From your question I can not see if you understand why the phantom s type is there at all. Even if you did not ask for this explicitly, let me elaborate on that.
The role of the phantom type
The main use of the phantom type is to constrain references (aka pointers) to stay "inside" the ST monad. Roughly, the dynamically allocated data must end its life when runST returns.
To see the issue, let's pretend that the type of runST were
runST :: ST s a -> a
Then, consider this:
data Dummy
let var :: STRef Dummy Int
var = runST (newSTRef 0)
change :: () -> ()
change = runST (modifySTRef var succ)
access :: () -> Int
result :: (Int, ())
result = (access() , change())
in result
(Above I added a few useless () arguments to make it similar to imperative code)
Now what should be the result of the code above? It could be either (0,()) or (1,()) depending on the evaluation order. This is a big no-no in the pure Haskell world.
The issue here is that var is a reference which "escaped" from its runST. When you escape the ST monad, you are no longer forced to use the monad operator >>= (or equivalently, the do notation to sequentialize the order of side effects. If references are still around, then we can still have side effects around when there should be none.
To avoid the issue, we restrict runST to work on ST s a where a does not depend on s. Why this? Because newSTRef returns a STRef s a, a reference to a tagged with the phantom type s, hence the return type depends on s and can not be extracted from the ST monad through runST.
Technically, this restriction is done by using a rank-2 type:
runST :: (forall s. ST s a) -> a
the "forall" here is used to implement the restriction. The type is saying: choose any a you wish, then provide a value of type ST s a for any s I wish, then I will return an a. Mind that s is chosen by runST, not by the caller, so it could be absolutely anything. So, the type system will accept an application runST action only if action :: forall s. ST s a where s is unconstrained, and a does not involve s (recall that the caller has to choose a before runST chooses s).
It is indeed a slightly hackish trick to implement the independence constraint, but it does work.
On the actual question
To connect this to your actual question: in the implementation of runST, s will be chosen to be any concrete type. Note that, even if s were simply chosen to be Int inside runST it would not matter much, because the type system has already constrained a to be independent from s, hence to be reference-free. As #Ganesh pointed out, RealWorld is the type used by GHC.
You also mentioned Java. One could attempt to play a similar trick in Java as follows: (warning, overly simplified code follows)
interface ST<S,A> { A call(); }
interface STAction<A> { <S> ST<S,A> call(S dummy); }
...
<A> A runST(STAction<A> action} {
RealWorld dummy = new RealWorld();
return action.call(dummy).call();
}
Above in STAction parameter A can not depend on S.

Class contraints for monads and monad functions

I am trying to write a new monad that only can contain a Num. When it fails, it returns 0 much like the Maybe monad returns Nothing when it fails.
Here is what I have so far:
data (Num a) => IDnum a = IDnum a
instance Monad IDnum where
return x = IDnum x
IDnum x >>= f = f x
fail :: (Num a) => String -> IDnum a
fail _ = return 0
Haskell is complaining that there is
No instance for (Num a) arising from a use of `IDnum'
It suggests that I add a add (Num a) to the context of the type signature for each of my monad functions, but I tried that it and then it complains that they need to work "forall" a.
Ex:
Method signature does not match class; it should be
return :: forall a. a -> IDnum a
In the instance declaration for `Monad IDnum'
Does anyone know how to fix this?
The existing Monad typeclass expects your type to work for every possible type argument. Consider Maybe: in Maybe a, a is not constrained at all. Basically you can't have a Monad with constraints.
This is a fundamental limitation of how the Monad class is defined—I don't know of any way to get around it without modifying that.
This is also a problem for defining Monad instances for other common types, like Set.
In practice, this restriction is actually pretty important. Consider that (normally) functions are not instances of Num. This means that we could not use your monad to contain a function! This really limits important operations like ap (<*> from Applicative), since that depends on a monad containing a function:
ap :: Monad m => m (a -> b) -> m a -> m b
Your monad would not support many common uses and idioms we've grown to expect from normal monads! This would rather limit its utility.
Also, as a side-note, you should generally avoid using fail. It doesn't really fit in with the Monad typeclass: it's more of a historic accident. Most people agree that you should avoid it in general: it was just a hack to deal with failed pattern matches in do-notation.
That said, looking at how to define a restricted monad class is a great exercise for understanding a few Haskell extensions and learning some intermediate/advanced Haskell.
Alternatives
With the downsides in mind, here are a couple of alternatives—replacements for the standard Monad class that do support restricted monads.
Constraint Kinds
I can think of a couple of possible alternatives. The most modern one would be taking advantage of the ConstraintKind extension in GHC, which lets you reify typeclass constraints as kinds. This blog post details how to implement a restricted monad using constraint kinds; once I've read it, I'll summarize it here.
The basic idea is simple: with ConstraintKind, we can turn our constrain (Num a) into a type. We can then have a new Monad class which contains this type as a member (just like return and fail are members) and allows use to overload the constraint with Num a. This is what the code looks like:
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE TypeFamilies #-}
module Main where
import Prelude hiding (Monad (..))
import GHC.Exts
class Monad m where
type Restriction m a :: Constraint
type Restriction m a = ()
return :: Restriction m a => a -> m a
(>>=) :: Restriction m a => m a -> (a -> m b) -> m b
fail :: Restriction m a => String -> m a
data IDnum a = IDnum a
instance Monad IDnum where
type Restriction IDnum a = Num a
return = IDnum
IDnum x >>= f = f x
fail _ = return 0
RMonad
There is an existing library on hackage called rmonad (for "restricted monad") which provides a more general typeclass. You could probably use this to write your desired monad instance. (I haven't used it myself, so it's a bit hard to say.)
It doesn't use the ConstraintKinds extension and (I believe) supports older versions of GHC. However, I think it's a bit ugly; I'm not sure that it's the best option any more.
Here's the code I came up with:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeFamilies #-}
import Prelude hiding (Monad (..))
import Control.RMonad
import Data.Suitable
data IDnum a = IDnum a
data instance Constraints IDnum a = Num a => IDnumConstraints
instance Num a => Suitable IDnum a where
constraints = IDnumConstraints
instance RMonad IDnum where
return = IDnum
IDnum x >>= f = f x
fail _ = withResConstraints $ \ IDnumConstraints -> return 0
Further Reading
For more details, take a look at this SO question.
Oleg has an article about this pertaining specifically to the Set monad, which might be interesting: "How to restrict a monad without breaking it".
Finally, there are a couple of papers you could also read:
The Constrained-Monad Problem
Generic Monadic Constructs for Embedded Languages
This answer will be brief, but here's another alternative to go along with Tikhon's. You can apply a codensity transformation to your type to basically get a free monad for it. Just use it (in the below code it's IDnumM) instead of your base type, then convert the final value to your base type at the end (in the below code, you would use runIDnumM). You can also inject your base type into the transformed type (in the below code, that would be toIDnumM).
A benefit of this approach is that it works with the standard Monad class.
data Num a => IDnum a = IDnum a
newtype IDnumM a = IDnumM { unIDnumM :: forall r. (a -> IDnum r) -> IDnum r }
runIDnumM :: Num a => IDnumM a -> IDnum a
runIDnumM (IDnumM n) = n IDnum
toIDnumM :: Num a => IDnum a -> IDnumM a
toIDnumM (IDnum x) = IDnumM $ \k -> k x
instance Monad IDnumM where
return x = IDnumM $ \k -> k x
IDnumM m >>= f = IDnumM $ \k -> m $ \x -> f x `unIDnumM` k
There is an easier way to do this. One can use multiple functions. First, write one in the Maybe monad. The Maybe monad returns Nothing upon failure. Second, write a function that returns the Just value if not Nothing or some safe value if Nothing. Third, write a function that composes those two functions.
This produces the desired result while being much easier to write and understand.

Signature of IO in Haskell (is this class or data?)

The question is not what IO does, but how is it defined, its signature. Specifically, is this data or class, is "a" its type parameter then? I didn't find it anywhere. Also, I don't understand the syntactic meaning of this:
f :: IO a
You asked whether IO a is a data type: it is. And you asked whether the a is its type parameter: it is. You said you couldn't find its definition. Let me show you how to find it:
localhost:~ gareth.rowlands$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Prelude> :i IO
newtype IO a
= GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
-- Defined in `GHC.Types'
instance Monad IO -- Defined in `GHC.Base'
instance Functor IO -- Defined in `GHC.Base'
Prelude>
In ghci, :i or :info tells you about a type. It shows the type declaration and where it's defined. You can see that IO is a Monad and a Functor too.
This technique is more useful on normal Haskell types - as others have noted, IO is magic in Haskell. In a typical Haskell type, the type signature is very revealing but the important thing to know about IO is not its type declaration, rather that IO actions actually perform IO. They do this in a pretty conventional way, typically by calling the underlying C or OS routine. For example, Haskell's putChar action might call C's putchar function.
IO is a polymorphic type (which happens to be an instance of Monad, irrelevant here).
Consider the humble list. If we were to write our own list of Ints, we might do this:
data IntList = Nil | Cons { listHead :: Int, listRest :: IntList }
If you then abstract over what element type it is, you get this:
data List a = Nil | Cons { listHead :: a, listRest :: List a }
As you can see, the return value of listRest is List a. List is a polymorphic type of kind * -> *, which is to say that it takes one type argument to create a concrete type.
In a similar way, IO is a polymorphic type with kind * -> *, which again means it takes one type argument. If you were to define it yourself, it might look like this:
data IO a = IO (RealWorld -> (a, RealWorld))
(definition courtesy of this answer)
The amount of magic in IO is grossly overestimated: it has some support from compiler and runtime system, but much less than newbies usually expect.
Here is the source file where it is defined:
http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.3.0.0/src/GHC-Types.html
newtype IO a
= IO (State# RealWorld -> (# State# RealWorld, a #))
It is just an optimized version of state monad. If we remove optimization annotations we will see:
data IO a = IO (Realworld -> (Realworld, a))
So basically IO a is a data structure storing a function that takes old real world and returns new real world with io operation performed and a.
Some compiler tricks are necessary mostly to remove Realworld dummy value efficiently.
IO type is an abstract newtype - constructors are not exported, so you cannot bypass library functions, work with it directly and perform nasty things: duplicate RealWorld, create RealWorld out of nothing or escape the monad (write a function of IO a -> a type).
Since IO can be applied to objects of any type a, as it is a polymorphic monad, a is not specified.
If you have some object with type a, then it can be 'wrappered' as an object of type IO a, which you can think of as being an action that gives an object of type a. For example, getChar is of type IO Char, and so when it is called, it has the side effect of (From the program's perspective) generating a character, which comes from stdin.
As another example, putChar has type Char -> IO (), meaning that it takes a char, and then performs some action that gives no output (in the context of the program, though it will print the char given to stdout).
Edit: More explanation of monads:
A monad can be thought of as a 'wrapper type' M, and has two associated functions:
return and >>=.
Given a type a, it is possible to create objects of type M a (IO a in the case of the IO monad), using the return function.
return, therefore, has type a -> M a. Moreover, return attempts not to change the element that it is passed -- if you call return x, you will get a wrappered version of x that contains all of the information of x (Theoretically, at least. This doesn't happen with, for example, the empty monad.)
For example, return "x" will yield an M Char. This is how getChar works -- it yields an IO Char using a return statement, which is then pulled out of its wrapper with <-.
>>=, read as 'bind', is more complicated. It has type M a -> (a -> M b) -> M b, and its role is to take a 'wrappered' object, and a function from the underlying type of that object to another 'wrappered' object, and apply that function to the underlying variable in the first input.
For example, (return 5) >>= (return . (+ 3)) will yield an M Int, which will be the same M Int that would be given by return 8. In this way, any function that can be applied outside of a monad can also be applied inside of it.
To do this, one could take an arbitrary function f :: a -> b, and give the new function g :: M a -> M b as follows:
g x = x >>= (return . f)
Now, for something to be a monad, these operations must also have certain relations -- their definitions as above aren't quite enough.
First: (return x) >>= f must be equivalent to f x. That is, it must be equivalent to perform an operation on x whether it is 'wrapped' in the monad or not.
Second: x >>= return must be equivalent to m. That is, if an object is unwrapped by bind, and then rewrapped by return, it must return to its same state, unchanged.
Third, and finally (x >>= f) >>= g must be equivalent to x >>= (\y -> (f y >>= g) ). That is, function binding is associative (sort of). More accurately, if two functions are bound successively, this must be equivalent to binding the combination thereof.
Now, while this is how monads work, it's not how it's most commonly used, because of the syntactic sugar of do and <-.
Essentially, do begins a long chain of binds, and each <- sort of creates a lambda function that gets bound.
For example,
a = do x <- something
y <- function x
return y
is equivalent to
a = something >>= (\x -> (function x) >>= (\y -> return y))
In both cases, something is bound to x, function x is bound to y, and then y is returned to a in the wrapper of the relevant monad.
Sorry for the wall of text, and I hope it explains something. If there's more you need cleared up about this, or something in this explanation is confusing, just ask.
This is a very good question, if you ask me. I remember being very confused about this too, maybe this will help...
'IO' is a type constructor, 'IO a' is a type, the 'a' (in 'IO a') is an type variable. The letter 'a' carries no significance, the letter 'b' or 't1' could have been used just as well.
If you look at the definition of the IO type constructor you will see that it is a newtype defined as: GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
'f :: IO a' is the type of a function called 'f' of apparently no arguments that returns a result of some unconstrained type in the IO monad. 'in the IO monad' means that f can do some IO (i.e. change the 'RealWorld', where 'change' means replace the provided RealWorld with a new one) while computing its result. The result of f is polymorphic (that's a type variable 'a' not a type constant like 'Int'). A polymorphic result means that in your program it's the caller that determines the type of the result, so used in one place f could return an Int, used in another place it could return a String. 'Unconstrained' means that there's no type class restricting what type can be returned and so any type can be returned.
Why is 'f' a function and not a constant since there are no parameters and Haskell is pure? Because the definition of IO means that 'f :: IO a' could have been written 'f :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #)' and so in fact has a parameter -- the 'state of the real world'.
In the data IO a a have mainly the same meaning as in Maybe a.
But we can't rid of a constructor, like:
fromIO :: IO a -> a
fromIO (IO a) = a
Fortunately we could use this data in Monads, like:
{-# LANGUAGE ScopedTypeVariables #-}
foo = do
(fromIO :: a) <- (dataIO :: IO a)
...

Resources