Struggling with basic experiment printing to stdio - haskell

Brand new to Haskell and failing to print the date in a simple program. What I'm trying to do:
Get the current time from getCurrentTime
Call a pure function on the date, returning a string
Print the string to stdio.
I've learned that getCurrentTime returns an IO monad. I must raise my pure function into the monad using extra sauce like fmap. Still no luck.
What am I doing wrong?
--- EDIT ---
Forgot to mention that this compiles and runs but produces no output.
module Main where
import System.IO
import Data.Time.Clock
import Data.Time.Calendar
date :: IO (Integer,Int,Int)
date = fmap (toGregorian . utctDay) getCurrentTime
getDateStr :: (Integer,Int,Int) -> String
getDateStr (year,month,day) = "Date is " ++ show year ++ "/" ++ show month ++ "/" ++ show day ++ "\n"
main = do
let printabledate = fmap getDateStr date
fmap print printabledate

It works like this:
fmap :: (a -> b) -> f a -> f b, noted the function to fmap is a normal function.
fmap getDateStr date :: IO String.
print :: a -> IO (), a will be String.
So: fmap print (fmap getDateStr date) will has type of IO (IO ()). The point is print is not a "normal" function, but it is a monadic function. You fmap a monadic function to a monadic value, you will get a monadic value wrapped inside another monadic value.
Then, when you evaluate main, you get back the inner monadic value of type IO (). That's not what you want. To get the desired result, just bind print to printabledate as #Ryan suggested in the comment:
(>>=) :: m a -> (a -> m b) -> m b
printabledate >>= print :: IO ()
That's all.

Let's forget the concept of monads for a second to clear stuff a bit:
In OOP, we have classes. Most of the time, these bind some behaviour to some data. In Haskell we do not do this, and rather we create data types that are just that, data.
Also, in OOP, we have the concept of interface, which allows us to define an API for some common functions that some classes can share. In some way, we could group those classes by properties they share, like for example creating an interface Mappable which has a map method that applies a function to the contents of the class that implements that interface.
Now, for example we could create a class List<T> that implements Mappable where map applies the function to each of the elements of the list.
In Haskell, we have type classes which are like interfaces, but much better, because they allow you to implement the API for any existing type. For example our Mappable from before, is called Functor in Haskell. Don't get scared, as it is just a name like AbstractEnterpriseJavaBeanFactory. It is defined like this:
class Functor f where
fmap :: (a -> b) -> f a -> f b
...
And now we can extend anything that has some contents, like our List<T> from before, with it, which in Haskell would be:
data List a = ... -- List implementation
instance Functor List where
fmap f lst = ... -- Implementation of fmap
This is great, because the Functor concept gives us the assurance that the function f will be applied to the contents of the data types that implements this, always returning another copy of the data type.
And you now might be wondering, what has this to do with my question?
Haskell comes with a lot of predefined type classes: Functor, Foldable, Applicative, ...
Type classes only assure you that some constraints will be met when applying the functions defined in their API:
Functor's fmap takes a function a -> b and an f a as argument
Applicative takes a "container" with a function inside f (a -> b) and an f a
etc...
Within all these many typeclasses there is one that has the following API
class X f where
bind :: (a -> f b) -> f a -> f b
It's like our functor from before, but instead of returning an element, the function passed as a parameter, returns another "container".
This typeclass is called Monad, and in Haskell it is defined like:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a -- This puts the value a inside of the "container" m
...
(Note that arguments are flipped)
So basically IO is not a monad, just a container type that happens to implement the API defined in the type class Monad, but also Functor and many others (you can check them in the instances section of the IO API documentation).
Now let's reason a bit:
First of all, we use date to get an IO (Integer, Int, Int), which basically is a container that contains the triple.
Then, we apply the getDateStr function to it's contents using fmap, so we get back an IO that has String inside of it.
Now we bind this value to printableDate. So printableDate is now an IO String.
Now we apply print to the contents of printableDate using fmap, but wait, print returns an "empty" IO container, so now what we get back is an IO containing an IO () ( IO (IO ()) ), but the main function's return type must be always IO ().
What do we do now? We have two options:
1) Unwrap the value inside our printableDate using the <- operator, allowing us to get the String itself:
main = do
let printabledate = fmap getDateStr date
unwrappedPrintableDate <- printabledate
print unwrappedPrintabledate
Or, directly:
main = do
printabledate <- fmap getDateStr date
print printabledate
2) Use the >>= operator defined in the Monad type class:
main = do
let printabledate = fmap getDateStr date
printabledate >>= print
Remember? The >>= operator expects that the function passed to it returns another "container", which is what print does.
Use which one feels more natural to you, as they are both accepted and the same thing

Related

Calling a custom monad in haskell using the bind

I am currently 'getting my feet wet with monads' (http://learnyouahaskell.com/a-fistful-of-monads#getting-our-feet-wet-with-maybe) and still struggling with part of this concept. I understand the maybe monad mentioned there and can see how to call it, as mentioned here:
ghci> return "WHAT" :: Maybe String
Just "WHAT"
ghci> Just 9 >>= \x -> return (x*10)
Just 90
ghci> Nothing >>= \x -> return (x*10)
Nothing
How do I however call my own monad instance rather than the maybe one if I declare my own instance with its own type, like so:
newtype CustomM a = CustomM {runCustomM:: a -> String}
instance Monad CustomM where
return a = CustomM (\a -> mempty)
m >>= f = CustomM (\a -> mempty)
instance Functor CustomM where
fmap = liftM
instance Applicative CustomM where
pure = return; (<*>) = ap
using the return will just call the Maybe Monad I suspect. And what about the >>= it would require me to give it a certain type.
NB. I have only made the bind the same way as the return, because I still struggle with understanding what it is supposed to return, and how I can give it a monad m and call the function f on it's value a, when I don't specify the packed type more thoroughly in the left side. But I shall make another post about that.
You can create a value of your CustomM type just like you did with Maybe, using return:
cm = return "WHAT" :: CustomM String
You can also run it, after a fashion:
Prelude Control.Monad> runCustomM cm "foo"
""
Clearly, though, since you've hard-coded it to return mempty, it returns the mempty value for String, which is "".
All that said, though, a -> String is contravariant in a, so it can't be a useful Functor instance. And since it can't be a useful Functor instance, it also can't be a useful Monad instance.
While you can define a degenerate implementation as in the OP, you're not going to have much success making CustomM do anything for real.
Try, instead, swapping the types like e.g. String -> a.

What is the intuitive meaning of "join"?

What is the intuitive meaning of join for a Monad?
The monads-as-containers analogies make sense to me, and inside these analogies join makes sense. A value is double-wrapped and we unwrap one layer. But as we all know, a monad is not a container.
How might one write sensible, understandable code using join in normal circumstances, say when in IO?
An action :: IO (IO a) is a way of producing a way of producing an a. join action, then, is a way of producing an a by running the outermost producer of action, taking the producer it produced and then running that as well, to finally get to that juicy a.
join collapses consecutive layers of the type constructor.
A valid join must satisfy the property that, for any number of consecutive applications of the type constructor, it shouldn't matter the order in which we collapse the layers.
For example
ghci> let lolol = [[['a'],['b','c']],[['d'],['e']]]
ghci> lolol :: [[[Char]]]
ghci> lolol :: [] ([] ([] Char)) -- the type can also be expressed like this
ghci> join (fmap join lolol) -- collapse inner layers first
"abcde"
ghci> join (join lolol) -- collapse outer layers first
"abcde"
(We used fmap to "get inside" the outer monadic layer so that we could collapse the inner layers first.)
A small non container example where join is useful: for the function monad (->) a, join is equivalent to \f x -> f x x, a function of type (a -> a -> b) -> a -> b that applies two times the same argument to another function.
For the List monad, join is simply concat, and concatMap is join . fmap.
So join implicitly appears in any list expression which uses concat
or concatMap.
Suppose you were asked to find all of the numbers which are divisors of any
number in an input list. If you have a divisors function:
divisors :: Int -> [Int]
divisors n = [ d | d <- [1..n], mod n d == 0 ]
you might solve the problem like this:
foo xs = concat $ (map divisors xs)
Here we are thinking of solving the problem by first mapping the
divisors function over all of the input elements and then concatenating
all of the resulting lists. You might even think that this is a very
"functional" way of solving the problem.
Another approch would be to write a list comprehension:
bar xs = [ d | x <- xs, d <- divisors x ]
or using do-notation:
bar xs = do x <- xs
d <- divisors
return d
Here it might be said we're thinking a little more
imperatively - first draw a number from the list xs; then draw
a divisors from the divisors of the number and yield it.
It turns out, though, that foo and bar are exactly the same function.
Morever, these two approaches are exactly the same in any monad.
That is, for any monad, and appropriate monadic functions f and g:
do x <- f
y <- g x is the same as: (join . fmap g) f
return y
For instance, in the IO monad if we set f = getLine and g = readFile,
we have:
do x <- getLine
y <- readFile x is the same as: (join . fmap readFile) getLine
return y
The do-block is a more imperative way of expressing the action: first read a
line of input; then treat returned string as a file name, read the contents
of the file and finally return the result.
The equivalent join expression seems a little unnatural in the IO-monad.
However it shouldn't be as we are using it in exactly the same way as we
used concatMap in the first example.
Given an action that produces another action, run the action and then run the action that it produces.
If you imagine some kind of Parser x monad that parses an x, then Parser (Parser x) is a parser that does some parsing, and then returns another parser. So join would flatten this into a Parser x that just runs both actions and returns the final x.
Why would you even have a Parser (Parser x) in the first place? Basically, because fmap. If you have a parser, you can fmap a function that changes the result over it. But if you fmap a function that itself returns a parser, you end up with a Parser (Parser x), where you probably want to just run both actions. join implements "just run both actions".
I like the parsing example because a parser typically has a runParser function. And it's clear that a Parser Int is not an integer. It's something that can parse an integer, after you give it some input to parse from. I think a lot of people end up thinking of an IO Int as being just a normal integer but with this annoying IO bit that you can't get rid of. It isn't. It's an unexecuted I/O operation. There's no integer "inside" it; the integer doesn't exist until you actually perform the I/O.
I find these things easier to interpret by writing out the types and refactoring them a bit to reveal what the functions do.
Reader monad
The Reader type is defined thus, and its join function has the type shown:
newtype Reader r a = Reader { runReader :: r -> a }
join :: Reader r (Reader r a) -> Reader r a
Since this is a newtype, this means that the type Reader r a is isomorphic to r -> a. So we can refactor the type definition to give us this type that, albeit it's not the same, it's really "the same" with scare quotes:
In the (->) r monad, which is isomorphic to Reader r, join is the function:
join :: (r -> r -> a) -> r -> a
So the Reader join is the function that takes a two-place function (r -> r -> a) and applies to the same value at both its argument positions.
Writer monad
Since the Writer type has this definition:
newtype Writer w a = Writer { runWriter :: (a, w) }
...then when we remove the newtype, its join function has a type isomorphic to:
join :: Monoid w => ((a, w), w) -> (a, w)
The Monoid constraint needs to be there because the Monad instance for Writer requires it, and it lets us guess right away what the function does:
join ((a, w0), w1) = (a, w0 <> w1)
State monad
Similarly, since State has this definition:
newtype State s a = State { runState :: s -> (a, s) }
...then its join is like this:
join :: (s -> (s -> (a, s), s)) -> s -> (a, s)
...and you can also venture just writing it directly:
join f s0 = (a, s2)
where
(g, s1) = f s0
(a, s2) = g s1
{- Here's the "map" to the variable names in the function:
f g s2 s1 s0 s2
join :: (s -> (s -> (a, s ), s )) -> s -> (a, s )
-}
If you stare at this type a bit, you might think that it bears some resemblance to both the Reader and Writer's types for their join operations. And you'd be right! The Reader, Writer and State monads are all instances of a more general pattern called update monads.
List monad
join :: [[a]] -> [a]
As other people have pointed out, this is the type of the concat function.
Parsing monads
Here comes a really neat thing to realize. Very often, "fancy" monads turn out to be combinations or variants of "basic" ones like Reader, Writer, State or lists. So often what I do when confronted with a novel monad is ask: which of the basic monads does it resemble, and how?
Take for example parsing monads, which have been brought up in other answers here. A simplistic parser monad (with no support for important things like error reporting) looks like this:
newtype Parser a = Parser { runParser :: String -> [(a, String)] }
A Parser is a function that takes a string as input, and returns a list of candidate parses, where each candidate parse is a pair of:
A parse result of type a;
The leftovers (the suffix of the input string that was not consumed in that parse).
But notice that this type looks very much like the state monad:
newtype Parser a = Parser { runParser :: String -> [(a, String)] }
newtype State s a = State { runState :: s -> (a, s) }
And this is no accident! Parser monads are nondeterministic state monads, where the state is the unconsumed portion of the input string, and parse steps generate alternatives that may be later rejected in light of further input. List monads are often called "nondeterminism" monads, so it's no surprise that a parser resembles a mix of the state and list monads.
And this intuition can be systematized by using monad transfomers. The state monad transformer is defined like this:
newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }
Which means that the Parser type from above can be written like this as well:
type Parser a = StateT String [] a
...and its Monad instance follows mechanically from those of StateT and [].
The IO monad
Imagine we could enumerate all of the possible primitive IO actions, somewhat like this:
{-# LANGUAGE GADTs #-}
data Command a where
-- An action that writes a char to stdout
putChar :: Char -> Command ()
-- An action that reads a char from stdin
getChar :: Command Char
-- ...
Then we could think of the IO type as this (which I've adapted from the highly-recommended Operational monad tutorial):
data IO a where
-- An `IO` action that just returns a constant value.
Return :: a -> IO a
-- An action that binds the result of a `Command` to
-- a function that computes the next step after it.
Bind :: Command x -> (x -> IO a) -> IO a
instance Monad IO where ...
Then join action would then look like this:
join :: IO (IO a) -> IO a
-- If the action is just `Return`, then its payload already
-- is what we need to return.
join (Return ioa) = ioa
-- If the action is a `Bind`, then its "next step" function
-- `f` produces `IO (IO a)`, so we can just recursively stick
-- a `join` to its result end.
join (Bind cmd f) = Bind cmd (join . f)
So all that the join does here is "chase down" the IO action until it sees a result that fits the pattern Return (ma :: IO a), and strip out the outer Return.
So what did I do here? Just like for parser monads, I just defined (or rather copied) a toy model of the IO type that has the virtue of being transparent. Then I work out the behavior of join from the toy model.

Why does bind (>>=) exist? What are typical cases where a solution without bind is ugly?

This is a type declaration of a bind method:
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b
I read this as follows: apply a function that returns a wrapped value, to a wrapped value.
This method was included to Prelude as part of Monad typeclass. That means there are a lot of cases where it's needed.
OK, but I don't understand why it's a typical solution of a typical case at all.
If you already created a function which returns a wrapped value, why that function doesn't already take a wrapped value?
In other words, what are typical cases where there are many functions which take a normal value, but return a wrapped value? (instead of taking a wrapped value and return a wrapped value)
The 'unwrapping' of values is exactly what you want to keep hidden when dealing with monads, since it is this that causes a lot of boilerplate.
For example, if you have a sequence of operations which return Maybe values that you want to combine, you have to manually propagate Nothing if you receive one:
nested :: a -> Maybe b
nested x = case f x of
Nothing -> Nothing
Just r ->
case g r of
Nothing -> Nothing
Just r' ->
case h r' of
Nothing -> Nothing
r'' -> i r''
This is what bind does for you:
Nothing >>= _ = Nothing
Just a >>= f = f a
so you can just write:
nested x = f x >>= g >>= h >>= i
Some monads don't allow you to manually unpack the values at all - the most common example is IO. The only way to get the value from an IO is to map or >>= and both of these require you to propagate IO in the output.
Everyone focuses on IO monad and inability to "unwrap".
But a Monad is not always a container, so you can't unwrap.
Reader r a == r->a such that (Reader r) is a Monad
to my mind is the simplest best example of a Monad that is not a container.
You can easily write a function that can produce m b given a: a->(r->b). But you can't easily "unwrap" the value from m a, because a is not wrapped in it. Monad is a type-level concept.
Also, notice that if you have m a->m b, you don't have a Monad. What Monad gives you, is a way to build a function m a->m b from a->m b (compare: Functor gives you a way to build a function m a->m b from a->b; ApplicativeFunctor gives you a way to build a function m a->m b from m (a->b))
If you already created a function which returns a wrapped value, why that function doesn't already take a wrapped value?
Because that function would have to unwrap its argument in order to do something with it.
But for many choices of m, you can only unwrap a value if you will eventually rewrap your own result. This idea of "unwrap, do something, then rewrap" is embodied in the (>>=) function which unwraps for you, let's you do something, and forces you to rewrap by the type a -> m b.
To understand why you cannot unwrap without eventually rewrapping, we can look at some examples:
If m a = Maybe a, unwrapping for Just x would be easy: just return x. But how can we unwrap Nothing? We cannot. But if we know that we will eventually rewrap, we can skip the "do something" step and return Nothing for the overall operation.
If m a = [a], unwrapping for [x] would be easy: just return x. But for unwrapping [], we need the same trick as for Maybe a. And what about unwrapping [x, y, z]? If we know that we will eventually rewrap, we can execute the "do something" three times, for x, y and z and concat the results into a single list.
If m a = IO a, no unwrapping is easy because we only know the result sometimes in the future, when we actually run the IO action. But if we know that we will eventually rewrap, we can store the "do something" inside the IO action and perform it later, when we execute the IO action.
I hope these examples make it clear that for many interesting choices of m, we can only implement unwrapping if we know that we are going to rewrap. The type of (>>=) allows precisely this assumption, so it is cleverly chosen to make things work.
While (>>=) can sometimes be useful when used directly, its main purpose is to implement the <- bind syntax in do notation. It has the type m a -> (a -> m b) -> m b mainly because, when used in a do notation block, the right hand side of the <- is of type m a, the left hand side "binds" an a to the given identifier and, when combined with remainder of the do block, is of type a -> m b, the resulting monadic action is of type m b, and this is the only type it possibly could have to make this work.
For example:
echo = do
input <- getLine
putStrLn input
The right hand side of the <- is of type IO String
The left hands side of the <- with the remainder of the do block are of type String -> IO (). Compare with the desugared version using >>=:
echo = getLine >>= (\input -> putStrLn input)
The left hand side of the >>= is of type IO String. The right hand side is of type String -> IO (). Now, by applying an eta reduction to the lambda we can instead get:
echo = getLine >>= putStrLn
which shows why >>= is sometimes used directly rather than as the "engine" that powers do notation along with >>.
I'd also like to provide what I think is an important correction to the concept of "unwrapping" a monadic value, which is that it doesn't happen. The Monad class does not provide a generic function of type Monad m => m a -> a. Some particular instances do but this is not a feature of monads in general. Monads, generally speaking, cannot be "unwrapped".
Remember that m >>= k = join (fmap k m) is a law that must be true for any monad. Any particular implementation of >>= must satisfy this law and so must be equivalent to this general implementation.
What this means is that what really happens is that the monadic "computation" a -> m b is "lifted" to become an m a -> m (m b) using fmap and then applied the m a, giving an m (m b); and then join :: m (m a) -> m a is used to squish the two ms together to yield a m b. So the a never gets "out" of the monad. The monad is never "unwrapped". This is an incorrect way to think about monads and I would strongly recommend that you not get in the habit.
I will focus on your point
If you already created a function which returns a wrapped value, why
that function doesn't already take a wrapped value?
and the IO monad. Suppose you had
getLine :: IO String
putStrLn :: IO String -> IO () -- "already takes a wrapped value"
how one could write a program which reads a line and print it twice? An attempt would be
let line = getLine
in putStrLn line >> putStrLn line
but equational reasoning dictates that this is equivalent to
putStrLn getLine >> putStrLn getLine
which reads two lines instead.
What we lack is a way to "unwrap" the getLine once, and use it twice. The same issue would apply to reading a line, printing "hello", and then printing a line:
let line = getLine in putStrLn "hello" >> putStrLn line
-- equivalent to
putStrLn "hello" >> putStrLn getLine
So, we also lack a way to specify "when to unwrap" the getLine. The bind >>= operator provides a way to do this.
A more advanced theoretical note
If you swap the arguments around the (>>=) bind operator becomes (=<<)
(=<<) :: (a -> m b) -> (m a -> m b)
which turns any function f taking an unwrapped value into a function g taking a wrapped
value. Such g is known as the Kleisli extension of f. The bind operator guarantees
such an extension always exists, and provides a convenient way to use it.
Because we like to be able to apply functions like a -> b to our m as. Lifting such a function to m a -> m b is trivial (liftM, liftA, >>= return ., fmap) but the opposite is not necessarily possible.
You want some typical examples? How about putStrLn :: String -> IO ()? It would make no sense for this function to have the type IO String -> IO () because the origin of the string doesn't matter.
Anyway: You might have the wrong idea because of your "wrapped value" metaphor; I use it myself quite often, but it has its limitations. There isn't necessarily a pure way to get an a out of an m a - for example, if you have a getLine :: IO String, there's not a great deal of interesting things you can do with it - you can put it in a list, chain it in a row and other neat things, but you can't get any useful information out of it because you can't look inside an IO action. What you can do is use >>= which gives you a way to use the result of the action.
Similar things apply to monads where the "wrapping" metaphor applies too; For example the point Maybe monad is to avoid manually wrapping and unwrapping values with and from Just all the time.
My two most common examples:
1) I have a series of functions that generate a list of lists, but I finally need a flat list:
f :: a -> [a]
fAppliedThrice :: [a] -> [a]
fAppliedThrice aList = concat (map f (concat (map f (concat (map f a)))))
fAppliedThrice' :: [a] -> [a]
fAppliedThrice' aList = aList >>= f >>= f >>= f
A practical example of using this was when my functions fetched attributes of a foreign key relationship. I could just chain them together to finally obtain a flat list of attributes. Eg: Product hasMany Review hasMany Tag type relationship, and I finally want a list of all the tag names for a product. (I added some template-haskell and got a very good generic attribute fetcher for my purposes).
2) Say you have a series of filter-like functions to apply to some data. And they return Maybe values.
case (val >>= filter >>= filter2 >>= filter3) of
Nothing -> putStrLn "Bad data"
Just x -> putStrLn "Good data"

Signature of IO in Haskell (is this class or data?)

The question is not what IO does, but how is it defined, its signature. Specifically, is this data or class, is "a" its type parameter then? I didn't find it anywhere. Also, I don't understand the syntactic meaning of this:
f :: IO a
You asked whether IO a is a data type: it is. And you asked whether the a is its type parameter: it is. You said you couldn't find its definition. Let me show you how to find it:
localhost:~ gareth.rowlands$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Prelude> :i IO
newtype IO a
= GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
-- Defined in `GHC.Types'
instance Monad IO -- Defined in `GHC.Base'
instance Functor IO -- Defined in `GHC.Base'
Prelude>
In ghci, :i or :info tells you about a type. It shows the type declaration and where it's defined. You can see that IO is a Monad and a Functor too.
This technique is more useful on normal Haskell types - as others have noted, IO is magic in Haskell. In a typical Haskell type, the type signature is very revealing but the important thing to know about IO is not its type declaration, rather that IO actions actually perform IO. They do this in a pretty conventional way, typically by calling the underlying C or OS routine. For example, Haskell's putChar action might call C's putchar function.
IO is a polymorphic type (which happens to be an instance of Monad, irrelevant here).
Consider the humble list. If we were to write our own list of Ints, we might do this:
data IntList = Nil | Cons { listHead :: Int, listRest :: IntList }
If you then abstract over what element type it is, you get this:
data List a = Nil | Cons { listHead :: a, listRest :: List a }
As you can see, the return value of listRest is List a. List is a polymorphic type of kind * -> *, which is to say that it takes one type argument to create a concrete type.
In a similar way, IO is a polymorphic type with kind * -> *, which again means it takes one type argument. If you were to define it yourself, it might look like this:
data IO a = IO (RealWorld -> (a, RealWorld))
(definition courtesy of this answer)
The amount of magic in IO is grossly overestimated: it has some support from compiler and runtime system, but much less than newbies usually expect.
Here is the source file where it is defined:
http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.3.0.0/src/GHC-Types.html
newtype IO a
= IO (State# RealWorld -> (# State# RealWorld, a #))
It is just an optimized version of state monad. If we remove optimization annotations we will see:
data IO a = IO (Realworld -> (Realworld, a))
So basically IO a is a data structure storing a function that takes old real world and returns new real world with io operation performed and a.
Some compiler tricks are necessary mostly to remove Realworld dummy value efficiently.
IO type is an abstract newtype - constructors are not exported, so you cannot bypass library functions, work with it directly and perform nasty things: duplicate RealWorld, create RealWorld out of nothing or escape the monad (write a function of IO a -> a type).
Since IO can be applied to objects of any type a, as it is a polymorphic monad, a is not specified.
If you have some object with type a, then it can be 'wrappered' as an object of type IO a, which you can think of as being an action that gives an object of type a. For example, getChar is of type IO Char, and so when it is called, it has the side effect of (From the program's perspective) generating a character, which comes from stdin.
As another example, putChar has type Char -> IO (), meaning that it takes a char, and then performs some action that gives no output (in the context of the program, though it will print the char given to stdout).
Edit: More explanation of monads:
A monad can be thought of as a 'wrapper type' M, and has two associated functions:
return and >>=.
Given a type a, it is possible to create objects of type M a (IO a in the case of the IO monad), using the return function.
return, therefore, has type a -> M a. Moreover, return attempts not to change the element that it is passed -- if you call return x, you will get a wrappered version of x that contains all of the information of x (Theoretically, at least. This doesn't happen with, for example, the empty monad.)
For example, return "x" will yield an M Char. This is how getChar works -- it yields an IO Char using a return statement, which is then pulled out of its wrapper with <-.
>>=, read as 'bind', is more complicated. It has type M a -> (a -> M b) -> M b, and its role is to take a 'wrappered' object, and a function from the underlying type of that object to another 'wrappered' object, and apply that function to the underlying variable in the first input.
For example, (return 5) >>= (return . (+ 3)) will yield an M Int, which will be the same M Int that would be given by return 8. In this way, any function that can be applied outside of a monad can also be applied inside of it.
To do this, one could take an arbitrary function f :: a -> b, and give the new function g :: M a -> M b as follows:
g x = x >>= (return . f)
Now, for something to be a monad, these operations must also have certain relations -- their definitions as above aren't quite enough.
First: (return x) >>= f must be equivalent to f x. That is, it must be equivalent to perform an operation on x whether it is 'wrapped' in the monad or not.
Second: x >>= return must be equivalent to m. That is, if an object is unwrapped by bind, and then rewrapped by return, it must return to its same state, unchanged.
Third, and finally (x >>= f) >>= g must be equivalent to x >>= (\y -> (f y >>= g) ). That is, function binding is associative (sort of). More accurately, if two functions are bound successively, this must be equivalent to binding the combination thereof.
Now, while this is how monads work, it's not how it's most commonly used, because of the syntactic sugar of do and <-.
Essentially, do begins a long chain of binds, and each <- sort of creates a lambda function that gets bound.
For example,
a = do x <- something
y <- function x
return y
is equivalent to
a = something >>= (\x -> (function x) >>= (\y -> return y))
In both cases, something is bound to x, function x is bound to y, and then y is returned to a in the wrapper of the relevant monad.
Sorry for the wall of text, and I hope it explains something. If there's more you need cleared up about this, or something in this explanation is confusing, just ask.
This is a very good question, if you ask me. I remember being very confused about this too, maybe this will help...
'IO' is a type constructor, 'IO a' is a type, the 'a' (in 'IO a') is an type variable. The letter 'a' carries no significance, the letter 'b' or 't1' could have been used just as well.
If you look at the definition of the IO type constructor you will see that it is a newtype defined as: GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
'f :: IO a' is the type of a function called 'f' of apparently no arguments that returns a result of some unconstrained type in the IO monad. 'in the IO monad' means that f can do some IO (i.e. change the 'RealWorld', where 'change' means replace the provided RealWorld with a new one) while computing its result. The result of f is polymorphic (that's a type variable 'a' not a type constant like 'Int'). A polymorphic result means that in your program it's the caller that determines the type of the result, so used in one place f could return an Int, used in another place it could return a String. 'Unconstrained' means that there's no type class restricting what type can be returned and so any type can be returned.
Why is 'f' a function and not a constant since there are no parameters and Haskell is pure? Because the definition of IO means that 'f :: IO a' could have been written 'f :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #)' and so in fact has a parameter -- the 'state of the real world'.
In the data IO a a have mainly the same meaning as in Maybe a.
But we can't rid of a constructor, like:
fromIO :: IO a -> a
fromIO (IO a) = a
Fortunately we could use this data in Monads, like:
{-# LANGUAGE ScopedTypeVariables #-}
foo = do
(fromIO :: a) <- (dataIO :: IO a)
...

What general structure does this type have?

While hacking something up earlier, I created the following code:
newtype Callback a = Callback { unCallback :: a -> IO (Callback a) }
liftCallback :: (a -> IO ()) -> Callback a
liftCallback f = let cb = Callback $ \x -> (f x >> return cb) in cb
runCallback :: Callback a -> IO (a -> IO ())
runCallback cb =
do ref <- newIORef cb
return $ \x -> readIORef ref >>= ($ x) . unCallback >>= writeIORef ref
Callback a represents a function that handles some data and returns a new callback that should be used for the next notification. A callback which can basically replace itself, so to speak. liftCallback just lifts a normal function to my type, while runCallback uses an IORef to convert a Callback to a simple function.
The general structure of the type is:
data T m a = T (a -> m (T m a))
It looks much like this could be isomorphic to some well-known mathematical structure from category theory.
But what is it? Is it a monad or something? An applicative functor? A transformed monad? An arrow, even? Is there a search engine similar Hoogle that lets me search for general patterns like this?
The term you are looking for is free monad transformer. The best place to learn how these work is to read the "Coroutine Pipelines" article in issue 19 of The Monad Reader. Mario Blazevic gives a very lucid description of how this type works, except he calls it the "Coroutine" type.
I wrote up his type in the transformers-free package and then it got merged into the free package, which is its new official home.
Your Callback type is isomorphic to:
type Callback a = forall r . FreeT ((->) a) IO r
To understand free monad transformers, you need to first understand free monads, which are just abstract syntax trees. You give the free monad a functor which defines a single step in the syntax tree, and then it creates a Monad from that Functor that is basically a list of those types of steps. So if you had:
Free ((->) a) r
That would be a syntax tree that accepts zero or more as as input and then returns a value r.
However, usually we want to embed effects or make the next step of the syntax tree dependent on some effect. To do that, we simply promote our free monad to a free monad transformer, which interleaves the base monad between syntax tree steps. In the case of your Callback type, you are interleaving IO in between each input step, so your base monad is IO:
FreeT ((->) a) IO r
The nice thing about free monads is that they are automatically monads for any functor, so we can take advantage of this to use do notation to assemble our syntax tree. For example, I can define an await command that will bind the input within the monad:
import Control.Monad.Trans.Free
await :: (Monad m) => FreeT ((->) a) m a
await = liftF id
Now I have a DSL for writing Callbacks:
import Control.Monad
import Control.Monad.Trans.Free
printer :: (Show a) => FreeT ((->) a) IO r
printer = forever $ do
a <- await
lift $ print a
Notice that I never had to define the necessary Monad instance. Both FreeT f and Free f are automatically Monads for any functor f, and in this case ((->) a) is our functor, so it automatically does the right thing. That's the magic of category theory!
Also, we never had to define a MonadTrans instance in order to use lift. FreeT f is automatically a monad transformer, given any functor f, so it took care of that for us, too.
Our printer is a suitable Callback, so we can feed it values just by deconstructing the free monad transformer:
feed :: [a] -> FreeT ((->) a) IO r -> IO ()
feed as callback = do
x <- runFreeT callback
case x of
Pure _ -> return ()
Free k -> case as of
[] -> return ()
b:bs -> feed bs (k b)
The actual printing occurs when we bind runFreeT callback, which then gives us the next step in the syntax tree, which we feed the next element of the list.
Let's try it:
>>> feed [1..5] printer
1
2
3
4
5
However, you don't even need to write all this up yourself. As Petr pointed out, my pipes library abstracts common streaming patterns like this for you. Your callback is just:
forall r . Consumer a IO r
The way we'd define printer using pipes is:
printer = forever $ do
a <- await
lift $ print a
... and we can feed it a list of values like so:
>>> runEffect $ each [1..5] >-> printer
1
2
3
4
5
I designed pipes to encompass a very large range of streaming abstractions like these in such a way that you can always use do notation to build each streaming component. pipes also comes with a wide variety of elegant solutions for things like state and error handling, and bidirectional flow of information, so if you formulate your Callback abstraction in terms of pipes, you tap into a ton of useful machinery for free.
If you want to learn more about pipes, I recommend you read the tutorial.
The general structure of the type looks to me like
data T (~>) a = T (a ~> T (~>) a)
where (~>) = Kleisli m in your terms (an arrow).
Callback itself doesn't look like an instance of any standard Haskell typeclass I can think of, but it is a Contravariant Functor (also known as Cofunctor, misleadingly as it turns out). As it is not included in any of the libraries that come with GHC, there exist several definitions of it on Hackage (use this one), but they all look something like this:
class Contravariant f where
contramap :: (b -> a) -> f a -> f b
-- c.f. fmap :: (a -> b) -> f a -> f b
Then
instance Contravariant Callback where
contramap f (Callback k) = Callback ((fmap . liftM . contramap) f (f . k))
Is there some more exotic structure from category theory that Callback possesses? I don't know.
I think that this type is very close to what I have heard called a 'Circuit', which is a type of arrow. Ignoring for a moment the IO part (as we can have this just by transforming a Kliesli arrow) the circuit transformer is:
newtype CircuitT a b c = CircuitT { unCircuitT :: a b (c, CircuitT a b c) }
This is basicall an arrow that returns a new arrow to use for the next input each time. All of the common arrow classes (including loop) can be implemented for this arrow transformer as long as the base arrow supports them. Now, all we have to do to make it notionally the same as the type you mention is to get rid of that extra output. This is easily done, and so we find:
Callback a ~=~ CircuitT (Kleisli IO) a ()
As if we look at the right hand side:
CircuitT (Kleisli IO) a () ~=~
(Kliesli IO) a ((), CircuitT (Kleisli IO) a ()) ~=~
a -> IO ((), CircuitT (Kliesli IO) a ())
And from here, you can see how this is similar to Callback a, except we also output a unit value. As the unit value is in a tuple with something else anyway, this really doesn't tell us much, so I would say they're basically the same.
N.B. I used ~=~ for similar but not entirely equivalent to, for some reason. They are very closely similar though, in particular note that we could convert a Callback a into a CircuitT (Kleisli IO) a () and vice-versa.
EDIT: I would also fully agree with the ideas that this is A) a monadic costream (monadic operation expecitng an infinite number of values, I think this means) and B) a consume-only pipe (which is in many ways very similar to the circuit type with no output, or rather output set to (), as such a pipe could also have had output).
Just an observation, your type seems quite related to Consumer p a m appearing in the pipes library (and probably other similar librarties as well):
type Consumer p a = p () a () C
-- A Pipe that consumes values
-- Consumers never respond.
where C is an empty data type and p is an instance of Proxy type class. It consumes values of type a and never produces any (because its output type is empty).
For example, we could convert a Callback into a Consumer:
import Control.Proxy
import Control.Proxy.Synonym
newtype Callback m a = Callback { unCallback :: a -> m (Callback m a) }
-- No values produced, hence the polymorphic return type `r`.
-- We could replace `r` with `C` as well.
consumer :: (Proxy p, Monad m) => Callback m a -> () -> Consumer p a m r
consumer c () = runIdentityP (run c)
where
run (Callback c) = request () >>= lift . c >>= run
See the tutorial.
(This should have been rather a comment, but it's a bit too long.)

Resources