When would I want to use a Free Monad + Interpreter pattern? - haskell

I'm working on a project that, amongst other things, involves a database access layer. Pretty normal, really. In a previous project, a collaborator encouraged me to use the Free Monads concept for a database layer and so I did. Now I'm trying to decide in my new project what I gain.
In the previous project, I had an API that looked rather like this.
saveDocument :: RawDocument -> DBAction ()
getDocuments :: DocumentFilter -> DBAction [RawDocument]
getDocumentStats :: DBAction [(DocId, DocumentStats)]
etc. About twenty such public functions. To support them, I had the DBAction data structure:
data DBAction a =
SaveDocument RawDocument (DBAction a)
| GetDocuments DocumentFilter ([RawDocument] -> DBAction a)
| GetDocumentStats ([(DocId, DocumentStats)] -> DBAction a)
| Return a
And then a monad implementation:
instance Monad DBAction where
return = Return
SaveDocument doc k >>= f = SaveDocument doc (k >>= f)
GetDocuments df k >>= f = GetDocuments df (k >=> f)
And then the interpreter. And then the primitive functions that implement each of the different queries. Basically, I'm feeling that I had a huge amount of glue code.
In my current project (in a totally different field), I have instead gone with a pretty ordinary monad for my database:
newtype DBM err a = DBM (ReaderT DB (EitherT err IO) a)
deriving (Monad, MonadIO, MonadReader DB)
indexImage :: (ImageId, UTCTime) -> Exif -> Thumbnail -> DBM SaveError ()
removeImage :: DB -> ImageId -> DBM DeleteError ()
And so on. I figure that, ultimately, I'll have the "public" functions that represent high level concepts all running in the DBM context, and then I'll have the whole slew of functions that do the SQL/Haskell glue. This is, overall, feeling much better than the free monad system because I'm not writing a huge amount of boilerplate code to gains me nothing but the ability to swap out my interpreter.
Or...
Do I actually gain something else with the Free Monad + Interpreter pattern? If so, what?

As mentioned in the comments, it is frequently desirable to have some abstraction between code and database implementation. You can get much of the same abstraction as a free monad by defining a class for your DB Monad (I've taken a couple liberties here):
class (Monad m) => MonadImageDB m where
indexImage :: (ImageId, UTCTime) -> Exif -> Thumbnail -> m SaveResult
removeImage :: ImageId -> m DeleteResult
If your code is written against MonadImageDB m => instead of tightly coupled to DBM, you will be able to swap out the database and error handling without modifying your code.
Why would you use free instead? Because it "frees the interpreter as much as possible", meaning the intepreter is only committed to providing a monad, and nothing else. This means you are as unconstrained as possible writing monad instances to go with your code. Note that, for the free monad, you don't write your own instance for Monad, you get it for free. You'd write something like
data DBActionF next =
SaveDocument RawDocument ( next)
| GetDocuments DocumentFilter ([RawDocument] -> next)
| GetDocumentStats ([(DocId, DocumentStats)] -> next)
derive Functor DBActionF, and get the monad instance for Free DBActionF from the existing instance for Functor f => Monad (Free f).
For your example, it'd instead be:
data ImageActionF next =
IndexImage (ImageId, UTCTime) Exif Thumbnail (SaveResult -> next)
| RemoveImage ImageId (DeleteResult -> next)
You can also get the property "frees the interpreter as much as possible" for the type class. If you have no other constraints on m than the type class, MonadImageDB, and all of MonadImageDB's methods could be constructors for a Functor, then you get the same property. You can see this by implementing instance MonadImageDB (Free ImageActionF).
If you are going to mix your code with interactions with some other monad, you can get a monad transformer from free instead of a monad.
Choosing
You don't have to choose. You can convert back and forth between the representations. This example shows how to do so for actions with zero, one, or two arguments returning zero, one, or two results. First, a bit of boilerplate
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleInstances #-}
import Control.Monad.Free
We have a type class
class Monad m => MonadAddDel m where
add :: String -> m Int
del :: Int -> m ()
set :: Int -> String -> m ()
add2 :: String -> String -> m (Int, Int)
nop :: m ()
and an equivalent functor representation
data AddDelF next
= Add String ( Int -> next)
| Del Int ( next)
| Set Int String ( next)
| Add2 String String (Int -> Int -> next)
| Nop ( next)
deriving (Functor)
Converting from the free representation to the type class replaces Pure with return, Free with >>=, Add with add, etc.
run :: MonadAddDel m => Free AddDelF a -> m a
run (Pure a) = return a
run (Free (Add x next)) = add x >>= run . next
run (Free (Del id next)) = del id >> run next
run (Free (Set id x next)) = set id x >> run next
run (Free (Add2 x y next)) = add2 x y >>= \ids -> run (next (fst ids) (snd ids))
run (Free (Nop next)) = nop >> run next
A MonadAddDel instance for the representation builds functions for the next arguments of the constructors using Pure.
instance MonadAddDel (Free AddDelF) where
add x = Free . (Add x ) $ Pure
del id = Free . (Del id ) $ Pure ()
set id x = Free . (Set id x) $ Pure ()
add2 x y = Free . (Add2 x y) $ \id1 id2 -> Pure (id1, id2)
nop = Free . Nop $ Pure ()
(Both of these have patterns we could extract for production code, the hard part to writing these generically would be dealing with the varying number of input and result arguments)
Coding against the type class uses only the MonadAddDel m => constraint, for example:
example1 :: MonadAddDel m => m ()
example1 = do
id <- add "Hi"
del id
nop
(id3, id4) <- add2 "Hello" "World"
set id4 "Again"
I was too lazy to write another instance for MonadAddDel besides the one I got from free, and too lazy to make an example besides by using the MonadAddDel type class.
If you like running example code, here's enough to see the example interpreted once (converting the type class representation to the free representation), and again after converting the free representation back to the type class representation again. Again, I'm too lazy to write the code twice.
debugInterpreter :: Free AddDelF a -> IO a
debugInterpreter = go 0
where
go n (Pure a) = return a
go n (Free (Add x next)) =
do
print $ "Adding " ++ x ++ " with id " ++ show n
go (n+1) (next n)
go n (Free (Del id next)) =
do
print $ "Deleting " ++ show id
go n next
go n (Free (Set id x next)) =
do
print $ "Setting " ++ show id ++ " to " ++ show x
go n next
go n (Free (Add2 x y next)) =
do
print $ "Adding " ++ x ++ " with id " ++ show n ++ " and " ++ y ++ " with id " ++ show (n+1)
go (n+2) (next n (n+1))
go n (Free (Nop next)) =
do
print "Nop"
go n next
main =
do
debugInterpreter example1
debugInterpreter . run $ example1

Related

Interpreting the Teletype free monad in the RWS monad

I'm currently learning about free monads and I was toying with probably the simplest and most common example out there – Teletype:
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad.Free
data TeletypeF a = Put String a
| Get (String -> a)
deriving Functor
type Teletype = Free TeletypeF
Many tutorials interpret Teletype programs in the IO monad. For example:
-- Utilities
get = liftF $ Get id
put s = liftF $ Put s ()
-- Sample programs
echo :: Teletype ()
echo = do word <- get
if word == "\04" -- Ctrl-D
then return ()
else put word >> echo
hello :: Teletype ()
hello = do put "What is your name?"
name <- get
put "What is your age?"
age <- get
put ("Hello, " ++ name ++ "!")
put ("You are " ++ age ++ " years old!")
-- Interpret to IO
interpIO :: Teletype a -> IO a
interpIO = foldFree lift
where
lift (Put s a) = putStrLn s >> return a
lift (Get f) = getLine >>= return . f
I was trying to interpret it in a different monad, namely the RWS monad.
This idea was motivated by the last exercise from this assignment.
I'm using the RWS datatype to fetch input from the Reader part and accumulate output in the State part.
But, unfortunately, I'm not able to get it working. Here is my attempt so far:
import Control.Monad.Trans.RWS.Lazy hiding (get, put)
type TeletypeRWS = RWS [String] () [String]
-- Interpret to TeletypeRWS
interpRWS :: Teletype a -> TeletypeRWS a
interpRWS = foldFree lift
where
lift (Put s a) = state (\t -> ((), t ++ [s])) >> return a
lift (Get f) = reader head >>= local tail . return . f -- This is wrong
mockConsole :: Teletype a -> [String] -> (a, [String])
mockConsole p inp = (a, s)
where
(a, s, _) = runRWS (interpRWS p) inp []
When running the TeletypeRWS "programs", the first value in the environment is not removed:
*Main> mockConsole hello ["john", "18"]
((),["What is your name?","What is your age?","Hello, john!","You are john years old!"])
I am a bit uneasy about updating the Reader, but I don't know how else I can access the next value in the list. The type of TeletypeRWS was chosen based on the exercise mentioned above – so I assume it should be possible to implement interpRWS.
We can't use foldFree: it needs to be parametric in the continuation, so we can't apply local there. In contrast, iterM explicitly gives us the actual continuation without generalization, so this will work.
interpRWS = iterM lift where
lift (Put s a) = modify (\t -> t ++ [s]) >> a
lift (Get f) = reader head >>= local tail . f

Refactoring Haskell when adding IO

I have a concern regarding how far the introduction of IO trickles through a program. Say a function deep within my program is altered to include some IO; how do I isolate this change to not have to also change every function in the path to IO as well?
For instance, in a simplified example:
a :: String -> String
a s = (b s) ++ "!"
b :: String -> String
b s = '!':(fetch s)
fetch :: String -> String
fetch s = reverse s
main = putStrLn $ a "hello"
(fetch here could more realistically be reading a value from a static Map to give as its result)
But say if due to some business logic change, I needed to lookup the value returned by fetch in some database (which I can exemplify here with a call to getLine):
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
So my question is, how to prevent having to rewrite every function call in this chain?
a :: String -> IO String
a s = fmap (\x -> x ++ "!") (b s)
b :: String -> IO String
b s = fmap (\x -> '!':x) (fetch s)
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
main = a "hello" >>= putStrLn
I can see that refactoring this would be much simpler if the functions themselves did not depend on each other. That is fine for a simple example:
a :: String -> String
a s = s ++ "!"
b :: String -> String
b s = '!':s
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
doit :: String -> IO String
doit s = fmap (a . b) (fetch s)
main = doit "hello" >>= putStrLn
but I don't know if that is necessarily practical in more complicated programs.
The only way I've found thus far to really isolate an IO addition like this is to use unsafePerformIO, but, by its very name, I don't want to do that if I can help it. Is there some other way to isolate this change? If the refactoring is substantial, I would start to feel inclined to avoid having to do it (especially under deadlines, etc).
Thanks for any advice!
Here are a few methods I use.
Reduce dependencies on effects by inverting control. (One of the methods you described in your question.) That is, execute the effects outside and pass the results (or functions with those results partially applied) into pure code. Instead of having main → a → b → fetch, have main → fetch and then main → a → b:
a :: String -> String
a f = b f ++ "!"
b :: String -> String
b f = '!' : f
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
main = do
f <- fetch "hello"
putStrLn $ a f
For more complex cases of this, where you need to thread an argument to do this sort of “dependency injection” through many levels, Reader/ReaderT lets you abstract over the boilerplate.
Write pure code that you expect might need effects in monadic style from the start. (Polymorphic over the choice of monad.) Then if you do eventually need effects in that code, you don’t need to change the implementation, only the signature.
a :: (Monad m) => String -> m String
a s = (++ "!") <$> b s
b :: (Monad m) => String -> m String
b s = ('!' :) <$> fetch s
fetch :: (Monad m) => String -> m String
fetch s = pure (reverse s)
Since this code works for any m with a Monad instance (or in fact just Applicative), you can run it directly in IO, or purely with the “dummy” monad Identity:
main = putStrLn =<< a "hello"
main = putStrLn $ runIdentity $ a "hello"
Then as you need more effects, you can use “mtl style” (as #dfeuer’s answer describes) to enable effects on an as-needed basis, or if you’re using the same monad stack everywhere, just replace m with that concrete type, e.g.:
newtype Fetch a = Fetch { unFetch :: IO a }
deriving (Applicative, Functor, Monad, MonadIO)
a :: String -> Fetch String
a s = pure (b s ++ "!")
b :: String -> Fetch String
b s = ('!' :) <$> fetch s
fetch :: String -> Fetch String
fetch s = do
x <- liftIO getLine
return $ s ++ x
main = putStrLn =<< unFetch (a "hello")
The advantage of mtl style is that you can have multiple different implementations of your effects. That makes things like testing & mocking easy, since you can reuse the logic but run it with different “handlers” for production & testing. In fact, you can get even more flexibility (at the cost of some runtime performance) using an algebraic effects library such as freer-effects, which not only lets the caller change how each effect is handled, but also the order in which they’re handled.
Roll up your sleeves and do the refactoring. The compiler will tell you everywhere that needs to be updated anyway. After enough times doing this, you’ll naturally end up recognising when you’re writing code that will require this refactoring later, so you’ll consider effects from the beginning and not run into the problem.
You’re quite right to doubt unsafePerformIO! It’s not just unsafe because it breaks referential transparency, it’s unsafe because it can break type, memory, and concurrency safety as well—you can use it to coerce any type to any other, cause a segfault, or cause deadlocks and concurrency errors that would ordinarily be impossible. You’re telling the compiler that some code is pure, so it’s going to assume it can do all the transformations it does with pure code—such as duplicating, reordering, or even dropping it, which may completely change the correctness and performance of your code.
The main legitimate use cases for unsafePerformIO are things like using the FFI to wrap foreign code (that you know is pure), or doing GHC-specific performance hacks; stay away from it otherwise, since it’s not meant as an “escape hatch” for ordinary code.
First off, the refactoring doesn't tend to be as bad as you might imagine. Once you make the first change, the type checker will point you to the next few, and so on. But suppose you have a reason to suspect from the start that you might need some extra capability to make a function go. A common way to do this (called mtl-style, after the monad transformer library) is to express your needs in a constraint.
class Monad m => MonadFetch m where
fetch :: String -> m String
a :: MonadFetch m => String -> m String
a s = fmap (\x -> x ++ "!") (b s)
b :: MonadFetch m => String -> m String
b s = fmap (\x -> '!':x) (fetch s)
instance MonadFetch IO where
-- fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
instance MonadFetch Identity where
-- fetch :: String -> Identity String
fetch = Identity . reverse
You're no longer tied to a particular monad: you just need one that can fetch. Code operating on an arbitrary MonadFetch instance is pure, except that it can fetch.

Using a Monadic eDSL from the REPL

Say I have created myself an embedded domain specific language in Haskell using a monad. For example a simple language that lets you push and pop values on a stack, implemented using the state monad:
type DSL a = State [Int] a
push :: Int -> DSL ()
pop :: DSL Int
Now I can write small stack manipulation programs using do notation:
program = do
push 10
push 20
a <- pop
push (5*a)
return a
However, I would really like to use my DSL interactively from a REPL (GHCi in particular, willing to use other if it would help).
Unfortunately having a session like:
>push 10
>pop
10
>push 100
Does not immediately work, which is probably rather reasonable. However I really think being able to do something with a similar feel to that would be cool. The way the state monad work does not lend itself easily to this. You need to build up your DSL a type and then evaluate it.
Is there a way to do something like this. Incrementally using a monad in the REPL?
I have been looking at things like operational, MonadPrompt, and MonadCont which I sort of get the feeling maybe could be used to do something like this. Unfortunately none of the examples I have seen addresses this particular problem.
Another possibility is to re-simulate the whole history each time you do anything. This will work for any pure monad. Here's an extemporaneous library for it:
{-# LANGUAGE RankNTypes #-}
import Data.IORef
import Data.Proxy
newtype REPL m f = REPL { run :: forall a. m a -> IO (f a) }
newREPL :: (Monad m) => Proxy m -> (forall a. m a -> f a) -> IO (REPL m f)
newREPL _ runM = do
accum <- newIORef (return ())
return $ REPL (\nextAction -> do
actions <- readIORef accum
writeIORef accum (actions >> nextAction >> return ())
return (runM (actions >> nextAction)))
Basically, it stores all the actions run thus far in an IORef, and each time you do something it adds to the list of actions and runs it from the top.
To create a repl, use newREPL, passing it a Proxy for the monad and a "run" function that gets you out of the monad. The reason the run function has type m a -> f a instead of m a -> a is so that you can include extra information in the output -- for example, you might want to view the current state, too, in which case you could use an f like:
data StateOutput a = StateOutput a [Int]
deriving (Show)
But I have just used it with Identity which does nothing special.
The Proxy argument is so that ghci's defaulting doesn't bite us when we create a new repl instance.
Here's how you use it:
>>> repl <- newREPL (Proxy :: Proxy DSL) (\m -> Identity (evalState m []))
>>> run repl $ push 1
Identity ()
>>> run repl $ push 2
Identity ()
>>> run repl $ pop
Identity 2
>>> run repl $ pop
Identity 1
If the extra Identity line noise bothers you, you could use your own functor:
newtype LineOutput a = LineOutput a
instance (Show a) => Show (LineOutput a) where
show (LineOutput x) = show x
There was one small change I had to make -- I had to change
type DSL a = State [Int] a
to
type DSL = State [Int]
because you can't use type synonyms that are not fully applied, like when I said Proxy :: DSL. The latter, I think, is more idiomatic anyway.
To an extent.
I don't believe it can be done for arbitrary Monads/instruction sets, but here's something that would work for your example. I'm using operational with an IORef to back the REPL state.
data DSLInstruction a where
Push :: Int -> DSLInstruction ()
Pop :: DSLInstruction Int
type DSL a = Program DSLInstruction a
push :: Int -> DSL ()
push n = singleton (Push n)
pop :: DSL Int
pop = singleton Pop
-- runDslState :: DSL a -> State [Int] a
-- runDslState = ...
runDslIO :: IORef [Int] -> DSL a -> IO a
runDslIO ref m = case view m of
Return a -> return a
Push n :>>= k -> do
modifyIORef ref (n :)
runDslIO ref (k ())
Pop :>>= k -> do
n <- atomicModifyIORef ref (\(n : ns) -> (ns, n))
runDslIO ref (k n)
replSession :: [Int] -> IO (Int -> IO (), IO Int)
replSession initial = do
ref <- newIORef initial
let pushIO n = runDslIO ref (push n)
popIO = runDslIO ref pop
(pushIO, popIO)
Then you can use it like:
> (push, pop) <- replSession [] -- this shadows the DSL push/pop definitions
> push 10
> pop
10
> push 100
It should be straightforward to use this technique for State/Reader/Writer/IO-based DSLs. I don't expect it to work for everything though.

Understanding Purescript Eff Monad and do blocks

I'm trying to understand why the following does not work in Purescript. I have a feeling it can also be answered by the Haskell community, thus I've cross listed it.
The general gist is:
If I have a do block, can I not throw in a disposable value? In this instance, I'm trying to log something (similar to Haskell's print) in the middle of a series of monadic computations.
main = do
a <- someAction1
b <- someAction2
_ <- log "here is a statement I want printed"
someAction3 a b
Specifically, I have a function which takes the following (from the Halogen example template project)
data Query a = ToggleState a
eval :: Query ~> H.ComponentDSL State Query g
eval (Toggle next) = do
H.modify (\state -> state { isOn = not state.isOn })
_ <- log "updating the state!"
pure next
In my mind, this should work like in Haskell
barf :: IO Int
barf = do
_ <- print "Here I am!"
return 5
main :: IO ()
main = do
a <- barf
_ <- print $ "the value is: " ++ (show a)
print "done"
Specifically, the error that I get is type mismatch of the monads
Could not match type Eff with type Free while trying to match type Eff ( "console" :: CONSOLE | t6 ) with type Free (HalogenFP t0 { "isOn" :: t1 | t2 } t3 t4) ... etc...
I know purescript makes me declare the "things I'm touching in the monad" (i.e. forall e. Eff ( a :: SOMEVAR, b :: SOMEOTHERVAR | eff ) Unit, but I'm not sure how to do that in this case...
If you're working with version 0.12.0 of halogen you should be able to use fromEff from https://pursuit.purescript.org/packages/purescript-aff-free/3.0.0/docs/Control.Monad.Aff.Free#v:fromEff like so:
data Query a = ToggleState a
eval :: Query ~> H.ComponentDSL State Query g
eval (Toggle next) = do
H.modify (\state -> state { isOn = not state.isOn })
_ <- H.fromEff (log "updating the state!")
pure next
This is going to get a lot nicer in upcoming versions of halogen (>= 0.13) in which liftEff should be enough.
The reason for why you can't just use log right away, is that H.ComponentDSL is not a type synonym for Eff, but for Free and so you can't simply mix Eff and ComponentDSL actions.

How can monads determine ordering if their information is lost upon normalization?

If I understood correctly, a monad is just the implementation of a bind >>= and a return operator following certain rules which basically compose 2 functions of different return types together. So, for example, those are equivalent:
putStrLn "What is your name?"
>>= (\_ -> getLine)
>>= (\name -> putStrLn ("Welcome, " ++ name ++ "!"))
(bind (putStrLn "What is your name?")
(bind
(\_ -> getLine)
(\name -> putStrLn ("Welcome, " ++ name ++ "!"))))
But if we strongly normalize this expression, the final result will be just:
(putStrLn ("Welcome, " ++ getline ++ "!"))
The first statement (putStrLn "What is your name?") is completely lost. Also, getLine looks like a function with no arguments, which is nonsense. So how does this work, and what is the actual definition of the >>= and return functions?
Your logical misstep is that you assume certain reduction rules hold which do not. In particular, you appear to be using
f >>= (\x -> g x) ==== g f
If that held then, yes, monads would be pretty silly: (>>=) would just be flip ($). But it doesn't, in general, hold at all. In fact, the very reason it doesn't hold is what provides monads an opportunity to be interesting.
For a little bit of further exploration, here's the one monad where (>>=) == flip ($) (basically) holds.
newtype Identity a = Identity { unIdentity :: a }
To make our equations work out, we'll have to use that Identity a ~ a. This isn't strictly true, obviously, but let's pretend. In particular, Identity . unIdentity and unIdentity . Identity are both identities, no-ops, and we can freely apply Identity or unIdentity however we like to make types match
instance Functor Identity where
fmap f (Identity a) = Identity (f a)
instance Monad Identity where
return a = Identity a -- this is a no-op
ida >>= f = f (unIdentity ida)
Now, in particular, we want to examine
ida :: Identity a
f :: a -> b
ida >>= Identity . f :: Identity b
===
Identity (f (unIdentity ida)) :: Identity b
and if we throw away the Identity/unIdentity noise and thus produce the knowledge that ida = Identity a for some a
Identity (f (unIdentity ida)) :: Identity b
===
Identity (f a) :: Identity b
=== ~
f a :: b
So, while (>>=) == flip ($) forms a certain basis of intuition about (>>=)... in any circumstance more interesting than the Identity monad (and all other monads are) it doesn't hold exactly.
Seems to be a misunderstanding of how evaluation in IO proceeds in Haskell. If you look at the type signature for (>>=):
λ: :t (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b
It takes a monadic value parameterized by a type a, and a function which accepts a type of the same type and applies it inside the function body yielding a monadic value of type b.
The IO monad itself is a rather degenerate monad since it has special status in Haskell's implementation. A type of IO a stands for a potentially impure computation which, when performed, does some IO before returning a value of type a.
The first statement (putStrLn "What is your name?") is completely
lost.
The misunderstanding about this statement is that the value of putStrLn :: String -> IO () does in fact lose it's value in some sense, or more precisely it just yields the unit type () to the bound function after performing the IO action of printing a string to the outside world.
But if we strongly normalize this expression, the final result will be
just: (putStrLn ("Welcome, " ++ getline ++ "!"))
It's best to think of getLine :: IO String as being a computation yielding a value instead of a value itself. In this case as well the function getLine is not itself substituted in but the result of the computation it performs is, which behaves like you expect it to: getting a value from stdin and printing it back out.
It has been so long til I asked that question! The simple answer is that, no, the term I posted does not reduce to putStrLn ("Welcome, " ++ getline ++ "!"). Instead, its normal form will have the shape bind foo (\ _ -> bind bar (\ _ -> ...)), i.e., a chain of lambdas, which holds the ordering information I was worried about.
[...] what are the actual definitions for the (>>=) and return functions?
From section 6.1.7 (page 75) of the Haskell 2010 report:
The IO type serves as a tag for operations (actions) that interact with the outside world. The IO type is abstract: no constructors are visible to the user. IO is an instance of the Monad and Functor classes.
the crucial point being:
The IO type is abstract: no constructors are visible to the user.
There are no actual (written in idiomatic Haskell) definitions - it's the implementors' choice as to which model to use: state-threading, continuations, direct effects, etc. (This wasn't always the case - I provide more details here :-) We also benefit, as we're able to choose the most convenient model for the investigation being made.
So how does this work [...]?
I will choose the direct-effect model, based on examples from Philip Wadler's How to Declare an Imperative:
(* page 26, modified *)
type 'a io = oi -> 'a
infix >>=
val >>= : 'a io * ('a -> 'b io) -> 'b io
fun m >>= k = fn Oblige => let
val x = m Oblige
val y = k x Oblige
in
y
end
val return : 'a -> 'a io
fun return x = fn Oblige => x
val putc : char -> unit io
fun putc c = fn Oblige => putcML c
val getc : char io
val getc = fn Oblige => getcML ()
I'm using a new type:
datatype oi = Oblige
to reserve the unit type and its value () for the usual purpose of vacuous
results, for clarity.
(Yes - that's Standard ML: just imagine it's 1997, and you're writing a
prototype Haskell implementation ;-)
With the help of some extra definitions:
val gets : (char list) io
val putsl : char list -> unit io
that Haskell code sample, modified slightly:
putStrLn "What is your name?" >>=
(\_ -> getLine >>=
(\name -> putStrLn (greet name)))
greet :: String -> String
greet name = "Welcome, " ++ name ++ "!"
translates to:
putsl "What is your name?"
>>= (fn _ => gets
>>= (fn name => putsl (greet name))
where:
val greet : char list -> char list
fun greet name = List.concat (String.explode "Welcome, "::name::[#"!"])
All going well, the sample should simplify down to:
fun Oblige => let
val x = putsl "What is your name?" Oblige
val name = gets Oblige
val y = putsl (greet name) Oblige
in
y
end
Even though x isn't used it's still evaluated in Standard ML, which causes the prompt "What is your name?" to be displayed.
Now for a guess at the next question...Standard ML and Haskell are both functional languages - could all that oi stuff be transferred across to Haskell?
I was wrong? Meh; I'll answer it anyway - sort of; you can read about what I devised over here. If that was just too abominable to contemplate...well, here are those extra Standard ML definitions:
(* from pages 25-26, verbatim *)
val putcML : char -> unit
fun putcML c = TextIO.output1(TextIO.stdOut,c);
val getcML : unit -> char
fun getcML () = valOf(TextIO.input1(TextIO.stdIn));
(* Caution: work of SML novice... *)
val gets = fn Oblige => let
val c = getc Oblige
in
if c = #"\n" then
[]
else
let
val cs = gets Oblige
in
(c::cs)
end
end
fun putsl cs = fn Oblige => let
val _ = putsl cs Oblige
val _ = putc #"\n" Oblige
in
()
end
val puts : char list -> unit io
fun puts cs = fn Oblige => case cs of
[] => ()
| (c::cs) => let val _ = putc c Oblige in
puts cs Oblige

Resources