Refactoring Haskell when adding IO - haskell

I have a concern regarding how far the introduction of IO trickles through a program. Say a function deep within my program is altered to include some IO; how do I isolate this change to not have to also change every function in the path to IO as well?
For instance, in a simplified example:
a :: String -> String
a s = (b s) ++ "!"
b :: String -> String
b s = '!':(fetch s)
fetch :: String -> String
fetch s = reverse s
main = putStrLn $ a "hello"
(fetch here could more realistically be reading a value from a static Map to give as its result)
But say if due to some business logic change, I needed to lookup the value returned by fetch in some database (which I can exemplify here with a call to getLine):
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
So my question is, how to prevent having to rewrite every function call in this chain?
a :: String -> IO String
a s = fmap (\x -> x ++ "!") (b s)
b :: String -> IO String
b s = fmap (\x -> '!':x) (fetch s)
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
main = a "hello" >>= putStrLn
I can see that refactoring this would be much simpler if the functions themselves did not depend on each other. That is fine for a simple example:
a :: String -> String
a s = s ++ "!"
b :: String -> String
b s = '!':s
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
doit :: String -> IO String
doit s = fmap (a . b) (fetch s)
main = doit "hello" >>= putStrLn
but I don't know if that is necessarily practical in more complicated programs.
The only way I've found thus far to really isolate an IO addition like this is to use unsafePerformIO, but, by its very name, I don't want to do that if I can help it. Is there some other way to isolate this change? If the refactoring is substantial, I would start to feel inclined to avoid having to do it (especially under deadlines, etc).
Thanks for any advice!

Here are a few methods I use.
Reduce dependencies on effects by inverting control. (One of the methods you described in your question.) That is, execute the effects outside and pass the results (or functions with those results partially applied) into pure code. Instead of having main → a → b → fetch, have main → fetch and then main → a → b:
a :: String -> String
a f = b f ++ "!"
b :: String -> String
b f = '!' : f
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
main = do
f <- fetch "hello"
putStrLn $ a f
For more complex cases of this, where you need to thread an argument to do this sort of “dependency injection” through many levels, Reader/ReaderT lets you abstract over the boilerplate.
Write pure code that you expect might need effects in monadic style from the start. (Polymorphic over the choice of monad.) Then if you do eventually need effects in that code, you don’t need to change the implementation, only the signature.
a :: (Monad m) => String -> m String
a s = (++ "!") <$> b s
b :: (Monad m) => String -> m String
b s = ('!' :) <$> fetch s
fetch :: (Monad m) => String -> m String
fetch s = pure (reverse s)
Since this code works for any m with a Monad instance (or in fact just Applicative), you can run it directly in IO, or purely with the “dummy” monad Identity:
main = putStrLn =<< a "hello"
main = putStrLn $ runIdentity $ a "hello"
Then as you need more effects, you can use “mtl style” (as #dfeuer’s answer describes) to enable effects on an as-needed basis, or if you’re using the same monad stack everywhere, just replace m with that concrete type, e.g.:
newtype Fetch a = Fetch { unFetch :: IO a }
deriving (Applicative, Functor, Monad, MonadIO)
a :: String -> Fetch String
a s = pure (b s ++ "!")
b :: String -> Fetch String
b s = ('!' :) <$> fetch s
fetch :: String -> Fetch String
fetch s = do
x <- liftIO getLine
return $ s ++ x
main = putStrLn =<< unFetch (a "hello")
The advantage of mtl style is that you can have multiple different implementations of your effects. That makes things like testing & mocking easy, since you can reuse the logic but run it with different “handlers” for production & testing. In fact, you can get even more flexibility (at the cost of some runtime performance) using an algebraic effects library such as freer-effects, which not only lets the caller change how each effect is handled, but also the order in which they’re handled.
Roll up your sleeves and do the refactoring. The compiler will tell you everywhere that needs to be updated anyway. After enough times doing this, you’ll naturally end up recognising when you’re writing code that will require this refactoring later, so you’ll consider effects from the beginning and not run into the problem.
You’re quite right to doubt unsafePerformIO! It’s not just unsafe because it breaks referential transparency, it’s unsafe because it can break type, memory, and concurrency safety as well—you can use it to coerce any type to any other, cause a segfault, or cause deadlocks and concurrency errors that would ordinarily be impossible. You’re telling the compiler that some code is pure, so it’s going to assume it can do all the transformations it does with pure code—such as duplicating, reordering, or even dropping it, which may completely change the correctness and performance of your code.
The main legitimate use cases for unsafePerformIO are things like using the FFI to wrap foreign code (that you know is pure), or doing GHC-specific performance hacks; stay away from it otherwise, since it’s not meant as an “escape hatch” for ordinary code.

First off, the refactoring doesn't tend to be as bad as you might imagine. Once you make the first change, the type checker will point you to the next few, and so on. But suppose you have a reason to suspect from the start that you might need some extra capability to make a function go. A common way to do this (called mtl-style, after the monad transformer library) is to express your needs in a constraint.
class Monad m => MonadFetch m where
fetch :: String -> m String
a :: MonadFetch m => String -> m String
a s = fmap (\x -> x ++ "!") (b s)
b :: MonadFetch m => String -> m String
b s = fmap (\x -> '!':x) (fetch s)
instance MonadFetch IO where
-- fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
instance MonadFetch Identity where
-- fetch :: String -> Identity String
fetch = Identity . reverse
You're no longer tied to a particular monad: you just need one that can fetch. Code operating on an arbitrary MonadFetch instance is pure, except that it can fetch.

Related

How to avoid the IO monad when solving arithmetic problems in SBV

I am trying to solve arithmetic problems with SBV.
For example
solution :: SymbolicT IO ()
solution = do
[x, y] <- sFloats ["x", "y"]
constrain $ x + y .<= 2
Main> s1 = sat solution
Main> s2 = isSatisfiable solution
Main> s1
Satisfiable. Model:
x = -1.2030502e-17 :: Float
z = -2.2888208e-37 :: Float
Main> :t s1
s1 :: IO SatResult
Main> s2
True
Main> :t s2
s2 :: IO Bool
While I can do useful things, it is easier for me to work with the pure value (SatResult or Bool) and not with the IO monad.
According to the documentation
sat :: Provable a => a -> IO SatResult
constrain :: SolverContext m => SBool -> m ()
sFloats :: [String] -> Symbolic [SFloat]
type Symbolic = SymbolicT IO
Given the type of functions I use, I understand why I always get to the IO monad.
But looking in the generalized versions of the functions for example sFloats.
sFloats :: MonadSymbolic m => [String] -> m [SFloat]
Depending on type of the function, I can work with a different monad than IO. This gives me hope that we will reach a more useful monad, the Identity monad for example.
Unfortunately looking at the examples always solves the problems within the IO monad, so I couldn't find any examples that would work for me.Besides that I don't have much experience working with monads.
Finally My question is:
Is there any way to avoid the IO monad when solving such a problem with SBV?
Thanks in advance
SBV calls out to the SMT solver of your choice (most likely z3, but others are available too), and presents the results back to you. This means that it performs IO under the hood, and thus you cannot be outside the IO monad. You can create custom monads using MonadSymbolic, but that will not get you out of the IO monad: Since the call to the SMT solver does IO you'll always be in IO.
(And I'd strongly caution against uses of unsafePerformIO as suggested in one of the comments. This is really a bad idea; and you can find lots more information on this elsewhere why you shouldn't do so.)
Note that this is no different than any other IO based computation in Haskell: You perform the IO "in-the-wrapper," but once you get your results, you can do whatever you'd like to do with them in a "pure" environment.
Here's a simple example:
import Data.SBV
import Data.SBV.Control
example :: IO ()
example = runSMT $ do
[x, y] <- sFloats ["x", "y"]
constrain $ x + y .<= 2
query $ do cs <- checkSat
case cs of
Unsat -> io $ putStrLn "Unsatisfiable"
Sat -> do xv <- getValue x
yv <- getValue y
let result = use xv yv
io $ putStrLn $ "Result: " ++ show result
_ -> error $ "Solver said: " ++ show cs
-- Use the results from the solver, in a purely functional way
use :: Float -> Float -> Float
use x y = x + y
Now you can say:
*Main> example
Result: -Infinity
The function example has type IO (), because it does involve calling out to the solver and getting the results. However, once you extract those results (via calls to getValue), you can pass them to the function use which has a very simple purely functional type. So, you keep the "wrapper" in the monad, but actual processing, use-of-the values, etc., remain in the pure world.
Alternatively, you can also extract the values and continue from there:
import Data.SBV
import Data.SBV.Control
example :: IO (Maybe (Float, Float))
example = runSMT $ do
[x, y] <- sFloats ["x", "y"]
constrain $ x + y .<= 2
query $ do cs <- checkSat
case cs of
Unsat -> pure Nothing
Sat -> do xv <- getValue x
yv <- getValue y
pure $ Just (xv, yv)
_ -> error $ "Solver said: " ++ show cs
Now you can say:
*Main> Just (a, b) <- example
*Main> a
-Infinity
*Main> b
4.0302105e-21
Long story short: Don't avoid the IO monad. It's there for a very good reason. Get into it, get your results out, and then the rest of your program can remain purely functional, or whatever other monad you might find yourself in.
Note that none of this is really SBV specific. This is the usual Haskell paradigm of how to use functions with side-effects. (For instance, anytime you use readFile to read the contents of a file to process it further.) Do not try to "get rid of the IO." Instead, simply work with it.
Depending on type of the function, I can work with a different monad than IO.
Not meaningfully different, in the sense you'd hope. Every instance of this class is going to be some transformed version of IO. Sorry!
Time to make a plan that involves understanding and working with IO.

Reader Monad - explanation of trivial case

I have been trying to get to grips with the reader monad and came across this tutorial. In it, the author presents this example:
example2 :: String -> String
example2 context = runReader (greet "James" >>= end) context
where
greet :: String -> Reader String String
greet name = do
greeting <- ask
return $ greeting ++ ", " ++ name
end :: String -> Reader String String
end input = do
isHello <- asks (== "Hello")
return $ input ++ if isHello then "!" else "."
I know that this is a trivial example that shows the mechanics, but I am trying to figure out why it would be better than doing something like:
example3 :: String -> String
example3 = end <*> (greet "James")
where
greet name input = input ++ ", " ++ name
end input = if input == "Hello" then (++ "!") else (++ ".")
Reader isn't often used by itself in real code. As you have observed, it's not really better than just passing an extra argument to your functions. However, as part of a monad transformer it is an excellent way to pass configuration parameters through your application. Usually this is done by adding a MonadReader constraint to any function that needs access to configuration.
Here's an attempt at a more real-world example:
data Config = Config
{ databaseConnection :: Connection
, ... other configuration stuff
}
getUser :: (MonadReader Config m, MonadIO m) => UserKey -> m User
getUser x = do
db <- asks databaseConnection
.... fetch user from database using the connection
then your main would look something like:
main :: IO ()
main = do
config <- .... create the configuration
user <- runReaderT (getUser (UserKey 42)) config
print user
dfeuer, chi and user2297560 are right in that "Reader isn't often used by itself in real code". It is worth noting, though, that there is next to no essential difference between what you do in the second snippet in the question and actually using Reader as a monad: the function functor is just Reader without the wrappers, and the Monad and Applicative instances for both of them are equivalent. By the way, outside of highly polymorphic code1, the typical motivation for using the function Applicative is making code more pointfree. In that case, moderation is highly advisable. For instance, as far as my own taste goes, this...
(&&) <$> isFoo <*> isBar
... is fine (and sometimes it might even read nicer than the pointful spelling), while this...
end <*> greet "James"
... is just confusing.
Footnotes
For instance, as Carl points out in a comment, it and the related instances can be useful in...
[...] places where you have code that's polymorphic in a type constructor and your use case is passing an argument in. This can come up when using the polymorphic types offered by lenses, for instance.

bind a monadic value (m2 a) inside some other monad m1

Working in a Coding Dojo today I tried the following
example :: IO ()
example = do input <- getLine
parsed <- parseOnly parser input
...
where parseOnly :: Parser a -> Either String a (from attoparsec) of course the compiler complained that Either .. is not IO .. essentially telling me I am mixing monads.
Of course this can be solved by
case parseOnly parser input of .. -> ..
which is kind of unelegant, I think. Also my guess is that somebody else had this problem earlier and the solution I think is related to monad transformers, but the last bits I cannot piece together.
It also reminded me of liftIO - but this is the other way around I think which solves the problem of lifting an IO action happening inside some surrounding monad (more precisely MonadIO - say for example inside Snap when one wants to print something to stdout while getting some http).
More general this problem seems to be for a Monad m1 and a (different) Monad m2 how can I do something like
example = do a <- m1Action
b <- m2Action
..
You can't, in general. The whole do block has to be one specific monad (because example needs to have some specific type). If you could bind an arbitrary other monad inside that do block, you'd have unsafePerformIO.
Monad transformers allow you to produce one monad combining the kinds of things multiple other monads can do. But you have to decide that all the actions in your do block use the same monad transformer stack to use those, they're not a way to arbitrarily switch monads mid do-block.
Your solution with case only works because you've got a particular known monad (Either) that has a way of extracting values from inside it. Not all monads provide this, so it's impossible to build a general solution without knowing the particular monads involved. That's why the do block syntax doesn't provide such a shortcut.
In general, monad transformers are for this kind of interleaving. You can use ExceptT
example :: IO (Either String ())
example = runExceptT $ do
input <- liftIO getLine
parsed <- parseOnly parser input
...
Note that parseOnly must return ExceptT String IO a for some a. Or better ExceptT String m a for any m. Or if you want parseOnly to return Either String a it's
example :: IO (Either String ())
example = runExceptT $ do
input <- lift getLine
parsed <- ExceptT $ return $ parseOnly parser input
...
But I think all you need is just
eitherToIO :: Either String a -> IO a
eitherToIO (Left s) = error s
eitherToIO (Right x) = return x
parseOnly :: ... -> String -> Either String Int
example :: IO ()
example = do
input <- getLine
parsed <- eitherToIO $ parseOnly parser input
...
You need to make that expression type check; just as in pure code. Here,
... = do a <- act1 -- m1 monad
b <- act2 -- m2 monad
...
de-sugars to:
... = act1 >>= (\a -> act2 >>= \b -> ...)
>>= is of signature:
(>>=) :: Monad m => m a -> (a -> m b) -> m b
the outer bind is specialized with m1, so it expects that the expression inside parenthesis be of type: a -> m1 b, whereas the inner bind is specialized with m2, so the expression inside parenthesis will be of type a -> m2 b:
-- outer bind expects ( \a -> m1 b )
act1 >>= (\a -> act2 >>= \b -> ...)
-- inner bind results ( \a -> m2 b )
for this to type check, you need a function of signature m2 b -> m1 b in between the two; that is what lift does for a certain class of m2 and m1 monads: namely m1 ~ t m2 where t is an instance of MonadTrans:
lift :: (Monad m, MonadTrans t) => m a -> t m a

Get value from IO rather than the computation itself

Being quite new to Haskell, I'm currently trying to improve my skills by writing an interpreter for a simple imperative toy language.
One of the expressions in this language is input, which reads a single integer from standard input. However, when I assign the value of this expression to a variable and then use this variable later, it seems ot me that I actually stored the computation of reading a value rather the read value itself. This means that e.g. the statements
x = input;
y = x + x;
will cause the interpreter to invoke the input procedure three times rather than one.
Internally in the evaluator module, I use a Map to store the values of variables. Because I need to deal with IO, this gets wrapped in an IO monad, as immortalized in the following minimal example:
import qualified Data.Map as Map
type State = Map.Map String Int
type Op = Int -> Int -> Int
input :: String -> IO State -> IO State
input x state = do line <- getLine
st <- state
return $ Map.insert x (read line) st
get :: String -> IO State -> IO Int
get x state = do st <- state
return $ case Map.lookup x st of
Just i -> i
eval :: String -> Op -> String -> IO State -> IO Int
eval l op r state = do i <- get l state
j <- get r state
return $ op i j
main :: IO ()
main = do let state = return Map.empty
let state' = input "x" state
val <- eval "x" (+) "x" state'
putStrLn . show $ val
The second line in the main function simulates the assignment of x, while the third line simulates the evaluation of the binary + operator.
My question is: How do I get around this, such that the code above only inputs once? I suspect that it is the IO-wrapping that causes the problem, but as we're dealing with IO I see no way out of that..?
Remember that IO State is not an actual state, but instead the specification for an IO machine which eventually produces a State. Let's consider input as an IO-machine transformer
input :: String -> IO State -> IO State
input x state = do line <- getLine
st <- state
return $ Map.insert x (read line) st
Here, provided a machine for producing a state, we create a bigger machine which takes that passed state and adding a read from an input line. Again, to be clear, input name st is an IO-machine which is a slight modification of the IO-machine st.
Let's now examine get
get :: String -> IO State -> IO Int
get x state = do st <- state
return $ case Map.lookup x st of
Just i -> i
Here we have another IO-machine transformer. Given a name and an IO-machine which produces a State, get will produce an IO-machine which returns a number. Note again that get name st is fixed to always use the state produced by the (fixed, input) IO-machine st.
Let's combine these pieces in eval
eval :: String -> Op -> String -> IO State -> IO Int
eval l op r state = do i <- get l state
j <- get r state
return $ op i j
Here we call get l and get r each on the same IO-machine state and thus produce two (completely independent) IO-machines get l state and get r state. We then evaluate their IO effects one after another and return the op-combination of their results.
Let's examine the kinds of IO-machines built in main. In the first line we produce a trivial IO-machine, called state, written return Map.empty. This IO-machine, each time it's run, performs no side effects in order to return a fresh, blank Map.Map.
In the second line, we produce a new kind of IO-machine called state'. This IO-machine is based off of the state IO-machine, but it also requests an input line. Thus, to be clear, each time state' runs, a fresh Map.Map is generated and then an input line is read to read some Int, stored at "x".
It should be clear where this is going, but now when we examine the third line we see that we pass state', the IO-machine, into eval. Previously we stated that eval runs its input IO-machine twice, once for each name, and then combines the results. By this point it should be clear what's happening.
All together, we build a certain kind of machine which draws input and reads it as an integer, assigning it to a name in a blank Map.Map. We then build this IO-machine into a larger one which uses the first IO-machine twice, in two separate invocations, in order to collect data and combine it with an Op.
Finally, we run this eval machine using do notation (the (<-) arrow indicates running the machine). Clearly it should collect two separate lines.
So what do we really want to do? Well, we need to simulate ambient state in the IO monad, not just pass around Map.Maps. This is easy to do by using an IORef.
import Data.IORef
input :: IORef State -> String -> IO ()
input ref name = do
line <- getLine
modifyIORef ref (Map.insert name (read line))
eval :: IORef State -> Op -> String -> String -> IO Int
eval ref op l r = do
stateSnapshot <- readIORef ref
let Just i = Map.lookup l stateSnapshot
Just j = Map.lookup l stateSnapshot
return (op i j)
main = do
st <- newIORef Map.empty -- create a blank state, embedded into IO, not a value
input st "x" -- request input *once*
val <- eval st (+) "x" "x" -- compute the op
putStrLn . show $ val
It's fine to wrap your actions such as getLine in IO, but to me it looks like your problem is that you're trying to pass your state in the IO monad. Instead, I think this is probably time you get introduced to monad transformers and how they'll let you layer the IO and State monads to get the functionality of both in one.
Monad transformers are a pretty complex topic and it'll take a while to get to where you're comfortable with them (I'm still learning new things all the time about them), but they're a very useful tool when you need to layer multiple monads. You'll need the mtl library to follow this example.
First, imports
import qualified Data.Map as Map
import Control.Monad.State
Then types
type Op = Int -> Int -> Int
-- Renamed to not conflict with Control.Monad.State.State
type AppState = Map.Map String Int
type Interpreter a = StateT AppState IO a
Here Interpreter is the Monad in which we'll build our interpreter. We also need a way to run the interpreter
-- A utility function for kicking off an interpreter
runInterpreter :: Interpreter a -> IO a
runInterpreter interp = evalStateT interp Map.empty
I figured defaulting to Map.empty was sufficient.
Now, we can build our interpreter actions in our new monad. First we start with input. Instead of returning our new state, we just modify what is current in our map:
input :: String -> Interpreter ()
input x = do
-- IO actions have to be passed to liftIO
line <- liftIO getLine
-- modify is a member of the MonadState typeclass, which StateT implements
modify (Map.insert x (read line))
I had to rename get so that it didn't conflict with get from Control.Monad.State, but it does basically the same thing as before, it just takes our map and looks up that variable in it.
-- Had to rename to not conflict with Control.Monad.State.get
-- Also returns Maybe Int because it's safer
getVar :: String -> Interpreter (Maybe Int)
getVar x = do
-- get is a member of MonadState
vars <- get
return $ Map.lookup x vars
-- or
-- get x = fmap (Map.lookup x) get
Next, eval now just looks up each variable in our map, then uses liftM2 to keep the return value as Maybe Int. I prefer the safety of Maybe, but you can rewrite it if you prefer
eval :: String -> Op -> String -> Interpreter (Maybe Int)
eval l op r = do
i <- getVar l
j <- getVar r
-- liftM2 op :: Maybe Int -> Maybe Int -> Maybe Int
return $ liftM2 op i j
Finally, we write our sample program. It stores user input to the variable "x", adds it to itself, and prints out the result.
-- Now we can write our actions in our own monad
program :: Interpreter ()
program = do
input "x"
y <- eval "x" (+) "x"
case y of
Just y' -> liftIO $ putStrLn $ "y = " ++ show y'
Nothing -> liftIO $ putStrLn "Error!"
-- main is kept very simple
main :: IO ()
main = runInterpreter program
The basic idea is that there is a "base" monad, here IO, and these actions are "lifted" up to the "parent" monad, here StateT AppState. There is a typeclass implementation for the different state operations get, put, and modify in the MonadState typeclass, which StateT implements, and in order to lift IO actions there's a pre-made liftIO function that "lifts" IO actions to the parent monad. Now we don't have to worry about passing around our state explicitly, we can still perform IO, and it has even simplified the code!
I would recommend reading the Real World Haskell chapter on monad transformers to get a better feel for them. There are other useful ones as well, such as ErrorT for handling errors, ReaderT for static configuration, WriterT for aggregating results (usually used for logging), and many others. These can be layered into what is called a transformer stack, and it's not too difficult to make your own either.
Instead of passing an IO State, you can pass State and then use higher-level functions to deal with IO. You can go further and make get and eval free from side-effects:
input :: String -> State -> IO State
input x state = do
line <- getLine
return $ Map.insert x (read line) state
get :: String -> State -> Int
get x state = case Map.lookup x state of
Just i -> i
eval :: String -> Op -> String -> State -> Int
eval l op r state = let i = get l state
j = get r state
in op i j
main :: IO ()
main = do
let state = Map.empty
state' <- input "x" state
let val = eval "x" (+) "x" state'
putStrLn . show $ val
If you're actually building an interpreter, you'll presumably have a list of instructions to execute at some point.
This is my rough translation of your code (although I'm only a beginner myself)
import Data.Map (Map, empty, insert, (!))
import Control.Monad (foldM)
type ValMap = Map String Int
instrRead :: String -> ValMap -> IO ValMap
instrRead varname mem = do
putStr "Enter an int: "
line <- getLine
let intval = (read line)::Int
return $ insert varname intval mem
instrAdd :: String -> String -> String -> ValMap -> IO ValMap
instrAdd varname l r mem = do
return $ insert varname result mem
where result = (mem ! l) + (mem ! r)
apply :: ValMap -> (ValMap -> IO ValMap) -> IO ValMap
apply mem instr = instr mem
main = do
let mem0 = empty
let instructions = [ instrRead "x", instrAdd "y" "x" "x" ]
final <- foldM apply mem0 instructions
print (final ! "y")
putStrLn "done"
The foldM applies a function (apply) to a start value (mem0) and a list (instructions) but does so within a monad.

How can monads determine ordering if their information is lost upon normalization?

If I understood correctly, a monad is just the implementation of a bind >>= and a return operator following certain rules which basically compose 2 functions of different return types together. So, for example, those are equivalent:
putStrLn "What is your name?"
>>= (\_ -> getLine)
>>= (\name -> putStrLn ("Welcome, " ++ name ++ "!"))
(bind (putStrLn "What is your name?")
(bind
(\_ -> getLine)
(\name -> putStrLn ("Welcome, " ++ name ++ "!"))))
But if we strongly normalize this expression, the final result will be just:
(putStrLn ("Welcome, " ++ getline ++ "!"))
The first statement (putStrLn "What is your name?") is completely lost. Also, getLine looks like a function with no arguments, which is nonsense. So how does this work, and what is the actual definition of the >>= and return functions?
Your logical misstep is that you assume certain reduction rules hold which do not. In particular, you appear to be using
f >>= (\x -> g x) ==== g f
If that held then, yes, monads would be pretty silly: (>>=) would just be flip ($). But it doesn't, in general, hold at all. In fact, the very reason it doesn't hold is what provides monads an opportunity to be interesting.
For a little bit of further exploration, here's the one monad where (>>=) == flip ($) (basically) holds.
newtype Identity a = Identity { unIdentity :: a }
To make our equations work out, we'll have to use that Identity a ~ a. This isn't strictly true, obviously, but let's pretend. In particular, Identity . unIdentity and unIdentity . Identity are both identities, no-ops, and we can freely apply Identity or unIdentity however we like to make types match
instance Functor Identity where
fmap f (Identity a) = Identity (f a)
instance Monad Identity where
return a = Identity a -- this is a no-op
ida >>= f = f (unIdentity ida)
Now, in particular, we want to examine
ida :: Identity a
f :: a -> b
ida >>= Identity . f :: Identity b
===
Identity (f (unIdentity ida)) :: Identity b
and if we throw away the Identity/unIdentity noise and thus produce the knowledge that ida = Identity a for some a
Identity (f (unIdentity ida)) :: Identity b
===
Identity (f a) :: Identity b
=== ~
f a :: b
So, while (>>=) == flip ($) forms a certain basis of intuition about (>>=)... in any circumstance more interesting than the Identity monad (and all other monads are) it doesn't hold exactly.
Seems to be a misunderstanding of how evaluation in IO proceeds in Haskell. If you look at the type signature for (>>=):
λ: :t (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b
It takes a monadic value parameterized by a type a, and a function which accepts a type of the same type and applies it inside the function body yielding a monadic value of type b.
The IO monad itself is a rather degenerate monad since it has special status in Haskell's implementation. A type of IO a stands for a potentially impure computation which, when performed, does some IO before returning a value of type a.
The first statement (putStrLn "What is your name?") is completely
lost.
The misunderstanding about this statement is that the value of putStrLn :: String -> IO () does in fact lose it's value in some sense, or more precisely it just yields the unit type () to the bound function after performing the IO action of printing a string to the outside world.
But if we strongly normalize this expression, the final result will be
just: (putStrLn ("Welcome, " ++ getline ++ "!"))
It's best to think of getLine :: IO String as being a computation yielding a value instead of a value itself. In this case as well the function getLine is not itself substituted in but the result of the computation it performs is, which behaves like you expect it to: getting a value from stdin and printing it back out.
It has been so long til I asked that question! The simple answer is that, no, the term I posted does not reduce to putStrLn ("Welcome, " ++ getline ++ "!"). Instead, its normal form will have the shape bind foo (\ _ -> bind bar (\ _ -> ...)), i.e., a chain of lambdas, which holds the ordering information I was worried about.
[...] what are the actual definitions for the (>>=) and return functions?
From section 6.1.7 (page 75) of the Haskell 2010 report:
The IO type serves as a tag for operations (actions) that interact with the outside world. The IO type is abstract: no constructors are visible to the user. IO is an instance of the Monad and Functor classes.
the crucial point being:
The IO type is abstract: no constructors are visible to the user.
There are no actual (written in idiomatic Haskell) definitions - it's the implementors' choice as to which model to use: state-threading, continuations, direct effects, etc. (This wasn't always the case - I provide more details here :-) We also benefit, as we're able to choose the most convenient model for the investigation being made.
So how does this work [...]?
I will choose the direct-effect model, based on examples from Philip Wadler's How to Declare an Imperative:
(* page 26, modified *)
type 'a io = oi -> 'a
infix >>=
val >>= : 'a io * ('a -> 'b io) -> 'b io
fun m >>= k = fn Oblige => let
val x = m Oblige
val y = k x Oblige
in
y
end
val return : 'a -> 'a io
fun return x = fn Oblige => x
val putc : char -> unit io
fun putc c = fn Oblige => putcML c
val getc : char io
val getc = fn Oblige => getcML ()
I'm using a new type:
datatype oi = Oblige
to reserve the unit type and its value () for the usual purpose of vacuous
results, for clarity.
(Yes - that's Standard ML: just imagine it's 1997, and you're writing a
prototype Haskell implementation ;-)
With the help of some extra definitions:
val gets : (char list) io
val putsl : char list -> unit io
that Haskell code sample, modified slightly:
putStrLn "What is your name?" >>=
(\_ -> getLine >>=
(\name -> putStrLn (greet name)))
greet :: String -> String
greet name = "Welcome, " ++ name ++ "!"
translates to:
putsl "What is your name?"
>>= (fn _ => gets
>>= (fn name => putsl (greet name))
where:
val greet : char list -> char list
fun greet name = List.concat (String.explode "Welcome, "::name::[#"!"])
All going well, the sample should simplify down to:
fun Oblige => let
val x = putsl "What is your name?" Oblige
val name = gets Oblige
val y = putsl (greet name) Oblige
in
y
end
Even though x isn't used it's still evaluated in Standard ML, which causes the prompt "What is your name?" to be displayed.
Now for a guess at the next question...Standard ML and Haskell are both functional languages - could all that oi stuff be transferred across to Haskell?
I was wrong? Meh; I'll answer it anyway - sort of; you can read about what I devised over here. If that was just too abominable to contemplate...well, here are those extra Standard ML definitions:
(* from pages 25-26, verbatim *)
val putcML : char -> unit
fun putcML c = TextIO.output1(TextIO.stdOut,c);
val getcML : unit -> char
fun getcML () = valOf(TextIO.input1(TextIO.stdIn));
(* Caution: work of SML novice... *)
val gets = fn Oblige => let
val c = getc Oblige
in
if c = #"\n" then
[]
else
let
val cs = gets Oblige
in
(c::cs)
end
end
fun putsl cs = fn Oblige => let
val _ = putsl cs Oblige
val _ = putc #"\n" Oblige
in
()
end
val puts : char list -> unit io
fun puts cs = fn Oblige => case cs of
[] => ()
| (c::cs) => let val _ = putc c Oblige in
puts cs Oblige

Resources