What's the meaning of IO actions within pure functions? - haskell

I thought that in principle Haskell's type system would forbid calls to impure functions (i.e. f :: a -> IO b) from pure ones, but today I realized that by calling them with return they compile just fine. In this example:
h :: Maybe ()
h = do
return $ putStrLn "???"
return ()
h works in the Maybe monad, but it's a pure function nevertheless. Compiling and running it simply returns Just () as one would expect, without actually doing any I/O. I think Haskell's laziness puts the things together (i.e. putStrLn's return value is not used - and can't since its value constructors are hidden and I can't pattern match against it), but why is this code legal? Are there any other reasons that makes this allowed?
As a bonus, related question: in general, is it possible to forbid at all the execution of actions of a monad from within other ones, and how?

IO actions are first-class values like any other; that's what makes Haskell's IO so expressive, allowing you to build higher-order control structures (like mapM_) from scratch. Laziness isn't relevant here,1 it's just that you're not actually executing the action. You're just constructing the value Just (putStrLn "???"), then throwing it away.
putStrLn "???" existing doesn't cause a line to be printed to the screen. By itself, putStrLn "???" is just a description of some IO that could be done to cause a line to be printed to the screen. The only execution that happens is executing main, which you constructed from other IO actions, or whatever actions you type into GHCi. For more information, see the introduction to IO.
Indeed, it's perfectly conceivable that you might want to juggle about IO actions inside Maybe; imagine a function String -> Maybe (IO ()), which checks the string for validity, and if it's valid, returns an IO action to print some information derived from the string. This is possible precisely because of Haskell's first-class IO actions.
But a monad has no ability to execute the actions of another monad unless you give it that ability.
1 Indeed, h = putStrLn "???" `seq` return () doesn't cause any IO to be performed either, even though it forces the evaluation of putStrLn "???".

Let's desugar!
h = do return (putStrLn "???"); return ()
-- rewrite (do foo; bar) as (foo >> do bar)
h = return (putStrLn "???") >> do return ()
-- redundant do
h = return (putStrLn "???") >> return ()
-- return for Maybe = Just
h = Just (putStrLn "???") >> Just ()
-- replace (foo >> bar) with its definition, (foo >>= (\_ -> bar))
h = Just (putStrLn "???") >>= (\_ -> Just ())
Now, what happens when you evaluate h?* Well, for Maybe,
(Just x) >>= f = f x
Nothing >>= f = Nothing
So we pattern match the first case
f x
-- x = (putStrLn "???"), f = (\_ -> Just ())
(\_ -> Just ()) (putStrLn "???")
-- apply the argument and ignore it
Just ()
Notice how we never had to perform putStrLn "???" in order to evaluate this expression.
*n.b. It is somewhat unclear at which point "desugaring" stops and "evaluation" begins. It depends on your compiler's inlining decisions. Pure computations could be evaluated entirely at compile time.

Related

How can I get this currying experiment to behave as expected?

I am writing a currying experiment to get a feel for how multiple statements in haskell are chained together to work one after another.
Here is what I got so far
testCurry :: IO ()
testCurry =
(\b ->
(\_ -> putStrLn b)
((\a ->
putStrLn a
) (show 2))
) (show 3)
testCurryExpected :: IO ()
testCurryExpected = do {
a <- return (show 2);
putStrLn a;
b <- return (show 3);
putStrLn b;
}
main :: IO ()
main =
putStrLn "expected: " >>
testCurryExpected >>
putStrLn "given: " >>
testCurry
I know it works if I do it this way:
testCurry :: IO ()
testCurry =
(\b ->
(\next -> next >> putStrLn b)
((\a ->
putStrLn a
) (show 2))
) (show 3)
testCurryExpected :: IO ()
testCurryExpected = do {
a <- return (show 2);
putStrLn a;
b <- return (show 3);
putStrLn b;
}
main :: IO ()
main =
putStrLn "expected: " >>
testCurryExpected >>
putStrLn "given: " >>
testCurry
But I don't know how to simulate the ">>"(then) behavior only using functions.
I know a >> b is defined in terms of a >>= \_ -> b, but I am not sure how >>= is defined in terms of IO a >>= IO b but don't know how to translate this into raw function composition.
Can somebody please help me get this experiment to work?
In short, I want to know if there is a way to do this without >> or >>= operators nor wrapping these operators.
(\a -> \b -> a >> b)(putStrLn "one")(putStrLn "two")
Note: For the sake of concept, I restrict myself to using anonymous functions of at most one argument.
Edit: I found a good-enough solution by creating my own Free representation of putStrLn called Free_PutStrLn that is free of interpretation; using Lists to construct the operation chain, then evaluate it myself later.
data Free_PutStrLn = Free_PutStrLn String deriving Show
eval :: [Free_PutStrLn] -> IO ()
eval a =
foldl (\a -> \b ->
let deconstruct (Free_PutStrLn str) = str in
a >> putStrLn (deconstruct b)
) (return ()) a
testCurry :: [Free_PutStrLn]
testCurry =
(\a ->
[Free_PutStrLn a] ++
((\b ->
[Free_PutStrLn b]
) (show 3))
)(show 2)
main =
putStrLn (show testCurry) >>
eval (testCurry)
JavaScript proof of concept:
// | suspends an object within a function context.
// | first argument is the object to suspend.
// | second argument is the function object into which to feed the suspended
// | argument(first).
// | third argument is where to feed the result of the first argument feeded into
// | second argument. use a => a as identity.
const pure = obj => fn => f => f(fn(obj));
// | the experiment
pure({'console': {
'log': str => new function log() {
this.str = str;
}
}})(free =>
pure(str => () => console.log(str))
(putStrLn =>
pure("hello")(a =>
[free.console.log(a)].concat (
pure("world")(b =>
[free.console.log(b)]
)(a => a))
)((result =>
pure(logObj => logObj.str)
(deconstruct =>
result.map(str => putStrLn(deconstruct(str)))
)(result =>
result.forEach(f => f())
)
)
)
)(a => a)
)(a => a)
But I don't know how to simulate the >> (then) behavior only using functions.
Well, you can't! >> is (in this case) about ordering side-effects. A Haskell function can never have a side effect†. Side effects can only happen in monadic actions, and can be thus ordered by monadic combinators including >>, but without a Monad‡ constraint the notion of “do this and also that” simply doesn't make any sense in Haskell. A Haskell function is not executed, it's merely a mathematical transformation whose result you may evaluate. That result may itself be an actual action with type e.g. IO (), and such an action can be executed and/or monadically chained with other actions. But this is actually somewhat orthogonal to the evaluation of the function that yielded this action.
So that's the answer: “How can I get this currying experiment to behave as expected?” You can't, you need to use one of the monadic combinators instead (or do notation, which is just syntactic sugar for the same).
To also tackle this question from a bit of a different angle: you do not “need monads” to express sequencing of side effects. I might for instance define a type that “specifies side-effects” by generating e.g. Python code which when executed has these effects:
newtype Effect = Effect { pythons :: [String] }
Here, you could then sequence effects by simply concatenating the instruction lists. Again though, this sequencing would not be accomplished by any kind of currying exercise but by boring list concatenation. The preferable interface for this is the monoid class:
import Data.Monoid
instance Monoid Effect where
mempty = Effect []
mappend (Effect e₀) (Effect e₁) = Effect $ e₀ ++ e₁
And then you could simply do:
hello :: Effect
hello = Effect ["print('hello')"] <> Effect ["print('world')"]
(<> is just a shorthand synonym for mappend. You could as well define a custom operator, say # instead to chain those actions, but if there's a standard class that supports some operation it's usually a good idea to employ that!)
Ok, perfectly fine sequencing, no monadic operators required.
But very clearly, just evaluating hello would not cause anything to be printed: it would merely give you some other source code. You'd actually need to feed these instructions to a Python interpreter to accomplish the side-effects.And in principle that's no different with the IO type: evaluating an IO action also never causes any side-effects, only linking it to main (or to the GHCi repl) does. How many lambdas you wrap the subexpressions in is completely irrelevant for this, because side-effect occurance has nothing to do with whether a function gets called anywhere! It only has to do with how the actions get linked to an actual “executor”, be that a Python interpreter or Haskell's own main.
If you now wonder why it has to be those whacky monads if the simpler Monoid also does the trick... the problem with Effect is that it has no such thing as a return value. You can perfectly well generate “pure output” actions this way that simply execute a predetermined Python program, but you can never get back any values from Python this way to use within Haskell to decide what should happen next. This is what monads allow you to do.
†Yes, never. Haskell does not include something called unsafePerformIO. Anybody who claims otherwise in the comments shall suffer nuclear retaliation.
‡To be precise, the weaker Applicative is sufficient.

Why is getLine being evaluated when its value is not required?

If Haskell is lazy, why does getLine is evaluated in both of the following cases? Being lazy I would expect that in the case of fa the getLine would not be evaluated because its result is not being used subsequently:
let fa = do {
x <- getLine;
return "hello"
}
let fb = do {
x <- getLine;
return $ x
}
(I tested both cases in GHCi)
Thanks
Its result is being used, just not in the way you necessarily expect. This de-sugars to
fa = getLine >>= (\x -> return "hello")
So the result of getLine is still passed to the function \x -> return "hello". Monads are inherently about sequencing actions together (among other things); that sequencing still occurs even if results are later not used. If that weren't the case, then
main = do
print "Hello"
print "World"
wouldn't do anything as a program, since the results of both calls to print aren't being used.
Congratulations, you've just discovered why Haskell is a pure functional language!
The result† of getLine is not a string. It's an IO action which happens to “produce” a string. That string is indeed not evaluated, but the action itself is (since it turns up in a do block bound to main), and this is all that matters as far as side-effects are concerned.
†Really just the value of getLine. This is not a function, so it doesn't actually have a result.
Be careful now... ;-)
The result of getLine isn't a string, it's an "I/O command object", if you will. The code actually desugars to
getLine >>= (\ x -> return "hello")
The >>= operator constructs a new I/O command object out of an existing one and a function... OK, that's a bit much to wrap your mind around. The important thing is, the I/O action gets executed (because of the implementation of >>= for IO), but its result doesn't necessarily get evaluated (because of laziness).
So let's look at the implementation of the IO monad... erm, actually you know what? Let's not. (It's deep magic, hard-wired into the compiler, and as such it's implementation-specific.) But this phenomenon isn't unique to IO by any means. Let's look at the Maybe monad:
instance Monad Maybe where
mx >>= f =
case mx of
Nothing -> Nothing
Just x -> f x
return x = Just x
So if I do something like
do
x <- foobar
return "hello"
Will x get evaluated? Let's look. It desugars to:
foobar >>= (\ x -> return "hello")
then this becomes
case foobar of
Nothing -> Nothing
Just x -> Just "hello"
As you can see, foobar is clearly going to be evaluated, because we need to know whether the result is Nothing or Just. But the actual x won't be evaluated, because nothing looks at it.
It's kind of the same way that length evaluates the list nodes, but not the list elements they point to.

Is there a lazy Session IO Monad?

You have a sequence of actions that prefer to be executed in chunks due to some high-fixed overhead like packet headers or making connections. The limit is that sometimes the next action depends on the result of a previous one in which case, all pending actions are executed at once.
Example:
mySession :: Session IO ()
a <- readit -- nothing happens yet
b <- readit -- nothing happens yet
c <- readit -- nothing happens yet
if a -- all three readits execute because we need a
then write "a"
else write "..."
if b || c -- b and c already available
...
This reminds me of so many Haskell concepts but I can't put my finger on it.
Of course, you could do something obvious like:
[a,b,c] <- batch([readit, readit, readit])
But I'd like to hide the fact of chunking from the user for slickness purposes.
Not sure if Session is the right word. Maybe you can suggest a better one? (Packet, Batch, Chunk and Deferred come to mind.)
Update
I think there was a really good answer last night that I read on my phone but when I came back to look for it today it was gone. Was I dreaming?
I don't think you can do exactly what you want, since what you describe exploits haskell's lazy evaluation to have the evaluation of a force the actions that compute b and c, and there's no way to seq on unspecified values.
What I could do was hack together a monad transformer that delayed actions sequenced via >> so that they could be executed all together:
data Session m a = Session { pending :: [ m () ], final :: m a }
runSession :: Monad m => Session m a -> m a
runSession (Session ms ma) = foldr (flip (>>)) (return ()) ms >> ma
instance Monad m => Monad (Session m) where
return = Session [] . return
s >>= f = Session [] $ runSession s >>= (runSession . f)
(Session ms ma) >> (Session ms' ma') =
Session (ms' ++ (ma >> return ()) : ms) ma'
This violates some monad laws, but lets you do something like:
liftIO :: IO a -> Session IO a
liftIO = Session []
exampleSession :: Session IO Int
exampleSession = do
liftIO $ putStrLn "one"
liftIO $ putStrLn "two"
liftIO $ putStrLn "three"
liftIO $ putStrLn "four"
trace "five" $ return 5
and get
ghci> runSession exampleSession
five
one
two
three
four
5
ghci> length (pending exampleSession)
4
This is very similar to what Haxl does.
For more info:
Open sourcing haxl - Facebook Code Blog
ICFP 2014 talk
You could use the unsafeInterleaveIO function. It is a dangerous function that can introduce bugs to your program if not used carefully, but it does what you're asking for.
You can insert it into your example code like this:
lazyReadits :: IO [a]
lazyReadits = unsafeInterleaveIO $ do
a <- readit
r <- lazyReadits
return (a:r)
unsafeInterleaveIO makes the action as a whole lazy, but once it starts evaluating it will evaluate as if it had been strict. This means in my above example: readit will run as soon as something tests whether the returned list is empty or not. If I'd used mapM unsafeInterleaveIO (replicate 3 readit) instead, then readit would only be run when the actual elements of the list are evaluated, which would make the contents of the list depend on the order in which its elements are inspected, which is one example of how unsafeInterleaveIO can introduce bugs.

Keep variable inside another function in Haskell

I am lost in this concept. This is my code and what it does is just asking what is your name.
askName = do
name <- getLine
return ()
main :: IO ()
main = do
putStrLn "Greetings once again. What is your name?"
askName
However, How can I access in my main the variable name taken in askName?
This is my second attempt:
askUserName = do
putStrLn "Hello, what's your name?"
name <- getLine
return name
sayHelloUser name = do
putStrLn ("Hey " ++ name ++ ", you rock!")
main = do
askUserName >>= sayHelloUser
I can now re-use the name in a callback way, however if in main I want to call name again, how can I do that? (avoiding to put name <- getLine in the main, obviously)
We can imagine that you ask the name in order to print it, then let's rewrite it.
In pseudo code, we have
main =
print "what is your name"
bind varname with userReponse
print varname
Your question then concern the second instruction.
Take a look about the semantic of this one.
userReponse is a function which return the user input (getLine)
varname is a var
bind var with fun : is a function which associate a var(varname) to the output of a function(getLine)
Or as you know in haskell everything is a function then our semantic is not well suited.
We need to revisit it in order to respect this idiom. According to the later reflexion the semantic of our bind function become bind fun with fun
As we cannot have variable, to pass argument to a function, we need, at a first glance, to call another function, in order to produce them. Thus we need a way to chain two functions, and it's exactly what's bind is supposed to do. Furthermore, as our example suggest, an evaluation order should be respected and this lead us to the following rewriting with fun bind fun
That's suggest that bind is more that a function it's an operator.
Then for all function f and g we have with f bind g.
In haskell we note this as follow
f >>= g
Furthermore, as we know that a function take 0, 1 or more argument and return 0, 1 or more argument.
We could refine our definition of our bind operator.
In fact when f doesn't return any result we note >>= as >>
Applying, theses reflexions to our pseudo code lead us to
main = print "what's your name" >> getLine >>= print
Wait a minute, How the bind operator differ from the dot operator use for the composition of two function ?
It's differ a lot, because we have omit an important information, bind doesn't chain two function but it's chain two computations unit. And that's the whole point to understand why we have define this operator.
Let's write down a global computation as a sequence of computation unit.
f0 >>= f1 >> f2 >> f3 ... >>= fn
As this stage a global computation could be define as a set of computation unit with two operator >>=, >>.
How do we represent set in computer science ?
Usually as container.
Then a global computation is a container which contain some computation unit. On this container we could define some operator allowing us to move from a computation unit to the next one, taking into account or not the result of the later, this is ours >>= and >> operator.
As it's a container we need a way to inject value into it, this is done by the return function. Which take an object and inject it into a computation, you could check it through is signature.
return :: a -> m a -- m symbolize the container, then the global computation
As it's a computation we need a way to manage a failure, this done by the fail function.
In fact the interface of a computation is define by a class
class Computation
return -- inject an objet into a computation
>>= -- chain two computation
>> -- chain two computation, omitting the result of the first one
fail -- manage a computation failure
Now we can refine our code as follow
main :: IO ()
main = return "What's your name" >>= print >> getLine >>= print
Here I have intentionally include the signature of the main function, to express the fact that we are in the global IO computation and the resulting output with be () (as an exercise enter $ :t print in ghci).
If we take more focus on the definition for >>=, we can emerge the following syntax
f >>= g <=> f >>= (\x -> g) and f >> g <=> f >>= (\_ -> g)
And then write
main :: IO ()
main =
return "what's your name" >>= \x ->
print x >>= \_ ->
getLine >>= \x ->
print x
As you should suspect, we certainly have a special syntax to deal with bind operator in computational environment. You're right this is the purpose of do syntax
Then our previous code become, with do syntax
main :: IO ()
main = do
x <- return "what's your name"
_ <- print x
x <- getLine
print x
If you want to know more take a look on monad
As mentioned by leftaroundabout, my initial conclusion was a bit too enthusiastic
You should be shocked, because we have break referential transparency law (x take two different value inside our sequence of instruction), but it doesn't matter anymore,because we are into a computation, and a computation as defined later is a container from which we can derive an interface and this interface is designed to manage, as explain, the impure world which correspond to the real world.
Return the name from askname. In Haskell its not idiomatic to access "global" variables:
askName :: IO String
askName = do
name <- getLine
return name
main :: IO ()
main = do
putStrLn "Greetings once again. What is your name?"
name <- askName
putStrLn name
Now the only problem is tht the askName function is kind of pointless, since its now just an alias to getLine. We could "fix" that by putting the question inside askName:
askName :: IO String
askName = do
putStrLn "Greetings once again. What is your name?"
name <- getLine
return name
main :: IO ()
main = do
name <- askName
putStrLn name
Finally, just two little points: its usually a good idea to put type declarations when you are learning, to make things explicit and help compiler error messages. Another thing is to remember that the "return" function is only used for monadic code (it is not analogous to a traditional return statement!) and sometimes we could have ommited some intermediary variables:
askName :: IO String
askName = do
putStrLn "Greetings once again. What is your name?"
getLine

How to initialize a monad and then use in a function many times in Haskell

Most of this is straight from the hint example. What I'd like to do is initialize the interpreter with modules and imports and such and keep it around somehow. Later on (user events, or whatever), I want to be able to call a function with that initialized state and interpret an expression many times. So at the --split here location in the code, I want to have the code above in init, and the code below that in a new function that takes an expression and interprets it.
module Main where
import Language.Haskell.Interpreter
import Test.SomeModule
main :: IO ()
main = do r <- runInterpreter testHint
case r of
Left err -> printInterpreterError err
Right () -> putStrLn "Done."
-- Right here I want to do something like the following
-- but how do I do testInterpret thing so it uses the
-- pre-initialized interpreter?
case (testInterpret "expression one")
Left err -> printInterpreterError err
Right () -> putStrLn "Done."
case (testInterpret "expression two")
Left err -> printInterpreterError err
Right () -> putStrLn "Done."
testHint :: Interpreter ()
testHint =
do
loadModules ["src/Test/SomeModule.hs"]
setImportsQ [("Prelude", Nothing), ("Test.SomeModule", Just "SM")]
say "loaded"
-- Split here, so what I want is something like this though I know
-- this doesn't make sense as is:
-- testExpr = Interpreter () -> String -> Interpreter ()
-- testExpr hintmonad expr = interpret expr
let expr1 = "let p1o1 = SM.exported undefined; p1o2 = SM.exported undefined; in p1o1"
say $ "e.g. typeOf " ++ expr1
say =<< typeOf expr1
say :: String -> Interpreter ()
say = liftIO . putStrLn
printInterpreterError :: InterpreterError -> IO ()
printInterpreterError e = putStrLn $ "Ups... " ++ (show e)
I'm having trouble understanding your question. Also I am not very familiar with hint. But I'll give it a go.
As far as I can tell, the Interpreter monad is just a simple state wrapper around IO -- it only exists so that you can say eg. setImportsQ [...] and have subsequent computations depend on the "settings" that were modified by that function. So basically you want to share the monadic context of multiple computations. The only way to do that is by staying within the monad -- by building a single computation in Interpreter and running it once. You can't have a "global variable" that escapes and reuses runInterpreter.
Fortunately, Interpreter is an instance of MonadIO, which means you can interleave IO computations and Interpreter computations using liftIO :: IO a -> Interpreter a. Basically you are thinking inside-out (an extremely common mistake for learners of Haskell). Instead of using a function in IO that runs code in your interpreter, use a function in Interpreter that runs code in IO (namely liftIO). So eg.
main = runInterpreter $ do
testHint
expr1 <- liftIO getLine
r1 <- interpret "" expr1
case r1 of
...
expr2 <- liftIO getLine
r2 <- interpret "" expr2
case r2 of
...
And you can easily pull that latter code out into a function if you need to, using the beauty of referential transparency! Just pull it straight out.
runSession :: Interpreter ()
runSession = do
expr1 <- liftIO getLine
r1 <- interpret "" expr1
case interpret expr1 of
...
main = runInterpreter $ do
testHint
runSession
Does that make sense? Your whole program is an Interpreter computation, and only at the last minute do you pull it out into IO.
(That does not mean that every function you write should be in the Interpreter monad. Far from it! As usual, use Interpreter around the edges of your program and keep the core purely functional. Interpreter is the new IO).
If I understand correctly, you want to initialize the compiler once, and run multiple queries, possibly interactively.
There are two main approaches:
lift IO actions into your Interpreter context (see luqui's answer).
use lazy IO to smuggle a stream of data in and out of your program.
I'll describe the second option.
By the magic of lazy IO, you can pass testHint a lazy stream of input, then loop in the body of testHint, interpreting many queries interactively:
main = do
ls <- getContents -- a stream of future input
r <- runInterpreter (testHint (lines input))
case r of
Left err -> printInterpreterError err
Right () -> putStrLn "Done."
testHint input = do
loadModules ["src/Test/SomeModule.hs"]
setImportsQ [("Prelude", Nothing), ("Test.SomeModule", Just "SM")]
say "loaded"
-- loop over the stream of input, interpreting commands
let go (":quit":es) = return ()
(e:es) = do say =<< typeOf e
go es
go
The go function has access to the closed-over environment of the initialized interpreter, so feeding it events will obviously run in the scope of that once-initialized interpreter.
An alternative method would be to extract the interpreter state from the monad, but I'm not sure that is possible in GHC (it would require GHC not to be in the IO monad fundamentally).

Resources