"Truly" lazy IO in Haskell

"Truly" lazy IO in Haskell - haskell

Consider the fragment -
getLine >>= \_ -> getLine >>= putStr
It does the reasonable thing, asking for a string twice, and then printing the last input. Because the compiler has no way of knowing what outside effects getLine has, it has to execute both of them, even though we throw away the result of the first one.
What I need is to wrap the IO Monad into another Monad (M) that allows IO computations to be effectively NOPs unless their return values are used. So that the program above could be rewritten as something like -
runM $ lift getLine >>= \_ -> lift getLine >>= lift putStr
Where
runM :: M a -> IO a
lift :: IO a -> M a
And the user is asked for input only once.
However, I cannot figure out how to write this Monad to achieve the effect I want. I'm not sure if it's even possible. Could someone please help?

Lazy IO is usually implemented using unsafeInterleaveIO :: IO a -> IO a, which delays the side effects of an IO action until its result is demanded, so we'll probably have to use that, but let's get some minor problems out of the way first.
First of all, lift putStr would not type check, as putStr has type String -> IO (), and lift has type IO a -> M a. We'll have to use something like lift . putStr instead.
Secondly, we're going to have to differentiate between IO actions that should be lazy and those who should not. Otherwise the putStr will never be executed, as we're not using it's return value () anywhere.
Taking that into account, this seems to work for your simple example, at least.
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import System.IO.Unsafe
newtype M a = M { runM :: IO a }
deriving (Monad)
lazy :: IO a -> M a
lazy = M . unsafeInterleaveIO
lift :: IO a -> M a
lift = M
main = runM $ lazy getLine >> lazy getLine >>= lift . putStr
However, as C. A. McCann points out, you should probably not use this for anything serious. Lazy IO is frowned upon already, as it makes it difficult to reason about the actual order of the side effects. This would make it even harder.
Consider this example
main = runM $ do
foo <- lazy readLn
bar <- lazy readLn
return $ foo / bar
The order of the two numbers are read in will be completely undefined, and may change depending on compiler version, optimizations or the alignment of the stars. The name unsafeInterleaveIO is long and ugly for a good reason: to remind you of the dangers of using it. It's a good idea to let people know when it's being used and not hide it in a monad.

There's no sensible way to do this, because to be quite honest it's not really a sensible thing to do. The entire purpose for introducing monadic I/O was to give a well-defined ordering to effects in the presence of lazy evaluation. It is certainly possible to throw that out the window if you really must, but I'm not sure what actual problem this would solve other than making it easier to write confusingly buggy code.
That said, introducing this sort of thing in a controlled fashion is what "Lazy IO" already does. The "primitive" operation for that is unsafeInterleaveIO, which is implemented roughly as return . unsafePerformIO, plus some details to make things behave a bit nicer. Applying unsafeInterleaveIO to everything, by hiding it in the bind operation of your "lazy IO" monad, would probably accomplish the ill-advised notion you're after.

What you are looking for isn't really a monad, unless you want to work with unsafe stuff like unsafeInterleaveIO.
Instead, a much cleaner abstraction here is Arrow.
I think, the following could work:
data Promise m a
= Done a
| Thunk (m a)
newtype Lazy m a b =
Lazy { getLazy :: Promise m a -> m (Promise m b) }

Related

Write a function from IO a -> a?

Take the function getLine - it has the type:
getLine :: IO String
How do I extract the String from this IO action?
More generally, how do I convert this:
IO a
to this:
a
If this is not possible, then why can't I do it?

In Haskell, when you want to work with a value that is "trapped" in IO, you don't take the value out of IO. Instead, you put the operation you want to perform into IO, as well!
For example, suppose you want to check how many characters the getLine :: IO String will produce, using the length function from Prelude.
There exists a helper function called fmap which, when specialized to IO, has the type:
fmap :: (a -> b) -> IO a -> IO b
It takes a function that works on "pure" values not trapped in IO, and gives you a function that works with values that are trapped in IO. This means that the code
fmap length getLine :: IO Int
represents an IO action that reads a line from console and then gives you its length.
<$> is an infix synonym for fmap that can make things simpler. This is equivalent to the above code:
length <$> getLine
Now, sometimes the operation you want to perform with the IO-trapped value itself returns an IO-trapped value. Simple example: you wan to write back the string you have just read using putStrLn :: String -> IO ().
In that case, fmap is not enough. You need to use the (>>=) operator, which, when specialiced to IO, has the type IO a -> (a -> IO b) -> IO b. In out case:
getLine >>= putStrLn :: IO ()
Using (>>=) to chain IO actions has an imperative, sequential flavor. There is a kind of syntactic sugar called "do-notation" which helps to write sequential operation like these in a more natural way:
do line <- getLine
putStrLn line
Notice that the <- here is not an operator, but part of the syntactic sugar provided by the do notation.

Not going into any details, if you're in a do block, you can (informally/inaccurately) consider <- as getting the value out of the IO.
For example, the following function takes a line from getLine, and passes it to a pure function that just takes a String
main = do
line <- getLine
putStrLn (wrap line)
wrap :: String -> String
wrap line = "'" ++ line ++ "'"
If you compile this as wrap, and on the command line run
echo "Hello" | wrap
you should see
'Hello'

If you know C then consider the question "How can I get the string from gets?" An IO String is not some string that's made hard to get to, it's a procedure that can return a string - like reading from a network or stdin. You want to run the procedure to obtain a string.
A common way to run IO actions in a sequence is do notation:
main = do
someString <- getLine
-- someString :: String
print someString
In the above you run the getLine operation to obtain a String value then use the value however you wish.

So "generally", it's unclear why you think you need a function of this type and in this case it makes all the difference.
It should be noted for completeness that it is possible. There indeed exists a function of type IO a -> a in the base library called unsafePerformIO.
But the unsafe part is there for a reason. There are few situations where its usage would be considered justified. It's an escape hatch to be used with great caution - most of the time you will let monsters in instead of letting yourself out.
Why can't you normally go from IO a to a? Well at the very least it allows you to break the rules by having a seemingly pure function that is not pure at all - ouch! If it were a common practice to do this the type signatures and all the work done by the compiler to verify them would make no sense at all. All the correctness guarantees would go out of the window.
Haskell is, partly, interesting precisely because this is (normally) impossible.
For how to approach your getLine problem in particular see the other answers.

Breaking out of monad sequence

Is it possible to break out of a monad sequence?
For instance, if I want to break out of a sequence earlier based on some condition calculated in the middle of the sequence. Say, in a 'do' notation I bind a value and based on the value I want to either finish the sequence or stop it. Is there something like a 'pass' function?
Thanks.

Directly using if
You could do this directly as Ingo beautifully encapsulated, or equivalently for example
breakOut :: a -> m (Either MyErrorType MyGoodResultType)
breakOut x = do
y <- dosomethingWith x
z <- doSomethingElseWith x y
if isNoGood z then return (Left (someerror z)) else do
w <- process z
v <- munge x y z
u <- fiddleWith w v
return (Right (greatResultsFrom u z))
This is good for simply doing something different based on what values you have.
Using Exceptions in the IO monad
You could use Control.Exception as Michael Litchard correctly pointed out. It has tons of error-handling, control-flow altering stuff in it, and is worth reading if you want to do something complex with this.
This is great if your error production could happen anywhere and your code is complex. You can handle the errors at the top level, or at any level you like. It's very flexible and doesn't mess with your return types. It only works in the IO monad.
import Control.Exception
Really I should roll my own custom type, but I can't be bothered deriving Typable etc, so I'll hack it with the standard error function and a few strings. I feel quite guilty about that.
handleError :: ErrorCall -> IO Int
handleError (ErrorCall msg) = case msg of
"TooBig" -> putStrLn "Error: argument was too big" >> return 10000
"TooSmall" -> putStrLn "Error: argument was too big" >> return 1
"Negative" -> putStrLn "Error: argument was too big" >> return (-1)
"Weird" -> putStrLn "Error: erm, dunno what happened there, sorry." >> return 0
The error handler needs an explicit type to be used in catch. I've flipped the argument to make the do block come last.
exceptOut :: IO Int
exceptOut = flip catch handleError $ do
x <- readLn
if (x < 5) then error "TooSmall" else return ()
y <- readLn
return (50 + x + y)
Monad transformers etc
These are designed to work with any monad, not just IO. They have the same benefits as IO's exceptions, so are officially great, but you need to learn about monad tranformers. Use them if your monad is not IO, and you have complex requirements like I said for Control.Exception.
First, read Gabriel Conzalez's Breaking from a loop for using EitherT to do two different things depending on some condition arising, or MaybeT for just stopping right there in the event of a problem.
If you don't know anything about Monad Transformers, you can start with Martin Grabmüller's Monad Transformers Step by Step. It covers ErrorT. After that read Breaking from a Loop again!
You might also want to read Real World Haskell chapter 19, Error handling.
Call/CC
Continuation Passing Style's callCC is remarkably powerful, but perhaps too powerful, and certainly doesn't produce terribly easy-to-follow code. See this for a fairly positive take, and this for a very negative one.

So what I think you're looking for is the equivalent of return in imperative languages, eg
def do_something
foo
bar
return baz if quux
...
end
Now in haskell this is doesn't work because a monadic chain is just one big function application. We have syntax that makes it look prettier but it could be written as
bind foo (bind bar (bind baz ...)))
and we can't just "stop" applying stuff in the middle. Luckily if you really need it there is an answer from the Cont monad. callCC. This is short for "call with current continuation" and generalizes the notation of returns. If you know Scheme, than this should be familiar.
import Control.Monad.Cont
foo = callCC $ \escape -> do
foo
bar
when baz $ quux >>= escape
...
A runnable example shamelessly stolen from the documentation of Control.Monad.Cont
whatsYourName name =
(`runCont` id) $ do
response <- callCC $ \exit -> do
validateName name exit
return $ "Welcome, " ++ name ++ "!"
return response
validateName name exit = do
when (null name) (exit "You forgot to tell me your name!")
and of course, there is a Cont transformer, ContT (which is absolutely mind bending) that will let you layer this on IO or whatever.
As a sidenote, callCC is a plain old function and completely nonmagical, implementing it is a great challenge

So I suppose there is no way of doing it the way I imagined it originally, which is equivalent of a break function in an imperative loop.
But I still get the same effect below based in Ingo's answer, which is pretty easy (silly me)
doStuff x = if x > 5
then do
t <- getTingFromOutside
doHeavyHalculations t
else return ()
I don't know though how it would work if I need to test 't' in the example above ...
I mean, if I need to test the bound value and make an if decision from there.

You can never break out of a "monad sequence", by definition. Remember that a "monad sequence" is nothing else than one function applied to other values/functions. Even if a "monad sequence" gives you the illusion that you could programme imperative, this is not true (in Haskell)!
The only thing you can do is to return (). This solution of the practical problem has already been named in here. But remember: it gives you only the illusion of being able to break out of the monad!

Is there any way to use IO Bool in if-statement without binding to a name in haskell?

If I've got a function that returns IO Bool (specifically an atomically), is there any way to use the return value directly in the if statement, without binding?
So currently I've got
ok <- atomically $ do
...
if (ok) then do
...
else do
...
Is it at all possible to write this as something like
if (*some_operator_here* atomically $ do
...) then do
...
else do
...
I was hoping there'd be a way to use something like <- anonymously, i.e., if (<- atomically ...) but so far no such luck.
Similarly on getLine, is it possible to write something like
if ((*operator* getLine) == "1234") then do ...
Related addendum--what is the type of (<-)? I can't get it to show up in ghci. I'm assuming it's m a -> a, but then that would mean it could be used outside of a monad to escape that monad, which would be unsafe, right? Is (<-) not a function at all?

You can use ifM from Control.Conditional if that suits your purpose and its not even hard to write a similar function.
Just to give you example
import Control.Conditional
import Control.Monad
(==:) :: ( Eq a,Monad m) => m a -> m a -> m Bool
(==:) = liftM2 (==)
main = ifM (getLine ==: getLine) (print "hit") (print "miss")
I think there are ways using rebindable syntax extension that you can even use if c then e1 else e2 like syntax for ifM but it is not worth the effort to try that.

With GHC 7.6 and the LambdaCase language extension, you can write
{-# LANGUAGE LambdaCase #-}
import System.Directory
main = do
doesFileExist "/etc/passwd" >>= \case
True -> putStrLn "Yes"
False -> putStrLn "No"
It is not exactly if..then..else, but closer enough, does not require binding to the result, and some people (not me) say that if..then..else is bad style in Haskell anyways.

No, you cannot. Well, to be honest, there is a 'hack' that will allow you to at least write code like this and get it to compile, but the results will almost certainly not be what you wanted or expected.
Why is this not possible? Well, for one thing a value of type IO Bool does not in any sense contain a value of type Bool. Rather it is an 'action' that when performed will return a value of type Bool. For another thing, if this were possible, it would allow you to hide side-effects inside what appears to be pure code. This would violate a core principal of Haskell. And Haskell is very principled.

What's the meaning of IO actions within pure functions?

I thought that in principle Haskell's type system would forbid calls to impure functions (i.e. f :: a -> IO b) from pure ones, but today I realized that by calling them with return they compile just fine. In this example:
h :: Maybe ()
h = do
return $ putStrLn "???"
return ()
h works in the Maybe monad, but it's a pure function nevertheless. Compiling and running it simply returns Just () as one would expect, without actually doing any I/O. I think Haskell's laziness puts the things together (i.e. putStrLn's return value is not used - and can't since its value constructors are hidden and I can't pattern match against it), but why is this code legal? Are there any other reasons that makes this allowed?
As a bonus, related question: in general, is it possible to forbid at all the execution of actions of a monad from within other ones, and how?

IO actions are first-class values like any other; that's what makes Haskell's IO so expressive, allowing you to build higher-order control structures (like mapM_) from scratch. Laziness isn't relevant here,1 it's just that you're not actually executing the action. You're just constructing the value Just (putStrLn "???"), then throwing it away.
putStrLn "???" existing doesn't cause a line to be printed to the screen. By itself, putStrLn "???" is just a description of some IO that could be done to cause a line to be printed to the screen. The only execution that happens is executing main, which you constructed from other IO actions, or whatever actions you type into GHCi. For more information, see the introduction to IO.
Indeed, it's perfectly conceivable that you might want to juggle about IO actions inside Maybe; imagine a function String -> Maybe (IO ()), which checks the string for validity, and if it's valid, returns an IO action to print some information derived from the string. This is possible precisely because of Haskell's first-class IO actions.
But a monad has no ability to execute the actions of another monad unless you give it that ability.
1 Indeed, h = putStrLn "???" `seq` return () doesn't cause any IO to be performed either, even though it forces the evaluation of putStrLn "???".

Let's desugar!
h = do return (putStrLn "???"); return ()
-- rewrite (do foo; bar) as (foo >> do bar)
h = return (putStrLn "???") >> do return ()
-- redundant do
h = return (putStrLn "???") >> return ()
-- return for Maybe = Just
h = Just (putStrLn "???") >> Just ()
-- replace (foo >> bar) with its definition, (foo >>= (\_ -> bar))
h = Just (putStrLn "???") >>= (\_ -> Just ())
Now, what happens when you evaluate h?* Well, for Maybe,
(Just x) >>= f = f x
Nothing >>= f = Nothing
So we pattern match the first case
f x
-- x = (putStrLn "???"), f = (\_ -> Just ())
(\_ -> Just ()) (putStrLn "???")
-- apply the argument and ignore it
Just ()
Notice how we never had to perform putStrLn "???" in order to evaluate this expression.
*n.b. It is somewhat unclear at which point "desugaring" stops and "evaluation" begins. It depends on your compiler's inlining decisions. Pure computations could be evaluated entirely at compile time.

How can I initialize state in a hidden way in Haskell (like the PRNG does)?

I went through some tutorials on the State monad and I think I got the idea.
For example, as in this nice tutorial:
import Data.Word
type LCGState = Word32
lcg :: LCGState -> (Integer, LCGState)
lcg s0 = (output, s1)
where s1 = 1103515245 * s0 + 12345
output = fromIntegral s1 * 2^16 `div` 2^32
getRandom :: State LCGState Integer
getRandom = get >>= \s0 -> let (x,s1) = lcg s0
in put s1 >> return x
OK, so I can use getRandom:
*Main> runState getRandom 0
(0,12345)
*Main> runState getRandom 0
(0,12345)
*Main> runState getRandom 1
(16838,1103527590)
But I still need to pass the seed to the PRNG every time I call it. I know that the
PRNG available in Haskell implementations does not need that:
Prelude> :module Random
Prelude Random> randomRIO (1,6 :: Int)
(...) -- GHC prints some stuff here
6
Prelude Random> randomRIO (1,6 :: Int)
1
So I probably misunderstood the State monad, because what I could see in most tutorials
doesn't seem to be "persistent" state, but just a convenient way to thread state.
So... How can I have state that is automatically initialized (possible from some
function that uses time and other not-very-predictable data), like the Random module
does?
Thanks a lot!

randomRIO uses the IO monad. This seems to work nicely in the interpreter because the interpreter also works in the IO monad. That's what you are seeing in your example; you can't actually do that at the top-level in code -- you would have to put it in a do-expression like all monads anyway.
In general code you should avoid the IO monad, because once your code uses the IO monad, it is tied to external state forever -- you can't get out of it (i.e. if you have code that uses the IO monad, any code that calls it also has to use the IO monad; there is no safe way to "get out" of it). So the IO monad should only be used for things like accessing the external environment, things where it is absolutely required.
For things like local self-contained state, you should not use the IO monad. You can use the State monad as you have mentioned, or you can use the ST monad. The ST monad contains a lot of the same features as the IO monad; i.e. there is STRef mutable cells, analogous to IORef. And the nice thing about ST compared to IO is that when you are done, you can call runST on an ST monad to get the result of the computation out of the monad, which you can't do with IO.
As for "hiding" the state, that just comes as part of the syntax of do-expressions in Haskell for monads. If you think you need to explicitly pass the state, then you are not using the monad syntax correctly.
Here is code that uses IORef in the IO Monad:
import Data.IORef
foo :: IO Int -- this is stuck in the IO monad forever
foo = do x <- newIORef 1
modifyIORef x (+ 2)
readIORef x
-- foo is an IO computation that returns 3
Here is code that uses the ST monad:
import Control.Monad.ST
import Data.STRef
bar :: Int
bar = runST (do x <- newSTRef 1
modifySTRef x (+ 2)
readSTRef x)
-- bar == 3
The simplicity of the code is essentially the same; except that in the latter case we can get the value out of the monad, and in the former we can't without putting it inside another IO computation.

secretStateValue :: IORef SomeType
secretStateValue = unsafePerformIO $ newIORef initialState
{-# NOINLINE secretStateValue #-}
Now access your secretStateValue with normal readIORef and writeIORef, in the IO monad.

So I probably misunderstood the State monad, because what I could see in most tutorials doesn't seem to be "persistent" state, but just a convenient way to thread state.
The state monad is precisely about threading state through some scope.
If you want top level state, that's outside the language (and you'll have to use a global mutable variable). Note how this will likely complicated thread safety of your code -- how is that state initialized? and when? And by which thread?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

"Truly" lazy IO in Haskell - haskell

Related

Write a function from IO a -> a?

Breaking out of monad sequence

Is there any way to use IO Bool in if-statement without binding to a name in haskell?

What's the meaning of IO actions within pure functions?

How can I initialize state in a hidden way in Haskell (like the PRNG does)?

Categories

Resources