When we should use do? - haskell

Sometimes the program shows an error if I don't use do. But it runs well without do sometimes.
ex:
countdown ::Int -> IO ()
countdown x = if x <= 0
then putStrLn "The End."
else putStrLn (show (x))
runs well
But
countdown ::Int -> IO ()
countdown x = if x <= 0
then putStrLn "The End."
else putStrLn (show (x))
countdown (x-1)
shows error

Short answer: line breaks don't mean "next statement" in Haskell, like they do in many mainstream languages, like Python, Ruby, and recently even JavaScript.
Long answer
First, let's get rid of the if. It only clouds the issue:
countdown1 :: Int -> IO ()
countdown1 x = putStrLn (show x)
countdown2 :: Int -> IO ()
countdown2 x = putStrLn (show x)
countdown (x-1)
Notice that the type of your function is IO (). That's the type of what the function must ultimately calculate. Or "produce", if you will. This means that whatever is on the right of the equality sign must be of type IO ().
countdown x = .........
^ ^
+-------+
\
the type of whatever is here must be `IO ()`
The first version, countdown1, does satisfy this, because the expression putStrLn (show x) is, indeed, of type IO ()
The second version, countdown2, on the other hand, looks very strange to the compiler. It looks like you're trying to call putStrLn, and you're trying to pass it three parameters: (show x), countdown, and (x-1).
Don't let the newline between (show x) and countdown confuse you: as mentioned above, newlines don't mean "next statement" in Haskell. This is because in Haskell there is no such thing as a "statement" in the first place. Everything is an expression.
But wait! If newlines don't count as "next statement", then how the hell do I tell Haskell to perform several actions in order? Like, for example, first do putStrLn, and then do countdown?
Well, this is where monads come in. Monads (of which IO is a prime example) were specifically brought into Haskell for this very reason: to express order of things. And the primary operation for that is "bind", which in Haskell exists in the form of an operator >>=. This operator takes a monadic value (such as IO Int or IO ()) on the left, and takes a function that returns a monadic value (such as Int -> IO String) on the right, and "glues" them together. The result is a new monadic action, which consists of the two input actions executed one after the other.
Applying this to your example with putStrLn and countdown, it would look like this:
putStrLn (show x) >>= (\y -> countdown (x-1))
^ ^ ^ ^
+---------------+ +---------------------+
\ \
first monadic value a function that takes the result of the
first action as parameter and returns
the second action
But this is a bit inconvenient. Sure, you can glue together two actions, maybe even three. But after a while this becomes very messy. (at first I had an example here, but then decided to do without; just trust me: it does get messy)
So to relieve the mess, the language now offers syntactic sugar in the form of the do notation. Inside the do notation, newline does, in fact, mean "next statement", and these consecutive "statements" get desugared into a sequence of calls to the >>= operator, giving them the semantics of "executing in order". Something like this:
do
y <- f x
z <- h y ====> f x >>= (\y -> h y >>= (\z -> g x y z))
g x y z
So to the naked eye the do notation does look roughly equivalent to multiple-line program in Python, Ruby, or JavaScript. But underneath it all, it's still a pure functional program, where everything is an expression, and the order of (non-pure, effectful) operations is explicitly controlled.
So, to summarize: you need to use do in your program to express the order - first putStrLn, and then countdown:
countdown :: Int -> IO ()
countdown x = if x <= 0
then putStrLn "The End."
else do
putStrLn (show (x))
countdown (x-1)
But you don't need to use do when there is just one operation, so there is no order to speak of.
And if you don't want to use do for whatever reason, you can desugar it manually into the equivalent >>= call:
countdown :: Int -> IO ()
countdown x = if x <= 0
then putStrLn "The End."
else putStrLn (show (x)) >>= \_ -> countdown (x-1)

Related

Use of putStrLn to show result

I am using the Idone.com site and wanted to run this code but do not know the syntax putStrLn to compile from stdin Use this code but strip error.
main = putStrLn (show (sumaCifras x))
sumaCifras:: Int -> Int
sumaCifras x = div x 1000 + mod (div x 100) 10 + mod (div x 10) 10 + mod x 10
Compiler is having a problem, because you use x in main function, which isn't bound in this scope. At first you must read a value from input and then pass it to your function. You can do it in 2 ways.
More natural for people used to imperative languages is "do" syntax, in which it will look like that:
main = do
x <- getLine
putStrLn (show (sumaCifras (read x :: Int)))
When you want to use x as Int, you must use "read" function with type signature, so compiler will know what to expect.
To write it in more functional way, you may use monad transformations, to pass it like that
main = getLine >>= (\x -> putStrLn(show (sumaCifras (read x :: Int)))
The ">>=" operator will get result value from first monadic action (in here it is IO action of getting input) and apply it to function on the right (in here it is lambda function that reads input as Integer, applies your function and returns it to putStrLn, which prints it on the screen). "do" syntax is essentially just a syntactic sugar for this monadic operations, so it will not affect the execution or performance of program.
You can go one step further in writing it functionally by writing it totally point-free
main = getLine >>= (putStrLn . show . sumaCifras . (read :: String -> Int))
Note that here you have type signature for read function, not for application of this function to argument, hence the String -> Int. In here first executed is the getLine function. Input obtained from it is then passed to the read, where it is casted to Int, next is sumaCifras, which then is casted to String by show and printed with putStrLn.

Why is getLine being evaluated when its value is not required?

If Haskell is lazy, why does getLine is evaluated in both of the following cases? Being lazy I would expect that in the case of fa the getLine would not be evaluated because its result is not being used subsequently:
let fa = do {
x <- getLine;
return "hello"
}
let fb = do {
x <- getLine;
return $ x
}
(I tested both cases in GHCi)
Thanks
Its result is being used, just not in the way you necessarily expect. This de-sugars to
fa = getLine >>= (\x -> return "hello")
So the result of getLine is still passed to the function \x -> return "hello". Monads are inherently about sequencing actions together (among other things); that sequencing still occurs even if results are later not used. If that weren't the case, then
main = do
print "Hello"
print "World"
wouldn't do anything as a program, since the results of both calls to print aren't being used.
Congratulations, you've just discovered why Haskell is a pure functional language!
The result† of getLine is not a string. It's an IO action which happens to “produce” a string. That string is indeed not evaluated, but the action itself is (since it turns up in a do block bound to main), and this is all that matters as far as side-effects are concerned.
†Really just the value of getLine. This is not a function, so it doesn't actually have a result.
Be careful now... ;-)
The result of getLine isn't a string, it's an "I/O command object", if you will. The code actually desugars to
getLine >>= (\ x -> return "hello")
The >>= operator constructs a new I/O command object out of an existing one and a function... OK, that's a bit much to wrap your mind around. The important thing is, the I/O action gets executed (because of the implementation of >>= for IO), but its result doesn't necessarily get evaluated (because of laziness).
So let's look at the implementation of the IO monad... erm, actually you know what? Let's not. (It's deep magic, hard-wired into the compiler, and as such it's implementation-specific.) But this phenomenon isn't unique to IO by any means. Let's look at the Maybe monad:
instance Monad Maybe where
mx >>= f =
case mx of
Nothing -> Nothing
Just x -> f x
return x = Just x
So if I do something like
do
x <- foobar
return "hello"
Will x get evaluated? Let's look. It desugars to:
foobar >>= (\ x -> return "hello")
then this becomes
case foobar of
Nothing -> Nothing
Just x -> Just "hello"
As you can see, foobar is clearly going to be evaluated, because we need to know whether the result is Nothing or Just. But the actual x won't be evaluated, because nothing looks at it.
It's kind of the same way that length evaluates the list nodes, but not the list elements they point to.

Haskell Input to create a String List

I would like to allow a user to build a list from a series of inputs in Haskell.
The getLine function would be called recursively until the stopping case ("Y") is input, at which point the list is returned.
I know the function needs to be in a similar format to below. I am having trouble assigning the correct type signatures - I think I need to include the IO type somewhere.
getList :: [String] -> [String]
getList list = do line <- getLine
if line == "Y"
then return list
else getList (line : list)
So there's a bunch of things that you need to understand. One of them is the IO x type. A value of this type is a computer program that, when later run, will do something and produce a value of type x. So getLine doesn't do anything by itself; it just is a certain sort of program. Same with let p = putStrLn "hello!". I can sequence p into my program multiple times and it will print hello! multiple times, because the IO () is a program, as a value which Haskell happens to be able to talk about and manipulate. If this were TypeScript I would say type IO<x> = { run: () => Promise<x> } and emphatically that type says that the side-effecting action has not been run yet.
So how do we manipulate these values when the value is a program, for example one that fetches the current system time?
The most fundamental way to chain such programs together is to take a program that produces an x (an IO x) and then a Haskell function which takes an x and constructs a program which produces a y (an x -> IO y and combines them together into a resulting program producing a y (an IO y.) This function is called >>= and pronounced "bind". In fact this way is universal, if we add a program which takes any Haskell value of type x and produces a program which does nothing and produces that value (return :: x -> IO x). This allows you to use, for example, the Prelude function fmap f = (>>= return . f) which takes an a -> b and applies it to an IO a to produce an IO b.
So It is so common to say things like getLine >>= \line -> putStrLn (upcase line ++ "!") that we invented do-notation, writing this as
do
line <- getLine
putStrLn (upcase line ++ "!")
Notice that it's the same basic deal; the last line needs to be an IO y for some y.
The last thing you need to know in Haskell is the convention which actually gets these things run. That is that, in your Haskell source code, you are supposed to create an IO () (a program whose value doesn't matter) called Main.main, and the Haskell compiler is supposed to take this program which you described, and give it to you as an executable which you can run whenever you want. As a very special case, the GHCi interpreter will notice if you produce an IO x expression at the top level and will immediately run it for you, but that is very different from how the rest of the language works. For the most part, Haskell says, describe the program and I will give it to you.
Now that you know that Haskell has no magic and the Haskell IO x type just is a static representation of a computer program as a value, rather than something which does side-effecting stuff when you "reduce" it (like it is in other languages), we can turn to your getList. Clearly getList :: IO [String] makes the most sense based on what you said: a program which allows a user to build a list from a series of inputs.
Now to build the internals, you've got the right guess: we've got to start with a getLine and either finish off the list or continue accepting inputs, prepending the line to the list:
getList = do
line <- getLine
if line == 'exit' then return []
else fmap (line:) getList
You've also identified another way to do it, which depends on taking a list of strings and producing a new list:
getList :: IO [String]
getList = fmap reverse (go []) where
go xs = do
x <- getLine
if x == "exit" then return xs
else go (x : xs)
There are probably several other ways to do it.

What's the meaning of IO actions within pure functions?

I thought that in principle Haskell's type system would forbid calls to impure functions (i.e. f :: a -> IO b) from pure ones, but today I realized that by calling them with return they compile just fine. In this example:
h :: Maybe ()
h = do
return $ putStrLn "???"
return ()
h works in the Maybe monad, but it's a pure function nevertheless. Compiling and running it simply returns Just () as one would expect, without actually doing any I/O. I think Haskell's laziness puts the things together (i.e. putStrLn's return value is not used - and can't since its value constructors are hidden and I can't pattern match against it), but why is this code legal? Are there any other reasons that makes this allowed?
As a bonus, related question: in general, is it possible to forbid at all the execution of actions of a monad from within other ones, and how?
IO actions are first-class values like any other; that's what makes Haskell's IO so expressive, allowing you to build higher-order control structures (like mapM_) from scratch. Laziness isn't relevant here,1 it's just that you're not actually executing the action. You're just constructing the value Just (putStrLn "???"), then throwing it away.
putStrLn "???" existing doesn't cause a line to be printed to the screen. By itself, putStrLn "???" is just a description of some IO that could be done to cause a line to be printed to the screen. The only execution that happens is executing main, which you constructed from other IO actions, or whatever actions you type into GHCi. For more information, see the introduction to IO.
Indeed, it's perfectly conceivable that you might want to juggle about IO actions inside Maybe; imagine a function String -> Maybe (IO ()), which checks the string for validity, and if it's valid, returns an IO action to print some information derived from the string. This is possible precisely because of Haskell's first-class IO actions.
But a monad has no ability to execute the actions of another monad unless you give it that ability.
1 Indeed, h = putStrLn "???" `seq` return () doesn't cause any IO to be performed either, even though it forces the evaluation of putStrLn "???".
Let's desugar!
h = do return (putStrLn "???"); return ()
-- rewrite (do foo; bar) as (foo >> do bar)
h = return (putStrLn "???") >> do return ()
-- redundant do
h = return (putStrLn "???") >> return ()
-- return for Maybe = Just
h = Just (putStrLn "???") >> Just ()
-- replace (foo >> bar) with its definition, (foo >>= (\_ -> bar))
h = Just (putStrLn "???") >>= (\_ -> Just ())
Now, what happens when you evaluate h?* Well, for Maybe,
(Just x) >>= f = f x
Nothing >>= f = Nothing
So we pattern match the first case
f x
-- x = (putStrLn "???"), f = (\_ -> Just ())
(\_ -> Just ()) (putStrLn "???")
-- apply the argument and ignore it
Just ()
Notice how we never had to perform putStrLn "???" in order to evaluate this expression.
*n.b. It is somewhat unclear at which point "desugaring" stops and "evaluation" begins. It depends on your compiler's inlining decisions. Pure computations could be evaluated entirely at compile time.

"<-" bindings in do notation

I have a hard time grasping this. When writing in do notation, how are the following two lines different?
1. let x = expression
2. x <- expression
I can't see it. Sometimes one works, some times the other. But rarely both. "Learn you a haskell" says that <- binds the right side to the symbol on the left. But how is that different from simply defining x with let?
The <- statement will extract the value from a monad, and the let statement will not.
import Data.Typeable
readInt :: String -> IO Int
readInt s = do
putStrLn $ "Enter value for " ++ s ++ ": "
readLn
main = do
x <- readInt "x"
let y = readInt "y"
putStrLn $ "x :: " ++ show (typeOf x)
putStrLn $ "y :: " ++ show (typeOf y)
When run, the program will ask for the value of x, because the monadic action readInt "x" is executed by the <- statement. It will not ask for the value of y, because readInt "y" is evaluated but the resulting monadic action is not executed.
Enter value for x:
123
x :: Int
y :: IO Int
Since x :: Int, you can do normal Int things with it.
putStrLn $ "x = " ++ show x
putStrLn $ "x * 2 = " ++ show (x * 2)
Since y :: IO Int, you can't pretend that it's a regular Int.
putStrLn $ "y = " ++ show y -- ERROR
putStrLn $ "y * 2 = " ++ show (y * 2) -- ERROR
In a let binding, the expression can have any type, and all you're doing is giving it a name (or pattern matching on its internal structure).
In the <- version, the expression must have type m a, where m is whatever monad the do block is in. So in the IO monad, for instance, bindings of this form must have some value of type IO a on the right-hand side. The a part (inside the monadic value) is what is bound to the pattern on the left-hand side. This lets you extract the "contents" of the monad within the limited scope of the do block.
The do notation is, as you may have read, just syntactic sugar over the monadic binding operators (>>= and >>). x <- expression de-sugars to expression >>= \x -> and expression (by itself, without the <-) de-sugars to expression >>. This just gives a more convenient syntax for defining long chains of monadic computations, which otherwise tend to build up a rather impressive mass of nested lambdas.
let bindings don't de-sugar at all, really. The only difference between let in a do block and let outside of a do block is that the do version doesn't require the in keyword to follow it; the names it binds are implicitly in scope for the rest of the do block.
In the let form, the expression is a non-monadic value, while the right side of a <- is a monadic expression. For example, you can only have an I/O operation (of type IO t) in the second kind of binding. In detail, the two forms can be roughly translated as (where ==> shows the translation):
do {let x = expression; rest} ==> let x = expression in do {rest}
and
do {x <- operation; rest} ==> operation >>= (\ x -> do {rest})
let just assigns a name to, or pattern matches on arbitrary values.
For <-, let us first step away from the (not really) mysterious IO monad, but consider monads that have a notion of a "container", like a list or Maybe. Then <- does not more than "unpacking" the elements of that container. The opposite operation of "putting it back" is return. Consider this code:
add m1 m2 = do
v1 <- m1
v2 <- m2
return (v1 + v2)
It "unpacks" the elements of two containers, add the values together, and wraps it again in the same monad. It works with lists, taking all possible combinations of elements:
main = print $ add [1, 2, 3] [40, 50]
--[41,51,42,52,43,53]
In fact in case of lists you could write as well add m1 m2 = [v1 + v2 | v1 <- m1, v2 <- m2]. But our version works with Maybes, too:
main = print $ add (Just 3) (Just 12)
--Just 15
main = print $ add (Just 3) Nothing
--Nothing
Now IO isn't that different at all. It's a container for a single value, but it's a "dangerous" impure value like a virus, that we must not touch directly. The do-Block is here our glass containment, and the <- are the built-in "gloves" to manipulate the things inside. With the return we deliver the full, intact container (and not just the dangerous content), when we are ready. By the way, the add function works with IO values (that we got from a file or the command line or a random generator...) as well.
Haskell reconciles side-effectful imperative programming with pure functional programming by representing imperative actions with types of form IO a: the type of an imperative action that produces a result of type a.
One of the consequences of this is that binding a variable to the value of an expression and binding it to the result of executing an action are two different things:
x <- action -- execute action and bind x to the result; may cause effect
let x = expression -- bind x to the value of the expression; no side effects
So getLine :: IO String is an action, which means it must be used like this:
do line <- getLine -- side effect: read from stdin
-- ...do stuff with line
Whereas line1 ++ line2 :: String is a pure expression, and must be used with let:
do line1 <- getLine -- executes an action
line2 <- getLine -- executes an action
let joined = line1 ++ line2 -- pure calculation; no action is executed
return joined
Here is a simple example showing you the difference.
Consider the two following simple expressions:
letExpression = 2
bindExpression = Just 2
The information you are trying to retrieve is the number 2.
Here is how you do it:
let x = letExpression
x <- bindExpression
let directly puts the value 2 in x.
<- extracts the value 2 from the Just and puts it in x.
You can see with that example, why these two notations are not interchangeable:
let x = bindExpression would directly put the value Just 2 in x.
x <- letExpression would not have anything to extract and put in x.

Resources