Newbie: understanding main and IO() - haskell

While learning Haskell I am wondering when an IO action will be performed. In several places I found descriptions like this:
"What’s special about I/O actions is that if they fall into the main function, they are performed."
But in the following example, 'greet' never returns and therefore nothing should be printed.
import Control.Monad
main = greet
greet = forever $ putStrLn "Hello World!"
Or maybe I should ask: what does it mean to "fall into the main function"?

First of all, main is not a function. It is indeed just a regular value and its type is IO (). The type can be read as: An action that, when performed, produces a value of type ().
Now the run-time system plays the role of an interpreter that performs the actions that you have described. Let's take your program as example:
main = forever (putStrLn "Hello world!")
Notice that I have performed a transformation. That one is valid, since Haskell is a referentially transparent language. The run-time system resolves the forever and finds this:
main = putStrLn "Hello world!" >> MORE1
It doesn't yet know what MORE1 is, but it now knows that it has a composition with one known action, which is executed. After executing it, it resolves the second action, MORE1 and finds:
MORE1 = putStrLn "Hello world!" >> MORE2
Again it executes the first action in that composition and then keeps on resolving.
Of course this is a high level description. The actual code is not an interpreter. But this is a way to picture how a Haskell program gets executed. Let's take another example:
main = forever (getLine >>= putStrLn)
The RTS sees this:
main = forever MORE1
<< resolving forever >>
MORE1 = getLine >>= MORE2
<< executing getLine >>
MORE2 result = putStrLn result >> MORE1
<< executing putStrLn result (where 'result' is the line read)
and starting over >>
When understanding this you understand how an IO String is not "a string with side effects" but rather the description of an action that would produce a string. You also understand why laziness is crucial for Haskell's I/O system to work.

In my opinion the point of the statement "What’s special about I/O actions is that if they fall into the main function, they are performed." is that IO actions are first class citizens. That is, IO-actions can occur at all places where values of other data types like Int can occur. For example, you can define a list that contains IO actions as follows.
actionList = [putStr "Hello", putStr "World"]
The list actionList has type [IO ()]. That is, the list contains actions that interact with the world, for example, print on the console or read in input from the user. But, in defining this list we do not execute the actions, we simply put them in a list for later use.
If an IO can occur somewhere in your program, the question arrises when these actions are performed and here main comes into play. Consider the following definition of main.
main = do
actionList !! 0
actionList !! 1
This main function projects to the first and the second component of the list and "executes" the corresponding actions by using them within its definition. Note that it does not necessarily have to be the main function itself that executes an IO action. Any function that is called from the main function can execute actions as well. For example, we can define a function that calls the actions from actionList and let main call this function as follows.
main = do
caller
putStr "!"
caller = do
actionList !! 0
actionList !! 1
To highlight that it does not have to be a simple renaming like in main = caller I have added an action that prints an exclamation mark after it has performed the actions from the list.

Simple IO actions can be combined into more advanced ones by using do notation.
main = do
printStrLn "Hello"
printStrLn "World"
combines the IO action printStrLn "Hello" with the IO action printStrLn "World". Main is now an IO action first printing a line that says "Hello" and then a line that says "World". Written without do-notation (which is just syntactic suger) it looks like this:
main = printStrLn "Hello" >> printStrLn "World"
Here you can see the >> function combining the two actions.
You can create an IO action that reads a line, passes it to a function(that does awesome stuff to it :)) and the prints the result like this:
main = do
input <- getLine
let result = doAwesomeStuff input
printStrLn result
or without binding the result to a variable:
main = do
input <- getLine
printStrLn (doAwesomeStuff input)
This can ofcourse also be written as IO actions and functions that combine them like this:
main = getLine >>= (\input -> printStrLn (doAwesomeStuff input))
When you run the program the main IO action is executed. This is the only time any IO actions are actually executed. (well technically you can also execute them within you program, but it is not safe. The function that does the is called unsafePerformIO.)
You can read more here: http://www.haskell.org/haskellwiki/Introduction_to_Haskell_IO/Actions
(This link is probably a better explaination than mine, but I only found it after I had written nearly everything. It is also quite a bit longer)

launchAMissile :: IO ()
launchAMissile = do
openASilo
loadCoordinates
launchAMissile
main = do
let launch3missiles = launchAMissile >> launchAMissile >> launchAMissile
putStrLn "Not actually launching any missiles"

forever isn't a loop like C's while (true). It is a function that produces an IO value (which contains an infinitely repeated sequence of actions), which is consumed by the caller. (In this case, the caller is main, which means that the actions get executed by the runtime system).

Related

Is there a way to "unwrap" IO monad in Haskell inside another io action?

In particular I want to use pure Haskell functions on the results on console input. I'm curious if exists something like ? operator from rust.
this snippet complains that words expect the type String, while its supplied wrapped into an io action. But i can't even use >>= operator since as far as i understand i cannot instantiate IO constructor directly. So, does it mean all the standard library functions can't work with io even in the scope of io action?
thing :: IO ()
thing = do
let mbwords = words $ getLine ;
Since Haskell is not designed to be totally useless, there must be a way. And there is more than one indeed.
What you looking for is something like this
main = do
line <- getLine
let mbwords = words line
or perhaps
main = do
mbwords <- fmap words getLine
Note you use <- to "open" the monadic value inside the do-block, not let.
Also note instantiating IO constructors is not necessary as long as you have functions that return IO actions, such as print. You can express the same idea with
main = fmap words getLine >>= print
(of course you can use an arbitrary action of the right type instead of print).

What's the use of main returning IO Something rather than IO()?

I'm reading http://learnyouahaskell.com/ ... And something surprised me:
Because of that, main always has a type signature of main :: IO something, where something is some concrete type.
? So main doesn't have to be of type IO(), but rather can be IO(String) or IO(Int)? But what's the use of this?
I did some playing...
m#m-X555LJ:~$ cat wtf.hs
main :: IO Int
main = fmap (read :: String -> Int) getLine
m#m-X555LJ:~$ runhaskell wtf.hs
1
m#m-X555LJ:~$ echo $?
0
m#m-X555LJ:~$
Hmm. So my first hypothesis is disproven. I thought this was a way for a Haskell program to return exit status to the shell, much like a C program starts from int main() and reports the exit status with return 0 or return 1.
But nope: the above program consumes the 1 from input and then does nothing, and in particular doesn't seem to return this 1 to the shell.
One more test:
m#m-X555LJ:~$ cat wtf.hs
main = getContents
m#m-X555LJ:~$ runhaskell wtf.hs
m#m-X555LJ:~$
Wow. This time I tried returning IO String. For reasons unknown to me, this time Haskell doesn't even wait for input, as it did when I was returning IO Int. The program seems to simply do nothing.
This hints that the value is really not returned anywhere: apparently, since the results of getContents are nowhere used, the whole instruction was skipped due to laziness. But if this was the case, why was returning IO Int not skipped? Well yes: I did fmap read on the IO action; but same stuff seems to apply, computing the read is only necessary if the result of the action is used, which - as the main = getContents example seems to hint - is not used, so laziness should also skip the read and hence also the getLine, right? Well, wrong - but I'm confused why.
What's the use of returning IO Something from main rather than only IO ()?
This is actually multiple questions, but in order:
The 'result' of main doesn't have a meaning, that is why it can be () or anything else. It's not used at all.
The reason types other than IO () are allowed for main is for convenience; Otherwise you'd always have to do something like main = void $ realMain to discard results (you may well want to have an action that could return a result which you don't care about as the last thing that happens) which is a bit tedious. IMHO silently discarding things is bad and so I'd prefer if main was forced to be :: IO (), but you can always get that effect by just supplying the type signature yourself so it's not really a problem in practice.
Side point: If you want to exit with a specific exit code, use System.Exit
The reason fmap read getLine consumes output and getContents doesn't is because getContents is lazy and getLine isn't - i.e. getLine does read a line of text where you'd think it does, whereas getContents only does any actual IO if the result is "needed" in the Haskell world; Since the result of IO isn't used for anything that means it doesn't do anything if getContents is your whole main.

Does the main-function in Haskell always start with main = do?

In java we always write:
public static void main(String[] args){...}
when we want to start writing a program.
My question is, is it the same for Haskell, IE: can I always be sure to declare:
main = do, when I want to write code for a program in Haskell?
for example:
main = do
putStrLn "What's your name?"
name <- getLine
putStrLn ("Hello " ++ name)
This program is going to ask the user "What's your name?"
the user input will then be stored inside of the name-variable, and
"Hello" ++ name will be displayed before the program terminates.
Short answer: No, we have to declare a main =, but not a do.
The main must be an IO monad type (so IO a) where a is arbitrary (since it is ignored), as is written here:
The use of the name main is important: main is defined to be the entry point of a Haskell program (similar to the main function in C), and must have an IO type, usually IO ().
But you do not necessary need do notation. Actually do is syntactical sugar. Your main is in fact:
main =
putStrLn "What's your name?" >> getLine >>= \n -> putStrLn ("Hello " ++ n)
Or more elegantly:
main = putStrLn "What's your name?" >> getLine >>= putStrLn . ("Hello " ++)
So here we have written a main without do notation. For more about desugaring do notation, see here.
Yes, if you have more than one line in your do block, and if you are even using the do notation.
The full do-notation syntax also includes explicit separators -- curly braces and semicolons:
main = do { putStrLn "What's your name?"
; name <- getLine
; putStrLn ("Hello " ++ name)
}
With them, indentation plays no role other than in coding style (good indentation improves readability; explicit separators ensure code robustness, remove white-space related brittleness). So when you have only one line of IO-code, like
main = do { print "Hello!" }
there are no semicolons, no indentation to pay attention to, and the curly braces and do keyword itself become redundant:
main = print "Hello!"
So, no, not always. But very often it does, and uniformity in code goes a long way towards readability.
do blocks translate into monadic code, but you can view this fact as implementational detail, at first. In fact, you should. You can treat the do notation axiomatically, as an embedded language, mentally. Besides, it is that, anyway.
The simplified do-syntax is:
do { pattern1 <- action1
; pattern2 <- action2
.....................
; return (.....)
}
Each actioni is a Haskell value of type M ai for some monad M and some result type ai. Each action produces its own result type ai while all actions must belong to the same monad type M.
Each patterni receives the previously "computed" result from the corresponding action.
Wildcards _ can be used to ignore it. If this is the case, the _ <- part can be omitted altogether.
"Monad" is a scary and non-informative word, but it is really nothing more than EDSL, conceptually. Embedded domain-specific language means that we have native Haskell values standing for (in this case) I/O computations. We write our I/O programs in this language, which become a native Haskell value(s), which we can operate upon as on any other native Haskell value -- collect them in lists, compose them into more complex computation descriptions (programs), etc.
The main value is one such value computed by our Haskell program. The compiler sees it, and performs the I/O program that it stands for, at run time.
The point to it is that we can now have a "function" getCurrentTime (impossible, on the face of it, in functional paradigm since it must return different results on separate invocations), because it is not returning the current time -- the action it describes will do so, when the I/O program it describes is run by the run-time system.
On the type level this is reflected by such values not having just some plain Haskell type a, but a parameterized type, IO a, "tagged" by IO as belonging to this special world of I/O programming.
See also: Why does haskell's bind function take a function from non-monadic to monadic.

Why/how does recursive IO work?

Haskell IO is often explained in terms of the entire program being a pure function (main) that returns an IO value (often described as an imperative IO program), which is then executed by the runtime.
This mental model works fine for simple examples, but fell over for me as soon as I saw a recursive main in Learn You A Haskell. For example:
main = do
line <- getLine
putStrLn line
main
Or, if you prefer:
main = getLine >>= putStrLn >> main
Since main never terminates, it never actually returns an IO value, yet the program endlessly reads and echoes back lines just fine - so the simple explanation above doesn't quite work. Am I missing something simple or is there a more complete explanation (or is it 'simply' compiler magic) ?
In this case, main is a value of type IO () rather than a function. You can think of it as a sequence of IO a values:
main = getLine >>= putStrLn >> main
This makes it a recursive value, not unlike infinite lists:
foo = 1 : 2 : foo
We can return a value like this without needing to evaluate the whole thing. In fact, it's a reasonably common idiom.
foo will loop forever if you try to use the whole thing. But that's true of main too: unless you use some external method to break out of it, it will never stop looping! But you can start getting elements out of foo, or executing parts of main, without evaluating all of it.
The value main denotes is an infinite program:
main = do
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
...
But it's represented in memory as a recursive structure that references itself. That representation is finite, unless someone tries to unfold the entire thing to get a non-recursive representation of the entire program - that would never finish.
But just as you can probably figure out how to start executing the infinite program I wrote above without waiting for me to tell you "all" of it, so can Haskell's runtime system figure out how to execute main without unfolding the recursion up-front.
Haskell's lazy evaluation is actually interleaved with the runtime system's execution of the main IO program, so this works even for a function that returns an IO action which recursively invokes the function, like:
main = foo 1
foo :: Integer -> IO ()
foo x = do
print x
foo (x + 1)
Here foo 1 is not a recursive value (it contains foo 2, not foo 1), but it's still an infinite program. However this works just fine, because the program denoted by foo 1 is only generated lazily on-demand; it can be produced as the runtime system's execution of main goes along.
By default Haskell's laziness means that nothing is evaluated until it's needed, and then only "just enough" to get past the current block. Ultimately the source of all the "need" in "until it's needed" comes from the runtime system needing to know what the next step in the main program is so it can execute it. But it's only ever the next step; the rest of the program after that can remain unevaluated until after the next step has been fully executed. So infininte programs can be executed and do useful work so long as it's always only a finite amount of work to generate "one more step".

Keep variable inside another function in Haskell

I am lost in this concept. This is my code and what it does is just asking what is your name.
askName = do
name <- getLine
return ()
main :: IO ()
main = do
putStrLn "Greetings once again. What is your name?"
askName
However, How can I access in my main the variable name taken in askName?
This is my second attempt:
askUserName = do
putStrLn "Hello, what's your name?"
name <- getLine
return name
sayHelloUser name = do
putStrLn ("Hey " ++ name ++ ", you rock!")
main = do
askUserName >>= sayHelloUser
I can now re-use the name in a callback way, however if in main I want to call name again, how can I do that? (avoiding to put name <- getLine in the main, obviously)
We can imagine that you ask the name in order to print it, then let's rewrite it.
In pseudo code, we have
main =
print "what is your name"
bind varname with userReponse
print varname
Your question then concern the second instruction.
Take a look about the semantic of this one.
userReponse is a function which return the user input (getLine)
varname is a var
bind var with fun : is a function which associate a var(varname) to the output of a function(getLine)
Or as you know in haskell everything is a function then our semantic is not well suited.
We need to revisit it in order to respect this idiom. According to the later reflexion the semantic of our bind function become bind fun with fun
As we cannot have variable, to pass argument to a function, we need, at a first glance, to call another function, in order to produce them. Thus we need a way to chain two functions, and it's exactly what's bind is supposed to do. Furthermore, as our example suggest, an evaluation order should be respected and this lead us to the following rewriting with fun bind fun
That's suggest that bind is more that a function it's an operator.
Then for all function f and g we have with f bind g.
In haskell we note this as follow
f >>= g
Furthermore, as we know that a function take 0, 1 or more argument and return 0, 1 or more argument.
We could refine our definition of our bind operator.
In fact when f doesn't return any result we note >>= as >>
Applying, theses reflexions to our pseudo code lead us to
main = print "what's your name" >> getLine >>= print
Wait a minute, How the bind operator differ from the dot operator use for the composition of two function ?
It's differ a lot, because we have omit an important information, bind doesn't chain two function but it's chain two computations unit. And that's the whole point to understand why we have define this operator.
Let's write down a global computation as a sequence of computation unit.
f0 >>= f1 >> f2 >> f3 ... >>= fn
As this stage a global computation could be define as a set of computation unit with two operator >>=, >>.
How do we represent set in computer science ?
Usually as container.
Then a global computation is a container which contain some computation unit. On this container we could define some operator allowing us to move from a computation unit to the next one, taking into account or not the result of the later, this is ours >>= and >> operator.
As it's a container we need a way to inject value into it, this is done by the return function. Which take an object and inject it into a computation, you could check it through is signature.
return :: a -> m a -- m symbolize the container, then the global computation
As it's a computation we need a way to manage a failure, this done by the fail function.
In fact the interface of a computation is define by a class
class Computation
return -- inject an objet into a computation
>>= -- chain two computation
>> -- chain two computation, omitting the result of the first one
fail -- manage a computation failure
Now we can refine our code as follow
main :: IO ()
main = return "What's your name" >>= print >> getLine >>= print
Here I have intentionally include the signature of the main function, to express the fact that we are in the global IO computation and the resulting output with be () (as an exercise enter $ :t print in ghci).
If we take more focus on the definition for >>=, we can emerge the following syntax
f >>= g <=> f >>= (\x -> g) and f >> g <=> f >>= (\_ -> g)
And then write
main :: IO ()
main =
return "what's your name" >>= \x ->
print x >>= \_ ->
getLine >>= \x ->
print x
As you should suspect, we certainly have a special syntax to deal with bind operator in computational environment. You're right this is the purpose of do syntax
Then our previous code become, with do syntax
main :: IO ()
main = do
x <- return "what's your name"
_ <- print x
x <- getLine
print x
If you want to know more take a look on monad
As mentioned by leftaroundabout, my initial conclusion was a bit too enthusiastic
You should be shocked, because we have break referential transparency law (x take two different value inside our sequence of instruction), but it doesn't matter anymore,because we are into a computation, and a computation as defined later is a container from which we can derive an interface and this interface is designed to manage, as explain, the impure world which correspond to the real world.
Return the name from askname. In Haskell its not idiomatic to access "global" variables:
askName :: IO String
askName = do
name <- getLine
return name
main :: IO ()
main = do
putStrLn "Greetings once again. What is your name?"
name <- askName
putStrLn name
Now the only problem is tht the askName function is kind of pointless, since its now just an alias to getLine. We could "fix" that by putting the question inside askName:
askName :: IO String
askName = do
putStrLn "Greetings once again. What is your name?"
name <- getLine
return name
main :: IO ()
main = do
name <- askName
putStrLn name
Finally, just two little points: its usually a good idea to put type declarations when you are learning, to make things explicit and help compiler error messages. Another thing is to remember that the "return" function is only used for monadic code (it is not analogous to a traditional return statement!) and sometimes we could have ommited some intermediary variables:
askName :: IO String
askName = do
putStrLn "Greetings once again. What is your name?"
getLine

Resources