In java we always write:
public static void main(String[] args){...}
when we want to start writing a program.
My question is, is it the same for Haskell, IE: can I always be sure to declare:
main = do, when I want to write code for a program in Haskell?
for example:
main = do
putStrLn "What's your name?"
name <- getLine
putStrLn ("Hello " ++ name)
This program is going to ask the user "What's your name?"
the user input will then be stored inside of the name-variable, and
"Hello" ++ name will be displayed before the program terminates.
Short answer: No, we have to declare a main =, but not a do.
The main must be an IO monad type (so IO a) where a is arbitrary (since it is ignored), as is written here:
The use of the name main is important: main is defined to be the entry point of a Haskell program (similar to the main function in C), and must have an IO type, usually IO ().
But you do not necessary need do notation. Actually do is syntactical sugar. Your main is in fact:
main =
putStrLn "What's your name?" >> getLine >>= \n -> putStrLn ("Hello " ++ n)
Or more elegantly:
main = putStrLn "What's your name?" >> getLine >>= putStrLn . ("Hello " ++)
So here we have written a main without do notation. For more about desugaring do notation, see here.
Yes, if you have more than one line in your do block, and if you are even using the do notation.
The full do-notation syntax also includes explicit separators -- curly braces and semicolons:
main = do { putStrLn "What's your name?"
; name <- getLine
; putStrLn ("Hello " ++ name)
}
With them, indentation plays no role other than in coding style (good indentation improves readability; explicit separators ensure code robustness, remove white-space related brittleness). So when you have only one line of IO-code, like
main = do { print "Hello!" }
there are no semicolons, no indentation to pay attention to, and the curly braces and do keyword itself become redundant:
main = print "Hello!"
So, no, not always. But very often it does, and uniformity in code goes a long way towards readability.
do blocks translate into monadic code, but you can view this fact as implementational detail, at first. In fact, you should. You can treat the do notation axiomatically, as an embedded language, mentally. Besides, it is that, anyway.
The simplified do-syntax is:
do { pattern1 <- action1
; pattern2 <- action2
.....................
; return (.....)
}
Each actioni is a Haskell value of type M ai for some monad M and some result type ai. Each action produces its own result type ai while all actions must belong to the same monad type M.
Each patterni receives the previously "computed" result from the corresponding action.
Wildcards _ can be used to ignore it. If this is the case, the _ <- part can be omitted altogether.
"Monad" is a scary and non-informative word, but it is really nothing more than EDSL, conceptually. Embedded domain-specific language means that we have native Haskell values standing for (in this case) I/O computations. We write our I/O programs in this language, which become a native Haskell value(s), which we can operate upon as on any other native Haskell value -- collect them in lists, compose them into more complex computation descriptions (programs), etc.
The main value is one such value computed by our Haskell program. The compiler sees it, and performs the I/O program that it stands for, at run time.
The point to it is that we can now have a "function" getCurrentTime (impossible, on the face of it, in functional paradigm since it must return different results on separate invocations), because it is not returning the current time -- the action it describes will do so, when the I/O program it describes is run by the run-time system.
On the type level this is reflected by such values not having just some plain Haskell type a, but a parameterized type, IO a, "tagged" by IO as belonging to this special world of I/O programming.
See also: Why does haskell's bind function take a function from non-monadic to monadic.
Related
In particular I want to use pure Haskell functions on the results on console input. I'm curious if exists something like ? operator from rust.
this snippet complains that words expect the type String, while its supplied wrapped into an io action. But i can't even use >>= operator since as far as i understand i cannot instantiate IO constructor directly. So, does it mean all the standard library functions can't work with io even in the scope of io action?
thing :: IO ()
thing = do
let mbwords = words $ getLine ;
Since Haskell is not designed to be totally useless, there must be a way. And there is more than one indeed.
What you looking for is something like this
main = do
line <- getLine
let mbwords = words line
or perhaps
main = do
mbwords <- fmap words getLine
Note you use <- to "open" the monadic value inside the do-block, not let.
Also note instantiating IO constructors is not necessary as long as you have functions that return IO actions, such as print. You can express the same idea with
main = fmap words getLine >>= print
(of course you can use an arbitrary action of the right type instead of print).
Take the function getLine - it has the type:
getLine :: IO String
How do I extract the String from this IO action?
More generally, how do I convert this:
IO a
to this:
a
If this is not possible, then why can't I do it?
In Haskell, when you want to work with a value that is "trapped" in IO, you don't take the value out of IO. Instead, you put the operation you want to perform into IO, as well!
For example, suppose you want to check how many characters the getLine :: IO String will produce, using the length function from Prelude.
There exists a helper function called fmap which, when specialized to IO, has the type:
fmap :: (a -> b) -> IO a -> IO b
It takes a function that works on "pure" values not trapped in IO, and gives you a function that works with values that are trapped in IO. This means that the code
fmap length getLine :: IO Int
represents an IO action that reads a line from console and then gives you its length.
<$> is an infix synonym for fmap that can make things simpler. This is equivalent to the above code:
length <$> getLine
Now, sometimes the operation you want to perform with the IO-trapped value itself returns an IO-trapped value. Simple example: you wan to write back the string you have just read using putStrLn :: String -> IO ().
In that case, fmap is not enough. You need to use the (>>=) operator, which, when specialiced to IO, has the type IO a -> (a -> IO b) -> IO b. In out case:
getLine >>= putStrLn :: IO ()
Using (>>=) to chain IO actions has an imperative, sequential flavor. There is a kind of syntactic sugar called "do-notation" which helps to write sequential operation like these in a more natural way:
do line <- getLine
putStrLn line
Notice that the <- here is not an operator, but part of the syntactic sugar provided by the do notation.
Not going into any details, if you're in a do block, you can (informally/inaccurately) consider <- as getting the value out of the IO.
For example, the following function takes a line from getLine, and passes it to a pure function that just takes a String
main = do
line <- getLine
putStrLn (wrap line)
wrap :: String -> String
wrap line = "'" ++ line ++ "'"
If you compile this as wrap, and on the command line run
echo "Hello" | wrap
you should see
'Hello'
If you know C then consider the question "How can I get the string from gets?" An IO String is not some string that's made hard to get to, it's a procedure that can return a string - like reading from a network or stdin. You want to run the procedure to obtain a string.
A common way to run IO actions in a sequence is do notation:
main = do
someString <- getLine
-- someString :: String
print someString
In the above you run the getLine operation to obtain a String value then use the value however you wish.
So "generally", it's unclear why you think you need a function of this type and in this case it makes all the difference.
It should be noted for completeness that it is possible. There indeed exists a function of type IO a -> a in the base library called unsafePerformIO.
But the unsafe part is there for a reason. There are few situations where its usage would be considered justified. It's an escape hatch to be used with great caution - most of the time you will let monsters in instead of letting yourself out.
Why can't you normally go from IO a to a? Well at the very least it allows you to break the rules by having a seemingly pure function that is not pure at all - ouch! If it were a common practice to do this the type signatures and all the work done by the compiler to verify them would make no sense at all. All the correctness guarantees would go out of the window.
Haskell is, partly, interesting precisely because this is (normally) impossible.
For how to approach your getLine problem in particular see the other answers.
Haskell IO is often explained in terms of the entire program being a pure function (main) that returns an IO value (often described as an imperative IO program), which is then executed by the runtime.
This mental model works fine for simple examples, but fell over for me as soon as I saw a recursive main in Learn You A Haskell. For example:
main = do
line <- getLine
putStrLn line
main
Or, if you prefer:
main = getLine >>= putStrLn >> main
Since main never terminates, it never actually returns an IO value, yet the program endlessly reads and echoes back lines just fine - so the simple explanation above doesn't quite work. Am I missing something simple or is there a more complete explanation (or is it 'simply' compiler magic) ?
In this case, main is a value of type IO () rather than a function. You can think of it as a sequence of IO a values:
main = getLine >>= putStrLn >> main
This makes it a recursive value, not unlike infinite lists:
foo = 1 : 2 : foo
We can return a value like this without needing to evaluate the whole thing. In fact, it's a reasonably common idiom.
foo will loop forever if you try to use the whole thing. But that's true of main too: unless you use some external method to break out of it, it will never stop looping! But you can start getting elements out of foo, or executing parts of main, without evaluating all of it.
The value main denotes is an infinite program:
main = do
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
...
But it's represented in memory as a recursive structure that references itself. That representation is finite, unless someone tries to unfold the entire thing to get a non-recursive representation of the entire program - that would never finish.
But just as you can probably figure out how to start executing the infinite program I wrote above without waiting for me to tell you "all" of it, so can Haskell's runtime system figure out how to execute main without unfolding the recursion up-front.
Haskell's lazy evaluation is actually interleaved with the runtime system's execution of the main IO program, so this works even for a function that returns an IO action which recursively invokes the function, like:
main = foo 1
foo :: Integer -> IO ()
foo x = do
print x
foo (x + 1)
Here foo 1 is not a recursive value (it contains foo 2, not foo 1), but it's still an infinite program. However this works just fine, because the program denoted by foo 1 is only generated lazily on-demand; it can be produced as the runtime system's execution of main goes along.
By default Haskell's laziness means that nothing is evaluated until it's needed, and then only "just enough" to get past the current block. Ultimately the source of all the "need" in "until it's needed" comes from the runtime system needing to know what the next step in the main program is so it can execute it. But it's only ever the next step; the rest of the program after that can remain unevaluated until after the next step has been fully executed. So infininte programs can be executed and do useful work so long as it's always only a finite amount of work to generate "one more step".
I am still struggling with Haskell and now I have encountered a problem with wrapping my mind around the Input/Output monad from this example:
main = do
line <- getLine
if null line
then return ()
else do
putStrLn $ reverseWords line
main
reverseWords :: String -> String
reverseWords = unwords . map reverse . words
I understand that because functional language like Haskell cannot be based on side effects of functions, some solution had to be invented. In this case it seems that everything has to be wrapped in a do block. I get simple examples, but in this case I really need someone's explanation:
Why isn't it enough to use one, single do block for I/O actions?
Why do you have to open completely new one in if/else case?
Also, when does the -- I don't know how to call it -- "scope" of the do monad ends, i.e. when can you just use standard Haskell terms/functions?
The do block concerns anything on the same indentation level as the first statement. So in your example it's really just linking two things together:
line <- getLine
and all the rest, which happens to be rather bigger:
if null line
then return ()
else do
putStrLn $ reverseWords line
main
but no matter how complicated, the do syntax doesn't look into these expressions. So all this is exactly the same as
main :: IO ()
main = do
line <- getLine
recurseMain line
with the helper function
recurseMain :: String -> IO ()
recurseMain line
| null line = return ()
| otherwise = do
putStrLn $ reverseWords line
main
Now, obviously the stuff in recurseMain can't know that the function is called within a do block from main, so you need to use another do.
do doesn't actually do anything, it's just syntactic sugar for easily combining statements. A dubious analogy is to compare do to []:
If you have multiple expressions you can combine them into lists using ::
(1 + 2) : (3 * 4) : (5 - 6) : ...
However, this is annoying, so we can instead use [] notation, which compiles to the same thing:
[1+2, 3*4, 5-6, ...]
Similarly, if you have multiple IO statments, you can combine them using >> and >>=:
(putStrLn "What's your name?") >> getLine >>= (\name -> putStrLn $ "Hi " ++ name)
However, this is annoying, so we can instead use do notation, which compiles to the same thing:
do
putStrLn "What's your name?"
name <- getLine
putStrLn $ "Hi " ++ name
Now the answer to why you need multiple do blocks is simple:
If you have multiple lists of values, you need multiple []s (even if they're nested).
If you have multiple sequences of monadic statements, you need multiple dos (even if they're nested).
While learning Haskell I am wondering when an IO action will be performed. In several places I found descriptions like this:
"What’s special about I/O actions is that if they fall into the main function, they are performed."
But in the following example, 'greet' never returns and therefore nothing should be printed.
import Control.Monad
main = greet
greet = forever $ putStrLn "Hello World!"
Or maybe I should ask: what does it mean to "fall into the main function"?
First of all, main is not a function. It is indeed just a regular value and its type is IO (). The type can be read as: An action that, when performed, produces a value of type ().
Now the run-time system plays the role of an interpreter that performs the actions that you have described. Let's take your program as example:
main = forever (putStrLn "Hello world!")
Notice that I have performed a transformation. That one is valid, since Haskell is a referentially transparent language. The run-time system resolves the forever and finds this:
main = putStrLn "Hello world!" >> MORE1
It doesn't yet know what MORE1 is, but it now knows that it has a composition with one known action, which is executed. After executing it, it resolves the second action, MORE1 and finds:
MORE1 = putStrLn "Hello world!" >> MORE2
Again it executes the first action in that composition and then keeps on resolving.
Of course this is a high level description. The actual code is not an interpreter. But this is a way to picture how a Haskell program gets executed. Let's take another example:
main = forever (getLine >>= putStrLn)
The RTS sees this:
main = forever MORE1
<< resolving forever >>
MORE1 = getLine >>= MORE2
<< executing getLine >>
MORE2 result = putStrLn result >> MORE1
<< executing putStrLn result (where 'result' is the line read)
and starting over >>
When understanding this you understand how an IO String is not "a string with side effects" but rather the description of an action that would produce a string. You also understand why laziness is crucial for Haskell's I/O system to work.
In my opinion the point of the statement "What’s special about I/O actions is that if they fall into the main function, they are performed." is that IO actions are first class citizens. That is, IO-actions can occur at all places where values of other data types like Int can occur. For example, you can define a list that contains IO actions as follows.
actionList = [putStr "Hello", putStr "World"]
The list actionList has type [IO ()]. That is, the list contains actions that interact with the world, for example, print on the console or read in input from the user. But, in defining this list we do not execute the actions, we simply put them in a list for later use.
If an IO can occur somewhere in your program, the question arrises when these actions are performed and here main comes into play. Consider the following definition of main.
main = do
actionList !! 0
actionList !! 1
This main function projects to the first and the second component of the list and "executes" the corresponding actions by using them within its definition. Note that it does not necessarily have to be the main function itself that executes an IO action. Any function that is called from the main function can execute actions as well. For example, we can define a function that calls the actions from actionList and let main call this function as follows.
main = do
caller
putStr "!"
caller = do
actionList !! 0
actionList !! 1
To highlight that it does not have to be a simple renaming like in main = caller I have added an action that prints an exclamation mark after it has performed the actions from the list.
Simple IO actions can be combined into more advanced ones by using do notation.
main = do
printStrLn "Hello"
printStrLn "World"
combines the IO action printStrLn "Hello" with the IO action printStrLn "World". Main is now an IO action first printing a line that says "Hello" and then a line that says "World". Written without do-notation (which is just syntactic suger) it looks like this:
main = printStrLn "Hello" >> printStrLn "World"
Here you can see the >> function combining the two actions.
You can create an IO action that reads a line, passes it to a function(that does awesome stuff to it :)) and the prints the result like this:
main = do
input <- getLine
let result = doAwesomeStuff input
printStrLn result
or without binding the result to a variable:
main = do
input <- getLine
printStrLn (doAwesomeStuff input)
This can ofcourse also be written as IO actions and functions that combine them like this:
main = getLine >>= (\input -> printStrLn (doAwesomeStuff input))
When you run the program the main IO action is executed. This is the only time any IO actions are actually executed. (well technically you can also execute them within you program, but it is not safe. The function that does the is called unsafePerformIO.)
You can read more here: http://www.haskell.org/haskellwiki/Introduction_to_Haskell_IO/Actions
(This link is probably a better explaination than mine, but I only found it after I had written nearly everything. It is also quite a bit longer)
launchAMissile :: IO ()
launchAMissile = do
openASilo
loadCoordinates
launchAMissile
main = do
let launch3missiles = launchAMissile >> launchAMissile >> launchAMissile
putStrLn "Not actually launching any missiles"
forever isn't a loop like C's while (true). It is a function that produces an IO value (which contains an infinitely repeated sequence of actions), which is consumed by the caller. (In this case, the caller is main, which means that the actions get executed by the runtime system).