Does Haskell have implicit pattern matching? - haskell

Take this example:
module Main where
main = print (reverseWords "lol")
reverseWords :: String -> [String]
reverseWords = words
reverseWords function is not pattern matching against any argument here yet the function runs and outputs "[lol]".
I have two questions here:
How does Haskell know whether or not I am invoking the words function against the input to reverseWords? In this syntax it looks like I am simply returning the function words.
Why is this able to run successfully even though I have not provided any input arguments in the pattern to reverseWords?

You're just saying that reverseWords is the function words. Anywhere that reverseWords appears, it can be replaced by the function words. So print (reverseWords "lol") is exactly equivalent to print (words "lol"). Basically, you have this new function reverseWords that takes a String argument just like words does and simply passes that argument to words, returning whatever words returns for that argument. Your definition of reverseWords is equivalent to:
reverseWords :: String -> [String]
reverseWords s = words s
Given all of that, reverseWords is a misleading name because it's not doing anything differently than words. So, you're not really doing anything useful at all, just renaming something. A better example would be this:
reverseWords :: String -> [String]
reverseWords = reverse . words
where reverse is some other function you compose with words using the composition operator (.) to make a new function that does something useful. That's called point-free style, where you define new functions by combining other functions without referring to any arguments. That definition is equivalent to:
reverseWords :: String -> [String]
reverseWords s = reverse (words s)

You're declaring a new function reverseWords. First you declare it's type:
reverseWords :: String -> [String]
So it's going to be a function that takes a string and returns a list of strings.
Now there are two ways we can approach this. The first is to write a rule that says that when reverseWords receives some argument, the result is an expression (probably involving calling other functions and using the argument) for a list of strings. Like this:
reverseWords s = words s
This says "expressions of the form reverseWords s are defined to be equal to words s". So then the compiler knows that reverseWords "lol" is equal to words "lol". The function reverseWords is implicitly defined by the rules we write for it1.
But there's another way we can think about this. I assume you're perfectly comfortable with how this works:
myFavouriteNumber :: Integer
myFavouriteNumber = 28
We first declare that myFavouriteNumber will be of type Integer, and then define it by just writing down an integer.
Well, functions are first class values in Haskell, which means we don't only have to define them using dedicated special-purpose syntax. If we can make definitions of type Integer by just writing down an integer, we should be able to make definitions of type String -> [String] by just writing down something with that type, rather than writing down rules. That is what is going on in this form:
reverseWords = words
Rather than writing a rule for what the result of reverseWords will be when it's applied to something, we just write down what reverseWords is. In this case, we tell the compiler that reverseWords is defined to be equal to words. That still lets the compiler know that reverseWords "lol" is equal to words "lol", but it does it just by looking at reverseWords part; it could work that out even without looking at the "lol".
Furthermore, we can also write definitions like this:
two :: Integer
two = 1 + 1
Here instead of defining two as equal to some pre-existing thing, we calculate its value (from other pre-existing things: 1 and +). And because functions are first class, we can do the same thing:
reversedWords :: String -> [String]
reversedWords = reverse . words
Here we don't say reversedWords is equal to an existing function, instead we calculate the function that reverseWords is what you get by calling the composition operator . on the pre-existing functions reverse and words. But we're still calculating the function (of type String -> [String]), not the function's result (of type [String]).
So to answer your questions:
How does haskell know whether or not I am invoking the "words" function against the input to reverseWords? In this syntax it looks like I am simply returning the function "words"
Yes, you are just returning the function words. But you're "returning" it as the function reversedWords itself (before it's applied to anything), not as the result of reversedWords when it's applied. And that's how Haskell knows that the words function is to receive the input to reverseWords; reverseWords is equal to words, so any time you pass some input to reverseWords you're really passing it to words.
Why is this able to run successfully even though I have not provided any input arguments in the pattern to reverseWords?
Because you defined the function reverseWords. You defined it by declaring it equal to some other existing function, so it does whatever that function does. Writing rules for function results (based on arguments) is not the only way to define a function.
The fact that you didn't provide a name for the argument of reverseWords in your definition is exactly how Haskell knows that's what you're doing. If you're defining a function of type A -> B and you give a name to the argument, then the right hand side must be something of type B. If you don't, then the right hand side must be something of type A -> B.2
But for your tile question:
Does haskell have implicit pattern matching?
I'm not sure how to answer that, because none of this discussion has involved pattern matching at all. You can use pattern matching to define functions in Haskell, but that's not what is going on here.
1 Okay, in this case reverseWords is pretty explicitly defined by the rule, but in general functions can be defined using multiple rules using pattern matching and guards, and with auxilliary where definitions; the actual value of the function is sort of an emergent property of all the rules (and knowledge of how they'll be tried in order top-down) and where clauses.
2 This logic works regardless of what A and B are. In particular, B could be something with more arrows in it! That is exactly how functions with multiple arguments work in Haskell. A function like:
foo :: Int -> String -> (Int, String)
could be defined by either:
Writing a rule taking two arguments (an Int and a String), with a right hand side of type (Int, String)
Writing a rule taking one argument (an Int), with a right hand side of type String -> (Int, String)
Writing a direct definition with no arguments, with a right hand side of type Int -> String -> (Int, String)
The pattern is clear; each time you add an argument to your rule, the RHS has a type that strips off one more arrow (starting from the left).
All 3 options produce a function foo with the same type that you can call the same way. The internal definition of the function doesn't matter to the outside world.

reverseWord indeed "returns" words without calling it, and so reverseWords s becomes words s -- since reverseWords had returned words, the call reverseWords s had become the call words s.
It's like having the definition foo() { bar } in more conventional syntax, then foo()(x) === bar(x).

Related

Haskell irrefutable pattern matching

In order to get a head start into the path of Haskell, I chose the book by one of its creators Hudak. So, I am going through the gentle introduction to Haskell.
I got stuck at trying to understand the following statement:
Technically speaking, formal parameters are also patterns but they never fail to match a value.
From my little but relatively greater habituation with languages like C (or let's broadly say as non-functional languages), I could form that formal parameters are the arguments in the definition of a function. So, suppose there were a function like the following in C:
int func_add(int a, int d)
then passing a value of some other type like string will be a failure in pattern matching if I am correct. So calling func_add as func_add("trs", 5) is a case of pattern mismatch.
With a heavy possibility of an incorrect understanding or interpretation this similar situation can occur very well in Haskell as well when a piece of code calls a function by passing arguments of different types.
So, why is it said that in Haskell formal parameters are irrefutable pattern matching?
What you describe is not a pattern, it is a type. Haskell has types as well, and these are resolved at compile time. Each type can have several patterns. For example a list is defined as:
data Color = Red | Green | Blue | Other String
Now we can define a function foo:
foo :: Color -> String
foo Red = "Red"
foo Green = "Green"
foo Blue = "Blue"
foo (Other s) = s
The elements in boldface are all patterns. But these are not irrefutable: the first one will check if we have given the function a Red, the second whether we have given it Green, the third if the value is Blue, and finally we have a pattern (Other s) that will match with all Other patterns (regardless what the value of s is), since s is a variable, and a variable is an irrefutable pattern: we do not perform any checks on the value of the string.
Mind that these checks are done at runtime: if we would however call foo "Red", we will get a type error at compile time since the Haskell compiler knows that foo has type Color -> String.
If we would have written:
foo :: Color -> String
foo c = "Some color"
foo Red = "Red"
c is an irrefutable pattern: it will match any Color object, so the second line foo Red will never match, so c is an irrefutable pattern.
No, passing a value of some other type is not a failure in the pattern matching. It's a type error, and the code won't even compile. Formal parameters are irrefutable patterns for a well-typed program, which is the only kind of program that the compiler allows you to run.
In Haskell, you can define types in various ways. One of these is to introduce a sum type, like this:
data FooBar = Foo Int | Bar Bool
You could attempt to write a function like this, using pattern matching:
myFunction (Foo x) = x
That would, however, be a partially matched function, and if you try to call it with myFunction (Bar False), you'd get an exception.
You can, on the other hand, also define single-case sum types, like this:
data MyInt = MyInt Int
Here, you can write a function like this:
myFunction' (MyInt x) = x
Here, you're still using pattern matching, but since there's only a single case, this is a complete match. If the calling code compiles, the match can't fail.
The above MyInt is really only a wrapper around Int, so you could say that if you write a function that takes an Int, like this, it's the same sort of pattern matching:
myFunction'' :: Int -> Int
myFunction'' x = x + 42
While Int doesn't have a value constructor that you can use to pattern-match on, x is a pattern that always matches an Int value. Therefore you can say that a function argument is a match that always succeeds.

Operator as an argument in Haskell

I'm quite new to Haskell, may be it's a stupid question.
What I want is to give to my function as an argument any operator.
For example:
myFunc :: a -> Int -> Int -> Boolean
myFunc operator a b = a operator b
*Project> myFunc (>) 5 2
True
*Project> myFunc (<=) 5 2
False
Help me in advice how to do that please!
You can do that with haskell function arguments.
In your function above, you want myFunc to take a function that takes two Ints and returns a Bool (not a Boolean, you must have typed that wrong). The declaration for that function would be (Int -> Int -> Bool). Therefore, you can write:
myFunc :: (Int -> Int -> Bool) -> Int -> Int -> Bool
myFunc op a b = a `op` b
This defines a higher-order function that takes a function with two Int parameters that returns a Bool (and two Ints). You can now use it like any other function parameter!
Note that this is exactly the same as doing:
myFunc (#) a b = a # b
Or:
myFunc (%) a b = a % b
Because using infix operaters like * or /, or any operator composed only of special characters, without backticks is just shorthand for using them with (typing `/` every time you want to divide something would get annoying!).
Under the hood, functions "exist" without names. Any function you define (or that is already defined in libraries), such as myFunc just is a function value, and the name just gives us a way to refer to it in other code that wants to use it. This is exactly the same as if you write x = 3: the value 3 "exists" independently of the name x, that name just gives us a way to refer to it.
Why is this relevant to your question about passing operators?
Well, as far as Haskell is concerned, operators like > and <= are also just nameless functions that happen to be bound to the names > and <=. The special treatment of them as operators (that you can write them infix between the arguments you're calling them on) is only about the names, and changes if you refer to them with different names.
There are two types of names in Haskell. Alphanumeric names (consisting only of letters, numbers, and underscores), and symbolic names (consisting only of symbol characters). If you have an expression {1} {2} {3}, then if {2} is a symbolic name (and {1} and {3} aren't symbolic names; otherwise you have a syntax error), then the expression is interpreted as meaning "call {2} on the arguments {1} and {3}". But if none of them are symbolic names, then it's instead interpreted as "call {1} on the arguments {2} and {3}".1
But all of this happens only with reference to the name, not to the functions actually referred to by those names. So if you write your myFunc like so:
myFunc operator a b = operator a b
Then it doesn't actually matter whether myFunc was called like myFunc (+) 1 2 or like myFunc plus 1 2; inside the definition of myFunc the "operator" is referred to by the name operator, which is an alphanumeric name. So you put it first when you want to call it, with its arguments following.
Alternatively you could use a symbolic name inside myFunc, like so:
myFunc ($&^*) a b = a $&^* b
Again, this also works even when myFunc was called with a non-operator function like myFunc plus 1 2.
And of course, there are ways to convert either kind of name to work like the other; you can put an alphanumeric name in backticks to use it infix like an operator:
myFunc operator a b = a `operator` b
And you can put a symbolic name in parentheses to simply use it as reference to the function it's bound to (and this is in fact the only way to use an operator without providing arguments for it):
myFunc ($^&*) a b = ($&^*) a b
So basically, the only special thing you needed to know to pass an operator to your function is what you already knew: put the operator in parentheses when you call the function. Inside the definition of the function, you can write it exactly the same as any other function; the style of name you choose in that function definition will determine whether you call it like an operator or like an ordinary function. You don't need to know (and in fact cannot find out) whether it was an operator "outside" the function.
1 Of course, when you have more complex expressions involving more than 3 things and multiple operators, then the rules of precedence and associativity come into play to determine exactly what's going on.

How to access nth element in a Haskell tuple

I have this:
get3th (_,_,a,_,_,_) = a
which works fine in GHCI but I want to compile it with GHC and it gives error. If I want to write a function to get the nth element of a tuple and be able to run in GHC what should I do?
my all program is like below, what should I do with that?
get3th (_,_,a,_,_,_) = a
main = do
mytuple <- getLine
print $ get3th mytuple
Your problem is that getLine gives you a String, but you want a tuple of some kind. You can fix your problem by converting the String to a tuple – for example by using the built-in read function. The third line here tries to parse the String into a six-tuple of Ints.
main = do
mystring <- getLine
let mytuple = read mystring :: (Int, Int, Int, Int, Int, Int)
print $ get3th mytuple
Note however that while this is useful for learning about types and such, you should never write this kind of code in practise. There are at least two warning signs:
You have a tuple with more than three or so elements. Such a tuple is very rarely needed and can often be replaced by a list, a vector or a custom data type. Tuples are rarely used more than temporarily to bring two kinds of data into one value. If you start using tuples often, think about whether or not you can create your own data type instead.
Using read to read a structure is not a good idea. read will explode your program with a terrible error message at any tiny little mistake, and that's usually not what you want. If you need to parse structures, it's a good idea to use a real parser. read can be enough for simple integers and such, but no more than that.
The type of getLine is IO String, so your program won't type check because you are supplying a String instead of a tuple.
Your program will work if proper parameter is supplied, i.e:
main = do
print $ get3th (1, 2, 3, 4, 5, 6)
It seems to me that your confusion is between tuples and lists. That is an understandable confusion when you first meet Haskell as many other languages only have one similar construct. Tuples use round parens: (1,2). A tuple with n values in it is a type, and each value can be a different type which results in a different tuple type. So (Int, Int) is a different type from (Int, Float), both are two tuples. There are some functions in the prelude which are polymorphic over two tuples, ie fst :: (a,b) -> a which takes the first element. fst is easy to define using pattern matching like your own function:
fst (a,b) = a
Note that fst (1,2) evaluates to 1, but fst (1,2,3) is ill-typed and won't compile.
Now, lists on the other hand, can be of any length, including zero, and still be the same type; but each element must be of the same type. Lists use square brackets: [1,2,3]. The type for a list with elements of type a is written [a]. Lists are constructed from appending values onto the empty list [], so a list with one element can be typed [a], but this is syntactic sugar for a:[], where : is the cons operator which appends a value to the head of the list. Like tuples can be pattern matched, you can use the empty list and the cons operator to pattern match:
head :: [a] -> a
head (x:xs) = x
The pattern match means x is of type a and xs is of type [a], and it is the former we want for head. (This is a prelude function and there is an analogous function tail.)
Note that head is a partial function as we cannot define what it does in the case of the empty list. Calling it on an empty list will result in a runtime error as you can check for yourself in GHCi. A safer option is to use the Maybe type.
safeHead :: [a] -> Maybe a
safeHead (x:xs) = Just x
safeHead [] = Nothing
String in Haskell is simply a synonym for [Char]. So all of these list functions can be used on strings, and getLine returns a String.
Now, in your case you want the 3rd element. There are a couple of ways you could do this, you could call tail a few times then call head, or you could pattern match like (a:b:c:xs). But there is another utility function in the prelude, (!!) which gets the nth element. (Writing this function is a very good beginner exercise). So your program can be written
main = do
myString <- getLine
print $ myString !! 2 --zero indexed
Testing gives
Prelude> main
test
's'
So remember, tuples us ()and are strictly of a given length, but can have members of different types; whereas lists use '[]', can be any length, but each element must be the same type. And Strings are really lists of characters.
EDIT
As an aside, I thought I'd mention that there is a neater way of writing this main function if you are interested.
main = getLine >>= print . (!!3)

What is () in Haskell, exactly?

I'm reading Learn You a Haskell, and in the monad chapters, it seems to me that () is being treated as a sort of "null" for every type. When I check the type of () in GHCi, I get
>> :t ()
() :: ()
which is an extremely confusing statement. It seems that () is a type all to itself. I'm confused as to how it fits into the language, and how it seems to be able to stand for any type.
tl;dr () does not add a "null" value to every type, hell no; () is a "dull" value in a type of its own: ().
Let me step back from the question a moment and address a common source of confusion. A key thing to absorb when learning Haskell is the distinction between its expression language and its type language. You're probably aware that the two are kept separate. But that allows the same symbol to be used in both, and that is what is going on here. There are simple textual cues to tell you which language you're looking at. You don't need to parse the whole language to detect these cues.
The top level of a Haskell module lives, by default, in the expression language. You define functions by writing equations between expressions. But when you see foo :: bar in the expression language, it means that foo is an expression and bar is its type. So when you read () :: (), you're seeing a statement which relates the () in the expression language with the () in the type language. The two () symbols mean different things, because they are not in the same language. This replication often causes confusion for beginners, until the expression/type language separation installs itself in their subconscious, at which point it becomes helpfully mnemonic.
The keyword data introduces a new datatype declaration, involving a careful mixture of the expression and type languages, as it says first what the new type is, and secondly what its values are.
data TyCon tyvar ... tyvar = ValCon1 type ... type | ... | ValConn type ... type
In such a declaration, type constructor TyCon is being added to the type language and the ValCon value constructors are being added to the expression language (and its pattern sublanguage). In a data declaration, the things which stand in argument places for the ValCons tell you the types given to the arguments when that ValCon is used in expressions. For example,
data Tree a = Leaf | Node (Tree a) a (Tree a)
declares a type constructor Tree for binary tree types storing a elements at nodes, whose values are given by value constructors Leaf and Node. I like to colour type constructors (Tree) blue and value constructors (Leaf, Node) red. There should be no blue in expressions and (unless you're using advanced features) no red in types. The built-in type Bool could be declared,
data Bool = True | False
adding blue Bool to the type language, and red True and False to the expression language. Sadly, my markdown-fu is inadequate to the task of adding the colours to this post, so you'll just have to learn to add the colours in your head.
The "unit" type uses () as a special symbol, but it works as if declared
data () = () -- the left () is blue; the right () is red
meaning that a notionally blue () is a type constructor in the type language, but that a notionally red () is a value constructor in the expression language, and indeed () :: (). [ It is not the only example of such a pun. The types of larger tuples follow the same pattern: pair syntax is as if given by
data (a, b) = (a, b)
adding (,) to both type and expression languages. But I digress.]
So the type (), often pronounced "Unit", is a type containing one value worth speaking of: that value is also written () but in the expression language, and is sometimes pronounced "void". A type with only one value is not very interesting. A value of type () contributes zero bits of information: you already know what it must be. So, while there is nothing special about type () to indicate side effects, it often shows up as the value component in a monadic type. Monadic operations tend to have types which look like
val-in-type-1 -> ... -> val-in-type-n -> effect-monad val-out-type
where the return type is a type application: the (type) function tells you which effects are possible and the (type) argument tells you what sort of value is produced by the operation. For example
put :: s -> State s ()
which is read (because application associates to the left ["as we all did in the sixties", Roger Hindley]) as
put :: s -> (State s) ()
has one value input type s, the effect-monad State s, and the value output type (). When you see () as a value output type, that just means "this operation is used only for its effect; the value delivered is uninteresting". Similarly
putStr :: String -> IO ()
delivers a string to stdout but does not return anything exciting.
The () type is also useful as an element type for container-like structures, where it indicates that the data consists just of a shape, with no interesting payload. For example, if Tree is declared as above, then Tree () is the type of binary tree shapes, storing nothing of interest at nodes. Similarly [()] is the type of lists of dull elements, and if there is nothing of interest in a list's elements, then the only information it contributes is its length.
To sum up, () is a type. Its one value, (), happens to have the same name, but that's ok because the type and expression languages are separate. It's useful to have a type representing "no information" because, in context (e.g., of a monad or a container), it tells you that only the context is interesting.
The () type can be thought of as a zero-element tuple. It's a type that can only have one value, and thus it's used where you need to have a type, but you don't actually need to convey any information. Here's a couple of uses for this.
Monadic things like IO and State have a return value, as well as performing side-effects. Sometimes the only point of the operation is to perform a side-effect, like writing to the screen or storing some state. For writing to the screen, putStrLn must have type String -> IO ? -- IO always has to have some return type, but here there's nothing useful to return. So what type should we return? We could say Int, and always return 0, but that's misleading. So we return (), the type that has only one value (and thus no useful information), to indicate that there's nothing useful coming back.
It's sometimes useful to have a type which can have no useful values. Consider if you'd implemented a type Map k v which maps keys of type k to values of type v. Then you want to implement a Set, which is really similar to a map except that you don't need the value part, just the keys. In a language like Java you might use booleans as the dummy value type, but really you just want a type that has no useful values. So you could say type Set k = Map k ()
It should be noted that () is not particularly magic. If you want you can store it in a variable and do a pattern match on it (although there's not much point):
main = do
x <- putStrLn "Hello"
case x of
() -> putStrLn "The only value..."
It is called the Unit type, usually used to represent side effects. You can think of it vaguely as Void in Java. Read more here and here etc. What can be confusing is that () syntactically represents both the type and its only value literal. Also note that it is not similar to null in Java which means an undefined reference - () is just effectively a 0-sized tuple.
I really like to think of () by analogy with tuples.
(Int, Char) is the type of all pairs of an Int and a Char, so it's values are all possible values of Int crossed with all possible values of Char. (Int, Char, String) is similarly the type of all triples of an Int, a Char, and a String.
It's easy to see how to keep extending this pattern upwards, but what about downwards?
(Int) would be the "1-tuple" type, consisting of all possible values of Int. But that would be parsed by Haskell as just putting parentheses around Int, and thus being just the type Int. And values in this type would be (1), (2), (3), etc, which also would just get parsed as ordinary Int values in parentheses. But if you think about it, a "1-tuple" is exactly the same as just a single value, so there's no need to actually have them exist.
Going down one step further to zero-tuples gives us (), which should be all possible combinations of values in an empty list of types. Well, there's exactly one way to do that, which is to contain no other values, so there should be only one value in the type (). And by analogy with tuple value syntax, we can write that value as (), which certainly looks like a tuple containing no values.
That's exactly how it works. There is no magic, and this type () and its value () are in no way treated specially by the language.
() is not in fact being treated as "a null value for any type" in the monads examples in the LYAH book. Whenever the type () is used the only value which can be returned is (). So it's used as a type to explicitly say that there cannot be any other return value. And likewise where another type is supposed to be returned, you cannot return ().
The thing to keep in mind is that when a bunch of monadic computations are composed together with do blocks or operators like >>=, >>, etc, they'll be building a value of type m a for some monad m. That choice of m has to stay the same throughout the component parts (there's no way to compose a Maybe Int with an IO Int in that way), but the a can and very often is different at each stage.
So when someone sticks an IO () in the middle of an IO String computation, that's not using the () as a null in the String type, it's simply using an IO () on the way to building an IO String, the same way you could use an Int on the way to building a String.
Yet another angle:
() is the name of a set which contains a single element called ().
Its indeed slightly confusing that the name of the set and the
element in it happens to be the same in this case.
Remember: in Haskell a type is a set that has its possible values as elements in it.
The confusion comes from other programming languages:
"void" means in most imperative languages that there is no structure in memory storing a value. It seems inconsistent because "boolean" has 2 values instead of 2 bits, while "void" has no bits instead of no values, but there it is about what a function returns in a practical sense. To be exact: its single value consumes no bit of storage.
Let's ignore the value bottom (written _|_) for a moment...
() is called Unit, written like a null-tuple. It has only one value. And it is not called
Void, because Void has not even any value, thus could not be returned by any function.
Observe this: Bool has 2 values (True and False), () has one value (()), and Void has no value (it doesn't exist). They are like sets with two/one/no elements. The least memory they need to store their value is 1 bit / no bit / impossible, respectively. Which means that a function that returns a () may return with a result value (the obvious one) that may be useless to you. Void on the other hand would imply that that function will never return and never give you any result, because there would not exist any result.
If you want to give "that value" a name, that a function returns which never returns (yes, this sounds like crazytalk), then call it bottom ("_|_", written like a reversed T). It could represent an exception or infinity loop or deadlock or "just wait longer". (Some functions will only then return bottom, iff one of their parameters is bottom.)
When you create the cartesian product / a tuple of these types, you will observe the same behaviour:
(Bool,Bool,Bool,(),()) has 2·2·2·1·1=6 differnt values. (Bool,Bool,Bool,(),Void) is like the set {t,f}×{t,f}×{t,f}×{u}×{} which has 2·2·2·1·0=0 elements, unless you count _|_ as a value.

How to convert data from IO(String) to String in haskell [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
A Haskell function of type: IO String-> String
i'm reading some data from a file using the readFile function available in Haskell. But this function returns me some data stored as IO String. Does anybody knows how do I convert this data into a String type (or any function that reads String from a file, without the IO () type)?
It is a very general question about extracting data from monadic values.
The general idea is to use >>= function:
main = readFile foo >>= \s -> print s
>>= takes 2 arguments. It extracts the value from its first argument and passes it to its second argument. The first argument is monadic value, in this case of type IO String, and the second argument is a function that accepts a plain, non-monadic value, in this case String.
There is a special syntax for this pattern:
main = do
s <- readFile foo
print s
But the meaning is the same as above. The do notation is more convenient for beginners and for certain complicated cases, but explicit application of >>= can lead to a shorter code. For example, this code can be written as just
main = readFile foo >>= print
Also there are a big family of library functions to convert between monadic and non-monadic values. The most important of them are return, fmap, liftM2 and >=>.
The concept of monad is very useful beyond representing IO in a referentially transparent way: these helpers are very useful for error handling, dealing with implicit state and other applications of monads.
The second most important monad is Maybe.
I'd treat the IO type as a functor in this case, and instead of getting the value out of it, I'd send my function inside it and let the Functor instance deal with creating a new IO container with the result from my function.
> :m +Data.Functor
> length <$> readFile "file.txt"
525
<$> is an alias for fmap. I like <$> more, but it's just a personal preference.

Resources