Iteratively printing every integer in a List - haskell

Say I have a List of integers l = [1,2]
Which I want to print to stdout.
Doing print l produces [1,2]
Say I want to print the list without the braces
map print l produces
No instance for (Show (IO ())) arising from a use of `print'
Possible fix: add an instance declaration for (Show (IO ()))
In a stmt of an interactive GHCi command: print it
`:t print
print :: Show a => a -> IO ()
So while I thought this would work I went ahead and tried:
map putStr $ map show l
Since I suspected a type mismatch from Integer to String was to blame. This produced the same error message as above.
I realize that I could do something like concatenating the list into a string, but I would like to avoid that if possible.
What's going on? How can I do this without constructing a string from the elements of the List?

The problem is that
map :: (a -> b) -> [a] -> [b]
So we end up with [IO ()]. This is a pure value, a list of IO actions. It won't actually print anything. Instead we want
mapM_ :: (a -> IO ()) -> [a] -> IO ()
The naming convention *M means that it operates over monads and *_ means we throw away the value. This is like map except it sequences each action with >> to return an IO action.
As an example mapM_ print [1..10] will print each element on a new line.

Suppose you're given a list xs :: [a] and function f :: Monad m => a -> m b. You want to apply the function f to each element of xs, yielding a list of actions, then sequence these actions. Here is how I would go about constructing a function, call it mapM, that does this. In the base case, xs = [] is the empty list, and we simply return []. In the recursive case, xs has the form x : xs. First, we want to apply f to x, giving the action f x :: m b. Next, we want recursively call mapM on xs. The result of performing the first step is a value, say y; the result of performing the second step is a list of values, say ys. So we collect y and ys into a list, then return them in the monad:
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
mapM f [] = return []
mapM f (x : xs) = f x >>= \y -> mapM f ys >>= \ys -> return (y : ys)
Now we can map a function like print, which returns an action in the IO monad, over a list of values to print: mapM print [1..10] does precisely this for the list of integers from one through ten. There is a problem, however: we aren't particularly concerned about collecting the results of printing operations; we're primarily concerned about their side effects. Instead of returning y : ys, we simply return ().
mapM_ :: Monad m => (a -> m b) ->[a] -> m ()
mapM_ f [] = return ()
mapM_ f (x : xs) = f x >> mapM_ f xs
Note that mapM and mapM_ can be defined without explicit recursion using the sequence and sequence_ functions from the standard library, which do precisely what their names imply. If you look at the source code for mapM and mapM_ in Control.Monad, you will see them implemented that way.

Everything in Haskell is very strongly typed, including code to perform IO!
When you write print [1, 2], this is just a convenience wrapper for putStrLn (show [1, 2]), where show is a function that turns a (Show'able) object into a string. print itself doesn't do anything (in the side effect sense of do), but it outputs an IO() action, which is sort of like a mini unrun "program" (if you excuse the sloppy language), which isn't "run" at its creation time, but which can be passed around for later execution. You can verify the type in ghci
> :t print [1, 2]
print [1, 2]::IO()
This is just an object of type IO ().... You could throw this away right now and nothing would ever happen. More likely, if you use this object in main, the IO code will run, side effects and all.
When you map multiple putStrLn (or print) functions onto a list, you still get an object whose type you can view in ghci
> :t map print [1, 2]
map print [1, 2]::[IO()]
Like before, this is just an object that you can pass around, and by itself it will not do anything. But unlike before, the type is incorrect for usage in main, which expects an IO() object. In order to use it, you need to convert it to this type.
There are many ways to do this conversion.... One way that I like is the sequence function.
sequence $ map print [1, 2]
which takes a list of IO actions (ie- mini "programs" with side effects, if you will forgive the sloppy language), and sequences them together as on IO action. This code alone will now do what you want.
As jozefg pointed out, although sequence works, sequence_ is a better choice here....
Sequence not only concatinates the stuff in the IO action, but also puts the return values in a list.... Since print's return value is IO(), the new return value becomes a useless list of ()'s (in IO). :)

Using the lens library:
[1,2,3] ^! each . act print

You might write your own function, too:
Prelude> let l = [1,2]
Prelude> let f [] = return (); f (x:xs) = do print x; f xs
Prelude> f l
1
2

Related

Are there ways to call two functions (one just after another) in purely functional language? (in non-io mode)

I'm trying to understand order of execution in purely functional language.
I know that in purely functional languages, there is no necessary execution order.
So my question is:
Suppose there are two functions.
I would like to know all ways in which I can call one function after another (except nested call of one function from another) (and except io-mode).
I would like to see examples in Haskell or pseudo-code.
There is no way to do what you describe, if the functions are totally independent and you don't use the result of one when you call the other.
This is because there is no reason to do this. In a side effect free setting, calling a function and then ignoring its result is exactly the same as doing nothing for the amount of time it takes to call that function (setting aside memory usage).
It is possible that seq x y will evaluate x and then y, and then give you y as its result, but this evaluation order isn't guaranteed.
Now, if we do have side effects, such as if we are working inside a Monad or Applicative, this could be useful, but we aren't truly ignoring the result since there is context being passed implicitly. For instance, you can do
main :: IO ()
main = putStrLn "Hello, " >> putStrLn "world"
in the IO Monad. Another example would be the list Monad (which could be thought of as representing a nondeterministic computation):
biggerThanTen :: Int -> Bool
biggerThanTen n = n > 10
example :: String
example = filter biggerThanTen [1..15] >> return 'a' -- This evaluates to "aaaaa"
Note that even here we aren't really ignoring the result. We ignore the specific values, but we use the structure of the result (in the second example, the structure would be the fact that the resulting list from filter biggerThanTen [1..15] has 5 elements).
I should point out, though, that things that are sequenced in this way aren't necessarily evaluated in the order that they are written. You can sort of see this with the list Monad example. This becomes more apparent with bigger examples though:
example2 :: [Int]
example2 =
[1,2,3] >>=
(\x -> [10,100,1000] >>=
(\y -> return (x * y))) -- ==> [10,100,1000,20,200,2000,30,300,3000]
The main takeaway here is that evaluation order (in the absence of side effects like IO and ignoring bottoms) doesn't affect the ultimate meaning of code in Haskell (other than possible differences in efficiency, but that is another topic). As a result, there is never a reason to call two functions "one after another" in the fashion described in the question (that is, where the calls are totally independent from each other).
Do notation
Do notation is actually exactly equivalent to using >>= and >> (there is actually one other thing involved that takes care of pattern match failures, but that is irrelevant to the discussion at hand). The compiler actually takes things written in do notation and converts them to >>= and >> through a process called "desugaring" (since it removes the syntactic sugar). Here are the three examples from above written with do notation:
IO Example
main :: IO ()
main = do
putStrLn "Hello, "
putStrLn "World"
First list example
biggerThanTen :: Int -> Bool
biggerThanTen n = n > 10
example :: String -- String is a synonym for [Char], by the way
example = do
filter biggerThanTen [1..15]
return 'a'
Second list example
example2 :: [Int]
example2 = do
x <- [1,2,3]
y <- [10,100,1000]
return (x * y)
Here is a side-by-side comparison of the conversions:
do --
m -- m >> n
n --
do --
x <- m -- m >>= (\x ->
... -- ...)
The best way to understand do notation is to first understand >>= and return since, as I said, that's what the compiler transforms do notation into.
As a side-note, >> is just the same as >>=, it just ignores the "result" of it's left argument (although it preserves the "context" or "structure"). So all definitions of >> must be equivalent to m >> n = m >>= (\_ -> n).
Expanding the >>= in the second list example
To help drive home the point that Monads are not usually impure, lets expand the >>= calls in the second list example, using the Monad definition for lists. The definition is:
instance Monad [] where
return x = [x]
xs >>= f = concatMap f xs
and we can convert example2 into:
Step 0 (what we already have)
example2 :: [Int]
example2 =
[1,2,3] >>=
(\x -> [10,100,1000] >>=
(\y -> return (x * y)))
Step 1 (converting the first >>=)
example2 =
concatMap
(\x -> [10,100,1000] >>=
(\y -> return (x * y)))
[1,2,3]
Step 2
example2 =
concatMap
(\x -> concatMap
(\y -> return (x * y))
[10,100,1000])
[1,2,3]
Step 3
example2 =
concatMap
(\x -> concatMap
(\y -> [x * y])
[10,100,1000])
[1,2,3]
So, there is no magic going on here, just normal function calls.
You can write a function whose arguments depend on the evaluation of another function:
-- Ads the first two elements of a list together
myFunc :: [Int] -> Int
myFunc xs = (head xs) + (head $ tail xs)
If that's what you mean. In this case, you can't get the output of myFunc xs without evaluating head xs, head $ tail xs and (+). There is an order here. However, the compiler can choose which order to execute head xs and head $ tail xs in since they are not dependent on each other, but it can't do the addition without having both of the other results. It could even choose to evaluate them in parallel, or on different machines. The point is that pure functions, because they have no side effects, don't have to be evaluated in a given order until their results are interdependent.
Another way to look at the above function is as a graph:
myFunc
|
(+)
/ \
/ \
head head
\ |
\ tail
\ /
xs
In order to evaluate a node, all nodes below it have to be evaluated first, but different branches can be evaluated in parallel. First xs must be evaluated, at least partially, but after that the two branches can be evaluated in parallel. There are some nuances due to lazy evaluation, but this is essentially how the compiler constructs evaluation trees.
If you really want to force one function call before the other, you can use the seq function. It takes two arguments, forces the first to be evaluated, then returns the second, e.g.
myFunc2 :: [Int] -> Int
myFunc2 xs = hxs + (hxs `seq` (head $ tail xs))
where hxs = head xs
This will force head xs to evaluate before head $ tail xs, but this is more dealing with strictness than sequencing functions.
Here is an easy way:
case f x of
result1 -> case g y of
result2 -> ....
Still, unless g y uses something from result1 and the subsequent calculations something from result2, or the pattern is such that the result must be evaluated, there is no guarantee that either of f or g are actually called, nor in what order.
Still, you wanted a way to call one function after another, and this is such a way.

Cont Monad breaks laziness in Haskell

I was trying the Cont monad, and discovers the following problem.
First construct a infinite list and lift all the elements to a Cont monad
Use sequence operation to get a Cont monad on the infinite list.
When we try to run the monad, with head, for example, it falls into infinite loop
while trying to expand the continuation and the head is never called.
The code looks like this:
let inff = map (return :: a -> Cont r a) [0..]
let seqf = sequence inff
runCont seqf head
So is this a limitation of the Cont monad implementation in Haskell?
If so, how do we improve this?
The reason is that even though the value of the head element of sequence someList depends only on the first elemenent of someList, the effect of sequence someList can generally depend on all the effects of someList (and it does for most monads). Therefore, if we want to evaluate the head element, we still need to evaluate all the effects.
For example, if we have a list of Maybe values, the result of sequence someList is Just only if all the elements of someList are Just. So if we try to sequence an infinite list, we'd need to examine its infinite number of elements if they're all Just.
The same applies for Cont.
In the continuation monad, we can escape any time from the computation and return a result that is different from what has been computed so far.
Consider the following example:
test :: (Num a, Enum a) => a
test = flip runCont head $
callCC $ \esc -> do
sequence (map return [0..100] ++ [esc [-1]])
or directly using cont instead of callCC:
test' :: (Num a, Enum a) => a
test' = flip runCont head $
sequence (map return [0..100] ++ [cont (const (-1))])
The result of test is just -1. After processing the first 100 elements, the final element can decide to escape all of this and return -1 instead. So in order to see what is the head element of sequence someList in Cont, we again need to compute them all.
This is not a flaw with the Cont monad so much as sequence. You can get similar results for Either, for example:
import Control.Monad.Instances ()
xs :: [Either a Int]
xs = map Right [0..] -- Note: return = Right, for Either
ys :: Either a [Int]
ys = sequence xs
You can't retrieve any elements of ys until it computes the entire list, which will never happen.
Also, note that: sequence (map f xs) = mapM f xs, so we can simplify this example to:
>>> import Control.Monad.Instances
>>> mapM Right [0..]
<Hangs forever>
There are a few monads where mapM will work on an infinite list of values, specifically the lazy StateT monad and Identity, but they are the exception to the rule.
Generally, mapM/sequence/replicateM (without trailing underscores) are anti-patterns and the correct solution is to use pipes, which allows you to build effectful streams that don't try to compute all the results up front. The beginning of the pipes tutorial describes how to solve this in more detail, but the general rule of thumb is that any time you write something like:
example1 = mapM f xs
example2 = sequence xs
You can transform it into a lazy Producer by just transforming it to:
example1' = each xs >-> Pipes.Prelude.mapM f
example2' = each xs >-> Pipes.Prelude.sequence
Using the above example with Either, you would write:
>>> import Pipes
>>> let xs = each [0..] >-> mapM Right :: Producer Int (Either a) ()
Then you can lazily process the stream without generating all elements:
>>> Pipes.Prelude.any (> 10) xs
Right True

How to generate a list of repeated applications of a function to the previous result of it in IO context

As a part of a solution for the problem I'm trying to solve I need to generate a list of repeated application of a function to it's previous result. Sounds very much like iterate function, with the exception, that iterate has signature of
iterate :: (a -> a) -> a -> [a]
and my function lives inside of IO (I need to generate random numbers), so I'd need something more of a:
iterate'::(a -> IO a) -> a -> [a]
I have looked at the hoogle, but without much success.
You can actually get a lazy iterate that works on infinite lists if you use the pipes library. The definition is really simple:
import Pipes
iterate' :: (a -> IO a) -> a -> Producer a IO r
iterate' f a = do
yield a
a2 <- lift (f a)
iterate' f a2
For example, let's say that our step function is:
step :: Int -> IO Int
step n = do
m <- readLn
return (n + m)
Then applying iterate to step generates a Producer that lazily prompts the user for input and generates the tally of values read so far:
iterate' step 0 :: Producer Int IO ()
The simplest way to read out the value is to loop over the Producer using for:
main = runEffect $
for (iterate' step 0) $ \n -> do
lift (print n)
The program then endlessly loops, requesting user input and displaying the current tally:
>>> main
0
10<Enter>
10
14<Enter>
24
5<Enter>
29
...
Notice how this gets two things correct which the other solutions do not:
It works on infinite lists (you don't need a termination condition)
It produces results immediately. It doesn't wait until you run the action on the entire list to start producing usable values.
However, we can easily filter results just like the other two solutions. For example, let's say I want to stop when the tally is greater than 100. I can just write:
import qualified Pipes.Prelude as P
main = runEffect $
for (iterate' step 0 >-> P.takeWhile (< 100)) $ \n -> do
lift (print n)
You can read that as saying: "Loop over the iterated values while they are less than 100. Print the output". Let's try it:
>>> main
0
10<Enter>
10
20<Enter>
30
75<Enter>
>>> -- Done!
In fact, pipes has another helper function for printing out values, so you can simplify the above to a pipeline:
main = runEffect $ iterate' step 0 >-> P.takeWhile (< 100) >-> P.print
This gives a clear view of the flow of information. iterate' produces a never-ending stream of Ints, P.takeWhile filters that stream, and P.print prints all values that reach the end.
If you want to learn more about the pipes library, I encourage you to read the pipes tutorial.
Your functions lives in IO, so the signature is rather:
iterate'::(a -> IO a) -> a -> IO [a]
The problem is that the original iterate function returns an infinite list, so if you try to do the same in IO you will get an action that never ends. Maybe you should add a condition to end the iteration.
iterate' action value = do
result <- action value
if condition result
then return []
else
rest <- iterate' action result
return $ result : rest
Firstly, your resulting list must be in the IO monad, so iterate' must have produce an IO [a], rather than '[a]'
Iterate can be defined as:
iterate (a -> a) -> a -> [a]
iterate f x = x : iterate f (f x)
so we could make an iterateM quite easily
iterateM :: (a -> m a) -> m a -> [m a]
iterateM f x = x : iterateM f (x >>= f)
This still needs your seed value to be in the monad to start though, and also gives you a list of monadic things, rather than a monad of listy things.
So, lets change it a bit.
iterateM :: (a -> m a) -> a -> m [a]
iterateM f x = sequence $ go f (return x)
where
go f x = x : go f (x >>= f)
However, this doesn't work. This is because sequence first runs every action, and then returns. (You can see this if you write some safeDivide :: Double -> Double -> Maybe Double, and then try something like fmap (take 10) $ iterateM (flip safeDivide 2) 1000. You'll find it doesn't terminate. I'm not sure how to fix that though.

Meaning of `<-` in do block in Haskell

I am trying to understand Haskell monads, reading "Monads for the Curious Programmer". I've run into the example of List Monad:
tossDie=[1,2,3,4,5,6]
toss2Dice = do
n <- tossDie
m <- tossDie
return (n+m)
main = print toss2Dice
The way do block produces m as 36-element list I sort of understand - it maps every element of n as a list of 6 elements and then concatenates those lists. What I do not understand is how n is changed by the presence of m <- tossDie, from 6 elements list to 36 elements. Obviously "we first bind n and then bind m" is wrong understanding here, but what is right?
I am also not fully clear on how two-argument function is applied in do block. I suspect it's a case of curying, but I am slightly hazy as to how exactly it works.
Can someone please explain the above two mysteries?
For lists (such as tossDie), the do notation acts like a list comprehension -- that is to say, as if each variable binding were a nested foreach loop.
The do-block expression:
toss2Dice = do { n <- tossDie; m <- tossDie; return (n+m) }
does the same thing as this list comprehension:
toss2Dice = [ n+m | n <- tossDie, m <- tossDie ]
The result is comparable to the following imperative pseudocode:
toss2Dice = []
foreach n in tossDie:
foreach m in tossDie:
toss2Dice.push_back(n+m)
except that the Haskell examples generate their results lazily, on demand, instead of eagerly and all at once.
If you look at the monad instance for lists, you can see how this works:
instance Monad [] where
xs >>= f = concat (map f xs)
return x = [x]
Starting at the beginning of the do block, each variable binding creates a loop over the remainder of the block:
do { n <- tossDie; m <- tossDie; return (n+m) }
===> tossDie >>= \n -> do { m <- tossDie; return (n+m) }
===> concat ( map (\n -> do { m <- tossDie; return (n+m) }) tossDie )
Note that map function iterates over the items in the list tossDie, and the results are concatenated. The mapping function is the remainder of the do block, so the first binding effectively creates an outer loop around it.
Additional bindings create successively nested loops; and finally, the return function creates a singleton list out of each computed value (n+m) so that the "bind" function >>= (which expects lists) can concatenate them properly.
The interesting bit I assume is this:
toss2Dice = do
n <- tossDie
m <- tossDie
return (n+m)
This is somewhat equivalent to the following Python:
def toss2dice():
for n in tossDie:
for m in tossDie:
yield (n+m)
When it comes to the list monad, you can view the binding arrows (<-) in do notation as traditional imperative "foreach" loops. Everything after
n <- tossDie
belongs to the "loop body" of that foreach loop, and so will be evaluated once for every value in tossDie assigned to n.
If you want the desugaring from do notation to actual bind operators >>=, it looks like this:
toss2Dice =
tossDie >>= (\n ->
tossDie >>= (\m ->
return (n+m)
)
)
Notice how the "inner loop body"
(\n ->
tossDie >>= (\m ->
return (n+m)
)
)
Will get executed once for every value in tossDie. This is pretty much the equivalent to the nested Python loops.
Technical mumbo-jumbo: The reason you get "foreach" loops from the binding arrows has to do with the particular monad you are working with. The arrows mean different things for different monads, and to know what they mean for a particular monad you have to do some sleuthing and figure out how that monad works in general.
The arrows get desugared into calls to the bind operator, >>=, which works differently for different monads as well – this is the reason the bind arrows <- also work differently for different monads!
In the case of the list monad, the bind operator >>= takes a list to the left and a function returning a list to the right, and sort of applies that function to every element of the list. If we want to double every element in a list in a cumbersome way, we could imagine doing it as
λ> [1, 2, 3, 4] >>= \n -> return (n*2)
[2,4,6,8]
(return is required to make the types work out. >>= expects a function that returns a list, and return will, for the list monad, wrap a value in a list.) To illustrate a perhaps more powerful example, we can start by imagining the function
λ> let posneg n = [n, -n]
λ> posneg 5
[5,-5]
Then we can write
λ> [1, 2, 3, 4] >>= posneg
[1,-1,2,-2,3,-3,4,-4]
to count the natural numbers between -4 and 4.
The reason the list monad works this way is that this particular behaviour of the bind operator >>= and return makes the monad laws hold. The monad laws are important for us (and perhaps the adventurous compiler) because they let us change code in ways that we know will not break anything.
A very cute side effect of this is that it makes lists very handy to represent uncertainty in values: Say you are building an OCR thingey that's supposed to look at an image and turn it into text. You might encounter a character that could be either 4 or A or H, but you are not sure. By letting the OCR thingey work in the list monad and return the list ['A', '4', 'H'] you have covered your bases. Actually working with the scanned text then become very easy and readable with do notation for the list monad. (It sorta looks like you're working with single values, when in fact you are just generating all possible combinations!)
Adding to #kqr answer:
>>= for list monad is actually concatMap, a function that maps elements to lists of elements and concatenates the lists, but with arguments flipped:
concatMap' x f = concat (map f x)
or alternatively
concatMap' = flip concatMap
return is just
singleElementList x = [x]
Now we can replace >>= with concatMap' and singleElementList:
toss2Dice =
concatMap' tossDie (\n ->
concatMap' tossDie (\m ->
singleElementList (n+m)
)
)
Now we can replace the 2 functions with their bodies:
toss2Dice =
concat (map (\n ->
concat (map (\m ->
[n+m]
) tossDice)
) tossDice)
Remove extra line breaks:
toss2Dice = concat (map (\n -> concat (map (\m -> [n+m]) tossDice)) tossDice)
Or shorter with concatMap:
toss2Dice = concatMap (\n -> concatMap (\m -> [n+m]) tossDice) tossDice
following nponeccop's advice, with
for = flip concatMap
your code becomes
toss2Dice =
for {- n in -} tossDie {- call -}
(\n-> for {- m in -} tossDie {- call -}
(\m-> [n+m]))
where it is plainly visible that we have nested functions, one inside another; so the inner function (\m-> [n+m]), being in the scope of outer function's argument n, has access to it (to the argument n that is). So it uses the value of the argument to the outer function, which is the same on each invocation of the inner function, however many times it is invoked during the same invocation of the outer function.
This can be re-written with named functions,
toss2Dice =
for {- each elem in -} tossDie {- call -} g
where g n = for {- each elem in -} tossDie {- call -} h
where h m = [n+m]
The function h is defined inside g, i.e. in g's argument's scope. And that's how h gets to use both m and n even though only m is its argument.
So in fact we do indeed "first bind n and then bind m" here. In nested fashion, that is.

Why don't Haskell list comprehensions cause an error when pattern match fails?

I'm trying to understand how Haskell list comprehensions work "under the hood" in regards to pattern matching. The following ghci output illustrates my point:
Prelude> let myList = [Just 1, Just 2, Nothing, Just 3]
Prelude> let xs = [x | Just x <- myList]
Prelude> xs
[1,2,3]
Prelude>
As you can see, it is able to skip the "Nothing" and select only the "Just" values. I understand that List is a monad, defined as (source from Real World Haskell, ch. 14):
instance Monad [] where
return x = [x]
xs >>= f = concat (map f xs)
xs >> f = concat (map (\_ -> f) xs)
fail _ = []
Therefore, a list comprehension basically builds a singleton list for every element selected in the list comprehension and concatenates them. If a pattern match fails at some step, the result of the "fail" function is used instead. In other words, the "Just x" pattern doesn't match so [] is used as a placeholder until 'concat' is called. That explains why the "Nothing" appears to be skipped.
What I don't understand is, how does Haskell know to call the "fail" function? Is it "compiler magic", or functionality that you can write yourself in Haskell? Is it possible to write the following "select" function to work the same way as a list comprehension?
select :: (a -> b) -> [a] -> [b]
select (Just x -> x) myList -- how to prevent the lambda from raising an error?
[1,2,3]
While implemenatations of Haskell might not do it directly like this internally, it is helpful to think about it this way :)
[x | Just x <- myList]
... becomes:
do
Just x <- myList
return x
... which is:
myList >>= \(Just x) -> return x
As to your question:
What I don't understand is, how does Haskell know to call the "fail" function?
In do-notation, if a pattern binding fails (i.e. the Just x), then the fail method is called. For the above example, it would look something like this:
myList >>= \temp -> case temp of
(Just x) -> return x
_ -> fail "..."
So, every time you have a pattern-match in a monadic context that may fail, Haskell inserts a call to fail. Try it out with IO:
main = do
(1,x) <- return (0,2)
print x -- x would be 2, but the pattern match fails
The rule for desugaring a list comprehension requires an expression of the form [ e | p <- l ] (where e is an expression, p a pattern, and l a list expression) behave like
let ok p = [e]
ok _ = []
in concatMap ok l
Previous versions of Haskell had monad comprehensions, which were removed from the language because they were hard to read and redundant with the do-notation. (List comprehensions are redundant, too, but they aren't so hard to read.) I think desugaring [ e | p <- l ] as a monad (or, to be precise, as a monad with zero) would yield something like
let ok p = return e
ok _ = mzero
in l >>= ok
where mzero is from the MonadPlus class. This is very close to
do { p <- l; return e }
which desugars to
let ok p = return e
ok _ = fail "..."
in l >>= ok
When we take the List Monad, we have
return e = [e]
mzero = fail _ = []
(>>=) = flip concatMap
I.e., the 3 approaches (list comprehensions, monad comprehensions, do expressions) are equivalent for lists.
I don't think the list comprehension syntax has much to do with the fact that List ([]), or Maybe for that matter, happens to be an instance of the Monad type class.
List comprehensions are indeed compiler magic or syntax sugar, but that's possible because the compiler knows the structure of the [] data type.
Here's what the list comprehension is compiled to: (Well, I think, I didn't actually check it against the GHC)
xs = let f = \xs -> case xs of
Just x -> [x]
_ -> []
in concatMap f myList
As you can see, the compiler doesn't have to call the fail function, it can simply inline a empty list, because it knows what a list is.
Interestingly, this fact that the list comprehensions syntax 'skips' pattern match failures is used in some libraries to do generic programming. See the example in the Uniplate library.
Edit: Oh, and to answer your question, you can't call your select function with the lambda you gave it. It will indeed fail on a pattern match failure if you call it with an Nothing value.
You could pass it the f function from the code above, but than select would have the type:
select :: (a -> [b]) -> [a] -> [b]
which is perfectly fine, you can use the concatMap function internally :-)
Also, that new select now has the type of the monadic bind operator for lists (with its arguments flipped):
(>>=) :: [a] -> (a -> [b]) -> [b]
xs >>= f = concatMap f xs -- 'or as you said: concat (map f xs)

Resources