Why should fail method exist in the monad type class? - haskell

So I have this line of code:
[Nothing] >>= \(Just x) -> [x]
which of course gives exception, because the pattern doesn't match Nothing.
On the other hand, this code gives a different result, []:
do
Just x <- [Nothing]
return x
As I see it, they should produce the same result, because do-blocks should be desugared into using (>>=) and return. But this is not the case, making do-notation a feature rather than a syntactic sugar.
I know that fail exists in the monad type class and I know that it is called when a pattern matching fails in a do-block, but I can't understand why it is a wanted behavior that should be different than using normal monad operations.
So my questions is - why should the fail method exist at all?

Code such as
\(Just x) -> ...
denotes a function. There's only one way to use such a value: apply it to some argument. When said argument does not match the pattern (e.g. is Nothing) application is impossible, and the only general option is to raise a runtime error/exception.
Instead, when in a do-block, we have a type class around: a monad. Such class could, in theory, be extended to provide a behavior for such cases. Indeed, the designers of Haskell decided to add a fail method just for this case.
Whether the choice was good or bad can be controversial. Just to present another design option, the Monad class could have been designed without fail, and blocks such as
do ...
Just x <- ...
...
could have been forbidden, or made to require a special MonadFail subclass of Monad. Erroring out is also a choice, but we like to write e.g.
catMaybes xs = do Just x <- xs
return x
-- or
catMaybes xs = [ x | Just x <- xs ]
to discard Nothings from a list.

Related

Why are values of type () inspected?

In Haskell, the () type has two values, namely, () and bottom. If you have an expression e :: (), there's no point in actually inspecting it, since either it's e = () or by inspecting it you're crashing a program which could otherwise have not crashed.
Hence, I figured that perhaps operations on values of type () would not inspect the value and would not distinguish between () and bottom.
However, this is wildly untrue:
▎λ ghci
GHCi, version 9.0.2: https://www.haskell.org/ghc/ :? for help
ghci> u = (undefined :: ())
ghci> show u
"*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
ghci> () == u
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
ghci> f () = "ok"
ghci> f u
"*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
What is the reason for this? Here are some conjectures:
For some reason that I can't think of, it's useful to be non-lazy on (). Sometimes we want that bottom to propagate.
Haskell semantics are written in such a way that destructuring any ADTs, even trivial ones, inspects them. This means that having case (undefined :: ()) of { () -> ... } not throw would be a violation of language semantics
() is an extremely special case and simply isn't worth the attention to eke out this tiny extra bit of safety in a massive language like Haskell
There's also the possible combination explanation of 2+3, that Haskell could have had semantics dictating that an expression case e of inspects e unless it is of type (), but that would pollute the language spec for relatively low benefit
I will address this part:
For some reason that I can't think of, it's useful to be non-lazy on (). Sometimes we want that bottom to propagate.
Let's have a look at Control.Parallel.Strategies (version 1, an older version). This is a module for parallel evaluation. Let's focus on one of its functions for the sake of simplicity:
parMap :: Strategy b -> (a -> b) -> [a] -> [b]
The result of parMap strat f xs is the same as map f xs, except that the list is computed in parallel. What is the strat argument? Well,
strat :: Strategy b
means
strat :: b -> ()
There are only two things you can do with strat:
call it and ignore the result, which by laziness amounts to not calling it at all;
call it and force the result, even if you know it's () or a bottom.
parMap does the latter, in parallel. This allows the caller to specify a strat argument that evaluates the list values of type b as needed. For example
parMap (\(x,y) -> ()) f xs
parMap (\(x,y) -> x `seq` ()) f xs
parMap (\(x,y) -> x `seq` y `seq` ()) f xs
are valid calls, and will cause parMap to evaluate the new list-of-pairs only to expose the pair constructor, also the first component, also the second component, respectively.
Hence, forcing the () result of strat in this case allows the user to control how much evaluation to perform during parMap, i.e. how much to force the result (in parallel), and consequently which parts of the result should be left unevaluated. (By comparison map f xs would leave the result fully unevaluated -- it is completely lazy. parMap can not do that otherwise it is not longer parallel.)
Minor digression: note that the GADT
data a :~: b where
Refl :: t :~: t
has one constructor like (). Here, it is mandatory that such values are forced as in:
foo :: Int :~: String -> Int -> String
foo Refl x = x ++ " hello"
Here the first argument must be a bottom. By forcing that, we make the function error out with an exception. If we did not force that, we would get a very nasty undefined behavior like those in C and C++, completely breaking type safety. Haskell will correctly reject any attempt to circumvent that:
foo :: Int :~: String -> Int -> String
foo _ x = x ++ " hello"
triggers a type error at compile time.
I don't know for sure, but I suspect it's none of the things you said. Instead, this is so that the language is predictable and consistent.
There are, essentially, two things you observed, and I consider them to be separate things. The first is that checking whether a x is indeed () with a case statement forces evaluation of x; the second is that the instances (of Show and Eq) are written to use a case statement.
Pattern matching: the predictable, consistent rule here is that if you write case <e0> of <pat> -> <e1>, then e0 is evaluated far enough to check whether the constructors in pat are in fact in the given places. Well, okay, there's some wrinkles here to do with irrefutable patterns; let's say instead that e0 is evaluated far enough to check whether pat actually does match! For the () type, that means that the pattern () causes full evaluation -- because you've specified the full value that you expect it to be -- while the pattern x or _ can match without further evaluation.
Class instances: the natural inductive way to specify what the various class instances do is to always have an outermost case that matches against each available constructor with simple variable patterns for the fields, then does something (presumably recursive calls) on each of the fields in turn. That is, simplifying a bit, the show implementation goes like:
show x = case x of
<Con0> field00 field01 field02 <...> -> "<Con0>"
++ " " ++ show field00
++ " " ++ show field01
++ " " ++ show field02
++ <...>
<Con1> field10 field11 field12 <...> -> "<Con1>"
++ " " ++ show field10
++ " " ++ show field11
++ " " ++ show field12
++ <...>
<...>
It is very natural for the specialization of this scheme to the single-constructor, zero-field type () to go:
show x = case x of
() -> "()"
(Additionally, the Report specifies that (==) is is always strict in both arguments; but that property would also arise naturally from the obvious way of writing a generic Eq instance derivation algorithm.) Therefore the path of least surprise is for class instances to pattern match on their argument(s).
#2 is definitely true.
The () type is just a nullary data type with special type/data constructor syntax:
data () = ()
As a result, the Haskell 2010 report, while only providing an informal semantics, makes it pretty clear in section 3.17.2 Informal Semantics of Pattern Matching that the expression:
case undefined of () -> "ack!"
will be evaluated as per rule #5:
Matching the pattern con pat1 … patn against a value, where con is a constructor defined by data, depends on the value:
If the value is of the form con v1 … vn, sub-patterns are matched left-to-right against the components of the data value; if all matches succeed, the overall match succeeds; the first to fail or diverge causes the overall match to fail or diverge, respectively.
If the value is of the form con′ v1 … vm, where con is a different constructor to con′, the match fails.
If the value is ⊥, the match diverges.
Here, the value of undefined is ⊥, so the third bullet point applies, and the match diverges. And if the match diverges, the program diverges, and if the program diverges it must terminate with an error or -- at worst -- loop forever. It cannot continue as if nothing has happened. Admittedly, this last part is not explicitly stated, but it is the only reasonable interpretation for the semantics of a divergent evaluation of an expression.

The real sense of list generators in haskell

As I understand, the code
l = [(a,b)|a<-[1,2],b<-[3,4]]
is equivalent to
l = do
a <- [1,2]
b <- [3,4]
return (a,b)
or
[1,2] >>= (\a -> [3,4] >>= (\b -> return (a,b)))
The type of such expression is [(t,t1)] where t and t1 are in Num.
If I write something like
getLine >>= (\a -> getLine >>= (\b -> return (a,b)))
the interpreter reads two lines and returns a tuple containing them.
But can I use getLine or something like that in list generators?
The expression
[x|x<-getLine]
returns an error "Couldn't match expected type [t0]' with actual typeIO String'"
But, of course, this works in do-notation or using (>>=).
What's the point of list generators, and what's the actual difference between them and do-notation?
Is there any type restriction when using list gens?
That's a sensible observation, and you're not the first one to stumble upon that. You're right that the translation of [x|x<-getLine] would lead to a perfectly valid monadic expression. The point is that list comprehensions were, I think, first only introduced as convenience syntax for lists, and (probably) no one thought that people might use them for other monads.
However, since the restriction to [] is not really a necessary one, there is a GHC extension called -XMonadComprehensions which removes the restriction and allows you to write exactly what you wanted:
Prelude> :set -XMonadComprehensions
Prelude> [x|x<-getLine]
sdf
"sdf"
My understanding is that list comprehensions can only be used to construct lists.
However, there's a language extension called "monad comprehensions" that allows you to use any arbitrary monad.
https://ghc.haskell.org/trac/ghc/wiki/MonadComprehensions

Does a function in Haskell always evaluate its return value?

I'm trying to better understand Haskell's laziness, such as when it evaluates an argument to a function.
From this source:
But when a call to const is evaluated (that’s the situation we are interested in, here, after all), its return value is evaluated too ... This is a good general principle: a function obviously is strict in its return value, because when a function application needs to be evaluated, it needs to evaluate, in the body of the function, what gets returned. Starting from there, you can know what must be evaluated by looking at what the return value depends on invariably. Your function will be strict in these arguments, and lazy in the others.
So a function in Haskell always evaluates its own return value? If I have:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
head (foo [1..]) -- = 4
According to the above paragraph, map (* 2) xs, must be evaluated. Intuitively, I would think that means applying the map to the entire list- resulting in an infinite loop.
But, I can successfully take the head of the result. I know that : is lazy in Haskell, so does this mean that evaluating map (* 2) xs just means constructing something else that isn't fully evaluated yet?
What does it mean to evaluate a function applied to an infinite list? If the return value of a function is always evaluated when the function is evaluated, can a function ever actually return a thunk?
Edit:
bar x y = x
var = bar (product [1..]) 1
This code doesn't hang. When I create var, does it not evaluate its body? Or does it set bar to product [1..] and not evaluate that? If the latter, bar is not returning its body in WHNF, right, so did it really 'evaluate' x? How could bar be strict in x if it doesn't hang on computing product [1..]?
First of all, Haskell does not specify when evaluation happens so the question can only be given a definite answer for specific implementations.
The following is true for all non-parallel implementations that I know of, like ghc, hbc, nhc, hugs, etc (all G-machine based, btw).
BTW, something to remember is that when you hear "evaluate" for Haskell it normally means "evaluate to WHNF".
Unlike strict languages you have to distinguish between two "callers" of a function, the first is where the call occurs lexically, and the second is where the value is demanded. For a strict language these two always coincide, but not for a lazy language.
Let's take your example and complicate it a little:
foo [] = []
foo (_:xs) = map (* 2) xs
bar x = (foo [1..], x)
main = print (head (fst (bar 42)))
The foo function occurs in bar. Evaluating bar will return a pair, and the first component of the pair is a thunk corresponding to foo [1..]. So bar is what would be the caller in a strict language, but in the case of a lazy language it doesn't call foo at all, instead it just builds the closure.
Now, in the main function we actually need the value of head (fst (bar 42)) since we have to print it. So the head function will actually be called. The head function is defined by pattern matching, so it needs the value of the argument. So fst is called. It too is defined by pattern matching and needs its argument so bar is called, and bar will return a pair, and fst will evaluate and return its first component. And now finally foo is "called"; and by called I mean that the thunk is evaluated (entered as it's sometimes called in TIM terminology), because the value is needed. The only reason the actual code for foo is called is that we want a value. So foo had better return a value (i.e., a WHNF). The foo function will evaluate its argument and end up in the second branch. Here it will tail call into the code for map. The map function is defined by pattern match and it will evaluate its argument, which is a cons. So map will return the following {(*2) y} : {map (*2) ys}, where I have used {} to indicate a closure being built. So as you can see map just returns a cons cell with the head being a closure and the tail being a closure.
To understand the operational semantics of Haskell better I suggest you look at some paper describing how to translate Haskell to some abstract machine, like the G-machine.
I always found that the term "evaluate," which I had learned in other contexts (e.g., Scheme programming), always got me all confused when I tried to apply it to Haskell, and that I made a breakthrough when I started to think of Haskell in terms of forcing expressions instead of "evaluating" them. Some key differences:
"Evaluation," as I learned the term before, strongly connotes mapping expressions to values that are themselves not expressions. (One common technical term here is "denotations.")
In Haskell, the process of forcing is IMHO most easily understood as expression rewriting. You start with an expression, and you repeatedly rewrite it according to certain rules until you get an equivalent expression that satisfies a certain property.
In Haskell the "certain property" has the unfriendly name weak head normal form ("WHNF"), which really just means that the expression is either a nullary data constructor or an application of a data constructor.
Let's translate that to a very rough set of informal rules. To force an expression expr:
If expr is a nullary constructor or a constructor application, the result of forcing it is expr itself. (It's already in WHNF.)
If expr is a function application f arg, then the result of forcing it is obtained this way:
Find the definition of f.
Can you pattern match this definition against the expression arg? If not, then force arg and try again with the result of that.
Substitute the pattern match variables in the body of f with the parts of (the possibly rewritten) arg that correspond to them, and force the resulting expression.
One way of thinking of this is that when you force an expression, you're trying to rewrite it minimally to reduce it to an equivalent expression in WHNF.
Let's apply this to your example:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
-- We want to force this expression:
head (foo [1..])
We will need definitions for head and `map:
head [] = undefined
head (x:_) = x
map _ [] = []
map f (x:xs) = f x : map f x
-- Not real code, but a rule we'll be using for forcing infinite ranges.
[n..] ==> n : [(n+1)..]
So now:
head (foo [1..]) ==> head (map (*2) [1..]) -- using the definition of foo
==> head (map (*2) (1 : [2..])) -- using the forcing rule for [n..]
==> head (1*2 : map (*2) [2..]) -- using the definition of map
==> 1*2 -- using the definition of head
==> 2 -- using the definition of *
I believe the idea must be that in a lazy language if you're evaluating a function application, it must be because you need the result of the application for something. So whatever reason caused the function application to be reduced in the first place is going to continue to need to reduce the returned result. If we didn't need the function's result we wouldn't be evaluating the call in the first place, the whole application would be left as a thunk.
A key point is that the standard "lazy evaluation" order is demand-driven. You only evaluate what you need. Evaluating more risks violating the language spec's definition of "non-strict semantics" and looping or failing for some programs that should be able to terminate; lazy evaluation has the interesting property that if any evaluation order can cause a particular program to terminate, so can lazy evaluation.1
But if we only evaluate what we need, what does "need" mean? Generally it means either
a pattern match needs to know what constructor a particular value is (e.g. I can't know what branch to take in your definition of foo without knowing whether the argument is [] or _:xs)
a primitive operation needs to know the entire value (e.g. the arithmetic circuits in the CPU can't add or compare thunks; I need to fully evaluate two Int values to call such operations)
the outer driver that executes the main IO action needs to know what the next thing to execute is
So say we've got this program:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
main :: IO ()
main = print (head (foo [1..]))
To execute main, the IO driver has to evaluate the thunk print (head (foo [1..])) to work out that it's print applied to the thunk head (foo [1..]). print needs to evaluate its argument on order to print it, so now we need to evaluate that thunk.
head starts by pattern matching its argument, so now we need to evaluate foo [1..], but only to WHNF - just enough to tell whether the outermost list constructor is [] or :.
foo starts by pattern matching on its argument. So we need to evaluate [1..], also only to WHNF. That's basically 1 : [2..], which is enough to see which branch to take in foo.2
The : case of foo (with xs bound to the thunk [2..]) evaluates to the thunk map (*2) [2..].
So foo is evaluated, and didn't evaluate its body. However, we only did that because head was pattern matching to see if we had [] or x : _. We still don't know that, so we must immediately continue to evaluate the result of foo.
This is what the article means when it says functions are strict in their result. Given that a call to foo is evaluated at all, its result will also be evaluated (and so, anything needed to evaluate the result will also be evaluated).
But how far it needs to be evaluated depends on the calling context. head is only pattern matching on the result of foo, so it only needs a result to WHNF. We can get an infinite list to WHNF (we already did so, with 1 : [2..]), so we don't necessarily get in an infinite loop when evaluating a call to foo. But if head were some sort of primitive operation implemented outside of Haskell that needed to be passed a completely evaluated list, then we'd be evaluating foo [1..] completely, and thus would never finish in order to come back to head.
So, just to complete my example, we're evaluating map (2 *) [2..].
map pattern matches its second argument, so we need to evaluate [2..] as far as 2 : [3..]. That's enough for map to return the thunk (2 *) 2 : map (2 *) [3..], which is in WHNF. And so it's done, we can finally return to head.
head ((2 *) 2 : map (2 *) [3..]) doesn't need to inspect either side of the :, it just needs to know that there is one so it can return the left side. So it just returns the unevaluated thunk (2 *) 2.
Again though, we only evaluated the call to head this far because print needed to know what its result is, so although head doesn't evaluate its result, its result is always evaluated whenever the call to head is.
(2 *) 2 evaluates to 4, print converts that into the string "4" (via show), and the line gets printed to the output. That was the entire main IO action, so the program is done.
1 Implementations of Haskell, such as GHC, do not always use "standard lazy evaluation", and the language spec does not require it. If the compiler can prove that something will always be needed, or cannot loop/error, then it's safe to evaluate it even when lazy evaluation wouldn't (yet) do so. This can often be faster so GHC optimizations do actually do this.
2 I'm skipping over a few details here, like that print does have some non-primitive implementation we could step inside and lazily evaluate, and that [1..] could be further expanded to the functions that actually implement that syntax.
Not necessarily. Haskell is lazy, meaning that it only evaluates when it needs to. This has some interesting effects. If we take the below code, for example:
-- File: lazinessTest.hs
(>?) :: a -> b -> b
a >? b = b
main = (putStrLn "Something") >? (putStrLn "Something else")
This is the output of the program:
$ ./lazinessTest
Something else
This indicates that putStrLn "Something" is never evaluated. But it's still being passed to the function, in the form of a 'thunk'. These 'thunks' are unevaluated values that, rather than being concrete values, are like a breadcrumb-trail of how to compute the value. This is how Haskell laziness works.
In our case, two 'thunks' are passed to >?, but only one is passed out, meaning that only one is evaluated in the end. This also applies in const, where the second argument can be safely ignored, and therefore is never computed. As for map, GHC is smart enough to realise that we don't care about the end of the array, and only bothers to compute what it needs to, in your case the second element of the original list.
However, it's best to leave the thinking about laziness to the compiler and keep coding, unless you're dealing with IO, in which case you really, really should think about laziness, because you can easily go wrong, as I've just demonstrated.
There are lots and lots of online articles on the Haskell wiki to look at, if you want more detail.
Function could evaluate either return type:
head (x:_) = x
or exception/error:
head _ = error "Head: List is empty!"
or bottom (⊥)
a = a
b = last [1 ..]

What does Haskell's <|> operator do?

Going through Haskell's documentation is always a bit of a pain for me, because all the information you get about a function is often nothing more than just: f a -> f [a] which could mean any number of things.
As is the case of the <|> function.
All I'm given is this: (<|>) :: f a -> f a -> f a and that it's an "associative binary operation"...
Upon inspection of Control.Applicative I learn that it does seemingly unrelated things depending on implementation.
instance Alternative Maybe where
empty = Nothing
Nothing <|> r = r
l <|> _ = l
Ok, so it returns right if there is no left, otherwise it returns left, gotcha.. This leads me to believe it's a "left or right" operator, which kinda makes sense given its use of | and |'s historical use as "OR"
instance Alternative [] where
empty = []
(<|>) = (++)
Except here it just calls list's concatenation operator... Breaking my idea down...
So what exactly is that function? What's its use? Where does it fit in in the grand scheme of things?
Typically it means "choice" or "parallel" in that a <|> b is either a "choice" of a or b or a and b done in parallel. But let's back up.
Really, there is no practical meaning to operations in typeclasses like (<*>) or (<|>). These operations are given meaning in two ways: (1) via laws and (2) via instantiations. If we are not talking about a particular instance of Alternative then only (1) is available for intuiting meaning.
So "associative" means that a <|> (b <|> c) is the same as (a <|> b) <|> c. This is useful as it means that we only care about the sequence of things chained together with (<|>), not their "tree structure".
Other laws include identity with empty. In particular, a <|> empty = empty <|> a = a. In our intuition with "choice" or "parallel" these laws read as "a or (something impossible) must be a" or "a alongside (empty process) is just a". It indicates that empty is some kind of "failure mode" for an Alternative.
There are other laws with how (<|>)/empty interact with fmap (from Functor) or pure/(<*>) (from Applicative), but perhaps the best way to move forward in understanding the meaning of (<|>) is to examine a very common example of a type which instantiates Alternative: a Parser.
If x :: Parser A and y :: Parser B then (,) <$> x <*> y :: Parser (A, B) parses x and then y in sequence. In contrast, (fmap Left x) <|> (fmap Right y) parses either x or y, beginning with x, to try out both possible parses. In other words, it indicates a branch in your parse tree, a choice, or a parallel parsing universe.
(<|>) :: f a -> f a -> f a actually tells you quite a lot, even without considering the laws for Alternative.
It takes two f a values, and has to give one back. So it will have to combine or select from its inputs somehow. It's polymorphic in the type a, so it will be completely unable to inspect whatever values of type a might be inside an f a; this means it can't do the "combining" by combining a values, so it must to it purely in terms of whatever structure the type constructor f adds.
The name helps a bit too. Some sort of "OR" is indeed the vague concept the authors were trying to indicate with the name "Alternative" and the symbol "<|>".
Now if I've got two Maybe a values and I have to combine them, what can I do? If they're both Nothing I'll have to return Nothing, with no way to create an a. If at least one of them is a Just ... I can return one of my inputs as-is, or I can return Nothing. There are very few functions that are even possible with the type Maybe a -> Maybe a -> Maybe a, and for a class whose name is "Alternative" the one given is pretty reasonable and obvious.
How about combining two [a] values? There are more possible functions here, but really it's pretty obvious what this is likely to do. And the name "Alternative" does give you a good hint at what this is likely to be about provided you're familiar with the standard "nondeterminism" interpretation of the list monad/applicative; if you see a [a] as a "nondeterministic a" with a collection of possible values, then the obvious way for "combining two nondeterministic a values" in a way that might deserve the name "Alternative" is to produce a nondeterminstic a which could be any of the values from either of the inputs.
And for parsers; combining two parsers has two obvious broad interpretations that spring to mind; either you produce a parser that would match what the first does and then what the second does, or you produce a parser that matches either what the first does or what the second does (there are of course subtle details of each of these options that leave room for options). Given the name "Alternative", the "or" interpretation seems very natural for <|>.
So, seen from a sufficiently high level of abstraction, these operations do all "do the same thing". The type class is really for operating at that high level of abstraction where these things all "look the same". When I'm operating on a single known instance I just think of the <|> operation as exactly what it does for that specific type.
An interesting example of an Alternative that isn't a parser or a MonadPlus-like thing is Concurrently, a very useful type from the async package.
For Concurrently, empty is a computation that goes on forever. And (<|>) executes its arguments concurrently, returns the result of the first one that completes, and cancels the other one.
These seem very different, but consider:
Nothing <|> Nothing == Nothing
[] <|> [] == []
Just a <|> Nothing == Just a
[a] <|> [] == [a]
Nothing <|> Just b == Just b
[] <|> [b] == [b]
So... these are actually very, very similar, even if the implementation looks different. The only real difference is here:
Just a <|> Just b == Just a
[a] <|> [b] == [a, b]
A Maybe can only hold one value (or zero, but not any other amount). But hey, if they were both identical, why would you need two different types? The whole point of them being different is, you know, to be different.
In summary, the implementation may look totally different, but these are actually quite similar.

About value in context (applied in Monad)

I have a small question about value in context.
Take Just 'a', so the value in context of type Maybe in this case is 'a'
Take [3], so value in context of type [a] in this case is 3
And if you apply the monad for [3] like this: [3] >>= \x -> [x+3], it means you assign x with value 3. It's ok.
But now, take [3,2], so what is the value in the context of type [a]?. And it's so strange that if you apply monad for it like this:
[3,4] >>= \x -> x+3
It got the correct answer [6,7], but actually we don't understand what is x in this case. You can answer, ah x is 3 and then 4, and x feeds the function 2 times and concat as Monad does: concat (map f xs) like this:
[3,4] >>= concat (map f x)
So in this case, [3,4] will be assigned to the x. It means wrong, because [3,4] is not a value. Monad is wrong.
I think your problem is focusing too much on the values. A monad is a type constructor, and as such not concerned with how many and what kinds of values there are, but only the context.
A Maybe a can be an a, or nothing. Easy, and you correctly observed that.
An Either String a is either some a, or alternatively some information in form of a String (e.g. why the calculation of a failed).
Finally, [a] is an unknown number of as (or none at all), that may have resulted from an ambiguous computation, or one giving multiple results (like a quadratic equation).
Now, for the interpretation of (>>=), it is helpful to know that the essential property of a monad (how it is defined by category theorists) is
join :: m (m a) -> m a.
Together with fmap, (>>=) can be written in terms of join.
What join means is the following: A context, put in the same context again, still has the same resulting behavior (for this monad).
This is quite obvious for Maybe (Maybe a): Something can essentially be Just (Just x), or Nothing, or Just Nothing, which provides the same information as Nothing. So, instead of using Maybe (Maybe a), you could just have Maybe a and you wouldn't lose any information. That's what join does: it converts to the "easier" context.
[[a]] is somehow more difficult, but not much. You essentially have multiple/ambiguous results out of multiple/ambiguous results. A good example are the roots of a fourth-degree polynomial, found by solving a quadratic equation. You first get two solutions, and out of each you can find two others, resulting in four roots.
But the point is, it doesn't matter if you speak of an ambiguous ambiguous result, or just an ambiguous result. You could just always use the context "ambiguous", and transform multiple levels with join.
And here comes what (>>=) does for lists: it applies ambiguous functions to ambiguous values:
squareRoots :: Complex -> [Complex]
fourthRoots num = squareRoots num >>= squareRoots
can be rewritten as
fourthRoots num = join $ squareRoots `fmap` (squareRoots num)
-- [1,-1,i,-i] <- [[1,-1],[i,-i]] <- [1,-1] <- 1
since all you have to do is to find all possible results for each possible value.
This is why join is concat for lists, and in fact
m >>= f == join (fmap f) m
must hold in any monad.
A similar interpretation can be given to IO. A computation with side-effects, which can also have side-effects (IO (IO a)), is in essence just something with side-effects.
You have to take the word "context" quite broadly.
A common way of interpreting a list of values is that it represents an indeterminate value, so [3,4] represents a value which is three or four, but we don't know which (perhaps we just know it's a solution of x^2 - 7x + 12 = 0).
If we then apply f to that, we know it's 6 or 7 but we still don't know which.
Another example of an indeterminate value that you're more used to is 3. It could mean 3::Int or 3::Integer or even sometimes 3.0::Double. It feels easier because there's only one symbol representing the indeterminate value, whereas in a list, all the possibilities are listed (!).
If you write
asum = do
x <- [10,20]
y <- [1,2]
return (x+y)
You'll get a list with four possible answers: [11,12,21,22]
That's one for each of the possible ways you could add x and y.
It is not the values that are in the context, it's the types.
Just 'a' :: Maybe Char --- Char is in a Maybe context.
[3, 2] :: [Int] --- Int is in a [] context.
Whether there is one, none or many of the a in the m a is beside the point.
Edit: Consider the type of (>>=) :: Monad m => m a -> (a -> m b) -> m b.
You give the example Just 3 >>= (\x->Just(4+x)). But consider Nothing >>= (\x->Just(4+x)). There is no value in the context. But the type is in the context all the same.
It doesn't make sense to think of x as necessarily being a single value. x has a single type. If we are dealing with the Identity monad, then x will be a single value, yes. If we are in the Maybe monad, x may be a single value, or it may never be a value at all. If we are in the list monad, x may be a single value, or not be a value at all, or be various different values... but what it is not is the list of all those different values.
Your other example --- [2, 3] >>= (\x -> x + 3) --- [2, 3] is not passed to the function. [2, 3] + 3 would have a type error. 2 is passed to the function. And so is 3. The function is invoked twice, gives results for both those inputs, and the results are combined by the >>= operator. [2, 3] is not passed to the function.
"context" is one of my favorite ways to think about monads. But you've got a slight misconception.
Take Just 'a', so the value in context of type Maybe in this case is 'a'
Not quite. You keep saying the value in context, but there is not always a value "inside" a context, or if there is, then it is not necessarily the only value. It all depends on which context we are talking about.
The Maybe context is the context of "nullability", or potential absence. There might be something there, or there might be Nothing. There is no value "inside" of Nothing. So the maybe context might have a value inside, or it might not. If I give you a Maybe Foo, then you cannot assume that there is a Foo. Rather, you must assume that it is a Foo inside the context where there might actually be Nothing instead. You might say that something of type Maybe Foo is a nullable Foo.
Take [3], so value in context of type [a] in this case is 3
Again, not quite right. A list represents a nondeterministic context. We're not quite sure what "the value" is supposed to be, or if there is one at all. In the case of a singleton list, such as [3], then yes, there is just one. But one way to think about the list [3,4] is as some unobservable value which we are not quite sure what it is, but we are certain that it 3 or that it is 4. You might say that something of type [Foo] is a nondeterministic Foo.
[3,4] >>= \x -> x+3
This is a type error; not quite sure what you meant by this.
So in this case, [3,4] will be assigned to the x. It means wrong, because [3,4] is not a value. Monad is wrong.
You totally lost me here. Each instance of Monad has its own implementation of >>= which defines the context that it represents. For lists, the definition is
(xs >>= f) = (concat (map f xs))
You may want to learn about Functor and Applicative operations, which are related to the idea of Monad, and might help clear some confusion.

Resources