Why Do We Need Maybe Monad Over Either Monad - haskell

I was playing around Maybe and Either monad types (Chaining, applying conditional functions according to returned value, also returning error message which chained function has failed etc.). So it seemes to me like we can achieve same and more things that Maybe does by using Either monad. So my question is where the practical or conceptual difference between those ?

You are of course right that Maybe a is isomorphic to Either Unit a. The thing is that they are often semantically used to denote different things, a bit like the difference between returning null and throwing a NoSuchElementException:
Nothing/None denotes the "expected" missing of something, while
Left e denotes an error in getting it, for whatever reason.
That said, we might even combine the two to something like:
query :: Either DBError (Maybe String)
where we express both the possibility of a missing value (a DB NULL) and an error in the connection, the DBMS, or whatever (not saying that there aren't better designs, but you get the point).
Sometimes, the border is fluid; for saveHead :: [a] -> Maybe a, we could say that the expected possibility of the error is encoded in the intent of the function, while something like saveDivide might be encoded as Float -> Float -> Either FPError Float or Float -> Float -> Maybe Float, depending on the use case (again, just some stupid examples...).
If in doubt, the best option is probably to use a custom result ADT with semantic encoding (like data QueryResult = Success String | Null | Failure DBError), and to prefer Maybe to simple cases where it is "traditionally expected" (a subjective point, which however will be mostly OK if you gain experience).

#phg's answer is great. I will chime in with something that helped clear it up for me when I was learning them:
Maybe is one (value) or none – ie, you have a value or you have nothing
Either is a logical disjunction, but you always have at least one (value) - ie, you have one or the other, but not both.
Maybe is great for things like where you may or may not have a value - for example looking for an item in a list. if the list contains it, we get (Just x) otherwise we get Nothing
Either is the perfect representation of a branch in your code - it's going to go one way or the other; Left or Right. We use a mnemonic to remember it: Right is the right (correct) way; Left is the wrong way (Error). This is not it's only use of course, but definitely the most common.
I know the differences might seem subtle at first, but really they're suitable for very different things.

Well, you see, we can put this to the extreme by saying that all product types can be represented by just 2-tuples and all non-recursive sum types by Either. To additionally represent recursive types, we need a fixpoint type.
For example, why have 4-tuples (a,b,c,d) when we could as well write (a, (b, (c,d))) or (((a,b), c), d) ?
Or why have lists, when the following works as well?
data Y f = Y (f (Y f))
type List a = Y ((,) (Either () a))
nil = Y (Left (), undefined)
cons a as = Y (Right a, as)
infixr 4 cons
numbers = 1 `cons` 2 `cons` 3 `cons` nil
-- this is like foldl
reduce f z (Y (Left (), _)) = z
reduce f z (Y (Right x, xs)) = reduce f (f z x) xs
total = reduce (+) 0 numbers

Related

Haskell taking in two list with a int and returning a tuple

I am trying to learn haskell and saw a exercise which says
Write two different Haskell functions having the same type:
[a] -> [b] -> Int -> (a,b)
So from my understanding the expressions should take in two lists, an int and return a tuple of the type of the lists.
What i tried so far was
together :: [a] -> [b] -> Int -> (a,b)
together [] [] 0 = (0,0)
together [b] [a] x = if x == a | b then (b,a) else (0,0)
I know I am way off but any help is appreciated!
First you need to make your mind up what the function should return. That is partly determined by the signature. But still you can come up with a lot of functions that return different things, but have the same signature.
Here one of the most straightforward functions is probably to return the elements that are placed on the index determined by the third parameter.
It makes no sense to return (0,0), since a and b are not per se numerical types. Furthermore if x == a | b is not semantically valid. You can write this as x == a || x == b, but this will not work, since a and b are not per se Ints.
We can implement a function that returns the heads of the two lists in case the index is 0. In case the index is negative, or at least one of the two lists is exhausted, then we can raise an error. I leave it as an exercise what to do in case the index is greater than 0:
together :: [a] -> [b] -> Int -> (a,b)
together [] _ = error "List exhausted"
together _ [] = error "List exhausted"
together (a:_) (b:_) 0 = (a, b)
together (a:_) (b:_) n | n < 0 = error "Negative index!"
| …
you thus still need to fill in the ….
I generally dislike those "write any function with this signature"-type excercises precisely because of how arbitrary they are. You're supposed to figure out a definition that would make sense for that particular signature and implement it. In a lot of cases, you can wing it by ignoring as many arguments as possible:
fa :: [a] -> [b] -> Int -> (a,b)
fa (a:_) (b:_) _ = (a,b)
fa _ _ _ = error "Unfortunately, this function can't be made total because lists can be empty"
The error here is the important bit to note. You attempted to go around that problem by returning 0s, but this will only work when 0 is a valid value for types of a and b. Your next idea could be some sort of a "Default" value, but not every type has such a concept. The key observation is that without any knowledge about a type, in order to produce a value from a function, you need to get this value from somewhere else first*.
If you actually wanted a more sensible definition, you'd need to think up a use for that Int parameter; maybe it's the nth element from each
list? With the help of take :: Int -> [a] -> [a] and head :: [a] -> a this should be doable as an excercise.
Again, your idea of comparing x with a won't work for all types; not every type is comparable with an Int. You might think that this would make generic functions awfully limited; that's the point where you typically learn about how to express certain expectations about the types you get, which will allow you to operate only on certain subsets of all possible types.
* That's also the reason why id :: a -> a has only one possible implementation.
Write two different Haskell functions having the same type:
[a] -> [b] -> Int -> (a,b)
As Willem and Bartek have pointed out, there's a lot of gibberish functions that have this type.
Bartek took the approach of picking two based on what the simplest functions with that type could look like. One was a function that did nothing but throw an error. And one was picking the first element of each list, hoping they were not empty and failing otherwise. This is a somewhat theoretical approach, since you probably don't ever want to use those functions in practice.
Willem took the approach of suggesting an actually useful function with that type and proceeded to explore how to exhaust the possible patterns of such a function: For lists, match the empty list [] and the non-empty list a:_, and for integers, match some stopping point, 0 and some categories n < 0 and ….
A question that arises to me is if there is any other equally useful function with this type signature, or if a second function would necessarily have to be hypothetically constructed. It would seem natural that the Int argument has some relation to the positions of elements in [a] and [b], since they are also integers, especially because a pair of single (a,b) is returned.
But the only remotely useful functions (in the sense of not being completely silly) that I can think of are small variations of this: For example, the Int could be the position from the end rather than from the beginning, or if there's not enough elements in one of the lists, it could default to the last element of a list rather than an error. Neither of these are very pleasing to make ("from the end" conflicts with the list being potentially infinite, and having a fall-back to the last element of a list conflicts with the fact that lists don't necessarily have a last element), so it is tempting to go with Bartek's approach of writing the simplest useless function as the second one.

"For all" statements in Haskell

I'm building comfort going through some Haskell toy problems and I've written the following speck of code
multipOf :: [a] -> (Int, a)
multipOf x = (length x, head x)
gmcompress x = (map multipOf).group $ x
which successfully preforms the following operation
gmcompress [1,1,1,1,2,2,2,3] = [(4,1),(3,2),(1,3)]
Now I want this function to instead of telling me that an element of the set had multiplicity 1, to just leave it alone. So to give the result [(4,1),(3,2),3] instead. It be great if there were a way to say (either during or after turning the list into one of pairs) for all elements of multiplicity 1, leave as just an element; else, pair. My initial, naive, thought was to do the following.
multipOf :: [a] -> (Int, a)
multipOf x = if length x = 1 then head x else (length x, head x)
gmcompress x = (map multipOf).group $ x
BUT this doesn't work. I think because the then and else clauses have different types, and unfortunately you can't piece-wise define the (co)domain of your functions. How might I go about getting past this issue?
BUT this doesn't work. I think because the then and else clauses have different types, and unfortunately you can't piece-wise define the (co)domain of your functions. How might I go about getting past this issue?
Your diagnosis is right; the then and else must have the same type. There's no "getting past this issue," strictly speaking. Whatever solution you adopt has to use same type in both branches of the conditional. One way would be to design a custom data type that encodes the possibilities that you want, and use that instead. Something like this would work:
-- | A 'Run' of #a# is either 'One' #a# or 'Many' of them (with the number
-- as an argument to the 'Many' constructor).
data Run a = One a | Many Int a
But to tell you the truth, I don't think this would really gain you anything. I'd stick to the (Int, a) encoding rather than going to this Run type.

What does Haskell's <|> operator do?

Going through Haskell's documentation is always a bit of a pain for me, because all the information you get about a function is often nothing more than just: f a -> f [a] which could mean any number of things.
As is the case of the <|> function.
All I'm given is this: (<|>) :: f a -> f a -> f a and that it's an "associative binary operation"...
Upon inspection of Control.Applicative I learn that it does seemingly unrelated things depending on implementation.
instance Alternative Maybe where
empty = Nothing
Nothing <|> r = r
l <|> _ = l
Ok, so it returns right if there is no left, otherwise it returns left, gotcha.. This leads me to believe it's a "left or right" operator, which kinda makes sense given its use of | and |'s historical use as "OR"
instance Alternative [] where
empty = []
(<|>) = (++)
Except here it just calls list's concatenation operator... Breaking my idea down...
So what exactly is that function? What's its use? Where does it fit in in the grand scheme of things?
Typically it means "choice" or "parallel" in that a <|> b is either a "choice" of a or b or a and b done in parallel. But let's back up.
Really, there is no practical meaning to operations in typeclasses like (<*>) or (<|>). These operations are given meaning in two ways: (1) via laws and (2) via instantiations. If we are not talking about a particular instance of Alternative then only (1) is available for intuiting meaning.
So "associative" means that a <|> (b <|> c) is the same as (a <|> b) <|> c. This is useful as it means that we only care about the sequence of things chained together with (<|>), not their "tree structure".
Other laws include identity with empty. In particular, a <|> empty = empty <|> a = a. In our intuition with "choice" or "parallel" these laws read as "a or (something impossible) must be a" or "a alongside (empty process) is just a". It indicates that empty is some kind of "failure mode" for an Alternative.
There are other laws with how (<|>)/empty interact with fmap (from Functor) or pure/(<*>) (from Applicative), but perhaps the best way to move forward in understanding the meaning of (<|>) is to examine a very common example of a type which instantiates Alternative: a Parser.
If x :: Parser A and y :: Parser B then (,) <$> x <*> y :: Parser (A, B) parses x and then y in sequence. In contrast, (fmap Left x) <|> (fmap Right y) parses either x or y, beginning with x, to try out both possible parses. In other words, it indicates a branch in your parse tree, a choice, or a parallel parsing universe.
(<|>) :: f a -> f a -> f a actually tells you quite a lot, even without considering the laws for Alternative.
It takes two f a values, and has to give one back. So it will have to combine or select from its inputs somehow. It's polymorphic in the type a, so it will be completely unable to inspect whatever values of type a might be inside an f a; this means it can't do the "combining" by combining a values, so it must to it purely in terms of whatever structure the type constructor f adds.
The name helps a bit too. Some sort of "OR" is indeed the vague concept the authors were trying to indicate with the name "Alternative" and the symbol "<|>".
Now if I've got two Maybe a values and I have to combine them, what can I do? If they're both Nothing I'll have to return Nothing, with no way to create an a. If at least one of them is a Just ... I can return one of my inputs as-is, or I can return Nothing. There are very few functions that are even possible with the type Maybe a -> Maybe a -> Maybe a, and for a class whose name is "Alternative" the one given is pretty reasonable and obvious.
How about combining two [a] values? There are more possible functions here, but really it's pretty obvious what this is likely to do. And the name "Alternative" does give you a good hint at what this is likely to be about provided you're familiar with the standard "nondeterminism" interpretation of the list monad/applicative; if you see a [a] as a "nondeterministic a" with a collection of possible values, then the obvious way for "combining two nondeterministic a values" in a way that might deserve the name "Alternative" is to produce a nondeterminstic a which could be any of the values from either of the inputs.
And for parsers; combining two parsers has two obvious broad interpretations that spring to mind; either you produce a parser that would match what the first does and then what the second does, or you produce a parser that matches either what the first does or what the second does (there are of course subtle details of each of these options that leave room for options). Given the name "Alternative", the "or" interpretation seems very natural for <|>.
So, seen from a sufficiently high level of abstraction, these operations do all "do the same thing". The type class is really for operating at that high level of abstraction where these things all "look the same". When I'm operating on a single known instance I just think of the <|> operation as exactly what it does for that specific type.
An interesting example of an Alternative that isn't a parser or a MonadPlus-like thing is Concurrently, a very useful type from the async package.
For Concurrently, empty is a computation that goes on forever. And (<|>) executes its arguments concurrently, returns the result of the first one that completes, and cancels the other one.
These seem very different, but consider:
Nothing <|> Nothing == Nothing
[] <|> [] == []
Just a <|> Nothing == Just a
[a] <|> [] == [a]
Nothing <|> Just b == Just b
[] <|> [b] == [b]
So... these are actually very, very similar, even if the implementation looks different. The only real difference is here:
Just a <|> Just b == Just a
[a] <|> [b] == [a, b]
A Maybe can only hold one value (or zero, but not any other amount). But hey, if they were both identical, why would you need two different types? The whole point of them being different is, you know, to be different.
In summary, the implementation may look totally different, but these are actually quite similar.

What are the benefits of currying?

I don't think I quite understand currying, since I'm unable to see any massive benefit it could provide. Perhaps someone could enlighten me with an example demonstrating why it is so useful. Does it truly have benefits and applications, or is it just an over-appreciated concept?
(There is a slight difference between currying and partial application, although they're closely related; since they're often mixed together, I'll deal with both terms.)
The place where I realized the benefits first was when I saw sliced operators:
incElems = map (+1)
--non-curried equivalent: incElems = (\elems -> map (\i -> (+) 1 i) elems)
IMO, this is totally easy to read. Now, if the type of (+) was (Int,Int) -> Int *, which is the uncurried version, it would (counter-intuitively) result in an error -- but curryied, it works as expected, and has type [Int] -> [Int].
You mentioned C# lambdas in a comment. In C#, you could have written incElems like so, given a function plus:
var incElems = xs => xs.Select(x => plus(1,x))
If you're used to point-free style, you'll see that the x here is redundant. Logically, that code could be reduced to
var incElems = xs => xs.Select(curry(plus)(1))
which is awful due to the lack of automatic partial application with C# lambdas. And that's the crucial point to decide where currying is actually useful: mostly when it happens implicitly. For me, map (+1) is the easiest to read, then comes .Select(x => plus(1,x)), and the version with curry should probably be avoided, if there is no really good reason.
Now, if readable, the benefits sum up to shorter, more readable and less cluttered code -- unless there is some abuse of point-free style done is with it (I do love (.).(.), but it is... special)
Also, lambda calculus would get impossible without using curried functions, since it has only one-valued (but therefor higher-order) functions.
* Of course it actually in Num, but it's more readable like this for the moment.
Update: how currying actually works.
Look at the type of plus in C#:
int plus(int a, int b) {..}
You have to give it a tuple of values -- not in C# terms, but mathematically spoken; you can't just leave out the second value. In haskell terms, that's
plus :: (Int,Int) -> Int,
which could be used like
incElem = map (\x -> plus (1, x)) -- equal to .Select (x => plus (1, x))
That's way too much characters to type. Suppose you'd want to do this more often in the future. Here's a little helper:
curry f = \x -> (\y -> f (x,y))
plus' = curry plus
which gives
incElem = map (plus' 1)
Let's apply this to a concrete value.
incElem [1]
= (map (plus' 1)) [1]
= [plus' 1 1]
= [(curry plus) 1 1]
= [(\x -> (\y -> plus (x,y))) 1 1]
= [plus (1,1)]
= [2]
Here you can see curry at work. It turns a standard haskell style function application (plus' 1 1) into a call to a "tupled" function -- or, viewed at a higher level, transforms the "tupled" into the "untupled" version.
Fortunately, most of the time, you don't have to worry about this, as there is automatic partial application.
It's not the best thing since sliced bread, but if you're using lambdas anyway, it's easier to use higher-order functions without using lambda syntax. Compare:
map (max 4) [0,6,9,3] --[4,6,9,4]
map (\i -> max 4 i) [0,6,9,3] --[4,6,9,4]
These kinds of constructs come up often enough when you're using functional programming, that it's a nice shortcut to have and lets you think about the problem from a slightly higher level--you're mapping against the "max 4" function, not some random function that happens to be defined as (\i -> max 4 i). It lets you start to think in higher levels of indirection more easily:
let numOr4 = map $ max 4
let numOr4' = (\xs -> map (\i -> max 4 i) xs)
numOr4 [0,6,9,3] --ends up being [4,6,9,4] either way;
--which do you think is easier to understand?
That said, it's not a panacea; sometimes your function's parameters will be the wrong order for what you're trying to do with currying, so you'll have to resort to a lambda anyway. However, once you get used to this style, you start to learn how to design your functions to work well with it, and once those neurons starts to connect inside your brain, previously complicated constructs can start to seem obvious in comparison.
One benefit of currying is that it allows partial application of functions without the need of any special syntax/operator. A simple example:
mapLength = map length
mapLength ["ab", "cde", "f"]
>>> [2, 3, 1]
mapLength ["x", "yz", "www"]
>>> [1, 2, 3]
map :: (a -> b) -> [a] -> [b]
length :: [a] -> Int
mapLength :: [[a]] -> [Int]
The map function can be considered to have type (a -> b) -> ([a] -> [b]) because of currying, so when length is applied as its first argument, it yields the function mapLength of type [[a]] -> [Int].
Currying has the convenience features mentioned in other answers, but it also often serves to simplify reasoning about the language or to implement some code much easier than it could be otherwise. For example, currying means that any function at all has a type that's compatible with a ->b. If you write some code whose type involves a -> b, that code can be made work with any function at all, no matter how many arguments it takes.
The best known example of this is the Applicative class:
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
And an example use:
-- All possible products of numbers taken from [1..5] and [1..10]
example = pure (*) <*> [1..5] <*> [1..10]
In this context, pure and <*> adapt any function of type a -> b to work with lists of type [a]. Because of partial application, this means you can also adapt functions of type a -> b -> c to work with [a] and [b], or a -> b -> c -> d with [a], [b] and [c], and so on.
The reason this works is because a -> b -> c is the same thing as a -> (b -> c):
(+) :: Num a => a -> a -> a
pure (+) :: (Applicative f, Num a) => f (a -> a -> a)
[1..5], [1..10] :: Num a => [a]
pure (+) <*> [1..5] :: Num a => [a -> a]
pure (+) <*> [1..5] <*> [1..10] :: Num a => [a]
Another, different use of currying is that Haskell allows you to partially apply type constructors. E.g., if you have this type:
data Foo a b = Foo a b
...it actually makes sense to write Foo a in many contexts, for example:
instance Functor (Foo a) where
fmap f (Foo a b) = Foo a (f b)
I.e., Foo is a two-parameter type constructor with kind * -> * -> *; Foo a, the partial application of Foo to just one type, is a type constructor with kind * -> *. Functor is a type class that can only be instantiated for type constrcutors of kind * -> *. Since Foo a is of this kind, you can make a Functor instance for it.
The "no-currying" form of partial application works like this:
We have a function f : (A ✕ B) → C
We'd like to apply it partially to some a : A
To do this, we build a closure out of a and f (we don't evaluate f at all, for the time being)
Then some time later, we receive the second argument b : B
Now that we have both the A and B argument, we can evaluate f in its original form...
So we recall a from the closure, and evaluate f(a,b).
A bit complicated, isn't it?
When f is curried in the first place, it's rather simpler:
We have a function f : A → B → C
We'd like to apply it partially to some a : A – which we can just do: f a
Then some time later, we receive the second argument b : B
We apply the already evaluated f a to b.
So far so nice, but more important than being simple, this also gives us extra possibilities for implementing our function: we may be able to do some calculations as soon as the a argument is received, and these calculations won't need to be done later, even if the function is evaluated with multiple different b arguments!
To give an example, consider this audio filter, an infinite impulse response filter. It works like this: for each audio sample, you feed an "accumulator function" (f) with some state parameter (in this case, a simple number, 0 at the beginning) and the audio sample. The function then does some magic, and spits out the new internal state1 and the output sample.
Now here's the crucial bit – what kind of magic the function does depends on the coefficient2 λ, which is not quite a constant: it depends both on what cutoff frequency we'd like the filter to have (this governs "how the filter will sound") and on what sample rate we're processing in. Unfortunately, the calculation of λ is a bit more complicated (lp1stCoeff $ 2*pi * (νᵥ ~*% δs) than the rest of the magic, so we wouldn't like having to do this for every single sample, all over again. Quite annoying, because νᵥ and δs are almost constant: they change very seldom, certainly not at each audio sample.
But currying saves the day! We simply calculate λ as soon as we have the necessary parameters. Then, at each of the many many audio samples to come, we only need to perform the remaining, very easy magic: yⱼ = yⱼ₁ + λ ⋅ (xⱼ - yⱼ₁). So we're being efficient, and still keeping a nice safe referentially transparent purely-functional interface.
1 Note that this kind of state-passing can generally be done more nicely with the State or ST monad, that's just not particularly beneficial in this example
2 Yes, this is a lambda symbol. I hope I'm not confusing anybody – fortunately, in Haskell it's clear that lambda functions are written with \, not with λ.
It's somewhat dubious to ask what the benefits of currying are without specifying the context in which you're asking the question:
In some cases, like functional languages, currying will merely be seen as something that has a more local change, where you could replace things with explicit tupled domains. However, this isn't to say that currying is useless in these languages. In some sense, programming with curried functions make you "feel" like you're programming in a more functional style, because you more typically face situations where you're dealing with higher order functions. Certainly, most of the time, you will "fill in" all of the arguments to a function, but in the cases where you want to use the function in its partially applied form, this is a bit simpler to do in curried form. We typically tell our beginning programmers to use this when learning a functional language just because it feels like better style and reminds them they're programming in more than just C. Having things like curry and uncurry also help for certain conveniences within functional programming languages too, I can think of arrows within Haskell as a specific example of where you would use curry and uncurry a bit to apply things to different pieces of an arrow, etc...
In some cases, you want to think about more than functional programs, you can present currying / uncurrying as a way to state the elimination and introduction rules for and in constructive logic, which provides a connection to a more elegant motivation for why it exists.
In some cases, for example, in Coq, using curried functions versus tupled functions can produce different induction schemes, which may be easier or harder to work with, depending on your applications.
I used to think that currying was simple syntax sugar that saves you a bit of typing. For example, instead of writing
(\ x -> x + 1)
I can merely write
(+1)
The latter is instantly more readable, and less typing to boot.
So if it's just a convenient short cut, why all the fuss?
Well, it turns out that because function types are curried, you can write code which is polymorphic in the number of arguments a function has.
For example, the QuickCheck framework lets you test functions by feeding them randomly-generated test data. It works on any function who's input type can be auto-generated. But, because of currying, the authors were able to rig it so this works with any number of arguments. Were functions not curried, there would be a different testing function for each number of arguments - and that would just be tedious.

Why does the Applicative instance for Maybe give Nothing when function is Nothing in <*>

I am a beginner with haskell and am reading the Learn you a haskell book. I have been trying to digest functors and applicative functors for a while now.
In the applicative functors topic, the instance implementation for Maybe is given as
instance Applicative Maybe where
pure = Just
Nothing <*> _ = Nothing
(Just f) <*> something = fmap f something
So, as I understand it, we get Nothing if the left side functor (for <*>) is Nothing. To me, it seems to make more sense as
Nothing <*> something = something
So that this applicative functor has no effect. What is the usecase, if any for giving out Nothing?
Say, I have a Maybe String with me, whose value I don't know. I have to give this Maybe to a third party function, but want its result to go through a few Maybe (a -> b)'s first. If some of these functions are Nothing I'll want them to silently return their input, not give out a Nothing, which is loss of data.
So, what is the thinking behind returning Nothing in the above instance?
How would that work? Here's the type signature:
(<*>) :: Applicative f => f (a -> b) -> f a -> f b
So the second argument here would be of type Maybe a, while the result needs to be of type Maybe b. You need some way to turn a into b, which you can only do if the first argument isn't Nothing.
The only way something like this would work is if you have one or more values of type Maybe (a -> a) and want to apply any that aren't Nothing. But that's much too specific for the general definition of (<*>).
Edit: Since it seems to be the Maybe (a -> a) scenario you actually care about, here's a couple examples of what you can do with a bunch of values of that type:
Keeping all the functions and discard the Nothings, then apply them:
applyJust :: [Maybe (a -> a)] -> a -> a
applyJust = foldr (.) id . catMaybes
The catMaybes function gives you a list containing only the Just values, then the foldr composes them all together, starting from the identity function (which is what you'll get if there are no functions to apply).
Alternatively, you can take functions until finding a Nothing, then bail out:
applyWhileJust :: [Maybe (a -> a)] -> a -> a
applyWhileJust (Just f:fs) = f . applyWhileJust fs
applyWhileJust (Nothing:_) = id
This uses a similar idea as the above, except that when it finds Nothing it ignores the rest of the list. If you like, you can also write it as applyWhileJust = foldr (maybe (const id) (.)) id but that's a little harder to read...
Think of the <*> as the normal * operator. a * 0 == 0, right? It doesn't matter what a is. So using the same logic, Just (const a) <*> Nothing == Nothing. The Applicative laws dictate that a data type has to behave like this.
The reason why this is useful, is that Maybe is supposed to represent the presence of something, not the absence of something. If you pipeline a Maybe value through a chain of functions, if one function fails, it means that a failure happened, and that the process needs to be aborted.
The behavior you propose is impractical, because there are numerous problems with it:
If a failed function is to return its input, it has to have type a -> a, because the returned value and the input value have to have the same type for them to be interchangeable depending on the outcome of the function
According to your logic, what happens if you have Just (const 2) <*> Just 5? How can the behavior in this case be made consistent with the Nothing case?
See also the Applicative laws.
EDIT: fixed code typos, and again
Well what about this?
Just id <*> Just something
The usecase for Nothing comes when you start using <*> to plumb through functions with multiple inputs.
(-) <$> readInt "foo" <*> readInt "3"
Assuming you have a function readInt :: String -> Maybe Int, this will turn into:
(-) <$> Nothing <*> Just 3
<$> is just fmap, and fmap f Nothing is Nothing, so it reduces to:
Nothing <*> Just 3
Can you see now why this should produce Nothing? The original meaning of the expression was to subtract two numbers, but since we failed to produce a partially-applied function after the first input, we need to propagate that failure instead of just making up a nice function that has nothing to do with subtraction.
Additional to C. A. McCann's excellent answer I'd like to point out that this might be a case of a "theorem for free", see http://ttic.uchicago.edu/~dreyer/course/papers/wadler.pdf . The gist of this paper is that for some polymorphic functions there is only one possible implementiation for a given type signature, e.g. fst :: (a,b) -> a has no other choice than returning the first element of the pair (or be undefined), and this can be proven. This property may seem counter-intuitive but is rooted in the very limited information a function has about its polymorphic arguments (especially it can't create one out of thin air).

Resources