ApplicativeDo in Haskell - haskell

AFAIK one of the new additions to GHC8 is the ApplicativeDo language extension, which desugars the do-notation to the corresponding Applicative methods (<$>, <*>) if possible. I have the following questions.
How does it decide whether the desugaring to Applicative methods is possible? From what I know, it does a dependency check (if the later depends on the result of former) to decide the eligibility. Are there any other criteria?
Although this addition makes applicative code easier to read for classes that don't have any Monad instance (maybe ?). But for structures that both have a Monad and an Applicative instance: Is this a recommended practice (from the readability perspective)? Are there any other benefits?

How does it decide whether the desugaring to Applicative methods is possible? From what I know, it does a dependency check (if the later depends on the result of former) to decide the eligibility. Are there any other criteria?
The paper and trac page are the best sources of information. Some points:
ApplicativeDo uses applicatives wherever possible - including mixing them with >>= in some cases where only part of the do block is applicative
A sort of directed graph of dependencies is constructed to see which parts can be "parallelized"
Sometimes there is no obvious best translation of this graph! In this case, GHC chooses the translation that is (roughly speaking) locally smallest
Although this addition makes applicative code easier to read for classes that don't have any Monad instance (maybe ?). But for structures that both have a Monad and an Applicative instance: Is this a recommended practice (from the readability perspective)? Are there any other benefits?
Here is an answer about the difference between Applicative and Monad. To quote directly:
To deploy <*>, you choose two computations, one of a function, the other of an argument, then their values are combined by application. To deploy >>=, you choose one computation, and you explain how you will make use of its resulting values to choose the next computation. It is the difference between "batch mode" and "interactive" operation.
A prime example of this sort of thing is the Haxl monad (designed by Facebook) which is all about fetching data from some external source. Using Applicative, these requests can happen in parallel while Monad forces the requests to be sequential. In fact, this example is what motivated Simon Marlow at Facebook to make the ApplicativeDo extension in the first place and write the quoted paper about it.
In general, most Monad instances don't necessarily benefit from Applicative. From the same answer I quoted above:
I appreciate that ApplicativeDo is a great way to make more applicative (and in some cases that means faster) programs that were written in monadic style that you haven't the time to refactor. But otherwise, I'd argue applicative-when-you-can-but-monadic-when-you-must is also the better way to see what's going on.
So: use Applicative over Monad when possible, and exploit ApplicativeDo when it really is nicer to write (like it must have been in some cases at Facebook) than the corresponding applicative expression.

Related

What purpose does the complexity of `Except` serve in Haskell?

I understand (I think) that there is a close relationship between Either and Except in Haskell, and that it is easy to convert from one to the other. But I'm a bit confused about best practices for handling errors in Haskell and under what circumstances and scenarios I would choose one over the other. For example, in the example provided in Control.Monad.Except, Either is used in the definition
type LengthMonad = Either LengthError
so that calculateLength "abc" is
Right 3
If instead one were to define
type LengthMonad = Except LengthError
then calculateLength "abc" would be
ExceptT (Identity (Right 3))
I'm confused about what purpose this would serve and when one one want it. Why does everything returned from calculateLength always have Identity; why not just SomeExceptionType (Right 3) or even SomeSuccessType 3?
I'm a Haskell beginner when it comes to concepts like this, so a concrete example of when I'd want the latter over the former would be much appreciated, especially why it's so (apparently to me) complex. For example, what can a caller of a function that uses the Except version of calculateLength do, that they can't (or at least can't as easily) do with the Either version?
Abstract
Use Either for normal success/error APIs. It's defined in the base library, so it doesn't push other dependencies on a consumer. Also, this is one of the most basic Haskell types, so 'everyone' understands how it works.
Only use ExceptT if you specifically need to combine Either with another monad (such as, for example IO). This type is defined in the transformers library, so pushes an extra dependency on consumers. Additionally, monad transformers is a more advanced feature of Haskell, so you can't expect everyone to understand how to use it.
Speculation on reasons
I wasn't around when those decisions were made, but it seems that there are various historical reasons for the confusion. Haskell is an old language (older than Java!), so even though efforts have been made to streamline it and rectify old mistakes, some still remain. As far as I can tell, the Either/ExceptT confusion is one of those situations.
I'm speculating that Either is older than the concept of monad transformers, so I imagine that the type Either was introduced to the base library early in the history of Haskell.
The same thing seems to be the case with Maybe.
Other monads, likes e.g. Reader and State seem to have been introduced (or at least 'retconned') together with their monad transformers. For example, Reader is just a special case of ReaderT, where the 'other' Monad is Identity:
type Reader r = ReaderT r Identity
The same goes for StateT:
type State s = StateT s Identity
That's the general pattern for many of the monads defined in the transformers library. ExceptT just follows the pattern by defining Except as the special case of ExceptT.
There are exceptions to that pattern. For example, MaybeT doesn't define Maybe as a special case. Again, I believe that this is for historical reasons; Maybe was probably around long before anyone started work on the transformers library.
The story about Either seems even more convoluted. As far as I can tell, there was, originally, an EitherT monad transformer, but apparently (I forget the details) there was something wrong with the way that it behaved (it probably broke some laws), so it was replaced with another transformer called ErrorT, which again turned out to be wrong. Third time's the charm, I suppose, so ExceptT was introduced.
The Control.Monad.Trans.Except module follows the pattern of most other monad transformers by defining the 'uneffectful' special case using a type alias:
type Except e = ExceptT e Identity
I suppose it does that because it can, but it may be unfortunate, because it's confusing. There's definitely prior art that suggests that a monad transformer doesn't have to follow that pattern (e.g. MaybeT), so I think it would have been better if the module hadn't done that, but it does, and that's where we are.
I would essentially ignore the Except type and use Either instead, but use ExceptT if a transformer is required.

What is an (composite) effect (possibly represented as a Monad+Monad Transformers) ? Precise, clear, short answer/definition? [duplicate]

Time and again I read the term effectful, but I am still unable to give a clear definition of what it means. I assume the correct context is effectful computations, but I've also seen the term effectful values)
I used to think that effectful means having side effects. But in Haskell there are no side-effects (except to some extent IO). Still there are effectful computations all over the place.
Then I read that monads are used to create effectful computations. I can somewhat understand this in the context of the State Monad. But I fail to see any side-effect in the Maybe monad. In general it seems to me, that Monads which wrap a function-like thing are easier to see as producing side-effects than Monads which just wrap a value.
When it comes to Applicative functors I am even more lost. I always saw applicative functors as a way to map a function with more than one argument. I cannot see any side-effect here. Or is there a difference between effectful and with effects?
A side effect is an observable interaction with its environment (apart from computing its result value). In Haskell, we try hard to avoid functions with such side effects. This even applies to IO actions: when an IO action is evaluated, no side effects are performed, they are executed only when the actions prescribed in the IO value are executed within main.
However, when working with abstractions that are related to composing computations, such as applicative functors and monads, it's convenient to somewhat distinguish between the actual value and the "rest", which we often call an "effect". In particular, if we have a type f of kind * -> *, then in f a the a part is "the value" and whatever "remains" is "the effect".
I intentionally quoted the terms, as there is no precise definition (as far as I know), it's merely a colloquial definition. In some cases there are no values at all, or multiple values. For example for Maybe the "effect" is that there might be no value (and the computation is aborted), for [] the "effect" is that there are multiple (or zero) values. For more complex types this distinction can be even more difficult.
The distinction between "effects" and "values" doesn't really depend on the abstraction. Functor, Applicative and Monad just give us tools what we can do with them (Functors allow to modify values inside, Applicatives allow to combine effects and Monads allow effects to depend on the previous values). But in the context of Monads, it's somewhat easier to create a mental picture of what is going on, because a monadic action can "see" the result value of the previous computation, as witnessed by the
(>>=) :: m a -> (a -> m b) -> m b
operator: The second function receives a value of type a, so we can imagine "the previous computation had some effect and now there is its result value with which we can do something".
In support of Petr Pudlák's answer, here is an argument concerning the origin of the broader notion of "effect" espoused there.
The phrase "effectful programming" shows up in the abstract of McBride and Patterson's Applicative Programming with Effects, the paper which introduced applicative functors:
In this paper, we introduce Applicative functors — an abstract characterisation of an applicative style of effectful programming, weaker than Monads and hence more widespread.
"Effect" and "effectful" appear in a handful of other passages of the paper; these ocurrences are deemed unremarkable enough not to require an explicit clarification. For instance, this remark is made just after the definition of Applicative is presented (p. 3):
In each example, there is a type constructor f that embeds the usual
notion of value, but supports its own peculiar way of giving meaning to the usual applicative language [...] We correspondingly introduce the Applicative class:
[A Haskell definition of Applicative]
This class generalises S and K [i.e. the S and K combinators, which show up in the Reader/function Applicative instance] from threading an environment to threading an effect in general.
From these quotes, we can infer that, in this context:
Effects are the things that Applicative threads "in general".
Effects are associated with the type constructors that are given Applicative instances.
Monad also deals with effects.
Following these leads, we can trace back this usage of "effect" back to at least Wadler's papers on monads. For instance, here is a quote from page 6 of Monads for functional programming:
In general, a function of type a → b is replaced by a function of type a
→ M b. This can be read as a function that accepts an argument of type a
and returns a result of type b, with a possible additional effect captured by
M. This effect may be to act on state, generate output, raise an exception, or what have you.
And from the same paper, page 21:
If monads encapsulate effects and lists form a monad, do lists correspond to some effect? Indeed they do, and the effect they correspond to is choice. One can think of a computation of type [a] as offering a choice of values, one for each element of the list. The monadic equivalent of a function of type a → b is a function of type a → [b].
The "correspond to some effect" turn of phrase here is key. It ties back to the more straightforward claim in the abstract:
Monads provide a convenient framework for simulating effects found in other languages, such as global state, exception handling, output, or non-determinism.
The pitch is that monads can be used to express things that, in "other languages", are typically encoded as side-effects -- that is, as Petr Pudlák puts it in his answer here, "an observable interaction with [a function's] environment (apart from computing its result value)". Through metonymy, that has readily led to "effect" acquiring a second meaning, broader than that of "side-effect" -- namely, whatever is introduced through a type constructor which is a Monad instance. Over time, this meaning was further generalised to cover other functor classes such as Applicative, as seen in McBride and Patterson's work.
In summary, I consider "effect" to have two reasonable meanings in Haskell parlance:
A "literal" or "absolute" one: an effect is a side-effect; and
A "generalised" or "relative" one: an effect is a functorial context.
On occasion, avoidable disagreements over terminology happen when each of the involved parties implicitly assumes a different meaning of "effect". Another possible point of contention involves whether it is legitimate to speak of effects when dealing with Functor alone, as opposed to subclasses such as Applicative or Monad (I believe it is okay to do so, in agreement with Petr Pudlák's answer to Why can applicative functors have side effects, but functors can't?).
To my mind, a "side effect" is anything that a normal function couldn't do. In other words, anything in addition to just returning a value.
Consider the following code block:
let
y = foo x
z = bar y
in foobar z
This calls foo, and then calls bar, and then calls foobar, three ordinary functions. Simple enough, right? Now consider this:
do
y <- foo x
z <- bar y
foobar z
This also calls three functions, but it also invisibly calls (>>=) between each pair of lines as well. And that means that some strange things happen, depending on what type of monad the functions are running in.
If this is the identity monad, nothing special happens. The monadic version does exactly the same thing as the pure version. There are no side-effects.
If each function returns a Maybe-something, then if (say) bar returns Nothing, the entire code block aborts. A normal function can't do that. (I.e., in the pure version, there is no way to prevent foobar being called.) So this version does something that the pure version cannot. Each function can return a value or abort the block. That's a side-effect.
If each function returns a list-of-something, then the code executes for all possible combinations of results. Again, in the pure version, there is no way to make any of the functions execute multiple times with different arguments. So that's a side-effect.
If each function runs in a state monad, then (for example) foo can send some data directly to foobar, in addition to the value you can see being passed through bar. Again, you can't do that with pure functions, so that's a side-effect.
In IO monad, you have all sorts of interesting effects. You can save files to disk (a file is basically a giant global variable), you can even affect code running on other computers (we call this network I/O).
The ST monad is a cut-down version of the IO monad. It allows mutable state, but self-contained computations cannot influence each other.
The STM monad lets multiple threads talk to each other, and may cause the code to execute multiple times, and... well, you can't do any of this with normal functions.
The continuation monad allows you to break people's minds! Arguably that is possible with pure functions...
"Effect is a very vague term and that is ok because we are trying to talk about something that is outside the language. Effect and side effect are not the same thing. Effects are good. Side effects are bugs.
Their lexical similarity is really unfortunate because it leads to a lot of people conflating these ideas when they read about them and people using one instead of the other so it leads to a lot of confusion."
see here for more: https://www.slideshare.net/pjschwarz/rob-norrisfunctionalprogrammingwitheffects
Functional programming context
Effect generally means the stuff (behaviour, additional logic) that is implemented in Applicative/Monad instances.
Also, it can be said that a simple value is extended with additional behaviour.
For example,
Option models the effects of optionality
or
Option is a monad that models the effect of optionality (of being something optional)
Resources: Resource 1,
Resource 2

Why do all Haskell typeclasses have laws?

All the typeclasses in Typeclassopedia have associated laws, such as associativity or commutativity for certain operators. The definition of a "law" seems to be a constraint that cannot be expressed in the type system. I certainly understand why you want to have, say, monad laws, but is there a fundamental reason why a typeclass that can be expressed fully within the type system is pointless?
You will notice that almost always the laws are algebraic laws. They could be expressed by the type system by using some extensions, but the proofs would be cumbersome to express. So you have unchecked laws and potentially implementations might break them. Why is this good?
The reason is that the design patterns used in Haskell are motivated (and in most cases mirrored) by mathematical structures, usually from abstract algebra. While most other languages have an intuitive notion of certain features like safety, performance and semantics, we Haskell programmers prefer to establish a formal notion. The advantage of doing this is: Once your types and functions obey the safety laws, they are safe in the sense of the underlying algebraic structure. They are provably safe.
Take functors as an example. A Haskell functor has the following two laws:
fmap f . fmap g = fmap (f . g)
fmap id = id
Firstly this is very important: Functions in Haskell are opaque. You cannot examine, compare or whatever them. While this sounds like a bad thing in Haskell it is actually a very good thing. The fmap function cannot examine the function you've passed it. Particularly it can't check that you've passed the identity function or that you've passed a composition. In short: it can't cheat! The only way for it to obey these two laws is actually not to introduce any effects of its own. That means, in a proper functor fmap will never do anything unexpected. In fact it cannot do anything else than to map the given function. This is a very simple example and I haven't explained all the subtleties why fmap can't cheat, but it demonstrates the point.
Now extend this all over the language, the base libraries and most sensible third party libraries. This gives you a language that is as predictable as a language can get. When you write code, you know what it's going to do. That's one of the main reasons why Haskell code often works out of the box. I often write pages of Haskell code before compiling. Once my type errors are fixed, my program usually works.
The other reason why this is desirable is that it allows a more compositional style of programming. This is particularly useful when working as a team. First you map your application to algebraic structures and establish the necessary laws. For example: You express what it means for something to be a Valid Web Server. In particular you establish a formal notion of web server composition. If you compose two Valid Web Servers, the result is a Valid Web Server. Do you see where this is going? After establishing these laws the teammates go to work, and they work in isolation. Little to no communication is necessary to get their job done. When they meet again, everybody presents their Valid Web Servers and they just compose them to make the final product, a web site. Since the individual components were all Valid Web Servers, the final result must be a Valid Web Server. Provably.
Yes and no. For instance the Show class does not have any laws associated with it, and it is certainly useful.
However, typeclasses express interfaces. An interface needs to satisfy more than being just a bunch of functions - you want these functions to fulfill a specification. The specification is normally more complicated than what can be expressed in Haskell's type system. For example, take the Eq class. It only needs to provide us with a function, the type of which has to be a -> a -> Bool. That's the most that Haskell's type system will allow us to require from an instance of an Eq type. However, we would normally expect more from this function - you would probably want it to be an equivalence relation (reflexive, symmetric and transitive). So then you state these requirements as separate "laws".
A typeclass doesn't need to have laws, but it often will be more useful if it has them. Many typeclasses are expected to function in a certain way, the laws codify user expectations. The laws let users make assumptions about the way that an instance of a typeclass will work. If you break the typeclass laws, you don't get arrested by the Haskell police, you just end up with confused users.

What functionality do you get for free with Functors or other type-classes?

I read an article which said:
Providing instances for the many standard type-classes [Functors] will immediately give you a lot of functionality for practically free
My question is: what is this functionality that you get for free (for functors or other type-classes)? I know what the definition of a functor is, but what do I get for free by defining something as a functor/other type-class. Something other than a prettier syntax. Ideally this would be general and useful functions that operate on functors/other type-classes.
My imagination (could be wrong) of what free means is functions of this sort: TypeClass x => useful x y = ..
== Edit/Additition ==
I guess I'm mainly asking about the more abstract (and brain boggling) type-classes, like the ones in this image. For less abstract classes like Ord, my object oriented intuition understands.
Functors are simple and probably not the best example. Let's look at Monads instead:
liftM - if something is a Monad, it is also a Functor where liftM is fmap.
>=>, <=<: you can compose a -> m b functions for free where m is your monad.
foldM, mapM, filterM... you get a bunch of utility functions that generalize existing functions to use your monad.
when, guard* and unless -- you also get some control functions for free.
join -- this is actually fairly fundamental to the definition of a monad, but you don't need to define it in Haskell since you've defined >>=.
transformers -- ErrorT and stuff. You can bolt error handling onto your new type, for free (give or take)!
Basically, you get a wide variety of standard functions "lifted" to use your new type as soon as you make it a Monad instance. It also becomes trivial (but alas not automatic) to make it a Functor and Applicative as well.
However, these are all "symptoms" of a more general idea. You can write interesting, nontrivial code that applies to all monads. You might find some of the functions you wrote for your type--which are useful in your particular case, for whatever reason--can be generalized to all monads. Now you can suddenly take your function and use it on parsers, and lists, and maybes and...
* As Daniel Fischer helpfully pointed out, guard requires MonadPlus rather than Monad.
Functors are not very interesting by themselves, but they are a necessary stepping stone to get into applicative functors and Traversables.
The main property which makes applicative functors useful is that you can use fmap with the applicative operator <*> to "lift" any function of any arity to work with applicative values. I.e. you can turn any a -> b -> c -> d into Applicative f => f a -> f b -> f c -> f d. You can also take a look at Data.Traversable and Data.Foldable which contain several general purpose functions that involve applicative functors.
Alternative is a specialized applicative functor which supports choice between alternatives that can "fail" (the exact meaning of "empty" depends in the applicative instance). Applicative parsers are one practical example where the definitions of some and many are very intuitive (e.g. match some pattern zero-or-more times or one-or-more times).
Monads are one of the most interesting and useful type-classes, but they are already well covered by the other answers.
Monoid is another type-class that is both simple and immediately useful. It basically defines a way to add two pieces of data together, which then gives you a generic concat as well as functionality in the aforementioned Foldable module and it also enables you to use the Writer monad with the data type.
There are many of the standard functions in haskell that require that their arguments implement one or more type-classes. Doing so in your code allows other developers (or yourself) to use your data in ways they are already familiar with, without having to write additional functions.
As an example, implementing the Ord type-class will allow you to use things like sort, min, max, etc. Where otherwise, you would need sortBy and the like.
Yes, it means that implementing the type class Foo gives you all the other functions that have a Foo constraint "for free".
The Functor type class isn't too interesting in that regard, as it doesn't give you a lot.
A better example is monads and the functions in the Control.Monad module. Once you've defined the two Monad functions (>>=) and return for your type, you get another thirty or so functions that can then be used on your type.
Some of the more useful ones include: mapM, sequence, forever, join, foldM, filterM, replicateM, when, unless and liftM. These show up all the time in Haskell code.
As others have said, Functor itself doesn't actually get you much for free. Basically, the more high-level or general a typeclass is (meaning the more things fit that description), then the less "free" functionality you are going to get. So for example, Functor, and Monoid don't provide you with much, but Monad and Arrow provide you with a lot of useful functions for free.
In Haskell, it's still a good idea to write an instance for Functor and Monoid though (if your data type is indeed a functor or a monoid), because we almost always try to use the most general interface possible when writing functions. If you are writing a new function that can get away with only using fmap to operate on your data type, then there is no reason to artificially restrict that function to to Monads or Applicatives, since it might be useful later for other things.
Your object-oriented intuition carries across, if you read "interface and implementation" for "typeclass and instance". If you make your new type C an instance of a standard typeclass B, then you get for free that your type will work with all existing code A that depends on B.
As others have said, when the typeclass is something like Monad, then the freebies are the many library functions like foldM and when.

Isn't Monad just a syntactic sugar?

I have been through various papers/articles/blogs and what not about Monads. People talk about them in various context like category theory (what in world is that?) etc. After going through all this and trying to really understand and write monadic code, I came to the understanding that monads are just syntactic sugar (probably the most glorified of them all). Whether it is do notation in Haskell or the Computation Expressions in F# or even the LINQ select many operator (remember LINQ syntax is also a syntactic sugar in C#/VB).
My question is if anyone believe monads are more then syntactic sugar (over nested method calls) then please enlighten me with "practicality" rather than "theoretical concepts".
Thanks all.
UPDATE:
After going through all the answers I came to the conclusion that implementation of monad concept in a particular language is driven through a syntactic sugar BUT monad concept in itself is not related to syntactic sugar and is very general or abstract concept. Thanks every body for the answer to make the difference clear between the concept itself and the ways it is being implemented in languages.
Monad aren't syntactic sugar; Haskell has some sugar for dealing with monads, but you can use them without the sugar and operators. So, Haskell doesn't really 'support' monads any more than loads of other languages, just makes them easier to use and implement. A monad isn't a programming construct, or a language feature as such; it's an abstracted way of thinking about certain types of objects, which, when intuited as Haskell types, provide a nice way of thinking about the transfer of state in types which lets Haskell (or indeed any language, when thought of functionally) do its thing.
do notation, computation expressions and similar language constructs are of course syntactic sugar. This is readily apparent as those constructs are usually defined in terms of what they desugar to. A monad is simply a type that supports certain operations. In Haskell Monad is a typeclass which defines those operations.
So to answer your question: Monad is not a syntactic sugar, it's a type class, however the do notation is syntactic sugar (and of course entirely optional - you can use monads just fine without do notation).
By definition, Monads aren't syntactic sugar. They are a triple of operations (return/unit, map, and join) over a universe of values (lists, sets, option types, stateful functions, continuations, etc.) that obey a small number of laws. As used in programming, these operations are expressed as functions. In some cases, such as Haskell, these functions can be expressed polymorphically over all monads, through the use of typeclasses. In other cases, these functions have to be given a different name or namespace for each monad. In some cases, such as Haskell, there is a layer of syntactic sugar to make programming with these functions more transparent.
So Monads aren't about nested function calls per-se, and certainly aren't about sugar for them. They are about the three functions themselves, the types of values they operate over, and the laws these functions obey.
Monads are syntactic sugar in the same sense that classes and method call syntax are syntactic sugar. It is useful and practical, if a bit verbose, to apply object-oriented principles to a language such as C. Like OO (and many other language features) monads are an idea, a way of thinking about organizing your programs.
Monadic code can let you write the shape of code while deferring certain decisions to later. A Log monad, which could be a variant of Writer could be used to write code for a library that supports logging but let the consuming application decide where the logging goes, if anywhere. You can do this without syntactic sugar at all, or you can leverage it if the language you're working in supports it.
Of course there are other ways to get this feature but this is just one, hopefully "practical" example.
No,
you can think of a Monad (or any other type-classes in Haskell) more in terms of a pattern.
You see a pattern and you handle it every time the same way, so that you can generalize over it.
In this case it's the pattern of of values added information (or if you like data inside some kind of bags - but this picture does not hold for every monad) and a way to chain those together nicely.
The syntactic suggar is just some nice little way to compose the binds ;)
Its a extension to the thing ;)
For the practical concepts: just look at async-workflows, or the IO monad - should be practical enough ;)
I would first call it a pattern, in the sense that m a -> (a -> m b) -> m b (with a reasonable behavior) is convenient for many different problems / type constructors.
Actually so convenient that it deserves providing some syntactic sugar in the language. That's the do notation in Haskell, from in C#, for comprehensions in scala. The syntatic sugar requires only adherence to a naming pattern when implementing (selectMany in C#, flatMap in scala). Those languages do that without Monad being a type in their libraries (in scala, one may be written). Note that C# does that for the pattern Iterator too. While there is an interface IEnumerable, foreach is translated to calls to GetEnumerator/MoveNext/Current based on the name of the methods, irrespective of the types. Only when the translation is done is it checked that everything is defined and well typed.
But in Haskell (that may be done in Scala or OCaml too, non in C# and I believe this is not possible in F# either), Monad is more than design pattern + syntatic sugar based on naming pattern. It's an actual API, software component, whatever.
Consider the iterator pattern in (statically typed) imperative languages. You may just implement MoveNext/Current (or hasNext/next) in classes where this is appropriate. And if there is some syntactic sugar like C# for it, that's already quite useful. But if you make it an interface, you can immediately do much more. You can have computations that works on any iterator. You can have utilities methods on iterator (find, filter, chain, nest..) making them more poweful.
When Monad is a type rather than just a pattern, you can do the same. You can have utilities functions that make working with Monad more powerful (in Control.Monad) you can have computation where the type of monad to use is a parameter (see this old article from Wadler showing how an interpreter can be parameterized by the monad type and what various instances do). To have a monad type (type class), you need some kind of higher order type, that is you need to be able to parametrize with a type constructor, rather than a simple data type.

Resources