What is a monad in FP, in categorical terms?

What is a monad in FP, in categorical terms? - haskell

Every time someone promises to "explain monads", my interest is piqued, only to be replaced by frustration when the alleged "explanation" is a long list of examples terminated by some off-hand remark that the "mathematical theory" behind the "esoteric ideas" is "too complicated to explain at this point".
Now I'm asking for the opposite. I have a solid grasp on category theory and I'm not afraid of diagram chasing, Yoneda's lemma or derived functors (and indeed on monads and adjunctions in the categorical sense).
Could someone give me a clear and concise definition of what a monad is in functional programming? The fewer examples the better: sometimes one clear concept says more than a hundred timid examples. Haskell would do nicely as a language for demonstration though I'm not picky.

This question has some good answers: Monads as adjunctions
More to the point, Derek Elkins' "Calculating Monads with Category Theory" article in TMR #13 should have the sort of constructions you're looking for: http://www.haskell.org/wikiupload/8/85/TMR-Issue13.pdf
Finally, and perhaps this is really the closest to what you're looking for, you can go straight to the source and look at Moggi's seminal papers on the topic from 1988-91: http://www.disi.unige.it/person/MoggiE/publications.html
See in particular "Notions of computation and monads".
My own I'm sure too condensed/imprecise take:
Begin with a category Hask whose objects are Haskell types, and whose morphisms are functions. Functions are also objects in Hask, as are products. So Hask is Cartesian closed. Now introduce an arrow mapping every object in Hask to MHask which is a subset of the objects in Hask. Unit!
Next introduce an arrow mapping every arrow on Hask to an arrow on MHask. This gives us map, and makes MHask a covariant endofunctor. Now introduce an arrow mapping every object in MHask which is generated from an object in MHask (via unit) to the object in MHask which generates it. Join! And from the that, MHask is a monad (and a monoidal endofunctor to be more precise).
I'm sure there is a reason why the above is deficient, which is why I'd really direct you, if you're looking for formalism, to the Moggi papers in particular.

As a compliment to Carl's answer, a Monad in Haskell is (theoretically) this:
class Monad m where
join :: m (m a) -> m a
return :: a -> m a
fmap :: (a -> b) -> m a -> m b
Note that "bind" (>>=) can be defined as
x >>= f = join (fmap f x)
According to the Haskell Wiki
A monad in a category C is a triple (F : C → C, η : Id → F, μ : F ∘ F → F)
...with some axioms. For Haskell, fmap, return, and join line up with F, η, and μ, respectively. (fmap in Haskell defines a Functor). If I'm not mistaken, Scala calls these map, pure, and join respectively. (Scala calls bind "flatMap")

Ok, using Haskell terminology and examples...
A monad, in functional programming, is a composition pattern for data types with the kind * -> *.
class Monad (m :: * -> *) where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
(There's more to the class than that in Haskell, but those are the important parts.)
A data type is a monad if it can implement that interface while satisfying three conditions in the implementation. These are the "monad laws", and I'll leave it to those long-winded explanations for the full explanation. I summarize the laws as "(>>= return) is an identity function, and (>>=) is associative." It's really not more than that, even if it can be expressed more precisely.
And that's all a monad is. If you can implement that interface while preserving those behavioral properties, you have a monad.
That explanation is probably shorter than you expected. That's because the monad interface really is very abstract. The incredible level of abstraction is part of why so many different things can be modeled as monads.
What's less obvious is that as abstract as the interface is, it allows generically modeling any control-flow pattern, regardless of the actual monad implementation. This is why the Control.Monad package in GHC's base library has combinators like when, forever, etc. And this is why the ability to explicitly abstract over any monad implementation is powerful, especially with support from a type system.

You should read the paper by Eugenio Moggi "Notions of computations and monads" which explain the then proposed role of monads to structure denotational semantic of effectful languages.
Also there is a related question:
References for learning the theory behind pure functional languages such as Haskell?
As you don't want hand-waving, you have to read scientific papers, not forum answers or tutorials.

A monad is a monoid in the category of endofunctors, whats the problem?.
Humor aside, I personally believe that monads, as they are used in Haskell and functional programming, are better understood from the monads-as-an-interface point of view (as in Carl's and Dan's answers) instead of from the monads-as-the-term-from-category-theory point of view. I have to confess that I only really internalized the whole monad thing when I had to use a monadic library from another language in a real project.
You mention that you didn't like all the "lots of examples" tutorials. Has anyone ever pointed you to the Awkward squad paper? It focuses manly in the IO monad but the introduction gives a good technical and historical explanation of why the monad concept was embraced by Haskell in the first place.

I don't really know what I'm talking about, but here's my take:
Monads are used to represent computations. You can think of a normal procedural program, which is basically a list of statements, as a bunch of composed computations. Monads are a generalization of this concept, allowing you to define how the statements get composed. Each computation has a value (it could just be ()); the monad just determines how the value strung through a series of computations behaves.
Do notation is really what makes this clear: it's basically a special sort of statement-based language that lets you define what happens between statements. It's as if you could define how ";" worked in C-like languages.
In this light all of the monads I've used so far makes sense: State doesn't affect the value but updates a second value which is passed along from computation to computation in the background; Maybe short-circuits the value if it ever encounters a Nothing; List lets you have a variable number of values passed through; IO lets you have impure values passed through in a safe way. The more specialized monads I've used like Gen and Parsec parsers are also similar.
Hopefully this is a clear explanation which isn't completely off-base.

Since you understand monads in the category-theoretic sense I am interpreting your question as being about the presentation of monads in functional programming.
Thus my answer avoids any explanation of what a monad is, or any intuition about its meaning or use.
Answer: In Haskell a monad is presented, in an internal language for some category, as the (internalised) maps of a Kleisli triple.
Explanation:
It is hard to be precise about the properties of the "Hask category", and these properties are largely irrelevant for understanding Haskell's presentation of monads.
Instead, for this discussion, it is more useful to understand Haskell as an internal language for some category C. Haskell functions define morphisms in C and Haskell types are objects in C, but the particular category in which these definitions are made is unimportant.
Parameteric data types, e.g. data F a = ..., are object mappings, e.g. F : |C| -> |C|.
The usual description of a monad in Haskell is in Kleisli triple (or Kleisli extension) form:
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
where:
m is the object mapping m :|C| -> |C|
return is the unit operation on objects
>>= (pronounced "bind" by Haskellers) is the extension operation on morphisms but with its first two parameters swapped (cf. usual signature of extension (-)* : (a -> m b) -> m a -> m b)
(These maps are themselves internalised as families of morphisms in C, which is possible since m :|C| -> |C|).
Haskell's do-notation (if you have come across this) is therefore an internal language for Kleisli categories.

The Haskell wikibook page has a good basic explanation.

Related

Categorical structure in Haskell

Hask is usually thought to be the category whose objects are types and morphisms are functions.
However, I've seen Conor McBride (#pigworker) warn against the use of Hask multiple times (1, 2, 3):
I would discourage talk of "the Hask Category" because it subconsciously conditions you against looking for other categorical structure in Haskell programming.
Note, I dislike the use of "Hask" as the name of the "category of Haskell types and functions": I fear that labelling one category as the Haskell category has the unfortunate side-effect of blinding us to the wealth of other categorical structure in Haskell programming. It's a trap.
I wish people wouldn't call it "Hask", though: it threatens to limit the imagination.
What other categories can we see in Haskell?
In one of his answers, he touches upon some of these ideas, but I wonder if someone could expand upon it; and I wonder if there are even more examples.
[...] there's a ton of categorical structure lurking everywhere, there's certainly a ton of categorical structure available (possibly but not necessarily) at higher kinds. I'm particularly fond of functors between indexed families of sets.

Constraints in Haskell also form a category. The objects are the constraints, and the arrows mean "this constraint implies this other constraint". So every constraint implies itself, and there's an arrow between Monad f and Applicative f, between Ord a and Eq a and between Ord a and Ord [a].
It is a thin category, so there is at most one arrow between two objects.

Gabriel Gonzalez has blogged about this. Here's one such post:
http://www.haskellforall.com/2012/08/the-category-design-pattern.html
In it, he calls Hask "the function category", and also discusses "the Kleisli category" and "the pipes category." These are all examples of instances of the Category typeclass in Haskell. The Category typeclass in Haskell is a subset of the categories you can find in Haskell.

I once uploaded an educational package that demonstrates one example of this. I called it MHask.
http://hackage.haskell.org/package/MHask
Copied from the hackage page:
MHask is the category where
The objects are Haskell types of kind (* → *) that have an instance of Prelude.Monad
An arrow from object m to object n is a Haskell function of the form (forall x. m x → n x)
Arrow composition is merely a specialization of Haskell function composition
The identity arrow for the object m is the Prelude.id function in Haskell, specialized to (forall x. m x → m x)
Caveat emptor; I have not looked at this in a long time. There may be mistakes.

What does a nontrivial comonoid look like?

Comonoids are mentioned, for example, in Haskell's distributive library docs:
Due to the lack of non-trivial comonoids in Haskell, we can restrict ourselves to requiring a Functor rather than some Coapplicative class.
After a little searching I found a StackOverflow answer that explains this a bit more with the laws that comonoids would have to satisfy. So I think I understand why there's only one possible instance for a hypothetical Comonoid typeclass in Haskell.
Thus, to find a nontrivial comonoid, I suppose we'd have to look in some other category. Surely, if category theorists have a name for comonoids, then there are some interesting ones. The other answers on that page seem to hint at an example involving Supply, but I couldn't figure one out that still satisfies the laws.
I also turned to Wikipedia: there's a page for monoids that doesn't reference category theory, which seems to me as an adequate description of Haskell's Monoid typeclass, but "comonoid" redirects to a category-theoretic description of monoids and comonoids together that I can't understand, and there still don't seem to be any interesting examples.
So my questions are:
Can comonoids be explained in non-category-theoretic terms like monoids?
What is a simple example of an interesting comonoid, even if it's not a Haskell type? (Could one be found in a Kleisli category over a familiar Haskell monad?)
edit: I am not sure if this is actually category-theoretically correct, but what I was imagining in the parenthetical of question 2 was nontrivial definitions of delete :: a -> m () and split :: a -> m (a, a) for some specific Haskell type a and Haskell monad m that satisfy Kleisli-arrow versions of the comonoid laws in the linked answer. Other examples of comonoids are still welcome.

As Phillip JF mentioned, comonoids are interesting to talk about in substructural logics. Let's talk about linear lambda calculus. This is much like your normal typed lambda calculus except that every variable must be used exactly once.
To get a feel, let's count linear functions of given types, i.e.
a -> a
has exactly one inhabitant, id. While
(a,a) -> (a,a)
has two, id and flip. Note that in regular lambda calculus (a,a) -> (a,a) has four inhabitants
(a, b) ↦ (a, a)
(a, b) ↦ (b, b)
(a, b) ↦ (a, b)
(a, b) ↦ (b, a)
but the first two require that we use one of the arguments twice while discarding the other. This is exactly the essence of linear lambda calculus—disallowing those kinds of functions.
As a quick aside, what's the point of linear LC? Well, we can use it to model linear effects or resource usage. If, for instance, we have a file type and a few transformers it might look like
data File
open :: String -> File
close :: File -> () -- consumes a file, but we're ignoring purity right now
t1 :: File -> File
t2 :: File -> File
and then the following are valid pipelines:
close . t1 . t2 . open
close . t2 . t1 . open
close . t1 . open
close . t2 . open
but this "branching" computation isn't
let f1 = open "foo"
f2 = t1 f1
f3 = t2 f1
in close f3
since we used f1 twice.
Now, you might be wondering something at this point about what things must follow the linear rules. For instance, I decided that some pipelines don't have to include both t1 and t2 (compare the enumeration exercise from before). Further, I introduced the open and close functions which happily create and destroy the File type despite that being a violation of linearity.
Indeed, we might posit the existence of functions which violate linearity—but not all clients may. It's much like the IO monad—all of the secrets live inside the implementation of IO so that users work in a "pure" world.
And this is where Comonoid comes in.
class Comonoid m where
destroy :: m -> ()
split :: m -> (m, m)
A type that instantiates Comonoid in a linear lambda calculus is a type which has carry-along destruction and duplication rules. In other words, it's a type which isn't very much bound by linear lambda calculus at all.
Since Haskell doesn't implement the linear lambda calculus rules at all, we can always instantiate Comonoid
instance Comonoid a where
destroy a = ()
split a = (a, a)
Or, perhaps the other way to think of it is that Haskell is a linear LC system that just happens to instantiate Comonoid for every type and applies destroy and split for you automatically.

A monoid in the usual sense is the same as a categorical monoid in the category of sets. One would expect that a comonoid in the usual sense is the same as a categorical comonoid in the category of sets. But every set in the category of sets is a comonoid in a trivial way, so apparently there is no non-categorical description of comonoids which would be parallel to that of monoids.
Just like a monad is a monoid in the category of endofunctors (what's the problem?), a comonad is a comonoid in the category of endofunctors (what's the coproblem?) So yes, any comonad in Haskell would be an example of a comonoid.

Well one way we can think of a monoid is as hooked to any particular product construction that we're using, so in Set we'd take this signature:
mul : A * A -> A
one : A
to this one:
dup : A -> A * A
one : A
but the idea of duality is that the logical statements that you can make all have duals which can be applied to the dual objects, and there is another way of stating what a monoid is, and that's being agnostic to the choice of product construction and then when we take the costructure we can take the coproduct in the output, like:
div : A -> A + A
one : A
where + is a tagged sum. Here we essentially have that every single term which is in this type is always ready to produce a new bit, which is implicitly derived from the tag used to denote the left or the right instance of A. I personally think this is really damn cool. I think the cool version of the things that people were talking about above is when you don't particularly construct that for monoids, but for monoid actions.
A monoid M is said to act on a set A if there's a function
act : M * A -> A
where we have the following rules
act identity a = a
act f (act g a) = act (f * g) a
If we want a co-action, what exactly do we want?
act : A -> M * A
this generates us a stream of the type of our comonoid! I'm having a lot of trouble coming up with the laws for these systems, but I think they must be around somewhere so I'm gonna keep looking tonight. If somebody can tell me them or that I'm wrong about these things in some way or another, also interested in that.

As a physicist, the most common example I deal with is coalgebras, which are comonoid objects in the category of vector spaces, with the monoidal structure usually given by the tensor product.
In that case, there is a bijection between monoid and comonoid objects, since you can just take the adjoint or transpose of the product and unit maps to get a coproduct and a counit that satisfy the comonoid axioms.
In some branches of physics, it is very common to see objects that have both an algebra and a coalgebra structure with some compatibility axioms. The two most common cases are Hopf algebras and Frobenius algebras. They are very convenient for constructing states or solution that are entangled or correlated.
In programming, the simplest nontrivial example I can think of would be reference counted pointers such as shared_ptr in C++ and Rc in Rust, along with their weak equivalents. You can copy them, which is a nontrivial operation that bumps up the refcount (so the two copies are distinct from the initial state). You can drop (call the destructor) on one, which is nontrivial because it bumps down the refcount of any other refcounted pointer that points to the same piece of data.
Furthermore, weak pointers are a great example of a comonoid action. You can use the co-action to generate a weak pointer from a shared pointer. This can be easily checked by noting that creating one from a shared pointer and immediately dropping it is a unit operation, and creating one & cloning it is equivalent to creating two from the shared pointer.
This is a general thing you see with nontrivial coproducts and their co-actions: when they don't reduce to a copying operation, they intuitively imply some form of action at a distance between the two halves, while also adding an operation that erases one half to leave the other independent.

How are functors in Haskell and OCaml similar?

I've been toying around in a Haskell for the past year or so and I'm actually starting to 'get' it, up until Monads, Lenses, Type Families, ... the lot.
I'm about to leave this comfort zone a little and I am moving to an OCaml project as a day job. Going through the syntax a little I was looking for similar higher level concepts, like for example functor.
I read the code in OCaml and the structure of a functor but I cannot seem to get whether they are now similar concepts in Haskell and OCaml or not.
In a nutshell, a functor in Haskell is for me mainly a way to lift functions in Haskell and I use it (and like it) like that.
In OCaml it gives me the feeling that it's closer to programming to an interface (for example when making a set or a list, with that compare function) and I wouldn't really know how to for example lift functions over the functor or so.
Can somebody explain me whether the two concepts are similar and if so what am I missing or not seeing? I googled around a bit and there doesn't seem to be a clear answer to be found.
Kasper

From a practical standpoint, you can think of "functors" in OCaml and Haskell as unrelated. As you said, in Haskell a functor is any type that allows you to map a function over it. In OCaml, a functor is a module parametrized by another module.
In Functional Programming, what is a functor? has a good description of what functors in the two languages are and how they differ.
However, as the name implies, there is actually a connection between the two seemingly disparate concepts! Both language's functors are just realizations of a concept from category theory.
Category theory is the study of categories, which are just arbitrary collections of objects with "morphisms" between them. The idea of a category is very abstract, so "objects" and "morphisms" can really be anything with a few restrictions—there has to be an identity morphism for every object and morphisms have to compose.
The most obvious example of a category is the category of sets and functions: the sets are the objects and the functions between sets the morphisms. Clearly, every set has has an identity function and functions can be composed. A very similar category can be formed by looking at a functional programming language like Haskell or OCaml: concrete types (e.g. types with kind *) are the objects and Haskell/OCaml functions are the morphisms between them.
In category theory, a functor is a transformation between categories. It is like a function between categories. When we're looking at the category of Haskell types, a functor is essentially a type-level function: it maps types to something else. The particular kind of functor we care about maps types to other types. A perfect example of this is Maybe: Maybe maps Int to Maybe Int, String to Maybe String and so on. It provides a mapping for every possible Haskell type.
Functors have one additional requirement—they have to map the category's morphisms as well as the objects. In particular, if we have a morphism A → B and our functor maps A to A' and B to B', it has to map the morphism A → B to some morphism A' → B'. As a concrete example, let's say we have the types Int and String. There are a whole bunch of Haskell functions Int → String. For Maybe to be a proper functor, it has to have a function Maybe Int → Maybe String for each of these.
Happily, this is exactly what the fmap function does—it maps functions. For Maybe, it has the type (a → b) → Maybe a → Maybe b; we can add some parentheses to get: (a → b) → (Maybe a → Maybe b). What this type signature tells us is that for any normal function we have, we also have a corresponding function over Maybes.
So a functor is a mapping between types that also preserves the functions between them. The fmap function is essentially just a proof of this second restriction on functors. This makes it easy to see how the Haskell Functor class is just a particular version of the mathematical concept.
So what about OCaml? In OCaml, a functor is not a type—it's a module. In particular, it's a parametrized module: a module that takes another module as an argument. Already, we can see some parallels: in Haskell, a Functor is like a type-level function; in OCaml, a functor is like a module-level function. So really, it's the same mathematical idea; however, instead of being used on types—like in Haskell—it's used on modules.
There's far more detail about how OCaml functors related to category theory functors on the CS site: What is the relation between functors in SML and Category theory?. The question talks about SML rather than OCaml per se, but my understanding is that the module system of OCaml is very closely related to that of SML.
In summary: functors in Haskell and in OCaml are two fundamentally different structures which both happen to be reifications of the same very abstract mathematical idea. I think it's pretty neat :).

How do you identify monadic design patterns?

I my way to learn Haskell I'm starting to grasp the monad concept and starting to use the known monads in my code but I'm still having difficulties approaching monads from a designer point of view. In OO there are several rules like, "identify nouns" for objects, watch for some kind of state and interface... but I'm not able to find equivalent resources for monads.
So how do you identify a problem as monadic in nature? What are good design patterns for monadic design? What's your approach when you realize that some code would be better refactored into a monad?

A helpful rule of thumb is when you see values in a context; monads can be seen as layering "effects" on:
Maybe: partiality (uses: computations that can fail)
Either: short-circuiting errors (uses: error/exception handling)
[] (the list monad): nondeterminism (uses: list generation, filtering, ...)
State: a single mutable reference (uses: state)
Reader: a shared environment (uses: variable bindings, common information, ...)
Writer: a "side-channel" output or accumulation (uses: logging, maintaining a write-only counter, ...)
Cont: non-local control-flow (uses: too numerous to list)
Usually, you should generally design your monad by layering on the monad transformers from the standard Monad Transformer Library, which let you combine the above effects into a single monad. Together, these handle the majority of monads you might want to use. There are some additional monads not included in the MTL, such as the probability and supply monads.
As far as developing an intuition for whether a newly-defined type is a monad, and how it behaves as one, you can think of it by going up from Functor to Monad:
Functor lets you transform values with pure functions.
Applicative lets you embed pure values and express application — (<*>) lets you go from an embedded function and its embedded argument to an embedded result.
Monad lets the structure of embedded computations depend on the values of previous computations.
The easiest way to understand this is to look at the type of join:
join :: (Monad m) => m (m a) -> m a
This means that if you have an embedded computation whose result is a new embedded computation, you can create a computation that executes the result of that computation. So you can use monadic effects to create a new computation based on values of previous computations, and transfer control flow to that computation.
Interestingly, this can be a weakness of structuring things monadically: with Applicative, the structure of the computation is static (i.e. a given Applicative computation has a certain structure of effects that cannot change based on intermediate values), whereas with Monad it is dynamic. This can restrict the optimisation you can do; for instance, applicative parsers are less powerful than monadic ones (well, this isn't strictly true, but it effectively is), but they can be optimised better.
Note that (>>=) can be defined as
m >>= f = join (fmap f m)
and so a monad can be defined simply with return and join (assuming it's a Functor; all monads are applicative functors, but Haskell's typeclass hierarchy unfortunately doesn't require this for historical reasons).
As an additional note, you probably shouldn't focus too heavily on monads, no matter what kind of buzz they get from misguided non-Haskellers. There are many typeclasses that represent meaningful and powerful patterns, and not everything is best expressed as a monad. Applicative, Monoid, Foldable... which abstraction to use depends entirely on your situation. And, of course, just because something is a monad doesn't mean it can't be other things too; being a monad is just another property of a type.
So, you shouldn't think too much about "identifying monads"; the questions are more like:
Can this code be expressed in a simpler monadic form? With which monad?
Is this type I've just defined a monad? What generic patterns encoded by the standard functions on monads can I take advantage of?

Follow the types.
If you find you have written functions with all of these types
(a -> b) -> YourType a -> YourType b
a -> YourType a
YourType (YourType a) -> YourType a
or all of these types
a -> YourType a
YourType a -> (a -> YourType b) -> YourType b
then YourType may be a monad. (I say “may” because the functions must obey the monad laws as well.)
(Remember you can reorder arguments, so e.g. YourType a -> (a -> b) -> YourType b is just (a -> b) -> YourType a -> YourType b in disguise.)
Don't look out only for monads! If you have functions of all of these types
YourType
YourType -> YourType -> YourType
and they obey the monoid laws, you have a monoid! That can be valuable too. Similarly for other typeclasses, most importantly Functor.

There's the effect view of monads:
Maybe - partiality / failure short-circuiting
Either - error reporting / short-circuiting (like Maybe with more information)
Writer - write only "state", commonly logging
Reader - read-only state, commonly environment passing
State - read / write state
Resumption - pausable computation
List - multiple successes
Once you are familiar with these effects its easy to build monads combining them with monad transformers. Note that combining some monads needs special care (particularly Cont and any monads with backtracking).
One thing important to note is there aren't many monads. There are some exotic ones that aren't in the standard libraries e.g the probability monad and variations of the Cont monad like Codensity. But unless you are doing something mathematical its unlikely you will invent (or discover) a new monad, however if you use Haskell long enough you'll build many monads that are different combinations of the standard ones.
Edit - Also note that the order you stack monad transformers results in different monads:
If you add ErrorT (transformer) to a Writer monad, you get this monad Either err (log,a) - you can only access the log if you have no error.
If you add WriterT (transfomer) to an Error monad, you get this monad (log, Either err a) which always gives access to the log.

This is sort of a non-answer, but I feel it is important to say anyways. Just ask! StackOverflow, /r/haskell, and the #haskell irc channel are all great places to get quick feedback from smart people. If you are working on a problem, and you suspect that there's some monadic magic that could make it easier, just ask! The Haskell community loves to solve problems, and is ridiculously friendly.
Don't misunderstand, I'm not encouraging you to never learn for yourself. Quite the contrary, interacting with the Haskell community is one of the best ways to learn. LYAH and RWH, 2 Haskell books that are freely available online, come highly recommended as well.
Oh, and don't forget to play, play, play! As you play around with monadic code, you'll start to get the feel of what "shape" monads have, and when monadic combinators can be useful. If you're rolling your own monad, then usually the type system will guide you to an obvious, simple solution. But to be honest, you should rarely need to roll your own instance of Monad, since Haskell libraries provide tons of useful things as mentioned by other answerers.

There's a common notion that one sees in many programming languages of an "infectious function tag" -- some special behavior for a function that must extend to its callers as well.
Rust functions can be unsafe, meaning they perform operations that can potentially violate memory unsafety. unsafe functions can call normal functions, but any function that calls an unsafe function must be unsafe as well.
Python functions can be async, meaning they return a promise rather than an actual value. async functions can call normal functions, but invocation of an async function (via await) can only be done by another async function.
Haskell functions can be impure, meaning they return an IO a rather than an a. Impure functions can call pure functions, but impure functions can only be called by other impure functions.
Mathematical functions can be partial, meaning they don't map every value in their domain to an output. The definitions of partial functions can reference total functions, but if a total function maps some of its domain to a partial function, it becomes partial as well.
While there may be ways to invoke a tagged function from an untagged function, there is no general way, and doing so can often be dangerous and threatens to break the abstraction the language tries to provide.
The benefit, then, of having tags is that you can expose a set of special primitives that are given this tag and have any function that uses these primitives make that clear in its signature.
Say you're a language designer and you recognize this pattern, and you decide that you want to allow user-defined tags. Let's say the user defined a tag Err, representing computations that may throw an error. A function using Err might look like this:
function div <Err> (n: Int, d: Int): Int
if d == 0
throwError("division by 0")
else
return (n / d)
If we wanted to simplify things, we might observe that there's nothing erroneous about taking arguments - it's computing the return value where problems might arise. So we can restrict tags to functions that take no arguments, and have div return a closure rather than the actual value:
function div(n: Int, d: Int): <Err> () -> Int
() =>
if d == 0
throwError("division by 0")
else
return (n / d)
In a lazy language such as Haskell, we don't need the closure, and can just return a lazy value directly:
div :: Int -> Int -> Err Int
div _ 0 = throwError "division by 0"
div n d = return $ n / d
It is now apparent that, in Haskell, tags need no special language support - they are ordinary type constructors. Let's make a typeclass for them!
class Tag m where
We want to be able to call an untagged function from a tagged function, which is equivalent to turning an untagged value (a) into a tagged value (m a).
addTag :: a -> m a
We also want to be able to take a tagged value (m a) and apply a tagged function (a -> m b) to get a tagged result (m b):
embed :: m a -> (a -> m b) -> m b
This, of course, is precisely the definition of a monad! addTag corresponds to return, and embed corresponds to (>>=).
It is now clear that "tagged functions" are merely a type of monad. As such, whenever you spot a place where a "function tag" could apply, chances are you've got a place suitable for a monad.
P.S. Regarding the tags I've mentioned in this answer: Haskell models impurity with the IO monad and partiality with the Maybe monad. Most languages implement async/promises fairly transparently, and there seems to be a Haskell package called promise that mimics this functionality. The Err monad is equivalent to the Either String monad. I'm not aware of any language that models memory unsafety monadically, it could be done.

Why should I use applicative functors in functional programming?

I'm new to Haskell, and I'm reading about functors and applicative functors. Ok, I understand functors and how I can use them, but I don't understand why applicative functors are useful and how I can use them in Haskell. Can you explain to me with a simple example why I need applicative functors?

Applicative functors are a construction that provides the midpoint between functors and monads, and are therefore more widespread than monads, while more useful than functors. Normally you can just map a function over a functor. Applicative functors allow you to take a "normal" function (taking non-functorial arguments) use it to operate on several values that are in functor contexts. As a corollary, this gives you effectful programming without monads.
A nice, self-contained explanation fraught with examples can be found here. You can also read a practical parsing example developed by Bryan O'Sullivan, which requires no prior knowledge.

Applicative functors are useful when you need sequencing of actions, but don't need to name any intermediate results. They are thus weaker than monads, but stronger than functors (they do not have an explicit bind operator, but they do allow running arbitrary functions inside the functor).
When are they useful? A common example is parsing, where you need to run a number of actions that read parts of a data structure in order, then glue all the results together. This is like a general form of function composition:
f a b c d
where you can think of a, b and so on as the arbitrary actions to run, and f as the functor to apply to the result.
f <$> a <*> b <*> c <*> d
I like to think of them as overloaded 'whitespace'. Or, that regular Haskell functions are in the identity applicative functor.
See "Applicative Programming with Effects"

Conor McBride and Ross Paterson's Functional Pearl on the style has several good examples. It's also responsible for popularizing the style in the first place. They use the term "idiom" for "applicative functor", but other than that it's pretty understandable.

It is hard to come up with examples where you need applicative functors. I can understand why an intermediate Haskell programmer would ask them self that question since most introductory texts present instances derived from Monads using Applicative Functors only as a convenient interface.
The key insight, as mentioned both here and in most introductions to the subject, is that Applicative Functors are between Functors and Monads (even between Functors and Arrows). All Monads are Applicative Functors but not all Functors are Applicative.
So necessarily, sometimes we can use applicative combinators for something that we can't use monadic combinators for. One such thing is ZipList (see also this SO question for some details), which is just a wrapper around lists in order to have a different Applicative instance than the one derived from the Monad instance of list. The Applicative documentation uses the following line to give an intuitive notion of what ZipList is for:
f <$> ZipList xs1 <*> ... <*> ZipList xsn = ZipList (zipWithn f xs1 ... xsn)
As pointed out here, it is possible to make quirky Monad instances that almost work for ZipList.
There are other Applicative Functors that are not Monads (see this SO question) and they are easy to come up with. Having an alternative Interface for Monads is nice and all, but sometimes making a Monad is inefficient, complicated, or even impossible, and that is when you need Applicative Functors.
disclaimer: Making Applicative Functors might also be inefficient, complicated, and impossible, when in doubt, consult your local category theorist for correct usage of Applicative Functors.

In my experience, Applicative functors are great for the following reasons:
Certain kinds of data structures admit powerful types of compositions, but cannot really be made monads. In fact, most of the abstractions in functional reactive programming fall into this category. While we might technically be able to make e.g. Behavior (aka Signal) a monad, it typically cannot be done efficiently. Applicative functors allow us to still have powerful compositions without sacrificing efficiency (admittedly, it is a bit trickier to use an applicative than a monad sometimes, just because you don't have quite as much structure to work with).
The lack of data-dependence in an applicative functor allows you to e.g. traverse an action looking for all the effects it might produce without having the data available. So you could imagine a "web form" applicative, used like so:
userData = User <$> field "Name" <*> field "Address"
and you could write an engine which would traverse to find all the fields used and display them in a form, then when you get the data back run it again to get the constructed User. This cannot be done with a plain functor (because it combines two forms into one), nor a monad, because with a monad you could express:
userData = do
name <- field "Name"
address <- field $ name ++ "'s address"
return (User name address)
which cannot be rendered, because the name of the second field cannot be known without already having the response from the first. I'm pretty sure there's a library that implements this forms idea -- I've rolled my own a few times for this and that project.
The other nice thing about applicative functors is that they compose. More precisely, the composition functor:
newtype Compose f g x = Compose (f (g x))
is applicative whenever f and g are. The same cannot be said for monads, which has creates the whole monad transformer story which is complicated in some unpleasant ways. Applicatives are super clean this way, and it means you can build up the structure of a type you need by focusing on small composable components.
Recently the ApplicativeDo extension has appeared in GHC, which allows you to use do notation with applicatives, easing some of the notational complexity, as long as you don't do any monady things.

One good example: applicative parsing.
See [real world haskell] ch16 http://book.realworldhaskell.org/read/using-parsec.html#id652517
This is the parser code with do-notation:
-- file: ch16/FormApp.hs
p_hex :: CharParser () Char
p_hex = do
char '%'
a <- hexDigit
b <- hexDigit
let ((d, _):_) = readHex [a,b]
return . toEnum $ d
Using functor make it much shorter:
-- file: ch16/FormApp.hs
a_hex = hexify <$> (char '%' *> hexDigit) <*> hexDigit
where hexify a b = toEnum . fst . head . readHex $ [a,b]
'lifting' can hide the underlying details of some repeating code. then you can just use fewer words to tell the exact & precise story.

I would also suggest to take a look at this
In the end of the article there's an example
import Control.Applicative
hasCommentA blogComments =
BlogComment <$> lookup "title" blogComments
<*> lookup "user" blogComments
<*> lookup "comment" blogComments
Which illustrates several features of applicative programming style.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string