What does it mean to "run" inside a monad?

What does it mean to "run" inside a monad? - haskell

Working through Haskell textbook chapters on different monads, I repeatedly get lost when the authors jump from explaining the details of bind and the monad laws to actually using monads. Suddenly, expressions like "running a function in a monadic context", or "to run a monad" pop up. Similarly, in library documentation and in discussions about monad transformer stacks, I read statements that some function "can be run in any choice of monad". What does this "running inside a monad" exactly mean?
There are two things I don't seem to get straight:
A monad is a type class with functions (return, >>=) and laws. To "run" something inside a monad could thus either mean (a) to provide it as argument to return, or (b) to sequence it using >>=. If the monad is of type m a, then in case a) that something must be of type a, to match the type of the return function. In case b) that something must be a function of type a -> m b, to match the type of the >>= function. From this, I do not understand how I can "run" some function inside an arbitrary monad, because the functions I sequence using >>= must all have the same type signature, and the values I lift using return must be of the specific monad type parameter.
In my understanding, there is no notion of execution or running a computation in a functional language - there is only function application to some argument, and evaluating the function (replacing it with its value). Yet, many specific monads come with a run function such as runReader, runState, etc. These functions are not part of the definition of a monad, and they are plain functions, not in any way special imperative statements outside the functional core of the language. So, what do they "run"?
I feel that having a clear understanding of these concepts is key to understanding monad transformer stacks or similar constructs that seem to be necessary to understand any substantial libraries and any non-trivial programs in Haskell. Thanks very much for helping me to make the leap from simply writing functional code to actually understanding what it means.

Authors who write books and articles often use metaphors and less precise language when they try to explain concepts. The purpose is to give the reader a conceptual intuition for what's going on.
I believe that the concept of 'running' a function falls into this category. Apart from IO, you're right that the functions you use to compose, say, [], Maybe, and so on are no special from other functions.
The notion of running something inside of a monad comes, I think, from the observation that functors are containers. This observation applies to monads as well, since all monads are functors. [Bool] is a container of Boolean values, Maybe Int is a container of (zero or one) numbers. You can even think of the reader functor r -> a as a container of a values, because you can imagine that it's just a very big lookup table.
Being able to 'run a function inside a container' is useful because not all containers enable access to their contents. Again, IO is the prime example, since it's an opaque container.
A frequently asked question is: How to return a pure value from a impure method. Likewise, many beginners ask: How do I get the value of a Maybe? You could even ask: How do I get the value out of a list? Generalised, the question becomes: How to get the value out of the monad.
The answer is that you don't. You 'run the function inside the container', or, as I like to put it, you inject the behaviour into the monad. You never leave the container, but rather let your functions execute within the context of it. Particularly when it comes to IO, this is the only way you can interact with that container, because it's otherwise opaque (I'm here pretending that unsafePerformIO doesn't exist).
Keep in mind when it comes to the bind method (>>=) that while the function 'running inside of it' has the type a -> m b, you can also 'run' a 'normal' function a -> b inside the monad with fmap, because all Monad instances are also Functor instances.

the functions I sequence using >>= must all have the same type signature
This is only sorta true. In some monadic context, we may have the expression
x >>= f >>= g
where
x :: Maybe Int
f :: Int -> Maybe String
g :: String -> Maybe Char
All of these must involve the same monad (Maybe), but note that they do not all have the same type signature. Just as with ordinary function composition, you don't need all the return types to be the same, merely that the input of one function match up with the output of its predecessor.

Here is a simple analogy of "run the function inside the container", with pseudocode:
Let's say you have some type Future[String] which represents a container that will have a string "at some time in the future":
val tweet: Future[String] = getTweet()
Now you want to access the string -- but you do not take the string out of the context -- the "future" -- you simply use the string "inside the container":
tweet.map { str =>
println(str)
}
Inside these curly braces, you are "in the future". For example:
val tweet: Future[String] = getTweet()
tweet.map { str =>
println(str)
}
println("Length of tweet string is " + tweet.length) // <== WRONG -- you are not yet in the future
The tweet.length is trying to access the tweet outside of the container. So "being inside the container" is analogous to, when reading the source code, "being inside the curly braces of a map (flatmap, etc.)". You are dipping inside of the container.
tweet.map { str =>
println("Length of tweet string is " + str.length) // <== RIGHT
}
Albeit a very simple analogy, I find this useful when thinking of all monads in general. In the source code, where is one "inside the container" and where is one outside? In this case, the length function is running in the future, or "inside the container".

Related

Why is there no runConst function in Haskell?

Is there a convention so that I know when to expect runX versus getX typeclass functions?

It's purely a matter of how the author preferred to think about what they're representing. And it's often more about the "abstract concept" being represented, not the actual data structure being used to represent it.
If you have some type X and think of an X value as a computation that could be run to get a value, then you'd have a runX function. If you think of it as more like a container, then you'd have a getX function (There are other possible interpretations that could lead to runX or getX or something else, these are just 2 commonly recurring ways of thinking about values).
Of course when we're using first class Haskell values to represent things (and functions are perfectly good values), a lot of the time you could interpret something as either a computation or a container reasonably well. Consider State for representing stateful computations; surely that has to be interpreted as a computation, right? We say runState :: State s a -> s -> (a , s) because we think of it as "running" the State s a, needing an s as additional input. But we could just as easily think of it as "getting" an s -> (a, s) out of the State s a - treating State more like a container.
So the choice between runX and getX isn't really meaningful in any profound tense, but it tells you how the author was thinking about X (and perhaps how they think you should think about it).
Const is so-named in analogy to the function const (which takes an argument to produce the "constant function" that takes another input, ignores it, and returns whatever the first input to const was). But it's thought of as operating at the type level; Const takes a type and generates a "type-level function" that ignores whatever type it is applied to and then is isomorphic to the first type Const was applied to. Isomorphic rather than equal because to create a new type that could have different instances, it needs to have a constructor. At the value level, in order to be an isomorphism you need to be able to get a Const a b from an a (that's the Const constructor), and get the a back out of a Const a b. Since "being isomorphic to a" is all the properties we need it to have there's no real need to think of it as doing anything other than being a simple container of a, so we have getConst.
Identity seems similarly obvious as "just a container" and we have runIdentity. But one of the main motivations for having Identity is to think of Identity a as being a "monadic computation" in the same way that State s a, Reader e a, etc values are. So to continue the analogy we think of Identity as a "do-nothing" computation we run, rather than a simple wrapper container that we get a value out of. It would be perfectly valid to think of Identity as a container (the simplest possible one), but that wasn't the interpretation the authors chose to focus on.

Monad "unboxing"

My question came up while following the tutorial Functors, Applicatives, And Monads In Pictures and its JavaScript version.
When the text says that functor unwraps value from the context, I understand that a Just 5 -> 5 transformation is happening. As per What does the "Just" syntax mean in Haskell? , Just is "defined in scope" of the Maybe monad.
My question is what is so magical about the whole unwrapping thing? I mean, what is the problem of having some language rule which automatically unwraps the "scoped" variables? It looks to me that this action is merely a lookup in some kind of a table where the symbol Just 5 corresponds to the integer 5.
My question is inspired by the JavaScript version, where Just 5 is prototype array instance. So unwrapping is, indeed, not rocket science at all.
Is this a "for-computation" type of reason or a "for-programmer" one? Why do we distinguish Just 5 from 5 on the programming language level?

First of all, I don't think you can understand Monads and the like without understanding a Haskell like type system (i.e. without learning a language like Haskell). Yes, there are many tutorials that claim otherwise, but I've read a lot of them before learning Haskell and I didn't get it. So my advice: If you want to understand Monads learn at least some Haskell.
To your question "Why do we distinguish Just 5 from 5 on the programming language level?". For type safety. In most languages that happen not to be Haskell null, nil, whatever, is often used to represent the absence of a value. This however often results in things like NullPointerExceptions, because you didn't anticipate that a value may not be there.
In Haskell there is no null. So if you have a value of type Int, or anything else, that value can not be null. You are guarantied that there is a value. Great! But sometimes you actually want/need to encode the absence of a value. In Haskell we use Maybe for that. So something of type Maybe Int can either be something like Just 5 or Nothing. This way it is explicit that the value may not be there and you can not accidentally forget that it might be Nothing because you have to explicitly unwrap the value.
This has nothing really to do with Monads, except that Maybe happens to implement the Monad type class (a type class is a bit like a Java interface, if you are familiar with Java). That is Maybe is not primarily a Monad, but just happens to also be a Monad.

I think you're looking at this from the wrong direction. Monad is explicitly not about unwrapping. Monad is about composition.
It lets you combine (not necessarily apply) a function of type a -> m b with a value of type m a to get a value of type m b. I can understand where you might think the obvious way to do that is unwrapping the value of type m a into an value of type a. But very few Monad instances work that way. In fact, the only ones that can work that way are the ones that are equivalent to the Identity type. For nearly all instances of Monad, it's just not possible to unwrap a value.
Consider Maybe. Unwrapping a value of type Maybe a into a value of type a is impossible when the starting value is Nothing. Monadic composition has to do something more interesting than just unwrapping.
Consider []. Unwrapping a value of type [a] into a value of type a is impossible unless the input just happens to be a list of length 1. In every other case, monadic composition is doing something more interesting than unwrapping.
Consider IO. A value like getLine :: IO String doesn't contain a String value. It's plain impossible to unwrap, because it isn't wrapping something. Monadic composition of IO values doesn't unwrap anything. It combines IO values into more complex IO values.
I think it's worthwhile to adjust your perspective on what Monad means. If it were only an unwrapping interface, it would be pretty useless. It's more subtle, though. It's a composition interface.

A possible example is this: consider the Haskell type Maybe (Maybe Int). Its values can be of the following form
Nothing
Just Nothing
Just (Just n) for some integer n
Without the Just wrapper we couldn't distinguish between the first two.
Indeed, the whole point of the optional type Maybe a is to add a new value (Nothing) to an existing type a. To ensure such Nothing is indeed a fresh value, we wrap the other values inside Just.
It also helps during type inference. When we see the function call f 'a' we can see that f is called at the type Char, and not at type Maybe Char or Maybe (Maybe Char). The typeclass system would allow f to have a different implementation in each of these cases (this is similar to "overloading" in some OOP languages).

My question is, what is so magical about the whole unwrapping thing?
There is nothing magical about it. You can use garden-variety pattern matching (here in the shape of a case expression) to define...
mapMaybe :: (a -> b) -> Maybe a -> Maybe b
mapMaybe f mx = case mx of
Just x -> Just (f x)
_ -> mx
... which is exactly the same than fmap for Maybe. The only thing the Functor class adds -- and it is a very useful thing, make no mistake -- is an extra level of abstraction that covers various structures that can be mapped over.
Why do we distinguish Just 5 from 5 on programming language level?
More meaningful than the distinction between Just 5 and 5 is the one between their types -- e.g. between Maybe Intand Int. If you have x :: Int, you can be certain x is an Int value you can work with. If you have mx :: Maybe Int, however, you have no such certainty, as the Int might be missing (i.e. mx might be Nothing), and the type system forces you to acknowledge and deal with this possibility.
See also: jpath's answer for further comments on the usefulness of Maybe (which isn't necessarily tied to classes such as Functor and Monad); Carl's answer for further comments on the usefulness of classes like Functor and Monad (beyond the Maybe example).

What "unwrap" means depends on the container. Maybe is just one example. "Unwrapping" means something completely different when the container is [] instead of Maybe.
The magical about the whole unwrapping thing is the abstraction: In a Monad we have a notion of "unwrapping" which abstracts the nature of the container; and then it starts to get "magical"...
You ask what Just means: Just is nothing but a Datatype constructor in Haskell defined via a data declaration like :
data Maybe a = Just a | Nothing
Just take a value of type a and creates a value of type Maybe a. It's Haskell's way to distinguigh values of type a from values of type Maybe a

First of all, you need to remove monads from your question. They have nothing to do this. Treat this articles as one of the points of view on the monads, maybe it does not suit you, you may still little understood in the type system that would understand monads in haskell.
And so, your question can be rephrased as: Why is there no implicit conversion Just 5 => 5? But answer is very simple. Because value Just 5 has type Maybe Integer, so this value may would be Nothing, but what must do compiler in this case? Only programmer can resolve this situation.
But there is more uncomfortable question. There are types, for example, newtype Identity a = Identity a. It's just wrapper around some value. So, why is there no impliciti conversion Identity a => a?
The simple answer is - an attempt to realize this would lead to a different system types, which would not have had many fine qualities that exist in the current. According to this, it can be sacrificed for the benefit of other possibilities.

One more time...can I have an example of state monad that does what I want?

I'm trying to understand the actual need for the reader and/or state monads. All the examples I've seen (including many on stackoverflow as I've hunted for suitable examples I can use and in various books and blog articles) are of the form (pseudo code)
f = do
foo <- ask
do something with foo
g = do
foo <- ask
do something else using foo
h = runReader
(
f
g
)
In other words, call two functions and (presumably) maintain some state from one call to the next. However, I don't find this example particularly convincing as (I think) I could just have f return some state and then pass that state on to g.
I'd love to see an example, using a single integer (say) as the state to be preserved where, rather than two sequential calls to f and then g from a central place, rather there's a call to f which then internally calls g and then have changed state (if state monad) available back in the main routine.
Most (well actually all) the examples I have seen spend a tremendous amount of time focusing on the definition of the monad and then show how to set up a single function call. To me, it would be the ability to do nested calls and have the state carried along for the ride to demonstrate why it's useful.

Here's a non-trivial example of a stateful subroutine calling another stateful subroutine.
import Control.Monad.Trans.State
f :: State Int ()
f = do
r <- g
modify (+ r)
g :: State Int Int
g = do
modify (+ 1)
get
main = print (execState f 4)
In this example, the initial state begins at 4 and the stateful computation begins at f. f internally calls g, which increments the state to 5 and then returns the current state (still 5). This restores control to f, which binds the value 5 to r and then increments the current state by r, giving a final state of 10:
>>> main
10

Almost everything you can do with monads you can do without them. (Well, some are special like ST, STM, IO, etc., but that's a different story.) But:
they allow you to encapsulate many common patterns, like in this case stateful computations, and hide details or boiler-plate code that would be otherwise needed; and
there are plethora of functions that work on any (or many) monads, which you can just specialize for a particular monad you're using.
To give an example: Often one needs to have some kind of a generator that supplies unique names, like when generating code etc. This can be easily accomplished using the state monad: Each time newName is called, it outputs a new name and increments the internal state:
import Control.Monad.State
import Data.Tree
import qualified Data.Traversable as T
type NameGen = State Int
newName :: NameGen String
newName = state $ \i -> ("x" ++ show i, i + 1)
Now let's say we have a tree that has some missing values. We'd like to supply them with such generated names. Fortunately, there is a generic function mapM that allows to traverse any traversable structure with any monad (without the monad abstraction, we wouldn't have this function). Now fixing the tree is easy. For each value we check if it's filled (then we just use return to lift it into the monad), and if not, supply a new name:
fillTree :: Tree (Maybe String) -> NameGen (Tree String)
fillTree = T.mapM (maybe newName return)
Just imagine implementing this function without monads, with explicit state - going manually through the tree and carrying the state around. The original idea would be completely lost in boilerplate code. Moreover, the function would be very specific to Tree and NameGen.
But with monads, we can go even further. We could parametrize the name generator and construct even more generic function:
fillTreeM :: (Monad m) => m String -> Tree (Maybe String) -> m (Tree String)
fillTreeM gen = T.mapM (maybe gen return)
Note the first parameter m String. It's not a constant String value, it's a recipe for generating a new String within m, whenever it's needed.
Then the original one can be rewritten just as
fillTree' :: Tree (Maybe String) -> NameGen (Tree String)
fillTree' = fillTreeM newName
But now we can use the same function for many very different purposes. For example, use the Rand monad and supply randomly generated names.
Or, at some point we might consider a tree without filled out nodes invalid. Then we just say that wherever we're asked for a new name, we instead abort the whole computation. This can be implemented just as
checkTree :: Tree (Maybe String) -> Maybe (Tree String)
checkTree = fillTreeM Nothing
where Nothing here is of type Maybe String, which, instead of trying to generate a new name, aborts the computation within the Maybe monad.
This level of abstraction would be hardly possible without having the concept of monads.

I'm trying to understand the actual need for the reader and/or state monads.
There are many ways to understand monads in general, and these monads in particular. In this answer, I focus on one understanding of these monads which I believe might be helpful for the OP.
The reader and state monads provide library support for very simple usage patterns:
The reader monad provides support for passing arguments to functions.
The state monad provides support for getting results out of functions and passing them to other functions.
As the OP correctly figured out, there is no great need for library support for these things, which are already very easy in Haskell. So many Haskell programs could use a reader or state monad, but there's no point in doing so, so they don't.
So why would anyone ever use a reader or state monad? I know three important reasons:
Realistic programs contain many functions that call each other and pass information back and forth. Sometimes, many functions take arguments that are just passed on to other functions. The reader monad is a library for this "accept arguments and pass them on" pattern. The state monad is a library for the similar "accept arguments, pass them on, and pass the results back as my result" pattern.
In this situation, a benefit of using the reader or state monad is that the arguments get passed on automatically, and we can focus on the more interesting jobs of these functions. A cost is that we have to use monadic style (do notation etc.).
Realistic programs can use multiple monads at once. They need arguments that are passed on, arguments that are returned, error handling, nondeterminism, ...
In this situation, a benefit of using the reader or state monad transformer is that we can package all of these monads into a single monad transformer stack. We still need monadic style, but now we pay the cost once (use do notation everywhere) and get the benefit often (multiple monads in the transformer stack).
Some library functions work for arbitrary monads. For example, the sequence :: [m a] -> m [a] takes a list of monadic actions, runs all of them in sequence, and returns the collected result.
A benefit of using the reader or state (or whatever) monad is that we can use these very generic library functions that work for any monad.
Note that points 1 and 2 only show up in realistic, somewhat large programs. So it is hard to give a small example for this benefit of using monads. Point 3 shows up in small library functions, but is harder to understand, because these library functions are often very abstract.

How do you identify monadic design patterns?

I my way to learn Haskell I'm starting to grasp the monad concept and starting to use the known monads in my code but I'm still having difficulties approaching monads from a designer point of view. In OO there are several rules like, "identify nouns" for objects, watch for some kind of state and interface... but I'm not able to find equivalent resources for monads.
So how do you identify a problem as monadic in nature? What are good design patterns for monadic design? What's your approach when you realize that some code would be better refactored into a monad?

A helpful rule of thumb is when you see values in a context; monads can be seen as layering "effects" on:
Maybe: partiality (uses: computations that can fail)
Either: short-circuiting errors (uses: error/exception handling)
[] (the list monad): nondeterminism (uses: list generation, filtering, ...)
State: a single mutable reference (uses: state)
Reader: a shared environment (uses: variable bindings, common information, ...)
Writer: a "side-channel" output or accumulation (uses: logging, maintaining a write-only counter, ...)
Cont: non-local control-flow (uses: too numerous to list)
Usually, you should generally design your monad by layering on the monad transformers from the standard Monad Transformer Library, which let you combine the above effects into a single monad. Together, these handle the majority of monads you might want to use. There are some additional monads not included in the MTL, such as the probability and supply monads.
As far as developing an intuition for whether a newly-defined type is a monad, and how it behaves as one, you can think of it by going up from Functor to Monad:
Functor lets you transform values with pure functions.
Applicative lets you embed pure values and express application — (<*>) lets you go from an embedded function and its embedded argument to an embedded result.
Monad lets the structure of embedded computations depend on the values of previous computations.
The easiest way to understand this is to look at the type of join:
join :: (Monad m) => m (m a) -> m a
This means that if you have an embedded computation whose result is a new embedded computation, you can create a computation that executes the result of that computation. So you can use monadic effects to create a new computation based on values of previous computations, and transfer control flow to that computation.
Interestingly, this can be a weakness of structuring things monadically: with Applicative, the structure of the computation is static (i.e. a given Applicative computation has a certain structure of effects that cannot change based on intermediate values), whereas with Monad it is dynamic. This can restrict the optimisation you can do; for instance, applicative parsers are less powerful than monadic ones (well, this isn't strictly true, but it effectively is), but they can be optimised better.
Note that (>>=) can be defined as
m >>= f = join (fmap f m)
and so a monad can be defined simply with return and join (assuming it's a Functor; all monads are applicative functors, but Haskell's typeclass hierarchy unfortunately doesn't require this for historical reasons).
As an additional note, you probably shouldn't focus too heavily on monads, no matter what kind of buzz they get from misguided non-Haskellers. There are many typeclasses that represent meaningful and powerful patterns, and not everything is best expressed as a monad. Applicative, Monoid, Foldable... which abstraction to use depends entirely on your situation. And, of course, just because something is a monad doesn't mean it can't be other things too; being a monad is just another property of a type.
So, you shouldn't think too much about "identifying monads"; the questions are more like:
Can this code be expressed in a simpler monadic form? With which monad?
Is this type I've just defined a monad? What generic patterns encoded by the standard functions on monads can I take advantage of?

Follow the types.
If you find you have written functions with all of these types
(a -> b) -> YourType a -> YourType b
a -> YourType a
YourType (YourType a) -> YourType a
or all of these types
a -> YourType a
YourType a -> (a -> YourType b) -> YourType b
then YourType may be a monad. (I say “may” because the functions must obey the monad laws as well.)
(Remember you can reorder arguments, so e.g. YourType a -> (a -> b) -> YourType b is just (a -> b) -> YourType a -> YourType b in disguise.)
Don't look out only for monads! If you have functions of all of these types
YourType
YourType -> YourType -> YourType
and they obey the monoid laws, you have a monoid! That can be valuable too. Similarly for other typeclasses, most importantly Functor.

There's the effect view of monads:
Maybe - partiality / failure short-circuiting
Either - error reporting / short-circuiting (like Maybe with more information)
Writer - write only "state", commonly logging
Reader - read-only state, commonly environment passing
State - read / write state
Resumption - pausable computation
List - multiple successes
Once you are familiar with these effects its easy to build monads combining them with monad transformers. Note that combining some monads needs special care (particularly Cont and any monads with backtracking).
One thing important to note is there aren't many monads. There are some exotic ones that aren't in the standard libraries e.g the probability monad and variations of the Cont monad like Codensity. But unless you are doing something mathematical its unlikely you will invent (or discover) a new monad, however if you use Haskell long enough you'll build many monads that are different combinations of the standard ones.
Edit - Also note that the order you stack monad transformers results in different monads:
If you add ErrorT (transformer) to a Writer monad, you get this monad Either err (log,a) - you can only access the log if you have no error.
If you add WriterT (transfomer) to an Error monad, you get this monad (log, Either err a) which always gives access to the log.

This is sort of a non-answer, but I feel it is important to say anyways. Just ask! StackOverflow, /r/haskell, and the #haskell irc channel are all great places to get quick feedback from smart people. If you are working on a problem, and you suspect that there's some monadic magic that could make it easier, just ask! The Haskell community loves to solve problems, and is ridiculously friendly.
Don't misunderstand, I'm not encouraging you to never learn for yourself. Quite the contrary, interacting with the Haskell community is one of the best ways to learn. LYAH and RWH, 2 Haskell books that are freely available online, come highly recommended as well.
Oh, and don't forget to play, play, play! As you play around with monadic code, you'll start to get the feel of what "shape" monads have, and when monadic combinators can be useful. If you're rolling your own monad, then usually the type system will guide you to an obvious, simple solution. But to be honest, you should rarely need to roll your own instance of Monad, since Haskell libraries provide tons of useful things as mentioned by other answerers.

There's a common notion that one sees in many programming languages of an "infectious function tag" -- some special behavior for a function that must extend to its callers as well.
Rust functions can be unsafe, meaning they perform operations that can potentially violate memory unsafety. unsafe functions can call normal functions, but any function that calls an unsafe function must be unsafe as well.
Python functions can be async, meaning they return a promise rather than an actual value. async functions can call normal functions, but invocation of an async function (via await) can only be done by another async function.
Haskell functions can be impure, meaning they return an IO a rather than an a. Impure functions can call pure functions, but impure functions can only be called by other impure functions.
Mathematical functions can be partial, meaning they don't map every value in their domain to an output. The definitions of partial functions can reference total functions, but if a total function maps some of its domain to a partial function, it becomes partial as well.
While there may be ways to invoke a tagged function from an untagged function, there is no general way, and doing so can often be dangerous and threatens to break the abstraction the language tries to provide.
The benefit, then, of having tags is that you can expose a set of special primitives that are given this tag and have any function that uses these primitives make that clear in its signature.
Say you're a language designer and you recognize this pattern, and you decide that you want to allow user-defined tags. Let's say the user defined a tag Err, representing computations that may throw an error. A function using Err might look like this:
function div <Err> (n: Int, d: Int): Int
if d == 0
throwError("division by 0")
else
return (n / d)
If we wanted to simplify things, we might observe that there's nothing erroneous about taking arguments - it's computing the return value where problems might arise. So we can restrict tags to functions that take no arguments, and have div return a closure rather than the actual value:
function div(n: Int, d: Int): <Err> () -> Int
() =>
if d == 0
throwError("division by 0")
else
return (n / d)
In a lazy language such as Haskell, we don't need the closure, and can just return a lazy value directly:
div :: Int -> Int -> Err Int
div _ 0 = throwError "division by 0"
div n d = return $ n / d
It is now apparent that, in Haskell, tags need no special language support - they are ordinary type constructors. Let's make a typeclass for them!
class Tag m where
We want to be able to call an untagged function from a tagged function, which is equivalent to turning an untagged value (a) into a tagged value (m a).
addTag :: a -> m a
We also want to be able to take a tagged value (m a) and apply a tagged function (a -> m b) to get a tagged result (m b):
embed :: m a -> (a -> m b) -> m b
This, of course, is precisely the definition of a monad! addTag corresponds to return, and embed corresponds to (>>=).
It is now clear that "tagged functions" are merely a type of monad. As such, whenever you spot a place where a "function tag" could apply, chances are you've got a place suitable for a monad.
P.S. Regarding the tags I've mentioned in this answer: Haskell models impurity with the IO monad and partiality with the Maybe monad. Most languages implement async/promises fairly transparently, and there seems to be a Haskell package called promise that mimics this functionality. The Err monad is equivalent to the Either String monad. I'm not aware of any language that models memory unsafety monadically, it could be done.

Can Haskell's monads be thought of as using and returning a hidden state parameter?

I don't understand the exact algebra and theory behind Haskell's monads. However, when I think about functional programming in general I get the impression that state would be modelled by taking an initial state and generating a copy of it to represent the next state. This is like when one list is appended to another; neither list gets modified, but a third list is created and returned.
Is it therefore valid to think of monadic operations as implicitly taking an initial state object as a parameter and implicitly returning a final state object? These state objects would be hidden so that the programmer doesn't have to worry about them and to control how they gets accessed. So, the programmer would not try to copy the object representing the IO stream as it was ten minutes ago.
In other words, if we have this code:
main = do
putStrLn "Enter your name:"
name <- getLine
putStrLn ( "Hello " ++ name )
...is it OK to think of the IO monad and the "do" syntax as representing this style of code?
putStrLn :: IOState -> String -> IOState
getLine :: IOState -> (IOState, String)
main :: IOState -> IOState
-- main returns an IOState we can call "state3"
main state0 = putStrLn state2 ("Hello " ++ name)
where (state2, name) = getLine state1
state1 = putStrLn state0 "Enter your name:"

No, that's not what monads in general do. However, your analogy is in fact exactly correct with regards to the data type State s a, which happens to be a monad. State is defined like this:
newtype State s a = State { runState :: s -> (a, s) }
...where the type variable s is the state value and a is the "regular" value that you use. So a value in "the State monad" is just a function from an initial state, to a return value and final state. The monadic style, as applied to State, does nothing more than automatically thread a state value through a sequence of functions.
The ST monad is superficially similar, but uses magic to allow computations with real side-effects, but only such that the side effects can't be observed from outside particular uses of ST.
The IO monad is essentially an ST monad set to "more magic", with side effects that touch the outside world and only a single point where IO computations are run, namely the entry point for the entire program. Still, on some conceptual level, you can still think of it as threading a "state" value through functions the way regular State does.
However, other monads don't necessarily have anything whatsoever to do with threading state, or sequencing functions, or whatnot. The operations needed for something to be a monad are incredibly general and abstract. For instance, using Maybe or Either as monads lets you use functions that might return errors, with the monadic style handling escaping from the computation when an error occurs the same way that State threads a state value. Using lists as a monad gives you nondeterminism, letting you simultaneously apply functions to multiple inputs and see all possible outputs, with the monadic style automatically applying the function to each argument and collecting all the outputs.

Is it therefore valid to think of monadic operations as implicitly taking an initial state object as a parameter and implicitly returning a final state object?
This seems to be a common sticking point for learning monads, i.e., trying to figure out how a single magical monad primordial soup is simultaneously useful for representing stateful computations, computations that can fail, nondeterministic computations, exceptions, continuations, sequencing side effects, and so on.
Threading state through a sequence of stateful computations is one single example of an operation that satisfies the monad laws.
You're correct in observing that the State and IO monads are close cousins, but your analogy will fall apart if you try inserting, say, the list monad.

Not monads in general, but for the IO monad, yes — in fact, the type IO a is often defined as the function type RealWorld -> (RealWorld, a). So in this view, the desugared type of putStrLn is String -> RealWorld -> (RealWorld, ()) and getChar is RealWorld -> (RealWorld, Char) — and we only partially apply it, with the monadic bind taking care of fully evaluating it and passing the RealWorld around. (GHC's ST library actually includes a very real RealWorld type, though it's described as "deeply magical" and not for actual use.)
There are many other monads that don't have this property, though. There's no RealWorld being passed around with, for example, the monads [1,2,3,4] or Just "Hello".

Absolutely not. This is not what monads are in general. You can use monads to implicitly pass around data, but that is just one application. If you use this model of monads then you are going to miss out on a lot of the really cool things monads can do.
Instead, think about monads as being pieces of data that represent a computation. For example, there is a sense in which implicitly passing around data isn't pure because pure languages insist that you be explicit about all of your arguments and return types. So if you want to pass around data implicitly you can do this: define a new data type that is a representation of doing something impure, and then write a piece of code to work with that.
An extreme example (just a theoretical example, you're unlikely to want to do this) would be this: C allows impure computations, so you could define a type that represents a piece of C code. You can then write an interpreter that takes one of these C structures and interprets it. All monads are like this, though usually much simpler than a C interpreter. But this view is more powerful because it also explains the monads that aren't about passing around hidden state.
You should probably also try to resist the temptation to see IO as passing around a hidden world state. The internal implementation of IO has nothing to do with how you should think about IO. The IO monad is about building representations of the I/O you want to perform. You return one of these representations to the Haskell run-time system and it's up to that system to implement your representation however it wants.
Any time you want to do something, and you can't see how to directly implement it in a pure way, but you can see how to build a pure data structure to describe want you want, you may have an application for a monad, especially if your structure is naturally a tree of some sort.

I prefer to think about monads as objects that represents delayed actions (runXXX, main) with result which can be combined according to that result.
return "somthing"
actionA >>= \x -> makeActionB x
And those action not necessary to be state-full. I.e. you can think about monad of function construction like this:
instance Monad ((->) a) where
m >>= fm = \x -> fm (m x) x
return = const
sqr = \x -> x*x
cube = \x -> x*x*x
weird = do
a <- sqr
b <- cube
return (a+b)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string