Left recursion of >>= in Haskell - haskell

I've just read this very interesting article about an alternative implementation for the Prompt Monad : http://joeysmandatory.blogspot.com/2012/06/explaining-prompt-monad-with-simpler.html
Here is a simplified code that can be run :
data Request a where
GetLine :: Request String
PutStrLn :: String -> Request ()
data Prompt p r
= forall a. Ask (p a) (a -> Prompt p r)
| Answer r
instance Monad (Prompt p) where
return = Answer
Ask req cont >>= k = Ask req (\ans -> cont ans >>= k)
Answer x >>= k = k x
prompt :: p a -> Prompt p a
prompt req = Ask req Answer
runPromptM :: Monad m => (forall a. p a -> m a) -> Prompt p r -> m r
runPromptM perform (Ask req cont) = perform req
>>= runPromptM perform . cont
runPromptM _ (Answer r) = return r
handleIO :: Request a -> IO a
handleIO GetLine = return ""
handleIO (PutStrLn s) = putStrLn s
req :: Prompt Request ()
req = do
answers <- sequence $ replicate 20000 (prompt GetLine)
prompt $ PutStrLn (concat answers)
main = runPromptM handleIO req
A comment in the article mentions that :
it has a problem that left recursion of >>= takes quadratic time to evaluate (it's the same as the left-recursion of ++ problem!)
I don't understand where the quadratic time (which I checked experimentally) come from. Is it related to lazy evaluation ?
Can someone explain me why ?

I feel this is a little easier to explain using the Free monad over Prompt, though they are very similar.
data Free f a = Pure a | Free (f (Free f a)) deriving Functor
The Free monad is either a completed operation marked by Pure or an f-indexed effect marked by Free. If f is a Functor then Free f is a Monad
instance Functor f => Monad (Free f) where
return = Pure
Pure a >>= f = f a
Free m >>= f = Free (fmap (>>= f) m)
This Monad instance work by "pushing" binds down through the layers of the Functor to reach the Pure nodes at the bottom and then applying f. What's important about this is that the number of Functor layers does not change. To wit, here's an example bind occurring in Free Maybe ().
Free (Just (Free (Just (Free Nothing)))) >>= (const (return ()))
Free (fmap (>>= (const (return ()))) (Just (Free (Just (Free Nothing))))
Free (Just (Free (Just (Free Nothing)) >>= (const (return ())))
Free (Just (Free (fmap (>>= (const (return ()))) (Just (Free Nothing))))
Free (Just (Free (Just (Free Nothing >>= (const (return ()))))))
Free (Just (Free (Just (Free Nothing))))
We see here a hint of what's to come—we had to traverse the entire tree down to the root just to do nothing at all.
The "Tree-Grafting" Substitution Monad
One way to see the Free monad is to think of it as the "substitution monad". Its bind looks like
(=<<) :: (a -> Free f b) -> Free f a -> Free f b
and if you think of a -> Free f b as converting Pure leaf values into new trees of effects then (=<<) just descends through the tree of effects Free f a and performs substitution over the a values.
Now if we have a chain of right associating binds (written using Kleisli composition (>=>) to prove that reassociation is valid—Kleisli arrows are a category)
f >=> (g >=> h)
then we only must descend into the tree once—the substitution function is computed at all of the nodes of the tree. However, if we're associated the other direction
(f >=> g) >=> h
we get the same result, but must compute the entire result (f >=> g) before we're able to apply the substitution function in h. What this means is that if we have a deeply nested left-associated bind sequence
((((f >=> g) >=> h) >=> i) >=> j) >=> k
then we're continually recomputing the left results so that we can recurse on them. This is where the quadratic slowdown appears.
Speedup with Codensity
There's a really strange type called Codensity which is related to continuation passing style and the ContT monad.
data Codensity m a = Codensity { runCodensity :: forall b . (a -> m b) -> m b }
-- Codensity m a = forall b . ContT b m a
Codensity has the interesting property that it's a Functor even when m isn't:
instance Functor (Codensity m) where
fmap f m = Codensity (\amb -> runCodensity m (amb . f))
and the unique property that it's a Monad even when m isn't:
instance Monad (Codensity m) where
return a = Codensity (\amb -> amb a)
m >>= f = Codensity (\bmc -> runCodensity m (\a -> runCodensity (f a) bmc))
We can also round-trip Monads through Codensity
toCodensity :: Monad m => m a -> Codensity m a
toCodensity ma = Codensity (\amb -> ma >>= amb)
fromCodensity :: Monad m => Codensity m a -> m a
fromCodensity m = runCodensity m return
roundtrip :: Monad m => m a -> m a
roundtrip = fromCodensity . toCodensity
but when we do this something very, very interesting happens: all of the binds become right-associated!.

Consider the classic left associated append problem. You are probably aware that whenever you have a series of left associated append-like functions (I'll use (++) here), you run into O(n^2) issues:
(((as ++ bs) ++ cs) ++ ds) ++ ...
It is easy to see that each additional append will have to traverse the entire length of the previously appended list, resulting in horrendously slow O(n^2) algorithm.
The solution is to right associate:
as ++ (bs ++ (cs ++ (ds ++ ...)))
This is O(a + b + c + d + ...) where a is the length of as, etc. or simply O(n) in the length of the total list.
Now, what does this have to do with Free? Let's suggestively compare the definition of Free with []:
data Free f r = Pure r | Free (f (Free f r))
data [] a = [] | Cons a [a]
While [] has values at each Cons node, Free only has a value at the Pure tip. Apart from that the definitions are very similar. A good intuition for Free is that it is a list-like data structure of Functors.
Now the Monad instance for Free, we only care about (>>=):
Pure r >>= f = f r
Free x >>= f = Free (fmap (>>= f) x)
Notice that (>>=) traverses the structure of Free until it reaches the Pure value, and then grafts (appends) additional Free structure onto the end. Remarkably similar to (++)!
With this intuition of Free as a list of Functor, and (>>=) behaving as (++), it should be clear why left associated (>>=) causes problems.

Related

Cases in which we shall not use monadic bind to write mfix down using loop

I have been trying to write mfix down using Control.Arrow.loop. I came up with different definitions and would like to see which one is mfix's actual workalike.
So, the solution I reckon to be the right one is the following:
mfix' :: MonadFix m => (a -> m a) -> m a
mfix' k = let f ~(_, d) = sequenceA (d, k d)
in (flip runKleisli () . loop . Kleisli) f
As one can see, the loop . Kleisli's argument works for Applicative instances. I find it to be a good sign as we mostly have our knot-tying ruined by (>>=)'s strictness in the right argument.
Here is another function. I can tell that it is not mfix's total workalike, but the only case I found is not very natural. Take a look:
mfix'' k = let f ~(_, d) = fmap ((,) d) (return d >>= k)
in (flip runKleisli () . loop . Kleisli) f
As far as I understand, not every strict on the right-hand bind forces its argument entirely. For example, in case of IO:
GHCi> mfix'' ((return :: a -> IO a) . (1:))
[1,1,1,1,1,Interrupted.
So, I decided to fix this. I just took Maybe and forced x in Just x >>= k:
data Maybe' a = Just' a | Nothing' deriving Show
instance Functor Maybe' where
fmap = liftM
instance Applicative Maybe' where
pure = return
(<*>) = ap
instance Monad Maybe' where
return = Just'
Nothing' >>= k = Nothing'
Just' x >>= k = x `seq` k x
instance MonadFix Maybe' where
mfix f = let a = f (unJust' a) in a
where unJust' (Just' x) = x
unJust' Nothing' = errorWithoutStackTrace "mfix Maybe': Nothing'."
Having this on our hands:
GHCi> mfix ((return :: a -> Maybe' a) . (1:))
[1,1,1,1,1,Interrupted.
GHCi> mfix' ((return :: a -> Maybe' a) . (1:))
[1,1,1,1,1,Interrupted.
GHCi> mfix'' ((return :: a -> Maybe' a) . (1:))
Interrupted.
So, here are my questions:
Is there any other example which could show that mfix'' is not
totally mfix?
Are monads with such a strict bind, like Maybe',
interesting in practice?
Are there any examples which show that mfix' is not totally mfix that I have not found?
A small side note on IO:
mfix3 k' =
let
k = return . k'
f ~(_, d) = fmap ((,) d) (d >>= k)
in (join . flip runKleisli () . loop . Kleisli) f
Do not worry about all the returns and joins - they are here just to have mfix3's and mfix's types match. The idea is that we pass d itself instead of return d to the (>>=) on the right-hand. It gives us the following:
GHCi> mfix3 ((return :: a -> IO a) . (1:))
Interrupted.
Yet, for example (thanks to Li-yao Xia for their comment):
GHCi> mfix3 ((return :: a -> e -> a) . (1:)) ()
[1,1,1,1,1,Interrupted.
Edit: thanks to HTNW for an important note on pattern-matching in the comments: it is better to use \ ~(_, d) -> ..., not \ (_, d) -> ....
Here's a partial answer, which I hope is better than no answer.
Is there any other example which could show that mfix'' is not totally mfix?
We can distinguish mfix'' from mfix also by making return strict instead of (>>=).
Are monads with such a strict bind, like Maybe', interesting in practice?
Probably not. (Questions about the existence of "practical" examples are not easy to answer negatively.)
Containers that are strict in their elements might be an example of this. (In case you're wondering about the official containers package, it does not actually define Monad instances for Map and IntMap, and the Monad instance of Seq is lazy in the elements of the sequence).
Note also that it is unclear whether the monad laws take strictness into account. If you do, then such things are not lawfully monads because they break the left identity law: (return x >>= k) = k x for x = undefined.
Are there any examples which show that mfix' is not totally mfix that I have not found?
If you take the definition of loop in the standard library, in terms of mfix, then I think that mfix' = mfix, though I couldn't complete a proof (I could either be missing a good trick, or there's a missing MonadFix law).
The main point of contention, as was hinted at in the comments, is strictness. Both your definition of mfix' and the standard library's definition of loop are careful to expand the argument function to be lazier (using lazy patterns (~(_, d)) and snd respectively; the two techniques are equivalent). mfix and mfix' are still equal if exactly one of those precautions is dropped. There is a mismatch (mfix /= mfix') if both are dropped.

Is the Yoneda Lemma only useful from a theoretical point of view?

For instance, loop fusion can be obtained with Yoneda:
newtype Yoneda f a =
Yoneda (forall b. (a -> b) -> f b)
liftYo :: (Functor f) => f a -> Yoneda f a
liftYo x = Yoneda $ \f -> fmap f x
lowerYo :: (Functor f) => Yoneda f a -> f a
lowerYo (Yoneda y) = y id
instance Functor (Yoneda f) where
fmap f (Yoneda y) = Yoneda $ \g -> y (g . f)
loopFusion = lowerYo . fmap f . fmap g . liftYo
But I could have just written loopFusion = fmap (f . g). Why would I use Yoneda at all? Are there other use cases?
Well, in this case you could have done fusion by hand, because the two fmaps are "visible" in the source code, but the point is that Yoneda does the transformation at runtime. It's a dynamic thing, most useful when you don't know how many times you will need to fmap over a structure. E.g. consider lambda terms:
data Term v = Var v | App (Term v) (Term v) | Lam (Term (Maybe v))
The Maybe under Lam represents the variable bound by the abstraction; in the body of a Lam, the variable Nothing refers to the bound variable, and all variables Just v represent the ones bound in the environment. (>>=) :: Term v -> (v -> Term v') -> Term v' represents substitution—each variable can be replaced with a Term. However, when replacing a variable inside a Lam, all the variables in the produced Term need to be wrapped in Just. E.g.
Lam $ Lam $ Var $ Just $ Just $ ()
>>= \() -> App (Var "f") (Var "x")
=
Lam $ Lam $ App (Var $ Just $ Just "f") (Var $ Just $ Just "x")
The naive implementation of (>>=) goes like this:
(>>=) :: Term v -> (v -> Term v') -> Term v'
Var x >>= f = f x
App l r >>= f = App (l >>= f) (r >>= f)
Lam b >>= f = Lam (b >>= maybe (Var Nothing) (fmap Just . f))
But, written like this, every Lam that (>>=) goes under adds an fmap Just to f. If I had a Var v buried under 1000 Lams, then I would end up calling fmap Just and iterating over the new f v term 1000 times! I can't just pull your trick and fuse multiple fmaps into one, by hand, because there's only one fmap in the source code being called multiple times.
Yoneda can ease the pain:
bindTerm :: Term v -> (v -> Yoneda Term v') -> Term v'
bindTerm (Var x) f = lowerYoneda (f x)
bindTerm (App l r) f = App (bindTerm l f) (bindTerm r f)
bindTerm (Lam b) f =
Lam (bindTerm b (maybe (liftYoneda $ Var Nothing) (fmap Just . f)))
(>>=) :: Term v -> (v -> Term v') -> Term v'
t >>= f = bindTerm t (liftYoneda . f)
Now, the fmap Just is free; it's just a wonky function composition. The actual iteration over the produced Term is in the lowerYoneda, which is only called once for each Var. To reiterate: the source code nowhere contains anything of the form fmap f (fmap g x). Such forms only arise at runtime, dynamically, depending on the argument to (>>=). Yoneda can rewrite that, at runtime, to fmap (f . g) x, even though you can't rewrite it like that in the source code. Further, you can add Yoneda to existing code with minimal changes to it. (There is, however, a drawback: lowerYoneda is always called exactly once for each Var, which means e.g. Var v >>= f = fmap id (f v) where it was just f v, before.)
There is an example similar in spirit to the one described by HTNW in lens. A look at (a paraphrased version of) the lens function illustrates what a typical van Laarhoven lens looks like:
-- type Lens s t a b = forall f. Functor f => (a -> f b) -> s -> f t
lens :: (s -> a) -> (s -> b -> t) -> Lens s t a b
lens getter setter = \f -> \s -> fmap (setter s) (f (getter s))
The occurrence of fmap in there means that composing lenses will, in principle, lead to consecutive uses of fmap. Now, most of the time this doesn't actually matter: the implementations in lens use a lot of inlining and newtype coercion so that, when using the lens combinators (view, over, and so forth), the involved functor (typically Const or Identity) is generally optimised away. In a few situations, however, it is not possible to do that (for instance, if the lens is used in such a way that a concrete choice of functor isn't done in compile time). As compensation, lens offers a helper function called fusing, which makes it possible to have fmap fusion in those special cases. Its implementation is:
-- type LensLike f s t a b = (a -> f b) -> s -> f t
fusing :: Functor f => LensLike (Yoneda f) s t a b -> LensLike f s t a b
fusing t = \f -> lowerYoneda . t (liftYoneda . f)
That being so, if you write fusing (foo . bar), Yoneda f is picked as the functor to be used by foo . bar, which guarantees fmap fusion.
(This answer was inspired by a comment by Edward Kmett, which I happened to stumble upon right before seeing this question.)

Is the streaming package's Stream data type equivalent to FreeT?

The streaming package defines a Stream type that looks like the following:
data Stream f m r
= Step !(f (Stream f m r))
| Effect (m (Stream f m r))
| Return r
There is a comment on the Stream type that says the following:
The Stream data type is equivalent to FreeT and can represent any effectful succession of steps, where the form of the steps or 'commands' is specified by the first (functor) parameter.
I'm wondering how the Stream type is equivalent to FreeT?
Here is the definition of FreeT:
data FreeF f a b = Pure a | Free (f b)
newtype FreeT f m a = FreeT { runFreeT :: m (FreeF f a (FreeT f m a)) }
It looks like it is not possible to create an isomorphism between these two types.
To be specific, I don't see a way to write the following two functions that makes them an isomorphism:
freeTToStream :: FreeT f m a -> Stream f m a
streamToFreeT :: Stream f m a -> FreeT f m a
For instance, I'm not sure how to express a value like Return "hello" :: Stream f m String as a FreeT.
I guess it could be done like the following, but the Pure "hello" is necessarily going to be wrapped in an m, while in Return "hello" :: Stream f m String it is not:
FreeT $ pure $ Pure "hello" :: Applicative m => FreeT f m a
Can Stream be considered equivalent to FreeT even though it doesn't appear possible to create an isomorphism between them?
There are some small differences that make them not literally equivalent. In particular, FreeT enforces an alternation of f and m,
FreeT f m a = m (Either a (f (FreeT f m a) = m (Either a (f (m (...))))
-- m f m -- alternating
whereas Stream allows stuttering, e.g., we can construct the following with no Step between two Effect:
Effect (return (Effect (return (Return r))))
which should be equivalent in some sense to
Return r
Thus we shall take a quotient of Stream by the following equations that flatten the layers of Effect:
Effect (m >>= \a -> return (Effect (k a))) = Effect (m >>= k)
Effect (return x) = x
Under that quotient, the following are isomorphisms
freeT_stream :: (Functor f, Monad m) => FreeT f m a -> Stream f m a
freeT_stream (FreeT m) = Effect (m >>= \case
Pure r -> return (Return r)
Free f -> return (Step (fmap freeT_stream f))
stream_freeT :: (Functor f, Monad m) => Stream f m a -> FreeT f m a
stream_freeT = FreeT . go where
go = \case
Step f -> return (Free (fmap stream_freeT f))
Effect m -> m >>= go
Return r -> return (Pure r)
Note the go loop to flatten multiple Effect constructors.
Pseudoproof: (freeT_stream . stream_freeT) = id
We proceed by induction on a stream x. To be honest, I'm pulling the induction hypotheses out of thin air. There are certainly cases where induction is not applicable. It depends on what m and f are, and there might also be some nontrivial setup to ensure this approach makes sense for a quotient type. But there should still be many concrete m and f where this scheme is applicable. I hope there is some categorical interpretation that translates this pseudoproof to something meaningful.
(freeT_stream . stream_freeT) x
= freeT_stream (FreeT (go x))
= Effect (go x >>= \case
Pure r -> return (Return r)
Free f -> return (Step (fmap freeT_stream f)))
Case x = Step f, induction hypothesis (IH) fmap (freeT_stream . stream_freeT) f = f:
= Effect (return (Step (fmap freeT_stream (fmap stream_freeT f))))
= Effect (return (Step f)) -- by IH
= Step f -- by quotient
Case x = Return r
= Effect (return (Return r))
= Return r -- by quotient
Case x = Effect m, induction hypothesis m >>= (return . freeT_stream . stream_freeT)) = m
= Effect ((m >>= go) >>= \case ...)
= Effect (m >>= \x' -> go x' >>= \case ...) -- monad law
= Effect (m >>= \x' -> return (Effect (go x' >>= \case ...))) -- by quotient
= Effect (m >>= \x' -> (return . freeT_stream . stream_freeT) x') -- by the first two equations above in reverse
= Effect m -- by IH
Converse left as an exercise.
Both your example with Return and my example with nested Effect constructors cannot be represented by FreeT with the same parameters f and m. There are more counterexamples, too. The underlying difference in the data types can best be seen in a hand-wavey space where the data constructors are stripped out and infinite types are allowed.
Both Stream f m a and FreeT f m a are for nesting an a type inside a bunch of f and m type constructors. Stream allows arbitrary nesting of f and m, while FreeT is more rigid. It always has an outer m. That contains either an f and another m and repeats, or an a and terminates.
But that doesn't mean there isn't an equivalence of some sort between the types. You can show some equivalence by showing that each type can be embedded inside the other faithfully.
Embedding a Stream inside a FreeT can be done on the back of one observation: if you choose an f' and m' such that the f and m type constructors are optional at each level, you can model arbitrary nesting of f and m. One quick way to do that is use Data.Functor.Sum, then write a function:
streamToFreeT :: Stream f m a -> FreeT (Sum Identity f) (Sum Identity m) a
streamToFreeT = undefined -- don't have a compiler nearby, not going to even try
Note that the type won't have the necessary instances to function. That could be corrected by switching Sum Identity to a more direct type that actually has an appropriate Monad instance.
The transformation back the other direction doesn't need any type-changing trickery. The more restricted shape of FreeT is already directly embeddable inside Stream.
I'd say this makes the documentation correct, though possibly it should use a more precise term than "equivalent". Anything you can construct with one type, you can construct with the other - but there might be some extra interpretation of the embedding and a change of variables involved.

Is there a "chain" monad function in Haskell?

Explain about a "duplicate"
Someone point to Is this a case for foldM? as a possible duplicate. Now, I have a strong opinion that, two questions that can be answered with identical answers are not necessarily duplicates! "What is 1 - 2" and "What is i^2" both yields "-1", but no, they are not duplicate questions. My question (which is already answered, kind of) was about "whether the function iterateM exists in Haskell standard library", not "How to implement a chained monad action".
The question
When I write some projects, I found myself writing this combinator:
repeatM :: Monad m => Int -> (a -> m a) -> a -> m a
repeatM 0 _ a = return a
repeatM n f a = (repeatM (n-1) f) =<< f a
It just performs a monadic action n times, feeding the previous result into the next action. I tried some hoogle search and some Google search, and did not find anything that comes with the "standard" Haskell. Is there such a formal function that is predefined?
You can use foldM, e.g.:
import Control.Monad
f a = do print a; return (a+2)
repeatM n f a0 = foldM (\a _ -> f a) a0 [1..n]
test = repeatM 5 f 3
-- output: 3 5 7 9 11
Carsten mentioned replicate, and that's not a bad thought.
import Control.Monad
repeatM n f = foldr (>=>) pure (replicate n f)
The idea behind this is that for any monad m, the functions of type a -> m b form the Kleisli category of m, with identity arrows
pure :: a -> m a
(also called return)
and composition operator
(<=<) :: (b -> m c) -> (a -> m b) -> a -> m c
f <=< g = \a -> f =<< g a
Since were actually dealing with a function of type a -> m a, we're really looking at one monoid of the Kleisli category, so we can think about folding lists of these arrows.
What the code above does is fold the composition operator, flipped, into a list of n copies of f, finishing off with an identity as usual. Flipping the composition operator actually puts us into the dual category; for many common monads, x >=> y >=> z >=> w is more efficient than w <=< z <=< y <=< x; since all the arrows are the same in this case, it seems we might as well. Note that for the lazy state monad and likely also the reader monad, it may be better to use the unflipped <=< operator; >=> will generally be better for IO, ST s, and the usual strict state.
Notice: I am no category theorist, so there may be errors in the explanation above.
I find myself wanting this function often, I wish it had a standard name. That name however would not be repeatM - that would be for an infinite repeat, like forever if it existed, just for consistency with other libraries (and repeatM is defined in some libraries that way).
Just as another perspective from the answers already given, I point out that (s -> m s) looks a bit like an action in a State monad with state type s.
In fact, it is isomorphic to StateT s m () - an action which returns no value, because all the work it does is encapsulated in the way it changes the state. In this monad, the function you wanted really is replicateM. You can write it this way in haskell although it probably looks uglier than just writing it directly.
First convert s -> m s to the equivalent form which StateT uses, adding the information-free (), using liftM to map a function over the return type.
> :t \f -> liftM (\x -> ((),x)) . f
\f -> liftM (\x -> ((),x)) . f :: Monad m => (a -> m t) -> a -> m ((), t)
(could have used fmap but the Monad constraint seems clearer here; could have used TupleSections if you like; if you find do notation easier to read it is simply \f s -> do x <- f s; return ((),s) ).
Now this has the right type to wrap up with StateT:
> :t StateT . \f -> liftM (\x -> ((),x)) . f
StateT . \f -> liftM (\x -> ((),x)) . f :: Monad m => (s -> m s) -> StateT s m ()
and then you can replicate it n times, using the replicateM_ version because the returned list [()] from replicateM would not be interesting:
> :t \n -> replicateM_ n . StateT . \f -> liftM (\x -> ((),x)) . f
\n -> replicateM_ n . StateT . \f -> liftM (\x -> ((),x)) . f :: Monad m => Int -> (s -> m s) -> StateT s m ()
and finally you can use execStateT to go back to the Monad you were originally working in:
runNTimes :: Monad m => Int -> (s -> m s) -> s -> m s
runNTimes n act =
execStateT . replicateM_ n . StateT . (\f -> liftM (\x -> ((),x)) . f) $ act

how to achieve "product of two monads" effect?

Suppose we have two monads, m and m'. Now, suppose we have variables,
-- in real problems, the restriction is some subclass MyMonad, so don't worry
-- if it's the case here that mx and f must essentially be pure.
mx :: Monad m'' => m'' a
f :: Monad m'' => a -> m'' b
Is there a way to create anything similar to the product m x m'? I know this is possible with Arrows, but it seems more complicated (impossible?) for monads, especially when trying to write what mx >>= f should do.
To see this, define
data ProdM a = ProdM (m a) (m' a)
instance Monad ProdM where
return x = ProdM (return x) (return x)
but now, when we define mx >>= f, it's not clear which value from mx to pass to f,
(ProdM mx mx') >>= f
{- result 1 -} = mx >>= f
{- result 2 -} = mx' >>= f
I want (mx >>= f) :: ProdM to be isomorphic to ((mx >>= f) :: m) x ((mx >>= f) :: m').
Yes, this type is a monad. The key is simply to pass both results to f, and only keep the matching field from the result. That is, we keep the first element from the result of passing mx's result, and the second element from the result of passing mx''s result. The instance looks like this:
instance (Monad m, Monad m') => Monad (ProdM m m') where
return a = ProdM (return a) (return a)
ProdM mx mx' >>= f = ProdM (mx >>= fstProd . f) (mx' >>= sndProd . f)
where fstProd (ProdM my _) = my
sndProd (ProdM _ my') = my'
ProdM is available in the monad-products package under the name Product.

Resources