How to hoist Conduit of STT

How to hoist Conduit of STT - haskell

I've been trying to write an implementation of the function:
foo :: Monad m => ConduitM i o (forall s. STT s m) r -> ConduitM i o m r
But I've been failing at every turn with the error:
Couldn't match type because variable `s` would escape its scope.
I'm now suspicious that implementing this function is impossible.
threadSTT :: Monad m
=> (forall a. (forall s. STT s m a) -> m a)
-> ConduitM i o (forall s. STT s m) r
-> ConduitM i o m r
threadSTT runM (ConduitM c0) =
ConduitM $ \rest ->
let go (Done r) = rest r
go (PipeM mp) = PipeM $ do
r <- runM mp -- ERROR
return $ go r
go (Leftover p i) = Leftover (go p) i
go (NeedInput x y) = NeedInput (go . x) (go . y)
go (HaveOutput p f o) = HaveOutput (go p) (runM f) o -- ERROR
in go (c0 Done)
foo :: Monad m => ConduitM i o (forall s. STT s m) r -> ConduitM i o m r
foo = threadSTT STT.runST
Can anyone speak to this? I'd really love it to work, but if I can't then I need abandon use of Data.Array.ST for writing my conduits.

It seems that you have reinvented the MFunctor instance of ConduitM. You may check the source code.
By the author of conduit package, monad hoist in this style gives surprising results when you try to unwrap a monad with side effect. In you case runST will be called multiple times so the state is thrown every time the conduit produces an item.
You'd better lift every other conduit on the line from Conduit i o m r to Conduit i o (STT s m) r and call runST on the result. It's as easy as transPipe lift.

Related

Converting this FreeT (explicitly recursive data type) function to work on FT (church encoding)

I'm using the FreeT type from the free library to write this function which "runs" an underlying StateT:
runStateFree
:: (Functor f, Monad m)
=> s
-> FreeT f (StateT s m) a
-> FreeT f m (a, s)
runStateFree s0 (FreeT x) = FreeT $ do
flip fmap (runStateT x s0) $ \(r, s1) -> case r of
Pure y -> Pure (y, s1)
Free z -> Free (runStateFree s1 <$> z)
However, I'm trying to convert it to work on FT, the church-encoded version, instead:
runStateF
:: (Functor f, Monad m)
=> s
-> FT f (StateT s m) a
-> FT f m (a, s)
runStateF s0 (FT x) = FT $ \ka kf -> ...
but I'm not quite having the same luck. Every sort of combination of things I get seems to not quite work out. The closest I've gotten is
runStateF s0 (FT x) = FT $ \ka kf ->
ka =<< runStateT (x pure (\n -> _ . kf (_ . n)) s0
But the type of the first hole is m r -> StateT s m r and the type the second hole is StateT s m r -> m r...which means we necessarily lose the state in the process.
I know that all FreeT functions are possible to write with FT. Is there a nice way to write this that doesn't involve round-tripping through FreeT (that is, in a way that requires explicitly matching on Pure and Free)? (I've tried manually inlining things but I don't know how to deal with the recursion using different ss in the definition of runStateFree). Or maybe this is one of those cases where the explicit recursive data type is necessarily more performant than the church (mu) encoding?

Here's the definition. There are no tricks in the implementation itself. Don't think and make it type check. Yes, at least one of these fmap is morally questionable, but the difficulty is actually to convince ourselves it does the Right thing.
runStateF
:: (Functor f, Monad m)
=> s
-> FT f (StateT s m) a
-> FT f m (a, s)
runStateF s0 (FT run) = FT $ \return0 handle0 ->
let returnS a = StateT (\s -> fmap (\r -> (r, s)) (return0 (a, s)))
handleS k e = StateT (\s -> fmap (\r -> (r, s)) (handle0 (\x -> evalStateT (k x) s) e))
in evalStateT (run returnS handleS) s0
We have two stateless functions (i.e., plain m)
return0 :: a -> m r
handle0 :: forall x. (x -> m r) -> f x -> m r
and we must wrap them in two stateful (StateT s m) variants with the signatures below. The comments that follow give some details about what is going on in the definition of handleS.
returnS :: a -> StateT s m r
handleS :: forall x. (x -> StateT s m r) -> f x -> StateT s m r
-- 1. -- ^ grab the current state 's' here
-- 2. -- ^ call handle0 to produce that 'm'
-- 3. ^ here we will have to provide some state 's': pass the current state we just grabbed.
-- The idea is that 'handle0' is stateless in handling 'f x',
-- so it is fine for this continuation (x -> StateT s m r) to get the state from before the call to 'handle0'
There is an apparently dubious use of fmap in handleS, but it is valid as long as run never looks at the states produced by handleS. It is almost immediately thrown away by one of the evalStateT.
In theory, there exist terms of type FT f (StateT s m) a which break that invariant. In practice, that almost certainly doesn't occur; you would really have to go out of your way to do something morally wrong with those continuations.
In the following complete gist, I also show how to test with QuickCheck that it is indeed equivalent to your initial version using FreeT, with concrete evidence that the above invariant holds:
https://gist.github.com/Lysxia/a0afa3ca2ea9e39b400cde25b5012d18

I'd say that no, as even something as simple as cutoff converts to FreeT:
cutoff :: (Functor f, Monad m) => Integer -> FT f m a -> FT f m (Maybe a)
cutoff n = toFT . FreeT.cutoff n . fromFT
In general, you're probably looking at:
improve :: Functor f => (forall m. MonadFree f m => m a) -> Free f a
Improve the asymptotic performance of code that builds a free monad with only binds and returns by using F behind the scenes.
I.e. you'll construct Free efficiently, but then do whatever you need to do with Free (maybe again, by improveing).

Can one compose types in a Haskell instance declaration?

I've written a Haskell typeclass and it would be convenient to declare instances of it using types of the form (a -> m _), where m is of kind (* -> *), such as a monad, and _ is a slot to be left unsaturated. I know how to write newtype X a m b = X (a -> m b), and declaring an instance for X a m. But what I'm looking for is to instead use the bare, unwrapped -> type, if that's possible.
If one wants to declare instances for types of the form (a -> _), then you can just write:
instance Foo a ((->) a) where ...
but I don't know how/whether one can do it with types of the form (a -> m _). I guess I'm looking to compose the type constructor (->) a _ and the type constructor m _ in my instance declaration.
I'd like to write something like this:
instance Foo a ((->) a (m :: *->*)) where ...
or:
instance Foo a ((->) a (m *)) where ...
but of course these don't work. Is it possible to do this?
Concretely, here's what I'm trying to achieve. I wrote a typeclass for
MonadReaders that are embedded inside (one level) of other MonadReaders,
like this:
{-# LANGUAGE FunctionalDependencies FlexibleInstances
UndecidableInstances #-}
class MonadReader w m => DeepMonadReader w r m | m -> r where
{ deepask :: m r
; deepask = deepreader id
; deeplocal :: (r -> r) -> m a -> m a
; deepreader :: (r -> a) -> m a
; deepreader f = do { r <- deepask; return (f r) }
}
instance MonadReader r m => DeepMonadReader w r (ReaderT w m) where
{ deepask = lift ask
; deeplocal = mapReaderT . local
; deepreader = lift . reader
}
It'd be nice to also provide an instance something like this:
instance MonadReader r m => DeepMonadReader w r ((->) w (m :: * ->
*)) where
{ deepask = \w -> ask
; deeplocal f xx = \w -> local f (xx w)
; deepreader xx = \w -> reader xx
}

I think you're on the wrong track and are making things a lot more
complicated than they need to be.
Some observations:
... ((->) w (m :: * -> *)) ...
Let's explore what you mean by this. You are using it for the type parameter m in your DeepMonadReader class, and therefore it needs to be a monad. Can you give a concrete example of a monad which
has this type? Why not just use ((->) w) ?
class MonadReader w m => DeepMonadReader w r m | m -> r where ...
The fact that w never apears in any member signatures is an indication something is amiss.
... I wrote a typeclass for MonadReaders that are embedded inside (one level) of other MonadReaders ...
I would take the reverse perspective. It makes sense to talk of monad stacks which are a transformed
version of another monad stack. E.g.:
StateT s (WriterT w IO) "contains" IO
WriterT w (Maybe a) "contains" Maybe a
And what does it mean for a monad stack m1 to "contain" another monad m2?
It just means that there is a way to convert computations in m2 to computations in m1:
convert :: m2 a -> m1 a
Of course, this is just lift when using monad transformers.
To express your concept of a monad reader embedded in another monad, I would use this
type class:
class HasReader m m' r where ...
deepAsk :: m r
deepLocal :: (r -> r) -> m' a -> m a
The idea here is that an instance HasReader m m' r expresses the fact that
monad m "contains" a monad m' which itself is a reader with
environment r.
deepAsk returns the environment of m' but as a computation in m.
deepLocal runs a computation in m' with a environment modification function
but returns it as a computation in m. Note how this type signature is different from yours:
my deepLocal uses different monads, m' and m whereas yours just goes from m to m.
The next step is decide which triples (m, m', r) do we want to write instances
of HasReader for. Clearly it seems you had instances like this in mind:
m m' r
--------------------- ----------- --
ReaderT s (ReaderT r m) ReaderT r m r
ReaderT t (ReaderT s (ReaderT r m) ReaderT s (Reader T r m) s
...
but it also seems reasonable to want to have these instances:
StateT s (ReaderT r m) ReaderT r m r
WriterT w (ReaderT r m) ReaderT r m r
MaybeT (ReaderT r m) ReaderT r m r
...
It turns out, though, that we don't need the HasReader class for any of these cases.
We can just write the expression as a computation in m' and lift it up to m.

How do `pass` and `listen` work in WriterT?

The code below probably isn't a good way to do this, but it's what I've managed to cobble together. Basically, I run a series of complex tasks, during which several things get logged. At the end of each one I dump the log into a .txt file and move on to the next batch in a loop.
To achieve this I make use of listen and pass in WriterT (as part of RWST). The code is below:
-- Miscelaneous stuff
newtype Log = Log [String]
type ConnectT a = EitherT String (RWST ConnectReader Log ConnectState IO) a
timeStampLog :: String -> Log
timeStampLog msg = do
theTime <- liftIO $ fmap zonedTimeToLocalTime getZonedTime
let msgStart = show theTime ++ ": "
tell $ Log [msgStart ++ msg]
logToFileIO :: Log -> IO ()
logToFileIO (Log xs) = appendFile "Log.txt" $ "\r\n" ++ intercalate "\r\n" (reverse xs)
---------------------
logToFile :: ConnectT a -> ConnectT ()
logToFile cta = let ctaw = listen cta
in pass $ do
(_,w) <- ctaw
liftIO $ logToFileIO w
return ((),const mempty)
mapFunction :: (Show a) => a -> ConnectT ()
mapFunction a = logToFile $ do
timeStampLog $ "Starting sequence for " ++ show a
lotsOfLogging a
timeStampLog $ "Finishing sequence for " ++ show a
loopFunction :: ConnectT ()
loopFunction = logToFile $ do
timeStampLog "Starting Loop"
mapM_ mapFunction someList
timeStampLog "Finishing Loop"
What I end up with is something like this:
2015-03-17 20:21:40.8198823: Starting sequence for a
2015-03-17 20:21:41.8198823: (logs for a)
2015-03-17 20:21:41.8198823: Finishing sequence for a
2015-03-17 20:21:41.8198823: Starting sequence for b
2015-03-17 20:21:42.8198823: (logs for b)
2015-03-17 20:21:42.8198823: Finishing sequence for b
2015-03-17 20:21:39.8198823: Starting Loop
2015-03-17 20:21:42.8198823: Finishing Loop
Where the log entry for starting/finishing the loop end up together at the end.
I'm not entirely surprised that the call to logToFile in mapFunction doesn't include the log information from the loopFunction as the information hasn't passed to it via a bind.
But I'm still having trouble understanding how pass and listen work. And also how I would go about fixing this (admittedly minor) issue.

We can determine how listen and pass work almost entirely from their types. We'll start with listen.
listen
listen :: (Monoid w, Monad m) => RWST r w s m a -> RWST r w s m (a, w)
Unwrapping the RWST we have
listen :: (Monoid w, Monad m) => (r -> s -> m (a, s, w)) -> r -> s -> m ((a, w), s, w)
It needs to return an m .... The only way we have to make ms is to return something or apply the input function to an rand an s (we can't use >>= since it requires we already have an m). We don't have an a to return so we have to apply the function to an r and s. There's only one r and s we can use, those passed into the result.
listen k r s = ... (k r s)
Now we have an m (a, s, w) but need an m ((a, w), s, w). We can run the action again to get another m (nonsense for "listening") or do something with the (a, s, w) inside the m with >>=.
listen k r s = k r s >>= \(a, s' w) -> ...
To use bind we need an m. We can either return something or apply the input function to an r and s and repeat the action again, which is nonsense for "listening". We return something.
listen k r s = k r s >>= \(a, s', w) -> return ...
We need an a, a w, an s, and another w. We only have one a and no way to get any others.
listen k r s = k r s >>= \(a, s', w) -> return ((a,...),...,...)
There are 3 ways we can get a w: mempty, the w from the result of the action, or combining two ws together with <>. Returning mempty is pointless; the user could have just used mempty themselves. Duplicating what was logged with <> is as much nonsense as running an action twice, so we return what was logged by the first action.
listen k r s = k r s >>= \(a, s', w) -> return ((a,w),...,...)
We have two s es: s and s'. Reverting the state changes of the action is nonsense for "listening", so we return the changed state s'.
listen k r s = k r s >>= \(a, s', w) -> return ((a,w),s',...)
Now we are faced with the only interesting choice: what w should we keep for what was logged? The user has "listened" for what was logged; we could say that it's their problem now and reset the log to mempty. But "listening" doesn't suggest that it should change what something does, it should only observe it. Therefore, we keep the resulting log w intact.
listen k r s = k r s >>= \(a, s', w) -> return ((a,w),s',w)
If we wrap this in its RWSTs again we have
listen m = RWST \r s -> (runRWST m) r s >>= \(a, s', w) -> return ((a,w),s',w)
All we did was run the input action and include what it logged along with its resulting a in the result as a tuple. This matches the documentation for listen:
listen m is an action that executes the action m and adds its output to the value of the computation.
runRWST (listen m) r s = liftM (\ (a, w) -> ((a, w), w)) (runRWST m r s)
tell
pass :: (Monoid w, Monad m) => RWST r w s m (a, w -> w) -> RWST r w s m a
We begin as before, unwrapping the RWST
pass :: (Monoid w, Monad m) => (r -> s -> m ((a, w->w), s, w)) -> r -> s -> m (a, s, w)
We follow the same argument for how to get a resulting m as we used for listen
pass k r s = ... (k r s)
Now we have an m ((a, w->w), s, w)) but need an m (a, s, w). We can run the action again to get another m (nonsense for "passing") or do something with the ((a, w->w), s, w) inside the m with >>=.
pass k r s = k r s >>= \((a, f), s', w) -> ...
To use bind we need an m. We can either return something or apply the input function to an r and s and repeat the action again, which is nonsense for "passing". We return something.
pass k r s = k r s >>= \((a, f), s', w) -> return ...
We need an a, an s, and a w. We only have one a and no way to get any others.
pass k r s = k r s >>= \((a, f), s', w) -> return (a,...,...)
We have two s es: s and s'. Reverting the state changes of the action is nonsense for "passing", so we return the changed state s'.
pass k r s = k r s >>= \((a, f), s', w) -> return (a,s',...)
There are 4 ways we can get a w: mempty, the w from the result of the action, combining two ws together with <>, or applying the function f to another w. Setting the result to mempty leaves us wondering why the user provided a function f :: w -> w. themselves. Duplicating what was logged with <> is as much nonsense as running an action twice. We should be applying the function f to something.
pass k r s = k r s >>= \((a, f), s', w) -> return (a,s',f ...)
We could apply f to something built from memptys and <>, but if that were the case all of the fs would be equivalent to const ...; the type for it might as well have been a w. We could apply f to some elaborate structure built from w, mempty, <>, and f, but all of those structures could have been defined in f itself if we simply pass it w.
pass k r s = k r s >>= \((a, f), s', w) -> return (a,s',f w)
If we wrap this in its RWSTs again we have
pass m = RWST \r s -> (runRWST k) r s >>= \((a, f), s', w) -> return (a,s',f w)
We ran the input action and changed what was logged by the function that was a result of the action. This matches the documentation for pass:
pass m is an action that executes the action m, which returns a value and a function, and returns the value, applying the function to the output.
runRWST (pass m) r s = liftM (\ ((a, f), w) -> (a, f w)) (runRWST m r s)

The existing WriterT w m can't perform any action in the underlying m to perform logging until after the action has been run and the w has been assembled. As your question illustrates, this is confusing. The log for the do block loopFunction isn't written by logToFile until after the do block itself finishes running.
LoggerT
Let's invent a new WriterT called LoggerT. Our new LoggerT is going to provide a new function
logTells :: (Monoid w, Monoid w', Monad m) =>
(w -> LoggerT w' m ()) -> LoggerT w m a -> LoggerT w' m a
The intuition behind this is: we'll be able to provide an action (with type w -> LoggerT w' m ()) to log every tell, replacing the logged result with the result of the action. If we smash two things the user tells us together with <> we'll no longer be able to log both of them; we'll only ever be able to log the result of <>. Since our LoggerT will never be able to use <> it will never need the Monoid instances. We must drop the Monoid constraint from everything in LoggerT.
logTells :: (Monad m) =>
(w -> LoggerT w' m ()) -> LoggerT w m a -> LoggerT w' m a
We need to remember every tell so that we can replace it later. But when we replace it "later", the logging should happen at the point the tell appeared in the code. For example, if we make
processX :: LoggerT String m ()
processX = do
tell "Starting process X"
lotsOfProcessing
tell "Finishing process X"
And then "later" write logTells logToFile processX we want the resulting computation to look like the following.
logTells logToFile processX = do
logToFile "Starting process X"
lotsOfProcessing
logToFile "Finishing process X"
None of lotsOfProcessing should happen until the logToFile for tell "Starting process X" has already happened. This means that when the user tells us something we need to remember not only what we were told, but everything that happens after that. We "remember" things in the constructor for a data.
data LoggerT w m a
= Tell w (LoggerT w m a)
| ...
tell :: w -> LoggerT w m ()
tell w = Tell w (return ())
We also need to be able to perform actions in the underlying Monad. It would be tempting to add another constructor Lift (m a), but then we couldn't decide what to log as a result of the underlying computation. Instead, we'll let it decide the entire future LoggerT w m a to run.
data LoggerT w m a
= Tell w (LoggerT w m a)
| M (m (LoggerT w m a))
...
If we try to lift an underlying computation m a into LoggerT we now have a problem; we don't have a way to turn the a into a LoggerT w m a to put it in the M constructor.
instance MonadTrans (LoggerT w m) where
lift ma = M (??? ma)
We could try lifting return from the underlying Monad, but that's just a circular definition. We'll add another constructor for Returning.
data LoggerT w m a
= Tell w (LoggerT w m a)
| M (m (LoggerT w m a))
| Return a
instance MonadTrans (LoggerT w m) where
lift = M . liftM Return
To finish our monad transformer, we'll write a Monad instance.
instance Monad m => Monad (LoggerT w m) where
return = Return
la0 >>= k = go la0
where
go (Tell w la ) = Tell w (go la)
go (M mla) = M (liftM go mla)
go (Return a ) = Return a
We can now define logTells. It replaces every Tell with the action to perform to log it.
logTells :: (w -> LoggerT w' m ()) -> LoggerT w m a -> LoggerT w' m a
logTells k = go
where
go (Tell w la ) = k w >> go la
go (M mla) = M (liftM go mla)
go (Return a) = return a
Finally, we'll provide a way to get out of LoggerT by replacing all of the Tells with an action, very similar to logTells but dropping the LoggerT from the result.
Since it will get rid of the LoggerT we'll call it runLoggerT and swap the arguments to match the convention of other transformers.
runLoggerT :: LoggerT w m a -> (w -> m ()) -> m a
runLoggerT la0 k = go la0
where
go (Tell w la ) = k w >> go la
go (M mla) = liftM go mla
go (Return a) = return a
LoggerT already exists, we don't need to write it ourself. It's the Producer from the very mature pipes library.
pipes
The Producer from the pipes library is the correct logging transformer.
type Producer b = Proxy X () () b
Every Proxy has a MonadTrans (Proxy a' a b' b) instance and a Monad m => Monad (Proxy a' a b' b m) instance.
We tell it what to log with yield.
yield :: Monad m => a -> Producer' a m ()
tell = yield
When we know what we want to do with the yields, we replace them with what we want to do using for.
for :: Monad m =>
Proxy x' x b' b m a' ->
(b -> Proxy x' x c' c m b')
-> Proxy x' x c' c m a'
Specialized for Producer and (), for has the type
for :: Monad m =>
Producer b m a ->
(b -> Producer c m ()) ->
Producer c m a
logTells = flip for
If we replace each of the yields with an action in the underlying monad, we won't have anything produced anymore and can run the Proxy with runEffect.
runEffect :: Monad m => Effect m r -> m r
runEffect :: Monad m => Proxy X () () X m r -> m r
runEffect :: Monad m => Producer X m r -> m r
runLoggerT la0 k = runEffect $ for la0 (lift . k)
We can even recover the WriterT with hoist which replaces the underlying monad (every Proxy a' a b' b has an MFunctor instance).
hoist :: (Monad m, MFunctor t) => (forall a. m a -> n a) -> t m b -> t n b
We use hoist to replace the underlying monad with WriterT w m by lifting each m a into WriterT w m a. Then we replace each yield with lift . tell, and run the result.
toWriterT :: (Monad m, Monoid w) => Producer w m r -> WriterT w m r
toWriterT p0 = runEffect $ for (hoist lift p0) (lift . tell)
toWriterT p0 = runLoggerT (hoist lift p0) tell
Producer is essentially the free WriterT that doesn't require a Monoid for the items being written.

Here's a simplified, but definitely real-life example that uses censor (which is defined in terms of pass as
censor :: (MonadWriter w m) => (w -> w) -> m a -> m a
censor f m = pass $ (,f) <$> m
) to collect free variables of a lambda term:
import Control.Monad.Writer
import Data.Set (Set)
import qualified Data.Set as Set
type VarId = String
data Term = Var VarId
| Lam VarId Term
| App Term Term
freeVars :: Term -> Set VarId
freeVars = execWriter . go
where
go :: Term -> Writer (Set VarId) ()
go (Var x) = tell $ Set.singleton x
go (App f e) = go f >> go e
go (Lam x e) = censor (Set.delete x) $ go e
Now, of course you can implement this without all the Writer machinery, but remember this is just a simplified example standing in for some more involved compilation/analysis function, where tracking free variables is just one of the things going on.

The documentation is clear enough? http://hackage.haskell.org/package/mtl-2.2.1/docs/Control-Monad-Writer-Lazy.html#g:1
Examples (run the following in ghci)
import Control.Monad.Writer
runWriterT ( do (a,w) <- listen $ do { tell "foo" ; return 42 } ; tell $ reverse w ; return a )
==> (42,"foooof")
runWriterT ( pass $ do { tell "foo" ; return (42,reverse) } )
==> (42,"oof")

How best to type "Any monad transformer stack containing m"

I'd like to write the function
fixProxy :: (Monad m, Proxy p) => (b -> m b) -> b -> () -> p a' a () b m r
fixProxy f a () = runIdentityP $ do
v <- respond a
a' <- lift (f a)
fixProxy f a' v
which works just like you'd think until I try to run the proxy
>>> :t \g -> runRVarT . runWriterT . runProxy $ fixProxy g 0 >-> toListD
(Num a, RandomSource m s, MonadRandom (WriterT [a] (RVarT n)),
Data.Random.Lift.Lift n m) =>
(a -> WriterT [a] (RVarT n) a) -> s -> m (a, [a])
where I use RVarT intentionally to highlight the existence of the Lift class in RVar. Lift represents the existence of a natural transformation n :~> m which ought to encapsulate what I'm looking for, a function like:
fixProxy :: (Monad m, Monad n, Lift m n, Proxy p)
=> (b -> m b) -> b -> () -> p a' a () b n r
Is Lift the right answer (which would require many orphan instances) or is there a more standard natural transformation MPTC to use?
Note the practical solution, as described in comments below, is something like
runRVarT . runWriterT . runProxy
$ hoistK lift (fixProxy (const $ sample StdUniform) 0) >-> toListD

Fusing conduits with multiple inputs

I am trying to create a conduit that can consume multiple input streams. I need to be able to await on one or the other of the input streams in no particular order (e.g., not alternating) making zip useless. There is nothing parallel or non-deterministic going on here: I await on one stream or the other. I want to be able to write code similar to the following (where awaitA and awaitB await on the first or second input stream respectively):
do
_ <- awaitA
x <- awaitA
y <- awaitB
yield (x,y)
_ <- awaitB
_ <- awaitB
y' <- awaitB
yield (x,y')
The best solution I have is to make the inner monad another conduit, e.g.
foo :: Sink i1 (ConduitM i2 o m) ()
Which then allows
awaitA = await
awaitB = lift await
And this mostly works. Unfortunately, this seems to make it very difficult to fuse to the inner conduit before the outer conduit is fully connected. The first thing I tried was:
fuseInner :: Monad m =>
Conduit i2' m i2 ->
Sink i1 (ConduitM i2 o m) () ->
Sink i1 (ConduitM i2' o m) ()
fuseInner x = transPipe (x =$=)
But this doesn't work, at least when x is stateful since (x =$=) is run multiple times, effectively restarting x each time.
Is there any way to write fuseInner, short of breaking into the internals of conduit (which looks like it would be pretty messy)? Is there some better way to handle multiple input streams? Am I just way to far beyond what conduit was designed for?
Thanks!

If you want to combine two IO-generated streams, then Gabriel's comment is the solution.
Otherwise, you can't wait for both streams, which one produces a value first. Conduits are single-threaded and deterministic - it processes only one pipe at a time. But you could create a function that interleaves two streams, letting them decide when to switch:
{-# OPTIONS_GHC -fwarn-incomplete-patterns #-}
import Control.Monad (liftM)
import Data.Conduit.Internal (
Pipe (..), Source, Sink,
injectLeftovers, ConduitM (..),
mapOutput, mapOutputMaybe
)
-- | Alternate two given sources, running one until it yields `Nothing`,
-- then switching to the other one.
merge :: Monad m
=> Source m (Maybe a)
-> Source m (Maybe b)
-> Source m (Either a b)
merge (ConduitM l) (ConduitM r) = ConduitM $ goL l r
where
goL :: Monad m => Pipe () () (Maybe a) () m ()
-> Pipe () () (Maybe b) () m ()
-> Pipe () () (Either a b) () m ()
goL (Leftover l ()) r = goL l r
goL (NeedInput _ c) r = goL (c ()) r
goL (PipeM mx) r = PipeM $ liftM (`goL` r) mx
goL (Done _) r = mapOutputMaybe (liftM Right) r
goL (HaveOutput c f (Just o)) r = HaveOutput (goL c r) f (Left o)
goL (HaveOutput c f Nothing) r = goR c r
-- This is just a mirror copy of goL. We should combine them together to
-- avoid code repetition.
goR :: Monad m => Pipe () () (Maybe a) () m ()
-> Pipe () () (Maybe b) () m ()
-> Pipe () () (Either a b) () m ()
goR l (Leftover r ()) = goR l r
goR l (NeedInput _ c) = goR l (c ())
goR l (PipeM mx) = PipeM $ liftM (goR l) mx
goR l (Done _) = mapOutputMaybe (liftM Left) l
goR l (HaveOutput c f (Just o)) = HaveOutput (goR l c) f (Right o)
goR l (HaveOutput c f Nothing) = goL l c
It processes one source until it returns Nothing, then switches to another, etc. If one source finishes, the other one is processed to the end.
As an example, we can combine and interleave two lists:
import Control.Monad.Trans
import Data.Conduit (($$), awaitForever)
import Data.Conduit.List (sourceList)
main = (merge (sourceList $ concatMap (\x -> [Just x, Just x, Nothing]) [ 1..10])
(sourceList $ concatMap (\x -> [Just x, Nothing]) [101..110]) )
$$ awaitForever (\x -> lift $ print x)
If you need multiple sources, merge could be adapted to something like
mergeList :: Monad m => [Source m (Maybe a)] -> Source m a
which would cycle through the given list of sources until all of them are finished.

This can be done by diving into the internals of conduit. I wanted to avoid this because it looked extremely messy. Based on the responses here, it sounds like there is no way around it (but I would really appreciate a cleaner solution).
The key difficulty is that (x =$=) is a pure function, but to make transPipe give the correct answer, it needs a kind of stateful, function-like thing:
data StatefulMorph m n = StatefulMorph
{ stepStatefulMorph :: forall a. m a -> n (StatefulMorph m n, a)
, finalizeStatefulMorph :: n () }
Stepping StatefulMorph m n takes a value in m and returns, in n, both that value and the next StatefulMorph, which should be used to transform the next m value. The last StatefulMorph should be finalized (which, in the case of the "stateful (x =$=)", finalizes the x conduit.
Conduit fusion can be implemented as a StatefulMorph, using the code for pipeL with minor changes. The signature is:
fuseStateful :: Monad m
=> Conduit a m b
-> StatefulMorph (ConduitM b c m) (ConduitM a c m)
I also need a replacement for transPipe (a special case of hoist) that uses StatefulMorph values instead of functions.
class StatefulHoist t where
statefulHoist :: (Monad m, Monad n)
=> StatefulMorph m n
-> t m r -> t n r
A StatefulHoist instance for ConduitM i o can be written using the code for transPipe with some minor changes.
fuseInner is then easy to implement.
fuseInner :: Monad m
=> Conduit a m b
-> ConduitM i o (ConduitM b c m) r
-> ConduitM i o (ConduitM a c m) r
fuseInner left = statefulHoist (fuseStateful left)
I've written a more detailed explanation here and posted the full code here. If someone can come up with a cleaner solution, or one that uses the conduit public API, please post it.
Thanks for all the suggestions and input!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to hoist Conduit of STT - haskell

Related

Converting this FreeT (explicitly recursive data type) function to work on FT (church encoding)

Can one compose types in a Haskell instance declaration?

How do `pass` and `listen` work in WriterT?

How best to type "Any monad transformer stack containing m"

Fusing conduits with multiple inputs

Categories

Resources