Single-stepping a conduit

Single-stepping a conduit - haskell

I want to do something along the lines of ArrowChoice, but with conduits. I want to await an Either value and then pass Left values to one conduit and Right values to another, and then merge the results back into an Either stream.
Presumably this can be done by making the inner conduits like automata: turn a conduit into a function that takes an argument and returns a monadic list of outputs yielded:
newtype AutomataM i m o = Automata (i -> m (o, Automata i o))
conduitStep :: Conduit i m o -> AutomataM i m [o]
The reason for the list of outputs is that a Conduit may yield 0 or more outputs for each input.
I've looked at ResumableConduit and its relatives, and presumably the answer is in there somewhere. But I can't quite see how its done.

It's not exactly the same type signature you provided, but:
import Data.Conduit
import Data.Conduit.Internal (Pipe (..), ConduitM (..))
newtype Automata i o m r = Automata (m ([o], Either r (i -> Automata i o m r)))
conduitStep :: Monad m => ConduitM i o m r -> Automata i o m r
conduitStep (ConduitM con0) =
Automata $ go [] id con0
where
go _ front (Done r) = return (front [], Left r)
go ls front (HaveOutput p _ o) = go ls (front . (o:)) p
go ls front (NeedInput p _) =
case ls of
[] -> return (front [], Right $ conduitStep . ConduitM . p)
l:ls' -> go ls' front (p l)
go ls front (PipeM mp) = mp >>= go ls front
go ls front (Leftover p l) = go (l:ls) front p
But just be careful with this approach:
By keeping the output as a list, it's not constant memory.
We're throwing away finalizers.
There's probably a way to provide a ZipConduit abstraction, similar to ZipSource and ZipSink, that would handle this kind of problem more elegantly, but I haven't thought about it too much.
EDIT I ended up implementing ZipConduit in conduit-extra 0.1.5. Here's a demonstration of using it which sounds a bit like your case:
import Control.Applicative
import Data.Conduit
import Data.Conduit.Extra
import qualified Data.Conduit.List as CL
conduit1 :: Monad m => Conduit Int m String
conduit1 = CL.map $ \i -> "conduit1: " ++ show i
conduit2 :: Monad m => Conduit Double m String
conduit2 = CL.map $ \d -> "conduit2: " ++ show d
conduit :: Monad m => Conduit (Either Int Double) m String
conduit = getZipConduit $
ZipConduit (lefts =$= conduit1) *>
ZipConduit (rights =$= conduit2)
where
lefts = CL.mapMaybe (either Just (const Nothing))
rights = CL.mapMaybe (either (const Nothing) Just)
main :: IO ()
main = do
let src = do
yield $ Left 1
yield $ Right 2
yield $ Left 3
yield $ Right 4
sink = CL.mapM_ putStrLn
src $$ conduit =$ sink

There's a folk method of doing this using pipes by using "push-category" Pipes. The complete implementation comes from both this mailing list post and this Stack Overflow answer. I think it hasn't been released yet due to both an effort to simplify the Pipes interface, a focus on using the "sequencing" monad instance which is hidden via this method, and no proof yet that this implementation truly implements the Arrow class properly.
The idea is to implement a newtype Edge (demonstrated below) which is a push-based pipe with the type arguments in the right order for Category, Arrow, ArrowChoice and both Functor and Applicative over their output values. This lets you compose them into directed acyclic graphs using arrow notation. I'll run over the implementation below, but it's safe to just ignore it and use the Arrow/ArrowChoice/Applicative instances of Edge without too much concern.
(Edit: This code is best made available at https://github.com/Gabriel439/Haskell-RCPL-Library)
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE TypeSynonymInstances #-}
import Prelude hiding ((.), id)
import Pipes.Core
import Pipes.Lift
import Control.Monad.Morph
import Control.Category
import Control.Monad.State.Strict
import Control.Arrow
This is an atypical mode of using pipes and isn't exposed in the Pipes module; you must import Pipes.Core to use push. Push-based pipes look like
-- push :: a -> Proxy a' a a' a m r
and thus they demand at least one upstream value before the Proxy is allowed to run. This means the whole process needs to be "kickstarted" by passing the first value as a function call and that the leftmost push-Proxy will control the entire stream.
Given a push-based pipe we can implement Category, Arrow and ArrowChoice. The standard solution also involves the Edge typeclass so that we have the type arguments in the right order for Category and Arrow
newtype Edge m r a b = Edge { unEdge :: a -> Pipe a b m r }
For the Category instance, we use the "push" Category which has push as id and (<~<) as composition:
instance Monad m => Category (Edge m r) where
id = Edge push
Edge a . Edge b = Edge (a <~< b)
We embed functions into Edge with arr by augmenting id (i.e. push) on the downward edge. To do this we use the respond category which has the law p />/ respond == p, but jam our f into the process.
instance Monad m => Arrow (Edge m r) where
arr f = Edge (push />/ respond . f)
We also use a local state transformer to store the snd half of our pairs and pass it "around" the input pipe in first
first (Edge p) = Edge $ \(b, d) ->
evalStateP d $ (up \>\ hoist lift . p />/ dn) b
where
up () = do
(b, d) <- request ()
lift (put d)
return b
dn c = do
d <- lift get
respond (c, d)
Finally, we get an ArrowChoice instance by implementing left. To do so we split the burden of passing the Left and Right sides using either the return or the pipe to pass values.
instance (Monad m) => ArrowChoice (Edge m r) where
left (Edge k) = Edge (bef >=> (up \>\ (k />/ dn)))
where
bef x = case x of
Left b -> return b
Right d -> do
_ <- respond (Right d)
x2 <- request ()
bef x2
up () = do
x <- request ()
bef x
dn c = respond (Left c)
We can use Edge to create "push-based" producers and consumers
type PProducer m r b = Edge m r () b
type PConsumer m r a = forall b . Edge m r a b
and then we'll provide Functor and Applicative instances for PProducer. This goes by case analysis on the underlying Pipe, so it's a bit verbose. Essentially, however, all that happens is that we insert f into the yield slot of the Pipe.
instance Functor (PProducer m r) where
fmap f (Edge k) = $ Edge $ \() -> go (k ()) where
go p = case p of
Request () ku -> Request () (\() -> go (ku ()))
-- This is the only interesting line
Respond b ku -> Respond (f b) (\() -> go (ku ()))
M m -> M (m >>= \p' -> return (go p'))
Pure r -> Pure r
Finally, Applicative is much the same except that we have to switch between running the upstream pipe to produce functions and running the downstream pipe to produce arguments.
instance (Monad m) => Applicative (Edge m r ()) where
pure b = Edge $ \() -> forever $ respond b
(Edge k1) <*> (Edge k2) = Edge (\() -> goL (k1 ()) (k2 ()))
where
goL p1 p2 = case p1 of
Request () ku -> Request () (\() -> goL (ku ()) p2)
Respond f ku -> goR f (ku ()) p2
M m -> M (m >>= \p1' -> return (goL p1' p2))
Pure r -> Pure r
goR f p1 p2 = case p2 of
Request () ku -> Request () (\() -> goR f p1 (ku ()))
Respond x ku -> Respond (f x) (\() -> goL p1 (ku ()))
M m -> M (m >>= \p2' -> return (goR f p1 p2'))
Pure r -> Pure r

Related

Kleisli Arrow in Netwire 5?

I am trying to create a game using Haskell + Netwire 5 (+ SDL). Now I am working on the output part, where I would like to create wires that read in some game state and output the SDL surfaces to be blitted on screen.
However, the problem is that SDL surfaces are contained in IO monad, so any function that creates such surfaces must have type a -> IO b. Of course, arr does not construct a Wire from a -> m b. However, since the type signature of a wire is (Monad m, Monoid e) => Wire s e m a b, it looks quite like a Kleisi Arrow, but I cannot find a suitable constructor for making such a wire.
I am new to FRP and Arrows, and have not programmed a lot in Haskell, so this may not be the best way to implement the graphics output. If I am wrong from the beginning, please let me know.
Some SDL functions related:
createRGBSurfaceEndian :: [SurfaceFlag] -> Int -> Int -> Int -> IO Surface
fillRect :: Surface -> Maybe Rect -> Pixel -> IO Bool
blitSurface :: Surface -> Maybe Rect -> Surface -> Maybe Rect -> IO Bool
flip :: Surface -> IO ()
Update 1
This code type checks, but now I am trying to interface it with SDL for testing
wTestOutput :: (Monoid e) => Wire s e IO () SDL.Surface
wTestOutput = mkGen_ $ \a -> (makeSurf a >>= return . Right)
where
makeSurf :: a -> IO SDL.Surface
makeSurf _ = do
s <- SDL.createRGBSurfaceEndian [SDL.SWSurface] 800 600 32
SDL.fillRect s (Just testRect) (SDL.Pixel 0xFF000000)
return s
testRect = SDL.Rect 100 100 0 0

Now, after playing around with Arrows, I will answer my own question
using the function putStrLn. It has type String -> IO (), which is
a -> m b, so the method should generalize to all Kleisli wires. I also illustrate how to drive the wire, and the result is amazingly simple.
The entire code is written in Literate Haskell, so just copy it and run.
First, there are some imports for the Netwire 5 library
import Control.Wire
import Control.Arrow
import Prelude hiding ((.), id)
Now, this is the core of making a Kleisli Wire. Assume you have a
function with type a -> m b that needs to be lifted into a wire. Now,
notice that mkGen_ has type
mkGen_ :: Monad m => (a -> m (Either e b)) -> Wire s e m a b
So, to make a wire out of a -> m b, we first need to get a function
with type a -> m (Either () b). Notice that Left inhibits the wire,
while Right activates it, so the inner part is Either () b instead of
Either b (). Actually, if you try the latter, an obscure compile error
will tell you get this in the wrong way.
To get a -> m (Either () b), first consider how to get
m (Either () b) from m b, we extract the value from the monad (m
b), lift it to Right, then return to the monad m. In short:
mB >>= return . Right. Since we don't have the value "mB" here, we
make a lambda expression to get a -> m (Either () b):
liftToEither :: (Monad m) => (a -> m b) -> (a -> m (Either () b))
liftToEither f = \a -> (f a >>= return . Right)
Now, we can make a Kleisli wire:
mkKleisli :: (Monad m, Monoid e) => (a -> m b) -> Wire s e m a b
mkKleisli f = mkGen_ $ \a -> (f a >>= return . Right)
So, let's try the canonical "hello, world" wire!
helloWire :: Wire s () IO () ()
helloWire = pure "hello, world" >>> mkKleisli putStrLn
Now comes the main function to illustrate how to drive the wire. Note
that comparing to the source of testWire in the Control.Wire.Run
from the Netwire library, there is no use of liftIO: the outer program
knows nothing about how the wires work internally. It merely steps the
wires ignoring what is in it. Maybe this Just means better
composition than using Nothing about Kleisli Wires? (No pun intended!)
main = go clockSession_ helloWire
where
go s w = do
(ds, s') <- stepSession s
(mx, w') <- stepWire w ds (Right ())
go s' w'
Now here comes the code. Unfortunately StackOverflow does not work quite well with Literate Haskell...
{-# LANGUAGE Arrows #-}
module Main where
import Control.Wire
import Control.Monad
import Control.Arrow
import Prelude hiding ((.), id)
mkKleisli :: (Monad m, Monoid e) => (a -> m b) -> Wire s e m a b
mkKleisli f = mkGen_ $ \a -> liftM Right $ f a
helloWire :: Wire s () IO () ()
helloWire = pure "hello, world" >>> mkKleisli putStrLn
main = go clockSession_ helloWire
where
go s w = do
(ds, s') <- stepSession s
(mx, w') <- stepWire w ds (Right ())
go s' w'
Update
Thanks to Cubic's inspiration. liftToEither can actually be written in, you guess it, liftM:
liftToEither f = \a -> liftM Right $ f a
mkKleisli f = mkGen_ $ \a -> liftM Right $ f a

How do `pass` and `listen` work in WriterT?

The code below probably isn't a good way to do this, but it's what I've managed to cobble together. Basically, I run a series of complex tasks, during which several things get logged. At the end of each one I dump the log into a .txt file and move on to the next batch in a loop.
To achieve this I make use of listen and pass in WriterT (as part of RWST). The code is below:
-- Miscelaneous stuff
newtype Log = Log [String]
type ConnectT a = EitherT String (RWST ConnectReader Log ConnectState IO) a
timeStampLog :: String -> Log
timeStampLog msg = do
theTime <- liftIO $ fmap zonedTimeToLocalTime getZonedTime
let msgStart = show theTime ++ ": "
tell $ Log [msgStart ++ msg]
logToFileIO :: Log -> IO ()
logToFileIO (Log xs) = appendFile "Log.txt" $ "\r\n" ++ intercalate "\r\n" (reverse xs)
---------------------
logToFile :: ConnectT a -> ConnectT ()
logToFile cta = let ctaw = listen cta
in pass $ do
(_,w) <- ctaw
liftIO $ logToFileIO w
return ((),const mempty)
mapFunction :: (Show a) => a -> ConnectT ()
mapFunction a = logToFile $ do
timeStampLog $ "Starting sequence for " ++ show a
lotsOfLogging a
timeStampLog $ "Finishing sequence for " ++ show a
loopFunction :: ConnectT ()
loopFunction = logToFile $ do
timeStampLog "Starting Loop"
mapM_ mapFunction someList
timeStampLog "Finishing Loop"
What I end up with is something like this:
2015-03-17 20:21:40.8198823: Starting sequence for a
2015-03-17 20:21:41.8198823: (logs for a)
2015-03-17 20:21:41.8198823: Finishing sequence for a
2015-03-17 20:21:41.8198823: Starting sequence for b
2015-03-17 20:21:42.8198823: (logs for b)
2015-03-17 20:21:42.8198823: Finishing sequence for b
2015-03-17 20:21:39.8198823: Starting Loop
2015-03-17 20:21:42.8198823: Finishing Loop
Where the log entry for starting/finishing the loop end up together at the end.
I'm not entirely surprised that the call to logToFile in mapFunction doesn't include the log information from the loopFunction as the information hasn't passed to it via a bind.
But I'm still having trouble understanding how pass and listen work. And also how I would go about fixing this (admittedly minor) issue.

We can determine how listen and pass work almost entirely from their types. We'll start with listen.
listen
listen :: (Monoid w, Monad m) => RWST r w s m a -> RWST r w s m (a, w)
Unwrapping the RWST we have
listen :: (Monoid w, Monad m) => (r -> s -> m (a, s, w)) -> r -> s -> m ((a, w), s, w)
It needs to return an m .... The only way we have to make ms is to return something or apply the input function to an rand an s (we can't use >>= since it requires we already have an m). We don't have an a to return so we have to apply the function to an r and s. There's only one r and s we can use, those passed into the result.
listen k r s = ... (k r s)
Now we have an m (a, s, w) but need an m ((a, w), s, w). We can run the action again to get another m (nonsense for "listening") or do something with the (a, s, w) inside the m with >>=.
listen k r s = k r s >>= \(a, s' w) -> ...
To use bind we need an m. We can either return something or apply the input function to an r and s and repeat the action again, which is nonsense for "listening". We return something.
listen k r s = k r s >>= \(a, s', w) -> return ...
We need an a, a w, an s, and another w. We only have one a and no way to get any others.
listen k r s = k r s >>= \(a, s', w) -> return ((a,...),...,...)
There are 3 ways we can get a w: mempty, the w from the result of the action, or combining two ws together with <>. Returning mempty is pointless; the user could have just used mempty themselves. Duplicating what was logged with <> is as much nonsense as running an action twice, so we return what was logged by the first action.
listen k r s = k r s >>= \(a, s', w) -> return ((a,w),...,...)
We have two s es: s and s'. Reverting the state changes of the action is nonsense for "listening", so we return the changed state s'.
listen k r s = k r s >>= \(a, s', w) -> return ((a,w),s',...)
Now we are faced with the only interesting choice: what w should we keep for what was logged? The user has "listened" for what was logged; we could say that it's their problem now and reset the log to mempty. But "listening" doesn't suggest that it should change what something does, it should only observe it. Therefore, we keep the resulting log w intact.
listen k r s = k r s >>= \(a, s', w) -> return ((a,w),s',w)
If we wrap this in its RWSTs again we have
listen m = RWST \r s -> (runRWST m) r s >>= \(a, s', w) -> return ((a,w),s',w)
All we did was run the input action and include what it logged along with its resulting a in the result as a tuple. This matches the documentation for listen:
listen m is an action that executes the action m and adds its output to the value of the computation.
runRWST (listen m) r s = liftM (\ (a, w) -> ((a, w), w)) (runRWST m r s)
tell
pass :: (Monoid w, Monad m) => RWST r w s m (a, w -> w) -> RWST r w s m a
We begin as before, unwrapping the RWST
pass :: (Monoid w, Monad m) => (r -> s -> m ((a, w->w), s, w)) -> r -> s -> m (a, s, w)
We follow the same argument for how to get a resulting m as we used for listen
pass k r s = ... (k r s)
Now we have an m ((a, w->w), s, w)) but need an m (a, s, w). We can run the action again to get another m (nonsense for "passing") or do something with the ((a, w->w), s, w) inside the m with >>=.
pass k r s = k r s >>= \((a, f), s', w) -> ...
To use bind we need an m. We can either return something or apply the input function to an r and s and repeat the action again, which is nonsense for "passing". We return something.
pass k r s = k r s >>= \((a, f), s', w) -> return ...
We need an a, an s, and a w. We only have one a and no way to get any others.
pass k r s = k r s >>= \((a, f), s', w) -> return (a,...,...)
We have two s es: s and s'. Reverting the state changes of the action is nonsense for "passing", so we return the changed state s'.
pass k r s = k r s >>= \((a, f), s', w) -> return (a,s',...)
There are 4 ways we can get a w: mempty, the w from the result of the action, combining two ws together with <>, or applying the function f to another w. Setting the result to mempty leaves us wondering why the user provided a function f :: w -> w. themselves. Duplicating what was logged with <> is as much nonsense as running an action twice. We should be applying the function f to something.
pass k r s = k r s >>= \((a, f), s', w) -> return (a,s',f ...)
We could apply f to something built from memptys and <>, but if that were the case all of the fs would be equivalent to const ...; the type for it might as well have been a w. We could apply f to some elaborate structure built from w, mempty, <>, and f, but all of those structures could have been defined in f itself if we simply pass it w.
pass k r s = k r s >>= \((a, f), s', w) -> return (a,s',f w)
If we wrap this in its RWSTs again we have
pass m = RWST \r s -> (runRWST k) r s >>= \((a, f), s', w) -> return (a,s',f w)
We ran the input action and changed what was logged by the function that was a result of the action. This matches the documentation for pass:
pass m is an action that executes the action m, which returns a value and a function, and returns the value, applying the function to the output.
runRWST (pass m) r s = liftM (\ ((a, f), w) -> (a, f w)) (runRWST m r s)

The existing WriterT w m can't perform any action in the underlying m to perform logging until after the action has been run and the w has been assembled. As your question illustrates, this is confusing. The log for the do block loopFunction isn't written by logToFile until after the do block itself finishes running.
LoggerT
Let's invent a new WriterT called LoggerT. Our new LoggerT is going to provide a new function
logTells :: (Monoid w, Monoid w', Monad m) =>
(w -> LoggerT w' m ()) -> LoggerT w m a -> LoggerT w' m a
The intuition behind this is: we'll be able to provide an action (with type w -> LoggerT w' m ()) to log every tell, replacing the logged result with the result of the action. If we smash two things the user tells us together with <> we'll no longer be able to log both of them; we'll only ever be able to log the result of <>. Since our LoggerT will never be able to use <> it will never need the Monoid instances. We must drop the Monoid constraint from everything in LoggerT.
logTells :: (Monad m) =>
(w -> LoggerT w' m ()) -> LoggerT w m a -> LoggerT w' m a
We need to remember every tell so that we can replace it later. But when we replace it "later", the logging should happen at the point the tell appeared in the code. For example, if we make
processX :: LoggerT String m ()
processX = do
tell "Starting process X"
lotsOfProcessing
tell "Finishing process X"
And then "later" write logTells logToFile processX we want the resulting computation to look like the following.
logTells logToFile processX = do
logToFile "Starting process X"
lotsOfProcessing
logToFile "Finishing process X"
None of lotsOfProcessing should happen until the logToFile for tell "Starting process X" has already happened. This means that when the user tells us something we need to remember not only what we were told, but everything that happens after that. We "remember" things in the constructor for a data.
data LoggerT w m a
= Tell w (LoggerT w m a)
| ...
tell :: w -> LoggerT w m ()
tell w = Tell w (return ())
We also need to be able to perform actions in the underlying Monad. It would be tempting to add another constructor Lift (m a), but then we couldn't decide what to log as a result of the underlying computation. Instead, we'll let it decide the entire future LoggerT w m a to run.
data LoggerT w m a
= Tell w (LoggerT w m a)
| M (m (LoggerT w m a))
...
If we try to lift an underlying computation m a into LoggerT we now have a problem; we don't have a way to turn the a into a LoggerT w m a to put it in the M constructor.
instance MonadTrans (LoggerT w m) where
lift ma = M (??? ma)
We could try lifting return from the underlying Monad, but that's just a circular definition. We'll add another constructor for Returning.
data LoggerT w m a
= Tell w (LoggerT w m a)
| M (m (LoggerT w m a))
| Return a
instance MonadTrans (LoggerT w m) where
lift = M . liftM Return
To finish our monad transformer, we'll write a Monad instance.
instance Monad m => Monad (LoggerT w m) where
return = Return
la0 >>= k = go la0
where
go (Tell w la ) = Tell w (go la)
go (M mla) = M (liftM go mla)
go (Return a ) = Return a
We can now define logTells. It replaces every Tell with the action to perform to log it.
logTells :: (w -> LoggerT w' m ()) -> LoggerT w m a -> LoggerT w' m a
logTells k = go
where
go (Tell w la ) = k w >> go la
go (M mla) = M (liftM go mla)
go (Return a) = return a
Finally, we'll provide a way to get out of LoggerT by replacing all of the Tells with an action, very similar to logTells but dropping the LoggerT from the result.
Since it will get rid of the LoggerT we'll call it runLoggerT and swap the arguments to match the convention of other transformers.
runLoggerT :: LoggerT w m a -> (w -> m ()) -> m a
runLoggerT la0 k = go la0
where
go (Tell w la ) = k w >> go la
go (M mla) = liftM go mla
go (Return a) = return a
LoggerT already exists, we don't need to write it ourself. It's the Producer from the very mature pipes library.
pipes
The Producer from the pipes library is the correct logging transformer.
type Producer b = Proxy X () () b
Every Proxy has a MonadTrans (Proxy a' a b' b) instance and a Monad m => Monad (Proxy a' a b' b m) instance.
We tell it what to log with yield.
yield :: Monad m => a -> Producer' a m ()
tell = yield
When we know what we want to do with the yields, we replace them with what we want to do using for.
for :: Monad m =>
Proxy x' x b' b m a' ->
(b -> Proxy x' x c' c m b')
-> Proxy x' x c' c m a'
Specialized for Producer and (), for has the type
for :: Monad m =>
Producer b m a ->
(b -> Producer c m ()) ->
Producer c m a
logTells = flip for
If we replace each of the yields with an action in the underlying monad, we won't have anything produced anymore and can run the Proxy with runEffect.
runEffect :: Monad m => Effect m r -> m r
runEffect :: Monad m => Proxy X () () X m r -> m r
runEffect :: Monad m => Producer X m r -> m r
runLoggerT la0 k = runEffect $ for la0 (lift . k)
We can even recover the WriterT with hoist which replaces the underlying monad (every Proxy a' a b' b has an MFunctor instance).
hoist :: (Monad m, MFunctor t) => (forall a. m a -> n a) -> t m b -> t n b
We use hoist to replace the underlying monad with WriterT w m by lifting each m a into WriterT w m a. Then we replace each yield with lift . tell, and run the result.
toWriterT :: (Monad m, Monoid w) => Producer w m r -> WriterT w m r
toWriterT p0 = runEffect $ for (hoist lift p0) (lift . tell)
toWriterT p0 = runLoggerT (hoist lift p0) tell
Producer is essentially the free WriterT that doesn't require a Monoid for the items being written.

Here's a simplified, but definitely real-life example that uses censor (which is defined in terms of pass as
censor :: (MonadWriter w m) => (w -> w) -> m a -> m a
censor f m = pass $ (,f) <$> m
) to collect free variables of a lambda term:
import Control.Monad.Writer
import Data.Set (Set)
import qualified Data.Set as Set
type VarId = String
data Term = Var VarId
| Lam VarId Term
| App Term Term
freeVars :: Term -> Set VarId
freeVars = execWriter . go
where
go :: Term -> Writer (Set VarId) ()
go (Var x) = tell $ Set.singleton x
go (App f e) = go f >> go e
go (Lam x e) = censor (Set.delete x) $ go e
Now, of course you can implement this without all the Writer machinery, but remember this is just a simplified example standing in for some more involved compilation/analysis function, where tracking free variables is just one of the things going on.

The documentation is clear enough? http://hackage.haskell.org/package/mtl-2.2.1/docs/Control-Monad-Writer-Lazy.html#g:1
Examples (run the following in ghci)
import Control.Monad.Writer
runWriterT ( do (a,w) <- listen $ do { tell "foo" ; return 42 } ; tell $ reverse w ; return a )
==> (42,"foooof")
runWriterT ( pass $ do { tell "foo" ; return (42,reverse) } )
==> (42,"oof")

Construct a pipes Proxy inside-out

Is it possible to make a function so that a Proxy from pipes can be constructed inside-out? By inside-out, I mean create a proxy from a function that connects the upstream and downstream connections. The most desirable (but impossible) signature would be
makeProxy :: (Monad m) =>
(Server a' a m r -> Client b' b m r -> Effect m r) ->
Proxy a' a b' b m r
The first problem we encounter is the mechanical problem of constructing the Proxy. There's no way for us to know if the function looks at the Server or Client except by having each of them be M, in which case we'll only know which one it looked at, not what value it tried to send upstream or downstream. If we focus on the upstream end, the only thing we know is that something tried to figure out what the upstream proxy is, so we need to decide on either always resulting in a Request farther upstream or Responding. Either way we answer, the only value we can provide is (). This means we can make a Request () to an upstream producer or Respond () immediately. If we consider making this choice for both ends, there are only four possible functions. The following functions are named after whether their upstream and downstream connections send interesting data downstream (D) or upstream (U).
betweenDD :: (Monad m) =>
(Server () a m r -> Client () b m r -> Effect m r) ->
Proxy () a () b m r
betweenDD = undefined
betweenDU :: (Monad m) =>
(Server () a m r -> Client b' () m r -> Effect m r) ->
Proxy () a b' () m r
betweenDU = undefined
betweenUU :: (Monad m) =>
(Server a' () m r -> Client b' () m r -> Effect m r) ->
Proxy a' () b' () m r
betweenUU f = reflect (betweenDD g)
where g source sink = f (reflect sink) (reflect source)
betweenUD :: (Monad m) =>
(Server a' () m r -> Client () b m r -> Effect m r) ->
Proxy a' () () b m r
betweenUD = undefined
betweenDD is the most interesting, it would build a pipe between a Producer and a Consumer; betweenUU would do the same for a pipe running upstream. betweenDU would consume data requesting it from one of two sources. betweenUD would produce data, sending it to one of two destinations.
Can we provide a definition for betweenDD? If not, can we instead provide definitions for the following simpler functions?
belowD :: (Monad m) =>
(Producer a m r -> Producer b m r) ->
Proxy () a () b m r
aboveD :: (Monad m) =>
(Consumer b m r -> Consumer a m r) ->
Proxy () a () b m r
This question was motivated by trying to write belowD to use in answering a question about P.zipWith.
Example
This example happens to be essentially the question that inspired this question..
Let's say we want to create a Pipe that will number the values passing through it. The Pipe will have values a coming downstream from above and values (n, a) leaving downstream below; in other words it will be a Pipe a (n, a).
We'll solve this problem by zipping with the numbers. The result of ziping with the numbers is a function (->) from a Producer a to a Producer (n, a).
import qualified Pipes.Prelude as P
number' :: (Monad m, Num n, Enum n) => Producer a m () -> Producer (n, a) m ()
number' = P.zip (fromList [1..])
Even though the Pipe will consume as from upstream, from the point of view of the function it needs a Producer of as to provide those values. If we had a definition for belowD we could write
number :: (Monad m, Num n, Enum n) => Pipe a (n, a) m ()
number = belowD (P.zip (fromList [1..]))
given a suitable definition for fromList
fromList :: (Monad m) => [a] -> Producer a m ()
fromList [] = return ()
fromList (x:xs) = do
yield x
fromList xs

Actually, I think makeProxy is possible if you slightly change the type. I am on my phone so I cannot type check this just yet, but I believe this works:
{-# LANGUAGE RankNTypes #-}
import Control.Monad.Trans.Class (lift)
import Pipes.Core
makeProxy
:: Monad m
=> ( forall n. Monad n
=> (a' -> Server a' a n r)
-> (b -> Client b' b n r)
-> Effect n r
)
-> Proxy a' a b' b m r
makeProxy k = runEffect (k up dn)
where
up = lift . request \>\ pull
dn = push />/ lift . respond
This assumes that k is defined as:
k up dn = up ->> k >>~ dn
Edit: Yeah, it works if you add an import for lift
I'll walk through why this works.
First, let me set out some of the pipes definitions and laws:
-- Definition of `push` and `pull`
(1) pull = request >=> push
(2) push = respond >=> pull
-- Read this as: f * (g + h) = (f * g) + (f * h)
(3) f \>\ (g >=> h) = (f \>\ g) >=> (f \>\ h)
-- Read this as: (g + h) * f = (g * f) + (h * f)
(4) (g >=> h) />/ f = (g />/ f) >=> (h />/ f)
-- Right identity law for the request category
(5) f \>\ request = f
-- Left identity law for the respond category
(6) respond />/ f = f
-- Free theorems (equations you can prove from the types alone!)
(7) f \>\ respond = respond
(8) request />/ f = request
Now let's use those equations to expand out up and dn:
up = (lift . request) \>\ pull
= (lift . request) \>\ (request >=> push) -- Equation (1)
= (lift . request \>\ request) >=> (lift . request \>\ push) -- Equation (3)
= lift . request >=> (lift . request \>\ push) -- Equation (5)
= lift . request >=> (lift . request \>\ (respond >=> pull)) -- Equation (2)
= lift . request >=> (lift . request \>\ respond) >=> (lift . request \>\ pull) -- Equation (3)
= lift . request >=> respond >=> (lift . request \>\ pull) -- Equation (7)
up = lift . request >=> respond >=> up
-- Same steps, except symmetric
dn = lift . respond >=> request >=> dn
In other words, up converts all requests going out of k's upstream interface into lift . request and dn converts all responds going out of k's downstream interface into lift . respond. In fact, we can prove that:
(9) (f \>\ pull) ->> p = f \>\ p
(10) p >>~ (push />/ f) = p />/ f
... and if we apply those equations to k, we get:
(lift . request \>\ pull) ->> k >>~ (push />/ lift . respond)
= lift . request \>\ k />/ lift . respond
This says the same thing except more directly: we're replacing every request in k with lift . request and replacing every respond in k with lift . respond.
Once we lower all requests and responds to the base monad, we end up with this type:
lift . request \>\ k />/ lift . respond :: Effect' (Proxy a' a b' b m) r
Now we can delete the outer Effect using runEffect. This leaves behind the "inside-out" Proxy.
This is also the same trick that Pipes.Lift.distribute uses to swap the order of the Proxy monad with the monad underneath it:
http://hackage.haskell.org/package/pipes-4.1.4/docs/src/Pipes-Lift.html#distribute

(Sorry, I missed a couple brackets on a sleepy head, so the first answer was to a different question)
Producer' a m r -> Producer' b m r is the definition of a Pipe a b m r - it can consume a and produce b.
belowD ::Monad m => (Producer' a m () -> Producer' b m r) -> Pipe a b m ()
belowD g = sequence_ $ repeat $ do
x <- await -- wait for `a` as a Pipe
g $ yield x -- pass a trivial Producer to g, and forward output
This one will expect one or more b for each a. If g needs more than one a to produce one b, it won't produce anything.
But then since Proxy a b c d m is a Monad, we can lift await:
belowD :: Monad m => (forall m . Monad m => Producer a m () -> Producer b m r) ->
Pipe a b m r
belowD g = h . g $ sequence_ $ repeat ((lift $ await) >>= yield) where
h :: Monad m => Producer b (Pipe a b m) r -> Pipe a b m r
h p = do
x <- next p
case x of
Left r -> return r
Right (x,p) -> do
yield x
h p
h :: Monad m => Producer a m () -> Producer a m ()
h :: Monad m => Producer a m () -> Producer a m ()
h p = p >-> (sequence_ $ repeat $ await >>= yield >> await) -- skips even
main = runEffect $ (mapM_ yield [1..10]) >-> (for (belowD h) $ lift . print)
> 1
> 3
> 5
> 7
> 9

Haskell: Splitting pipes (broadcast) without using spawn

This question is a bit codegolf and a lot newb.
I'm using the awesome pipes library in Haskell, and I'd like to split a pipe to send the same data along multiple channels (do a broadcast).
The Pipes.Concurrent tutorial suggests using spawn to create mailboxes, taking advantage of Output's monoid status.
For example, we might do something like this:
main = do
(output1, input1) <- spawn Unbounded
(output2, input2) <- spawn Unbounded
let effect1 = fromInput input1 >-> pipe1
let effect2 = fromInput input2 >-> pipe2
let effect3 = P.stdinLn >-> toOutput (output1 <> output2)
...
Is this indirection through the mailboxes really necessary?
Could we instead write something like this?
main = do
let effect3 = P.stdinLn >-> (pipe1 <> pipe2)
...
The above doesn't compile, because Pipe doesn't have a Monoid instance.
Is there a good reason for this?
Is the first method really the cleanest way to split a pipe?

There are two ways to do this without using concurrency, both with caveats.
The first way is that if pipe1 and pipe2 are just simple Consumers that loop forever like:
p1 = for cat f -- i.e. p1 = forever $ await >>= f
p2 = for cat g -- i.e. p2 = forever $ await >>= g
... then the easy way to solve this is to just write:
for P.stdinLn $ \str -> do
f str
g str
For example, if p1 is just printing every value:
p1 = for cat (lift . print)
... and p2 is just writing that value to a handle:
p2 = for cat (lift . hPutStrLn h)
... then you would combine them like so:
for P.stdinLn $ \str -> do
lift $ print str
lift $ hPutStrLn h str
However, this simplification only works for Consumers that trivially loop. There's another solution that is more general, which is to define an ArrowChoice instance for pipes. I believe that pull-based Pipes do not permit a correct law-abiding instance, but push-based Pipes do:
newtype Edge m r a b = Edge { unEdge :: a -> Pipe a b m r }
instance (Monad m) => Category (Edge m r) where
id = Edge push
(Edge p2) . (Edge p1) = Edge (p1 >~> p2)
instance (Monad m) => Arrow (Edge m r) where
arr f = Edge (push />/ respond . f)
first (Edge p) = Edge $ \(b, d) ->
evalStateP d $ (up \>\ unsafeHoist lift . p />/ dn) b
where
up () = do
(b, d) <- request ()
lift $ put d
return b
dn c = do
d <- lift get
respond (c, d)
instance (Monad m) => ArrowChoice (Edge m r) where
left (Edge k) = Edge (bef >=> (up \>\ (k />/ dn)))
where
bef x = case x of
Left b -> return b
Right d -> do
_ <- respond (Right d)
x2 <- request ()
bef x2
up () = do
x <- request ()
bef x
dn c = respond (Left c)
This requires a newtype so that the type parameters are in the order that ArrowChoice expects.
If you're unfamiliar with the term push-based Pipe, it's basically a Pipe that begins from the most upstream pipe instead of the most downstream pipe, and they all have the following shape:
a -> Pipe a b m r
Think of it as a Pipe that cannot "go" until it receives at least one value from upstream.
These push-based Pipes are the "dual" to conventional pull-based Pipes, complete with their own composition operator and identity:
(>~>) :: (Monad m)
=> (a -> Pipe a b m r)
-> (b -> Pipe b c m r)
-> (a -> Pipe a c m r)
push :: (Monad m)
-> a -> Pipe a a m r
... but the unidirectional Pipes API does not export this by default. You can only get these operators from Pipes.Core (and you may want to study that module more closely to build an intuition for how they work). That module shows that both push-based Pipes and pull-based Pipes are both special cases of more general bidirectional versions, and understanding the bidirectional case is how you learn why they are duals of each other.
Once you have an Arrow instance for push-based pipes, you can write something like:
p >>> bifurcate >>> (p1 +++ p2)
where
bifurcate = Edge $ pull ~> \a -> do
yield (Left a) -- First give `p1` the value
yield (Right a) -- Then give `p2` the value
Then you would use runEdge to convert that to a pull-based pipe when you are done.
This approach has one major draw-back, which is that you can't automatically upgrade a pull-based pipe to a push-based pipe (but usually it's not hard to figure out how to do it manually). For example, to upgrade Pipes.Prelude.map to be a push-based Pipe, you would write:
mapPush :: (Monad m) => (a -> b) -> (a -> Pipe a b m r)
mapPush f a = do
yield (f a)
Pipes.Prelude.map f
Then that has the right type to be wrapped up in the Arrow:
mapEdge :: (Monad m) => (a -> b) -> Edge m r a b
mapEdge f = Edge (mapPush f)
Of course, an even simpler way would be just to write it from scratch:
mapEdge f = Edge $ push ~> yield . f
Use whichever approach suits you best.
In fact, I came up with the Arrow and ArrowChoice instances precisely because I was trying to answer the exact same question as you: how do you solve these kinds of problems without using concurrency? I wrote up a long answer about this more general subject in another Stack Overflow answer here, where I describe how you can use these Arrow and ArrowChoice instances to distill concurrent systems into equivalent pure ones.

Fusing conduits with multiple inputs

I am trying to create a conduit that can consume multiple input streams. I need to be able to await on one or the other of the input streams in no particular order (e.g., not alternating) making zip useless. There is nothing parallel or non-deterministic going on here: I await on one stream or the other. I want to be able to write code similar to the following (where awaitA and awaitB await on the first or second input stream respectively):
do
_ <- awaitA
x <- awaitA
y <- awaitB
yield (x,y)
_ <- awaitB
_ <- awaitB
y' <- awaitB
yield (x,y')
The best solution I have is to make the inner monad another conduit, e.g.
foo :: Sink i1 (ConduitM i2 o m) ()
Which then allows
awaitA = await
awaitB = lift await
And this mostly works. Unfortunately, this seems to make it very difficult to fuse to the inner conduit before the outer conduit is fully connected. The first thing I tried was:
fuseInner :: Monad m =>
Conduit i2' m i2 ->
Sink i1 (ConduitM i2 o m) () ->
Sink i1 (ConduitM i2' o m) ()
fuseInner x = transPipe (x =$=)
But this doesn't work, at least when x is stateful since (x =$=) is run multiple times, effectively restarting x each time.
Is there any way to write fuseInner, short of breaking into the internals of conduit (which looks like it would be pretty messy)? Is there some better way to handle multiple input streams? Am I just way to far beyond what conduit was designed for?
Thanks!

If you want to combine two IO-generated streams, then Gabriel's comment is the solution.
Otherwise, you can't wait for both streams, which one produces a value first. Conduits are single-threaded and deterministic - it processes only one pipe at a time. But you could create a function that interleaves two streams, letting them decide when to switch:
{-# OPTIONS_GHC -fwarn-incomplete-patterns #-}
import Control.Monad (liftM)
import Data.Conduit.Internal (
Pipe (..), Source, Sink,
injectLeftovers, ConduitM (..),
mapOutput, mapOutputMaybe
)
-- | Alternate two given sources, running one until it yields `Nothing`,
-- then switching to the other one.
merge :: Monad m
=> Source m (Maybe a)
-> Source m (Maybe b)
-> Source m (Either a b)
merge (ConduitM l) (ConduitM r) = ConduitM $ goL l r
where
goL :: Monad m => Pipe () () (Maybe a) () m ()
-> Pipe () () (Maybe b) () m ()
-> Pipe () () (Either a b) () m ()
goL (Leftover l ()) r = goL l r
goL (NeedInput _ c) r = goL (c ()) r
goL (PipeM mx) r = PipeM $ liftM (`goL` r) mx
goL (Done _) r = mapOutputMaybe (liftM Right) r
goL (HaveOutput c f (Just o)) r = HaveOutput (goL c r) f (Left o)
goL (HaveOutput c f Nothing) r = goR c r
-- This is just a mirror copy of goL. We should combine them together to
-- avoid code repetition.
goR :: Monad m => Pipe () () (Maybe a) () m ()
-> Pipe () () (Maybe b) () m ()
-> Pipe () () (Either a b) () m ()
goR l (Leftover r ()) = goR l r
goR l (NeedInput _ c) = goR l (c ())
goR l (PipeM mx) = PipeM $ liftM (goR l) mx
goR l (Done _) = mapOutputMaybe (liftM Left) l
goR l (HaveOutput c f (Just o)) = HaveOutput (goR l c) f (Right o)
goR l (HaveOutput c f Nothing) = goL l c
It processes one source until it returns Nothing, then switches to another, etc. If one source finishes, the other one is processed to the end.
As an example, we can combine and interleave two lists:
import Control.Monad.Trans
import Data.Conduit (($$), awaitForever)
import Data.Conduit.List (sourceList)
main = (merge (sourceList $ concatMap (\x -> [Just x, Just x, Nothing]) [ 1..10])
(sourceList $ concatMap (\x -> [Just x, Nothing]) [101..110]) )
$$ awaitForever (\x -> lift $ print x)
If you need multiple sources, merge could be adapted to something like
mergeList :: Monad m => [Source m (Maybe a)] -> Source m a
which would cycle through the given list of sources until all of them are finished.

This can be done by diving into the internals of conduit. I wanted to avoid this because it looked extremely messy. Based on the responses here, it sounds like there is no way around it (but I would really appreciate a cleaner solution).
The key difficulty is that (x =$=) is a pure function, but to make transPipe give the correct answer, it needs a kind of stateful, function-like thing:
data StatefulMorph m n = StatefulMorph
{ stepStatefulMorph :: forall a. m a -> n (StatefulMorph m n, a)
, finalizeStatefulMorph :: n () }
Stepping StatefulMorph m n takes a value in m and returns, in n, both that value and the next StatefulMorph, which should be used to transform the next m value. The last StatefulMorph should be finalized (which, in the case of the "stateful (x =$=)", finalizes the x conduit.
Conduit fusion can be implemented as a StatefulMorph, using the code for pipeL with minor changes. The signature is:
fuseStateful :: Monad m
=> Conduit a m b
-> StatefulMorph (ConduitM b c m) (ConduitM a c m)
I also need a replacement for transPipe (a special case of hoist) that uses StatefulMorph values instead of functions.
class StatefulHoist t where
statefulHoist :: (Monad m, Monad n)
=> StatefulMorph m n
-> t m r -> t n r
A StatefulHoist instance for ConduitM i o can be written using the code for transPipe with some minor changes.
fuseInner is then easy to implement.
fuseInner :: Monad m
=> Conduit a m b
-> ConduitM i o (ConduitM b c m) r
-> ConduitM i o (ConduitM a c m) r
fuseInner left = statefulHoist (fuseStateful left)
I've written a more detailed explanation here and posted the full code here. If someone can come up with a cleaner solution, or one that uses the conduit public API, please post it.
Thanks for all the suggestions and input!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Single-stepping a conduit - haskell

Related

Kleisli Arrow in Netwire 5?

How do `pass` and `listen` work in WriterT?

Construct a pipes Proxy inside-out

Haskell: Splitting pipes (broadcast) without using spawn

Fusing conduits with multiple inputs

Categories

Resources