Haskell: Splitting pipes (broadcast) without using spawn - haskell

This question is a bit codegolf and a lot newb.
I'm using the awesome pipes library in Haskell, and I'd like to split a pipe to send the same data along multiple channels (do a broadcast).
The Pipes.Concurrent tutorial suggests using spawn to create mailboxes, taking advantage of Output's monoid status.
For example, we might do something like this:
main = do
(output1, input1) <- spawn Unbounded
(output2, input2) <- spawn Unbounded
let effect1 = fromInput input1 >-> pipe1
let effect2 = fromInput input2 >-> pipe2
let effect3 = P.stdinLn >-> toOutput (output1 <> output2)
...
Is this indirection through the mailboxes really necessary?
Could we instead write something like this?
main = do
let effect3 = P.stdinLn >-> (pipe1 <> pipe2)
...
The above doesn't compile, because Pipe doesn't have a Monoid instance.
Is there a good reason for this?
Is the first method really the cleanest way to split a pipe?

There are two ways to do this without using concurrency, both with caveats.
The first way is that if pipe1 and pipe2 are just simple Consumers that loop forever like:
p1 = for cat f -- i.e. p1 = forever $ await >>= f
p2 = for cat g -- i.e. p2 = forever $ await >>= g
... then the easy way to solve this is to just write:
for P.stdinLn $ \str -> do
f str
g str
For example, if p1 is just printing every value:
p1 = for cat (lift . print)
... and p2 is just writing that value to a handle:
p2 = for cat (lift . hPutStrLn h)
... then you would combine them like so:
for P.stdinLn $ \str -> do
lift $ print str
lift $ hPutStrLn h str
However, this simplification only works for Consumers that trivially loop. There's another solution that is more general, which is to define an ArrowChoice instance for pipes. I believe that pull-based Pipes do not permit a correct law-abiding instance, but push-based Pipes do:
newtype Edge m r a b = Edge { unEdge :: a -> Pipe a b m r }
instance (Monad m) => Category (Edge m r) where
id = Edge push
(Edge p2) . (Edge p1) = Edge (p1 >~> p2)
instance (Monad m) => Arrow (Edge m r) where
arr f = Edge (push />/ respond . f)
first (Edge p) = Edge $ \(b, d) ->
evalStateP d $ (up \>\ unsafeHoist lift . p />/ dn) b
where
up () = do
(b, d) <- request ()
lift $ put d
return b
dn c = do
d <- lift get
respond (c, d)
instance (Monad m) => ArrowChoice (Edge m r) where
left (Edge k) = Edge (bef >=> (up \>\ (k />/ dn)))
where
bef x = case x of
Left b -> return b
Right d -> do
_ <- respond (Right d)
x2 <- request ()
bef x2
up () = do
x <- request ()
bef x
dn c = respond (Left c)
This requires a newtype so that the type parameters are in the order that ArrowChoice expects.
If you're unfamiliar with the term push-based Pipe, it's basically a Pipe that begins from the most upstream pipe instead of the most downstream pipe, and they all have the following shape:
a -> Pipe a b m r
Think of it as a Pipe that cannot "go" until it receives at least one value from upstream.
These push-based Pipes are the "dual" to conventional pull-based Pipes, complete with their own composition operator and identity:
(>~>) :: (Monad m)
=> (a -> Pipe a b m r)
-> (b -> Pipe b c m r)
-> (a -> Pipe a c m r)
push :: (Monad m)
-> a -> Pipe a a m r
... but the unidirectional Pipes API does not export this by default. You can only get these operators from Pipes.Core (and you may want to study that module more closely to build an intuition for how they work). That module shows that both push-based Pipes and pull-based Pipes are both special cases of more general bidirectional versions, and understanding the bidirectional case is how you learn why they are duals of each other.
Once you have an Arrow instance for push-based pipes, you can write something like:
p >>> bifurcate >>> (p1 +++ p2)
where
bifurcate = Edge $ pull ~> \a -> do
yield (Left a) -- First give `p1` the value
yield (Right a) -- Then give `p2` the value
Then you would use runEdge to convert that to a pull-based pipe when you are done.
This approach has one major draw-back, which is that you can't automatically upgrade a pull-based pipe to a push-based pipe (but usually it's not hard to figure out how to do it manually). For example, to upgrade Pipes.Prelude.map to be a push-based Pipe, you would write:
mapPush :: (Monad m) => (a -> b) -> (a -> Pipe a b m r)
mapPush f a = do
yield (f a)
Pipes.Prelude.map f
Then that has the right type to be wrapped up in the Arrow:
mapEdge :: (Monad m) => (a -> b) -> Edge m r a b
mapEdge f = Edge (mapPush f)
Of course, an even simpler way would be just to write it from scratch:
mapEdge f = Edge $ push ~> yield . f
Use whichever approach suits you best.
In fact, I came up with the Arrow and ArrowChoice instances precisely because I was trying to answer the exact same question as you: how do you solve these kinds of problems without using concurrency? I wrote up a long answer about this more general subject in another Stack Overflow answer here, where I describe how you can use these Arrow and ArrowChoice instances to distill concurrent systems into equivalent pure ones.

Related

Why to define the constructor parameter of Reader as a function?

When learning the Reader Monad, I find that it is defined as:
newtype Reader r a = Reader { runReader :: r -> a }
instance Monad (Reader r) where
return a = Reader $ \_ -> a
m >>= k = Reader $ \r -> runReader (k (runReader m r)) r
I want to known why using function as constructor parameter instead of something else such as a tuple:
newtype Reader r a = Reader { runReader :: (r, a) }
instance Monad (Reader r) where
-- Here I cannot get r when defining return function,
-- so does that's the reason that must using a function whose input is an "r"?
return a = Reader (r_unknown, a)
m >>= k = Reader (fst $ runReader m) (f (snd $ runReader m))
According to the Reader definition, we need a "environment" which we can use to generate a "value". I think a Reader type should contain the information of "environment" and "value", so the tuple seems perfect.
You didn't mention it in the question, but I guess you thought specifically of using a pair for defining Reader because it also makes sense to think of that as a way of providing a fixed environment. Let's say we have an earlier result in the Reader monad:
return 2 :: Reader Integer Integer
We can use this result to do further calculations with the fixed environment (and the Monad methods guarantee it remains fixed throughout the chain of (>>=)):
GHCi> runReader (return 2 >>= \x -> Reader (\r -> x + r)) 3
5
(If you substitute the definitions of return, (>>=) and runReader in the expression above and simplify it, you will see exactly how it reduces to 2 + 3.)
Now, let's follow your suggestion and define:
newtype Env r a = Env { runEnv :: (r, a) }
If we have an environment of type r and a previous result of type a, we can make an Env r a out of them...
Env (3, 2) :: Env Integer Integer
... and we can also get a new result from that:
GHCi> (\(r, x) -> x + r) . runEnv $ Env (3, 2)
5
The question, then, is whether we can capture this pattern through the Monad interface. The answer is no. While there is a Monad instance for pairs, it does something quite different:
newtype Writer r a = Writer { Writer :: (r, a) }
instance Monoid r => Monad (Writer r) where
return x = (mempty, x)
m >>= f = Writer
. (\(r, x) -> (\(s, y) -> (mappend r s, y)) $ f x)
$ runWriter m
The Monoid constraint is needed so that we can use mempty (which solves the problem that you noticed of having to create a r_unknown out of nowhere) and mappend (which makes it possible to combine the first elements of the pair in a way that doesn't violate the monad laws). This Monad instance, however, does something very different than what the Reader one does. The first element of the pair isn't fixed (it is subject to change, as we mappend other generated values to it) and we don't use it to compute the second element of the pair (in the definition above, y does not depend neither on r nor on s). Writer is a logger; the r values here are output, not input.
There is one way, however, in which your intuition is justified: we can't make a reader-like monad using a pair, but we can make a reader-like comonad. To put it very loosely, Comonad is what you get when you turn the Monad interface upside down:
-- This is slightly different than what you'll find in Control.Comonad,
-- but it boils down to the same thing.
class Comonad w where
extract :: w a -> a -- compare with return
(=>>) :: w a -> (w a -> b) -> w b -- compare with (>>=)
We can give the Env we had abandoned a Comonad instance:
newtype Env r a = Env { runEnv :: (r, a) }
instance Comonad (Env r) where
extract (Env (_, x)) = x
w#(Env (r, _)) =>> f = Env (r, f w)
That allows us to write the 2 + 3 example from the beginning in terms of (=>>):
GHCi> runEnv $ Env (3, 2) =>> ((\(r, x) -> x + r) . runEnv)
(3,5)
One way to see why this works is noting that an a -> Reader r b function (i.e. what you give to Reader's (>>=)) is essentially the same thing that an Env r a -> b one (i.e. what you give to Env's (=>>)):
a -> Reader r b
a -> (r -> b) -- Unwrap the Reader result
r -> (a -> b) -- Flip the function
(r, a) -> b -- Uncurry the function
Env r a -> b -- Wrap the argument pair
As further evidence of that, here is a function that changes one into the other:
GHCi> :t \f -> \w -> (\(r, x) -> runReader (f x) r) $ runEnv w
\f -> \w -> (\(r, x) -> runReader (f x) r) $ runEnv w
:: (t -> Reader r a) -> Env r t -> a
GHCi> -- Or, equivalently:
GHCi> :t \f -> uncurry (flip (runReader . f)) . runEnv
\f -> uncurry (flip (runReader . f)) . runEnv
:: (a -> Reader r c) -> Env r a -> c
To wrap things up, here is a slightly longer example, with Reader and Env versions side-by-side:
GHCi> :{
GHCi| flip runReader 3 $
GHCi| return 2 >>= \x ->
GHCi| Reader (\r -> x ^ r) >>= \y ->
GHCi| Reader (\r -> y - r)
GHCi| :}
5
GHCi> :{
GHCi| extract $
GHCi| Env (3, 2) =>> (\w ->
GHCi| (\(r, x) -> x ^ r) $ runEnv w) =>> (\z ->
GHCi| (\(r, x) -> x - r) $ runEnv z)
GHCi| :}
5
First of all note that your bind function is wrong and would not compile.
If the Reader were defined as you describe with a tuple, there would be problems:
The monad laws would be violated, e.g. left identity, which states that:
return a >>= f == f a
or the right identity:
m >>= return == m
would be broken, depending on the implmentation of >>= because >>= would forget either the first tuple element of the first argument or the second, i.e. if the implmentation would be:
(Reader (mr, mv)) >>= f =
let (Reader (fr, fv)) = f mv
in Reader (mr, fv)
then we would always lose the reader value that comes out of f (aka fr) and otherwise if >>= would be
(Reader (mr, mv)) >>= f =
let (Reader (fr, fv)) = f mv
in Reader (fr, fv)
-- ^^^ tiny difference here ;)
we would always lose mr.
A Reader is some action, that may ask for a constant value, which cannot be changed by another monadic action, which is read-only.
But when defined with a tuple, we could super-easy overwrite the reader value, e.g. whith this function:
tell :: x -> BadReader x ()
tell x = BadReader (x, ())
If a reader is instead defined with a function, this is impossible (try it)
Also, that enviroment is actually not required before converting a Reader to a pure value (aka running the Reader), so from this alone it makes sense to use a function instead of a tuple.
When using a tuple, we would have to provide the Reader value even before we actually run an action.
You can see that in your return definition, you even point out the problem, where the r_unknown comes from ...
To get a btter intuition, let's assume a Reader action that returns the Persons with a certain age from the Addressbook:
data Person = MkPerson {name :: String, age :: Int}
type Addressbook = [Person]
personsWithThisAge :: Int -> Reader Addressbook [Person]
personsWithThisAge a = do
addressbook <- ask
return (filter (\p -> age p == a) addressbook)
This personsWithAge function returns a Reader action and since it only asks for the Addressbook, it is like a function that accepts an addressbook and gives back a [Person] list,
so it is natural to define a reader as just that, a function from some input to a result.
We could rewrite this Reader action to be a function of the Addressbook like this:
personsWithThisAgeFun :: Int -> Addressbook -> [Person]
personsWithThisAgeFun a addressbook =
filter (\p -> age p == a) addressbook
But why invent Reader??
The real value of Reader shows when combining several functions like e.g. personsWithThisAge, that all depend on (the same) one constant Addressbook.
Using a Reader we don't have to explicitly pass some Addressbook around, individual Reader actions don't even have any way at all to modify the Addressbook - Reader guarantees us, that every action uses the same, unmodified Addressbook, and all a Reader action can ever to with the environment is ask for it.
The only way to implement this, with these guarantees is with a function.
Also if you look at the monad instances that are included in the standard library, you will see that (r ->) is a monad; actually it is identical to the Reader monad apart from some technical differences.
Now the structure you describe with the tuple is actually pretty close to a Writer monad, what is write-only , but that's out of scope.

Single-stepping a conduit

I want to do something along the lines of ArrowChoice, but with conduits. I want to await an Either value and then pass Left values to one conduit and Right values to another, and then merge the results back into an Either stream.
Presumably this can be done by making the inner conduits like automata: turn a conduit into a function that takes an argument and returns a monadic list of outputs yielded:
newtype AutomataM i m o = Automata (i -> m (o, Automata i o))
conduitStep :: Conduit i m o -> AutomataM i m [o]
The reason for the list of outputs is that a Conduit may yield 0 or more outputs for each input.
I've looked at ResumableConduit and its relatives, and presumably the answer is in there somewhere. But I can't quite see how its done.
It's not exactly the same type signature you provided, but:
import Data.Conduit
import Data.Conduit.Internal (Pipe (..), ConduitM (..))
newtype Automata i o m r = Automata (m ([o], Either r (i -> Automata i o m r)))
conduitStep :: Monad m => ConduitM i o m r -> Automata i o m r
conduitStep (ConduitM con0) =
Automata $ go [] id con0
where
go _ front (Done r) = return (front [], Left r)
go ls front (HaveOutput p _ o) = go ls (front . (o:)) p
go ls front (NeedInput p _) =
case ls of
[] -> return (front [], Right $ conduitStep . ConduitM . p)
l:ls' -> go ls' front (p l)
go ls front (PipeM mp) = mp >>= go ls front
go ls front (Leftover p l) = go (l:ls) front p
But just be careful with this approach:
By keeping the output as a list, it's not constant memory.
We're throwing away finalizers.
There's probably a way to provide a ZipConduit abstraction, similar to ZipSource and ZipSink, that would handle this kind of problem more elegantly, but I haven't thought about it too much.
EDIT I ended up implementing ZipConduit in conduit-extra 0.1.5. Here's a demonstration of using it which sounds a bit like your case:
import Control.Applicative
import Data.Conduit
import Data.Conduit.Extra
import qualified Data.Conduit.List as CL
conduit1 :: Monad m => Conduit Int m String
conduit1 = CL.map $ \i -> "conduit1: " ++ show i
conduit2 :: Monad m => Conduit Double m String
conduit2 = CL.map $ \d -> "conduit2: " ++ show d
conduit :: Monad m => Conduit (Either Int Double) m String
conduit = getZipConduit $
ZipConduit (lefts =$= conduit1) *>
ZipConduit (rights =$= conduit2)
where
lefts = CL.mapMaybe (either Just (const Nothing))
rights = CL.mapMaybe (either (const Nothing) Just)
main :: IO ()
main = do
let src = do
yield $ Left 1
yield $ Right 2
yield $ Left 3
yield $ Right 4
sink = CL.mapM_ putStrLn
src $$ conduit =$ sink
There's a folk method of doing this using pipes by using "push-category" Pipes. The complete implementation comes from both this mailing list post and this Stack Overflow answer. I think it hasn't been released yet due to both an effort to simplify the Pipes interface, a focus on using the "sequencing" monad instance which is hidden via this method, and no proof yet that this implementation truly implements the Arrow class properly.
The idea is to implement a newtype Edge (demonstrated below) which is a push-based pipe with the type arguments in the right order for Category, Arrow, ArrowChoice and both Functor and Applicative over their output values. This lets you compose them into directed acyclic graphs using arrow notation. I'll run over the implementation below, but it's safe to just ignore it and use the Arrow/ArrowChoice/Applicative instances of Edge without too much concern.
(Edit: This code is best made available at https://github.com/Gabriel439/Haskell-RCPL-Library)
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE TypeSynonymInstances #-}
import Prelude hiding ((.), id)
import Pipes.Core
import Pipes.Lift
import Control.Monad.Morph
import Control.Category
import Control.Monad.State.Strict
import Control.Arrow
This is an atypical mode of using pipes and isn't exposed in the Pipes module; you must import Pipes.Core to use push. Push-based pipes look like
-- push :: a -> Proxy a' a a' a m r
and thus they demand at least one upstream value before the Proxy is allowed to run. This means the whole process needs to be "kickstarted" by passing the first value as a function call and that the leftmost push-Proxy will control the entire stream.
Given a push-based pipe we can implement Category, Arrow and ArrowChoice. The standard solution also involves the Edge typeclass so that we have the type arguments in the right order for Category and Arrow
newtype Edge m r a b = Edge { unEdge :: a -> Pipe a b m r }
For the Category instance, we use the "push" Category which has push as id and (<~<) as composition:
instance Monad m => Category (Edge m r) where
id = Edge push
Edge a . Edge b = Edge (a <~< b)
We embed functions into Edge with arr by augmenting id (i.e. push) on the downward edge. To do this we use the respond category which has the law p />/ respond == p, but jam our f into the process.
instance Monad m => Arrow (Edge m r) where
arr f = Edge (push />/ respond . f)
We also use a local state transformer to store the snd half of our pairs and pass it "around" the input pipe in first
first (Edge p) = Edge $ \(b, d) ->
evalStateP d $ (up \>\ hoist lift . p />/ dn) b
where
up () = do
(b, d) <- request ()
lift (put d)
return b
dn c = do
d <- lift get
respond (c, d)
Finally, we get an ArrowChoice instance by implementing left. To do so we split the burden of passing the Left and Right sides using either the return or the pipe to pass values.
instance (Monad m) => ArrowChoice (Edge m r) where
left (Edge k) = Edge (bef >=> (up \>\ (k />/ dn)))
where
bef x = case x of
Left b -> return b
Right d -> do
_ <- respond (Right d)
x2 <- request ()
bef x2
up () = do
x <- request ()
bef x
dn c = respond (Left c)
We can use Edge to create "push-based" producers and consumers
type PProducer m r b = Edge m r () b
type PConsumer m r a = forall b . Edge m r a b
and then we'll provide Functor and Applicative instances for PProducer. This goes by case analysis on the underlying Pipe, so it's a bit verbose. Essentially, however, all that happens is that we insert f into the yield slot of the Pipe.
instance Functor (PProducer m r) where
fmap f (Edge k) = $ Edge $ \() -> go (k ()) where
go p = case p of
Request () ku -> Request () (\() -> go (ku ()))
-- This is the only interesting line
Respond b ku -> Respond (f b) (\() -> go (ku ()))
M m -> M (m >>= \p' -> return (go p'))
Pure r -> Pure r
Finally, Applicative is much the same except that we have to switch between running the upstream pipe to produce functions and running the downstream pipe to produce arguments.
instance (Monad m) => Applicative (Edge m r ()) where
pure b = Edge $ \() -> forever $ respond b
(Edge k1) <*> (Edge k2) = Edge (\() -> goL (k1 ()) (k2 ()))
where
goL p1 p2 = case p1 of
Request () ku -> Request () (\() -> goL (ku ()) p2)
Respond f ku -> goR f (ku ()) p2
M m -> M (m >>= \p1' -> return (goL p1' p2))
Pure r -> Pure r
goR f p1 p2 = case p2 of
Request () ku -> Request () (\() -> goR f p1 (ku ()))
Respond x ku -> Respond (f x) (\() -> goL p1 (ku ()))
M m -> M (m >>= \p2' -> return (goR f p1 p2'))
Pure r -> Pure r

Is a one-yield-per-await restricted pipe possible?

I'm working with pipes-4.0.0. In that library, the number of yields to downstream a pipe makes is in general unrelated to the number of awaits from upstream.
But suppose I wanted to build a restricted pipe that enforced that one and only one yield is performed for each await, while still being able to sequence these kinds of pipes using monadic (>>=).
I have observed that, in the bidirectional case, each value requested from upstream by a Proxy is matched with a value sent back. So maybe what I'm searching for is a function of type Proxy a' a () b m r -> Pipe a (Either b a') m r that "reflects" the values going upstream, turning them into additional yields to downstream. Or, less generally, Client a' a -> Pipe a a'. Is such a function possible?
You definitely do not want to use pipes for this. But, what you can do is define a restricted type that does this, do all your connections and logic within that restricted type, then promote it to a Pipe when you are done.
The type in question that you want is this, which is similar to the netwire Wire:
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad.Trans.Free -- from the 'free' package
data WireF a b x = Pass (a -> (b, x)) deriving (Functor)
type Wire a b = FreeT (WireF a b)
That's automatically a monad and a monad transformer since it is implemented in terms of FreeT. Then you can implement this convenient operation:
pass :: (Monad m) => (a -> b) -> Wire a b m ()
pass f = liftF $ Pass (\a -> (f a, ()))
... and assemble custom wires using monadic syntax:
example :: Wire Int Int IO ()
example = do
pass (+ 1)
lift $ putStrLn "Hi!"
pass (* 2)
Then when you're done connecting things with this restricted Wire type you can promote it to a Pipe:
promote :: (Monad m) => Wire a b m r -> Pipe a b m r
promote w = do
x <- lift $ runFreeT w
case x of
Pure r -> return r
Free (Pass f) -> do
a <- await
let (b, w') = f a
yield b
promote w'
Note that you can define an identity and wire and wire composition:
idWire :: (Monad m) => Wire a a m r
idWire = forever $ pass id
(>+>) :: (Monad m) => Wire a b m r -> Wire b c m r -> Wire a c m r
w1 >+> w2 = FreeT $ do
x <- runFreeT w2
case x of
Pure r -> return (Pure r)
Free (Pass f2) -> do
y <- runFreeT w1
case y of
Pure r -> return (Pure r)
Free (Pass f1) -> return $ Free $ Pass $ \a ->
let (b, w1') = f1 a
(c, w2') = f2 b
in (c, w1' >+> w2')
I'm pretty sure those form a Category:
idWire >+> w = w
w >+> idWire = w
(w1 >+> w2) >+> w3 = w1 >+> (w2 >+> w3)
Also, I'm pretty sure that promote obeys the following functor laws:
promote idWire = cat
promote (w1 >+> w2) = promote w1 >-> promote w2
My gut feeling is that this is going to be very hard to do, if not outright impossible. Not only can you write producers and consumers full of complex loops but the monadic interface mans that the control flow of the consumer can depend on the values it gets from the producer.
consumer = do
n <- await
for i in 1..n do
m <- await
print m
Its going to be very hard to encode on the types of the producer that "this produces N + 1 numbers where N is the value of the first number yielded".
Going back on topic, I think you might have a better chance if you use your own combinators instead of the base monadic interface for pipes. For example, the boomerang web routes library uses a set of combinators to simultaneously build the code that does the (Route -> URL) conversion and teh code that does the (URL -> Route) conversion, thus guaranteeing that they are compatible and inverses of each other.

Fusing conduits with multiple inputs

I am trying to create a conduit that can consume multiple input streams. I need to be able to await on one or the other of the input streams in no particular order (e.g., not alternating) making zip useless. There is nothing parallel or non-deterministic going on here: I await on one stream or the other. I want to be able to write code similar to the following (where awaitA and awaitB await on the first or second input stream respectively):
do
_ <- awaitA
x <- awaitA
y <- awaitB
yield (x,y)
_ <- awaitB
_ <- awaitB
y' <- awaitB
yield (x,y')
The best solution I have is to make the inner monad another conduit, e.g.
foo :: Sink i1 (ConduitM i2 o m) ()
Which then allows
awaitA = await
awaitB = lift await
And this mostly works. Unfortunately, this seems to make it very difficult to fuse to the inner conduit before the outer conduit is fully connected. The first thing I tried was:
fuseInner :: Monad m =>
Conduit i2' m i2 ->
Sink i1 (ConduitM i2 o m) () ->
Sink i1 (ConduitM i2' o m) ()
fuseInner x = transPipe (x =$=)
But this doesn't work, at least when x is stateful since (x =$=) is run multiple times, effectively restarting x each time.
Is there any way to write fuseInner, short of breaking into the internals of conduit (which looks like it would be pretty messy)? Is there some better way to handle multiple input streams? Am I just way to far beyond what conduit was designed for?
Thanks!
If you want to combine two IO-generated streams, then Gabriel's comment is the solution.
Otherwise, you can't wait for both streams, which one produces a value first. Conduits are single-threaded and deterministic - it processes only one pipe at a time. But you could create a function that interleaves two streams, letting them decide when to switch:
{-# OPTIONS_GHC -fwarn-incomplete-patterns #-}
import Control.Monad (liftM)
import Data.Conduit.Internal (
Pipe (..), Source, Sink,
injectLeftovers, ConduitM (..),
mapOutput, mapOutputMaybe
)
-- | Alternate two given sources, running one until it yields `Nothing`,
-- then switching to the other one.
merge :: Monad m
=> Source m (Maybe a)
-> Source m (Maybe b)
-> Source m (Either a b)
merge (ConduitM l) (ConduitM r) = ConduitM $ goL l r
where
goL :: Monad m => Pipe () () (Maybe a) () m ()
-> Pipe () () (Maybe b) () m ()
-> Pipe () () (Either a b) () m ()
goL (Leftover l ()) r = goL l r
goL (NeedInput _ c) r = goL (c ()) r
goL (PipeM mx) r = PipeM $ liftM (`goL` r) mx
goL (Done _) r = mapOutputMaybe (liftM Right) r
goL (HaveOutput c f (Just o)) r = HaveOutput (goL c r) f (Left o)
goL (HaveOutput c f Nothing) r = goR c r
-- This is just a mirror copy of goL. We should combine them together to
-- avoid code repetition.
goR :: Monad m => Pipe () () (Maybe a) () m ()
-> Pipe () () (Maybe b) () m ()
-> Pipe () () (Either a b) () m ()
goR l (Leftover r ()) = goR l r
goR l (NeedInput _ c) = goR l (c ())
goR l (PipeM mx) = PipeM $ liftM (goR l) mx
goR l (Done _) = mapOutputMaybe (liftM Left) l
goR l (HaveOutput c f (Just o)) = HaveOutput (goR l c) f (Right o)
goR l (HaveOutput c f Nothing) = goL l c
It processes one source until it returns Nothing, then switches to another, etc. If one source finishes, the other one is processed to the end.
As an example, we can combine and interleave two lists:
import Control.Monad.Trans
import Data.Conduit (($$), awaitForever)
import Data.Conduit.List (sourceList)
main = (merge (sourceList $ concatMap (\x -> [Just x, Just x, Nothing]) [ 1..10])
(sourceList $ concatMap (\x -> [Just x, Nothing]) [101..110]) )
$$ awaitForever (\x -> lift $ print x)
If you need multiple sources, merge could be adapted to something like
mergeList :: Monad m => [Source m (Maybe a)] -> Source m a
which would cycle through the given list of sources until all of them are finished.
This can be done by diving into the internals of conduit. I wanted to avoid this because it looked extremely messy. Based on the responses here, it sounds like there is no way around it (but I would really appreciate a cleaner solution).
The key difficulty is that (x =$=) is a pure function, but to make transPipe give the correct answer, it needs a kind of stateful, function-like thing:
data StatefulMorph m n = StatefulMorph
{ stepStatefulMorph :: forall a. m a -> n (StatefulMorph m n, a)
, finalizeStatefulMorph :: n () }
Stepping StatefulMorph m n takes a value in m and returns, in n, both that value and the next StatefulMorph, which should be used to transform the next m value. The last StatefulMorph should be finalized (which, in the case of the "stateful (x =$=)", finalizes the x conduit.
Conduit fusion can be implemented as a StatefulMorph, using the code for pipeL with minor changes. The signature is:
fuseStateful :: Monad m
=> Conduit a m b
-> StatefulMorph (ConduitM b c m) (ConduitM a c m)
I also need a replacement for transPipe (a special case of hoist) that uses StatefulMorph values instead of functions.
class StatefulHoist t where
statefulHoist :: (Monad m, Monad n)
=> StatefulMorph m n
-> t m r -> t n r
A StatefulHoist instance for ConduitM i o can be written using the code for transPipe with some minor changes.
fuseInner is then easy to implement.
fuseInner :: Monad m
=> Conduit a m b
-> ConduitM i o (ConduitM b c m) r
-> ConduitM i o (ConduitM a c m) r
fuseInner left = statefulHoist (fuseStateful left)
I've written a more detailed explanation here and posted the full code here. If someone can come up with a cleaner solution, or one that uses the conduit public API, please post it.
Thanks for all the suggestions and input!

Writing a function (a -> b -> ... -> t) -> (Monad m => m a -> m b -> ... -> m t)

Is there some way to write a function f :: (a -> b -> ... -> t) -> (Monad m => m a -> m b -> ... -> m t), basically liftMn for any n?
(EDIT: fixed nonsensical example.)
I'm writing an FRP library, and thought it'd be neat if I could have code vaguely like:
main = do
input1 <- signalFromTextBoxTheUserMayTypeANumberInto
input2 <- signalFromAnotherTextBox
divided <- signal div input1 input2
signal (\x -> showTextAlert ("input1 `div` input2 is " ++ show x)) divided
I've been fiddling with type families to get it working, but I'm starting to think that it's actually not doable. I'm currently doing something like this:
type Action a = IORef a -> IO ()
type Listener = IO ()
newtype Signal a = Signal (IORef (SigData a))
data SigData a = SigData {
trigger :: Action a,
output :: IORef a,
listeners :: [Listener]
}
class Sig a where
type S a
mkSig :: [AnySignal] -> a -> S a
instance Sig b => Sig (a -> b) where
type S (a -> b) = Signal a -> S b
mkSig dependencies f =
\s#(Signal sig) ->
let x = unsafePerformIO $ readIORef sig >>= readIORef . output
in mkSig (AnySignal s : dependencies) (f x)
instance Sig Int where
type S Int = IO (Signal Int)
out <- newIORef x
self <- Signal <$> (newIORef $ SigData {
trigger = \ref -> writeIORef ref $! x,
output = out,
listeners = []
})
mapM_ (self `listensTo`) deps
return self
This obviously doesn't work, as the unsafePerformIO gets evaluated once and then keeps that value, and if did work it'd still be ugly, hacky and generally evil. Is there a way to do this, or will I just have to let go of the idea?
I'm kind of new to all of this, so forgive me if this is a silly answer, but isn't this exactly what applicative functors are for?
Applicatives let you do something like:
f :: a -> b -> ... -> c
f2 :: Applicative p => p a -> p b ... -> p c
f2 x ... y = f <$> x <*> ... <*> y
if I'm not mistaken. (The ellipses are any number of types/arguments)
How about the Strathclyde Haskell Environment preprocessor, which lets you use idiom brackets, the original notation for applicative functors? This lets you use (| f a b c |) for f <$> a <*> b <*> c. Your example would be (| input1 `div` input2 |).
By the way, it's probably a bad idea for your Signal type to have a Monad instance; this causes the well-known (in the FRP community) problem of time leaks; see this blog post for more information. An Applicative interface is OK, but a Monad interface is not. There are several solutions that prevent time leaks while still allowing the same dynamic event switching behaviour, involving things like an additional type parameter or another monad (as seen in, e.g. the sodium library).

Resources