How to turn a pull based pipe into a push based one?

How to turn a pull based pipe into a push based one? - haskell

By default pipes are pull based. This is due to the operator >-> which is implemented via +>> which is the pointful bind operator for his pull category. My understanding is that this means that if you have code like producer >-> consumer, the consumer's body will be called first, then once it awaits data, the producer will be called.
I've seen in the pipes documentation here that you can use the code (reflect .) from Pipes.Core to turn a pull based pipe into a push based pipe. That means instead (correct me if I'm wrong) that in the code above producer >-> consumer, the producer is run first, produces a value, then the consumer tries to consume. That seems really useful and I'd like to know how to do it.
I've also seen in discussions here that there is no push based counterpart to >-> because it is easy to turn any pipe around (I assume with reflect?), but I can't really figure how to do it or find any examples.
Here's some code I've attempted:
stdin :: Producer String IO r
stdin = forever $ do
lift $ putStrLn "stdin"
str <- lift getLine
yield str
countLetters :: Consumer String IO r
countLetters = forever $ do
lift $ putStrLn "countLetters"
str <- await
lift . putStrLn . show . length $ str
-- this works in pull mode
runEffect (stdin >-> countLetters)
-- equivalent to above, works
runEffect ((\() -> stdin) +>> countLetters)
-- push based operator, doesn't do what I hoped
runEffect (stdin >>~ (\_ -> countLetters))
-- does not compile
runEffect (countLetters >>~ (\() -> stdin))

-- push based operator, doesn't do what I hoped
runEffect (stdin >>~ (\_ -> countLetters))
I gather the problem here is that, while the producer is ran first as expected, the first produced value is dropped. Compare...
GHCi> runEffect (stdin >-> countLetters)
countLetters
stdin
foo
3
countLetters
stdin
glub
4
countLetters
stdin
... with:
GHCi> runEffect (stdin >>~ (\_ -> countLetters))
stdin
foo
countLetters
stdin
glub
4
countLetters
stdin
This issue is discussed in detail by Gabriella Gonzalez's answer to this question. It boils down to how the argument to the function you give to (>>~) is the "driving" input in the push-based flow, and so if you const it away you end up dropping the first input. The solution is to reshape countLetters accordingly:
countLettersPush :: String -> Consumer String IO r
countLettersPush str = do
lift $ putStrLn "countLetters"
lift . putStrLn . show . length $ str
str' <- await
countLettersPush str'
GHCi> runEffect (stdin >>~ countLettersPush)
stdin
foo
countLetters
3
stdin
glub
countLetters
4
stdin
I've also seen in discussions here that there is no push based counterpart to >-> because it is easy to turn any pipe around (I assume with reflect?)
I'm not fully sure of my ground, but it seems that doesn't quite apply to the solution above. What we can do, now that we have the push-based flow working correctly, is using reflect to turn it around back to a pull-based flow:
-- Preliminary step: switching to '(>~>)'.
stdin >>~ countLettersPush
(const stdin >~> countLettersPush) ()
-- Applying 'reflect', as the documentation suggests.
reflect . (const stdin >~> countLettersPush)
reflect . const stdin <+< reflect . countLettersPush
const (reflect stdin) <+< reflect . countLettersPush
-- Rewriting in terms of '(+>>)'.
(reflect . countLettersPush >+> const (reflect stdin)) ()
reflect . countLettersPush +>> reflect stdin
This is indeed pull-based, as the flow is driven by reflect stdin, the downstream Client:
GHCi> :t reflect stdin
reflect stdin :: Proxy String () () X IO r
GHCi> :t reflect stdin :: Client String () IO r
reflect stdin :: Client String () IO r :: Client String () IO r
The flow, however, involves sending Strings upstream, and so it cannot be expressed in terms of (>->), which is, so to say, downstream-only:
GHCi> -- Compare the type of the second argument with that of 'reflect stdin'
GHCi> :t (>->)
(>->)
:: Monad m =>
Proxy a' a () b m r -> Proxy () b c' c m r -> Proxy a' a c' c m

Related

Convert IO callback to infinite list

I am using a library that I can provide with a function a -> IO (), which it will call occasionally.
Because the output of my function depends not only on the a it receives as input, but also on the previous a's, it would be much easier for me to write a function [a] -> IO (), where [a] is infinite.
Can I write a function:
magical :: ([a] -> IO ()) -> (a -> IO ())
That collects the a's it receives from the callback and passes them to my function as a lazy infinite list?

The IORef solution is indeed the simplest one. If you'd like to explore a pure (but more complex) variant, have a look at conduit. There are other implementations of the same concept, see Iteratee I/O, but I found myself conduit to be very easy to use.
A conduit (AKA pipe) is an abstraction of of program that can accept input and/or produce output. As such, it can keep internal state, if needed. In your case, magical would be a sink, that is, a conduit that accepts input of some type, but produces no output. By wiring it into a source, a program that produces output, you complete the pipeline and then ever time the sink asks for an input, the source is run until it produces its output.
In your case you'd have roughly something like
magical :: Sink a IO () -- consumes a stream of `a`s, no result
magical = go (some initial state)
where
go state = do
m'input <- await
case m'input of
Nothing -> return () -- finish
Just input -> do
-- do something with the input
go (some updated state)

This is not exactly what you asked for, but it might be enough for your purposes, I think.
magical :: ([a] -> IO ()) -> IO (a -> IO ())
magical f = do
list <- newIORef []
let g x = do
modifyIORef list (x:)
xs <- readIORef list
f xs -- or (reverse xs), if you need FIFO ordering
return g
So if you have a function fooHistory :: [a] -> IO (), you can use
main = do
...
foo <- magical fooHistory
setHandler foo -- here we have foo :: a -> IO ()
...
As #danidaz wrote above, you probably do not need magical, but can play the same trick directly in your fooHistory, modifying a list reference (IORef [a]).
main = do
...
list <- newIORef []
let fooHistory x = do
modifyIORef list (x:)
xs <- readIORef list
use xs -- or (reverse xs), if you need FIFO ordering
setHandler fooHistory -- here we have fooHistory :: a -> IO ()
...

Control.Concurrent.Chan does almost exactly what I wanted!
import Control.Monad (forever)
import Control.Concurrent (forkIO)
import Control.Concurrent.Chan
setHandler :: (Char -> IO ()) -> IO ()
setHandler f = void . forkIO . forever $ getChar >>= f
process :: String -> IO ()
process ('h':'i':xs) = putStrLn "hi" >> process xs
process ('a':xs) = putStrLn "a" >> process xs
process (x:xs) = process xs
process _ = error "Guaranteed to be infinite"
main :: IO ()
main = do
c <- newChan
setHandler $ writeChan c
list <- getChanContents c
process list

This seems like a flaw in the library design to me. You might consider an upstream patch so that you could provide something more versatile as input.

What's an idiomatic way of handling a lazy input channel in Haskell

I am implementing an IRC bot and since I am connecting over SSL by using OpenSSL.Session I use lazyRead function to read data from the socket. During the initial phase of the connection I need to perform several things in order: nick negotiation, nickserv identification, joining channels etc) so there is some state involved. Right now I came up with the following:
data ConnectionState = Initial | NickIdentification | Connected
listen :: SSL.SSL -> IO ()
listen ssl = do
lines <- BL.lines `fmap` SSL.lazyRead ssl
evalStateT (mapM_ (processLine ssl) lines) Initial
processLine :: SSL.SSL -> BL.ByteString -> StateT ConnectionState IO ()
processLine ssl line = do case message of
Just a -> processMessage ssl a
Nothing -> return ()
where message = IRC.decode $ BL.toStrict line
processMessage :: SSL.SSL -> IRC.Message -> StateT ConnectionState IO ()
processMessage ssl m = do
state <- S.get
case state of
Initial -> when (IRC.msg_command m == "376") $ do
liftIO $ putStrLn "connected!"
liftIO $ privmsg ssl "NickServ" ("identify " ++ nick_password)
S.put NickIdentification
NickIdentification -> do
when (identified m) $ do
liftIO $ putStrLn "identified!"
liftIO $ joinChannel ssl chan
S.put Connected
Connected -> return ()
liftIO $ print m
when (IRC.msg_command m == "PING") $ (liftIO . pong . mconcat . map show) (IRC.msg_params m)
So when I get to the "Connected" state I still end up going through the case statement even though it's only really needed to initialize the connection. The other problem is that adding nested StateT's would be very painful.
Other way would be to replace mapM with something custom to only process lines until we are connected and then start another loop over the rest. This would require either keeping track of what's left in the list or invoking SSL.lazyRead once again (which is not too bad).
Another solution is to keep the remaining lines list in the state and draw lines when needed similar to getLine.
What's the better thing to do in this case? Would Haskell's laziness make it so that we go directly to Connected case after state stops updating or is case always strict?

You can use the Pipe type from pipes. The trick is that instead of creating a state machine and a transition function you can encode the the state implicitly in the control flow of the Pipe.
Here is what the Pipe would look like:
stateful :: Pipe ByteString ByteString IO r
stateful = do
msg <- await
if (IRC.msg_command msg == "376")
then do
liftIO $ putStrLn "connected!"
liftIO $ privmsg ssl "NickServ" ("identify " ++ nick_password)
yield msg
nick
else stateful
nick :: Pipe ByteString ByteString IO r
nick = do
msg <- await
if identified msg
then do
liftIO $ putStrLn "identified!"
liftIO $ joinChannel ssl chan
yield msg
cat -- Forward the remaining input to output indefinitely
else nick
The stateful pipe corresponds to the stateful part of your processMessage function. It handles initialization and authentication, but defers further message processing to downstream stages by re-yielding the msg.
You can then loop over every message this Pipe yields by using for:
processMessage :: Consumer ByteString IO r
processMessage = for stateful $ \msg -> do
liftIO $ print m
when (IRC.msg_command m == "PING") $ (liftIO . pong . mconcat . map show) (IRC.msg_params m)
Now all you need is a source of ByteString lines to feed to processMessage. You can use the following Producer:
lines :: Producer ByteString IO ()
lines = do
bs <- liftIO (ByteString.getLine)
if ByteString.null bs
then return ()
else do
yield bs
lines
Then you can connect lines to processMessage and run them:
runEffect (lines >-> processMessage) :: IO ()
Note that the lines Producer does not use lazy IO. It will work even if you use the strict ByteString module, but the behavior of the entire program will still be lazy.
If you want to learn more about how pipes works, you can read the pipes tutorial.

How can I conditionally apply a conduit?

I have a Conduit of type Conduit a m a and a function of type (a -> Maybe a). I want to run the function, and then if it returns Nothing, use the Conduit. That is, I want a function of type
maybePipe :: Conduit a m b -> (a -> Maybe b) -> Conduit a m b
or, of the more restricted type
maybePipe :: Conduit a m a -> (a -> Maybe a) -> Conduit a m a
If it helps, my specific case is as follows:
I'm writing code that deals with IRC messages, and I have a function:
runClient :: Conduit IRC.Message IO IRC.Message -> ClientSettings -> IO ()
runClient pipe address = runTCPClient' pipe' address where
pipe' = mapC IRC.decode $= concatMapC id $= pipe $= mapC IRC.encode $= mapC (++ "\r\n")
handlePings (IRC.Message (Just (IRC.Server serverName)) "PING" []) = Just $ IRC.pong serverName
handlePings (IRC.Message Nothing "PING" [server]) = Just $ IRC.pong server
handlePings (IRC.Message Nothing "PING" []) = Just $ IRC.pong (getHost address)
handlePings _ = Nothing
runTCPClient' :: Conduit ByteString IO ByteString -> ClientSettings -> IO ()
runTCPClient' pipe address = runTCPClient address runClient where
runClient appdata = appSource appdata $= linesUnboundedAsciiC $= pipe $$ appSink appdata
I want to be able to do maybePipe handlePings pipe (or equivalent) in that function, so when the IRC message is a ping, we respond with a pong and don't call the user-specified Conduit.

Searching Hoogle reveals a function with almost exactly that type signature: mapOutputMaybe. But the more idiomatic way would be to fuse with Data.Conduit.List.mapMaybe.
EDIT
Scratch that, I understand what you're asking now. No, there's no built in combinator. But it's easy to build one up:
myHelper onNothing f = awaitForever $ maybe onNothing yield . f

Using Michael's combinator only calls the (a -> Maybe b) on the first item it comes acress, then lets the onNothing pipe take over. This was not what I was looking for.
Instead, using ZipConduit, in my specific example (using conduit-combinators):
pingHandlingPipe =
getZipConduit $ ZipConduit (concatMapC handlePings)
*> ZipConduit (takeWhileC (not.isJust.handlePings) $= pipe)
or, generalized
pipeMaybe maybeF pipe =
getZipConduit $ ZipConduit (concatMapC maybeF)
*> ZipConduit (takeWhileC (not.isJust.maybeF) $= pipe)
Unfortunately, this calls the (a -> Maybe b) function two times.

Generalizing a function to merge a set of Haskell pipes Producers

I am working with the Haskell pipes package.
I am trying to use pipes-concurrency to merge a list of Producers together.
What I want to arrive at is:
merge :: MonadIO m => [Producer a m ()] -> Producer a m ()
so given a producer s1 and another producer s2: r = merge [s1, s2]
which would give the behaviour:
s1 --1--1--1--|
s2 ---2---2---2|
r --12-1-21--2|
Following the code in the tutorial page I came up with:
mergeIO :: [Producer a IO ()] -> Producer a IO ()
mergeIO producers = do
(output, input) <- liftIO $ spawn Unbounded
_ <- liftIO $ mapM (fork output) producers
fromInput input
where
fork :: Output a -> Producer a IO () -> IO ()
fork output producer = void $ forkIO $ do runEffect $ producer >-> toOutput output
performGC
which works as expected.
However I am having difficulty generalizing things.
My attempt:
merge :: (MonadIO m) => [Producer a m ()] -> Producer a m ()
merge producers = do
(output, input) <- liftIO $ spawn Unbounded
_ <- liftIO $ mapM (fork output) producers
fromInput input
where
runEffectIO :: Monad m => Effect m r -> IO (m r)
runEffectIO e = do
x <- evaluate $ runEffect e
return x
fork output producer = forkIO $ do runEffectIO $ producer >-> toOutput output
performGC
Unfortunately this compiles but does not do all too much else. I am guessing that I am making a mess of runEffectIO. Other approaches to my current runEffectIO have yielded no better results.
The program:
main = do
let producer = merge [repeater 1 (100 * 1000), repeater 2 (150 * 1000)]
_ <- runEffect $ producer >-> taker 20
where repeater :: Int -> Int -> Producer Int IO r
repeater val delay = forever $ do
lift $ threadDelay delay
yield val
taker :: Int -> Consumer Int IO ()
taker 0 = return ()
taker n = do
val <- await
liftIO $ putStrLn $ "Taker " ++ show n ++ ": " ++ show val
taker $ n - 1
hits val <- await but does not get to liftIO $ putStrLn thus it produces no output. However it exits fine without hanging.
When I substitute in mergeIO for merge then the program runs I would expect outputting 20 lines.

While MonadIO is not sufficient for this operation, MonadBaseControl (from monad-control) is designed to allow embedding arbitrary transformer stacks inside the base monad. The companion package lifted-base provides a version of fork which will work for transformer stacks. I've put together an example of using it to solve your problem in the following Gist, though the main magic is:
import qualified Control.Concurrent.Lifted as L
fork :: (MonadBaseControl IO m, MonadIO m) => Output a -> Producer a m () -> m ThreadId
fork output producer = L.fork $ do
runEffect $ producer >-> toOutput output
liftIO performGC
Note that you should understand what happens to monadic states when treated this way: modifications to any mutable state performed in the child threads will be isolated to just those child threads. In other words, if you were using a StateT, each child thread would start off with the same state value that was in context when it was forked, but then you would have many different states that do not update each other.
There's an appendix in the Yesod book on monad-control, though frankly it's a bit dated. I'm just not aware of any more recent tutorials.

The problem seems to be your use of evaluate, which I assume it is the evaluate from Control.Exception.
You seem to be using it to "convert" a value inside the generic monad m into IO, but it doesn't really work that way. You are just obtaining the m value out of the Effect and then returning it inside IO without actually executing it. The following code doesn't print "foo":
evaluate (putStrLn "foo") >> return ""
Maybe your merge function could take as an additional parameter a function m a -> IO a so that merge knows how to bring the result of runEffect into IO.

Unfortunately, you can't fork a Producer with a MonadIO base monad (or any MonadIO computation for that matter). You need to specifically include the logic necessary to run all other monad transformers to get back an IO action before you can fork the computation.

How do I break out of a loop in Haskell?

The current version of the Pipes tutorial, uses the following two functions in one of the example:
stdout :: () -> Consumer String IO r
stdout () = forever $ do
str <- request ()
lift $ putStrLn str
stdin :: () -> Producer String IO ()
stdin () = loop
where
loop = do
eof <- lift $ IO.hIsEOF IO.stdin
unless eof $ do
str <- lift getLine
respond str
loop
As is mentinoed in the tutorial itself, P.stdin is a bit more complicated due to the need to check for the end of input.
Are there any nice ways to rewrite P.stdin to not need a manual tail recursive loop and use higher order control flow combinators like P.stdout does? In an imperative language I would use a structured while loop or a break statement to do the same thing:
while(not IO.isEOF(IO.stdin) ){
str <- getLine()
respond(str)
}
forever(){
if(IO.isEOF(IO.stdin) ){ break }
str <- getLine()
respond(str)
}

I prefer the following:
import Control.Monad
import Control.Monad.Trans.Either
loop :: (Monad m) => EitherT e m a -> m e
loop = liftM (either id id) . runEitherT . forever
-- I'd prefer 'break', but that's in the Prelude
quit :: (Monad m) => e -> EitherT e m r
quit = left
You use it like this:
import Pipes
import qualified System.IO as IO
stdin :: () -> Producer String IO ()
stdin () = loop $ do
eof <- lift $ lift $ IO.hIsEOF IO.stdin
if eof
then quit ()
else do
str <- lift $ lift getLine
lift $ respond str
See this blog post where I explain this technique.
The only reason I don't use that in the tutorial is that I consider it less beginner-friendly.

Looks like a job for whileM_:
stdin () = whileM_ (lift . fmap not $ IO.hIsEOF IO.stdin) (lift getLine >>= respond)
or, using do-notation similarly to the original example:
stdin () =
whileM_ (lift . fmap not $ IO.hIsEOF IO.stdin) $ do
str <- lift getLine
respond str
The monad-loops package offers also whileM which returns a list of intermediate results instead of ignoring the results of the repeated action, and other useful combinators.

Since there is no implicit flow there is no such thing like "break". Moreover your sample already is small block which will be used in more complicated code.
If you want to stop "producing strings" it should be supported by your abstraction. I.e. some "managment" of "pipes" using special monad in Consumer and/or other monads that related with this one.

You can simply import System.Exit, and use exitWith ExitSuccess
Eg. if (input == 'q')
then exitWith ExitSuccess
else print 5 (anything)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to turn a pull based pipe into a push based one? - haskell

Related

Convert IO callback to infinite list

What's an idiomatic way of handling a lazy input channel in Haskell

How can I conditionally apply a conduit?

Generalizing a function to merge a set of Haskell pipes Producers

How do I break out of a loop in Haskell?

Categories

Resources