Finalization in Pipes-2.1.0 package

Finalization in Pipes-2.1.0 package - haskell

I am using the Pipes-2.1.0 package and the zeromq3-haskell package to construct a little message pipeline. Everything seems to be going well except that I am having trouble understanding finalization of Frames.
In the following Frame I acquire two resources; a zeromq context, and a zeromq socket. Then I continuously wait for messages (in the form of ByteStrings) to publish on the zeromq socket.
{-# LANGUAGE RebindableSyntax #-}
{-# LANGUAGE ScopedTypeVariables #-}
module PipesZeroMQ where
import Control.Frame
import Control.IMonad.Do
import Control.IMonad.Trans
import qualified Control.Monad as M
import Data.ByteString (ByteString)
import Data.String
import Prelude hiding (Monad(..))
import qualified System.ZMQ3 as ZMQ
type Address = String
fromList :: (M.Monad m) => [b] -> Frame b m (M a) (M a) ()
fromList xs = mapMR_ yield xs
publisher :: Address -> Frame Void IO (M ByteString) C ()
publisher addr = do
c <- liftU $ ZMQ.init 1
s <-liftU $ ZMQ.socket c ZMQ.Pub
liftU $ ZMQ.bind s addr
liftU $ print "Socket open for business!!!"
foreverR $ do
bs <- await
finallyF (ZMQ.close s M.>> ZMQ.term c M.>> print "ZMQ socket closed") $ do
(liftU $ ZMQ.send s [] bs)
(liftU (print "Sending message"))
Now if I try this:
λ> runFrame $ (publisher localAddress) <-< (fromList ["This", "that", "that"] >> close)
I get this:
"Socket open for business"
"Sending message"
"ZMQ socket closed"
*** Exception: ZMQError { errno = 88, source = "send", message = "Socket operation on non-socket" }
publisher finalizes after receiving but one BytesString.
Why is this happening?
What am I misunderstanding about finalization using Frames in Pipes-2.1.0?
Does the tree outside stand a chance if I start attacking it?

You made a mistake when writing the publisher function:
foreverR $ do
bs <- await
finallyF (ZMQ.close s M.>> ZMQ.term c M.>> print "ZMQ socket closed") $ do
(liftU $ ZMQ.send s [] bs)
(liftU (print "Sending message"))
You probably wanted to place the finallyF OUTSIDE the foreverR loop:
finallyF (...) $ foreverR $ do
bs <- await
liftU $ ZMQ.send s [] bs)
liftU (print "Sending message")
The way you wrote it, it finalizes after each send, so it's doing exactly what you told it to do: finalize after every send. finallyF calls the finalizer once the action it wraps is complete, both if it terminates successfully or unsuccessfully. You could also use catchF in that case, since the loop never terminates anyway:
catchF (...) $ foreverR $ do
bs <- await
liftU $ ZMQ.send s [] bs)
liftU (print "Sending message")
Alternatively, you could have kept it inside the loop but switched to catchF so that the finalizer doesn't get run after each send:
foreverR $ do
bs <- await
catchF (ZMQ.close s M.>> ZMQ.term c M.>> print "ZMQ socket closed") $ do
(liftU $ ZMQ.send s [] bs)
(liftU (print "Sending message"))
Also, if you are planning on writing a zeroMQ library based on pipes, keep in touch with me because I'm planning to return frames back to an ordinary monad in the next release with a lot of new enhancements to functionality, too, such as the ability to close and reinitialize resources. To reach me use my gmail.com address with username Gabriel439.

Related

how to publish in Hedis haskell pubSub

I'm currently learning Haskell. Now I'm currently not that good in functional programming.
I want to make a piece of code, that get's data from subscribing on a topic in Redis, do some calculation on it and publishing it on a other topic. I'm having trouble with I guess some language specific features.
My current code:
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Database.Redis
import System.IO
main = do
conn <- connect defaultConnectInfo
runRedis conn $ do
pubSub (subscribe ["commands"]) $ \msg -> do
putStrLn $ "Message from " ++ show (msgChannel msg)
publish "results" "Result of a very interesting calculation"
return mempty
Now I receive the error: • No instance for (RedisCtx IO f0) arising from a use of ‘publish’
Putting the publish outside the pubsub will make it work. But I want to publish a result! I can't get any wiser from the documentation. What is it that I'm missing?

Your use of putStrLn is making the typechecker (correctly!) infer that your do block is intended to be in an IO context, then the call to publish requires the context to be an instance of RedisCtx, which IO is not.
Normally in a Redis context, the solution would be to lift the IO action into the Redis context using liftIO :: IO a -> m a from the MonadIO class, as in the example in the documentation:
runRedis conn $ do
set "hello" "hello"
set "world" "world"
helloworld <- multiExec $ do
hello <- get "hello"
world <- get "world"
return $ (,) <$> hello <*> world
liftIO (print helloworld)
MonadIO is the set of types in whose context you can execute IO actions.
However, in this case, it’s the other way around: the functional parameter to pubSub returns an IO action, but publish expects a RedisCtx monad.
It’s not clear to me whether pubSub allows you to make a runRedis call inside the callback, like this, although I think it should typecheck:
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Database.Redis
import System.IO
main = do
conn <- connect defaultConnectInfo
runRedis conn $ do
pubSub (subscribe ["commands"]) $ \msg -> do
putStrLn $ "Message from " ++ show (msgChannel msg)
runRedis conn $ publish "results" "Result of a very interesting calculation"
return mempty
Based on skimming the docs, each runRedis call takes a connection from the connection pool, whose default size is 50; however, if there is no available connection, it blocks, so my concern is that because the docs for pubSub say it’s “single-threaded”, this could deadlock waiting for a connection that won’t be released since you’re in a “nested” runRedis call.
I think the thing I would try next is to use the more flexible pubSubForever API; in the hedis test suite there’s an example of using pubSubForever with separate threads for publishing and handling subscription events.
main = do
ctrl <- newPubSubController [("foo", msgHandler)] []
conn <- connect defaultConnectInfo
withAsync (publishThread conn) $ \_pubT -> do
withAsync (handlerThread conn ctrl) $ \_handlerT -> do
void $ hPutStrLn stderr "Press enter to subscribe to bar" >> getLine
void $ addChannels ctrl [("bar", msgHandler)] []
-- …
-- (Add/remove various subscriptions.)
-- …
publishThread uses runRedis and calls publish:
publishThread :: Connection -> IO ()
publishThread c = runRedis c $ loop (0 :: Int)
where
loop i = do
let msg = encodeUtf8 $ pack $ "Publish iteration " ++ show i
void $ publish "foo" ("foo" <> msg)
-- …
liftIO $ threadDelay $ 2*1000*1000
loop (i+1)
handlerThread uses pubSubForever:
handlerThread :: Connection -> PubSubController -> IO ()
handlerThread conn ctrl = forever $
pubSubForever conn ctrl onInitialComplete
`catch` (\(e :: SomeException) -> do
hPutStrLn stderr $ "Got error: " ++ show e
threadDelay $ 50*1000)
This is wrapped in a call to forever to resubscribe if the connection is lost, per the docs for pubSubForever:
[…] if the network connection to Redis dies, pubSubForever will throw a ConnectionLost. When such an exception is thrown, you can recall pubSubForever with the same PubSubController which will open a new connection and resubscribe to all the channels which are tracked in the PubSubController.
This test uses Control.Concurrent.Async from the async package for managing tasks, which is a good idea imo. If you want to avoid that dependency, you could use forkIO instead (with e.g. a Chan or STM TChan to send events from the handler), the only issue is that this won’t automatically notify the other threads if the forked thread terminates due to an exception, whereas Async makes some nice exception safety guarantees.

Collecting the Async results as they become available

How would you collect the results of a list of Async a in Haskell as they become available? The idea is to start processing the results of asynchronous tasks as soon as they are available.
The best I could come up with is the following function:
collect :: [Async a] -> IO [a]
collect [] = return []
collect asyncs = do
(a, r) <- waitAny asyncs
rs <- collect (filter (/= a) asyncs)
return (r:rs)
However, this function does not exhibits the desired behavior since, as pointed out in the comment below, it doesn't return till all the asynchronous tasks are completed. Furthermore, collect runs in O(n^2) since I'm filtering the list at each recursive step. This could be improved by using a more efficient structure (and maybe indexing the position of the Async values in the list).
Maybe there are library functions that take care of this, but I could not find them in the Control.Concurrent.Async module and I wonder why.
EDIT: after thinking the problem a bit more carefully, I'm wondering whether such function is a good idea. I could just use fmap on the asynchronous tasks. Maybe it is a better practice to wait for the results when there is no other choice.

As I mentioned in my other answer, streaming results out of a list of Asyncs as they become available is best achieved using a stream processing library. Here's an example using pipes.
import Control.Concurrent (threadDelay)
import Control.Concurrent.Async
import Control.Concurrent.STM
import Data.Functor (($>))
import Pipes
import Pipes.Concurrent -- from the pipes-concurrency package
import qualified Pipes.Prelude as P
asCompleted :: MonadIO m => [Async a] -> Producer a m ()
asCompleted asyncs = do
(o, i, seal) <- liftIO $ spawn' unbounded
liftIO $ forkIO $ do
forConcurrently asyncs (\async -> atomically $ waitSTM async >>= send o)
atomically seal
fromInput i
main = do
actions <- traverse async [threadDelay 2000000 $> "bar", threadDelay 1000000 $> "foo"]
runEffect $ asCompleted actions >-> P.print
-- after one second, prints "foo", then "bar" a second later
Using pipes-concurrency, we spawn' an Output-Input pair and immediately convert the Input to a Producer using fromInput. Asynchronously, we send items as they become available. When all the Asyncs have completed we seal the inbox to close down the Producer.

Implemented via TChan, additionally implemented a version which can react immediately, but it is more complex and also might have problems with exceptions (if you want to receive exceptions, use SlaveThread.fork instead of forkIO), so I commented that code in case you're not interested in it:
import Control.Concurrent (threadDelay)
import Control.Concurrent (forkIO)
import Control.Concurrent.Async
import Control.Concurrent.STM
import Control.Monad
collect :: [Async a] -> IO [a]
collect = atomically . collectSTM
collectSTM :: [Async a] -> STM [a]
collectSTM as = do
c <- newTChan
collectSTMChan c as
collectSTMChan :: TChan a -> [Async a] -> STM [a]
collectSTMChan chan as = do
mapM_ (waitSTM >=> writeTChan chan) as
replicateM (length as) (readTChan chan)
main :: IO ()
main = do
a1 <- async (threadDelay 2000000 >> putStrLn "slept 2 secs" >> return 2)
a2 <- async (threadDelay 3000000 >> putStrLn "slept 3 secs" >> return 3)
a3 <- async (threadDelay 1000000 >> putStrLn "slept 1 sec" >> return 1)
res <- collect [a1,a2,a3]
putStrLn (show res)
-- -- reacting immediately
-- a1 <- async (threadDelay 2000000 >> putStrLn "slept 2 secs" >> return 2)
-- a2 <- async (threadDelay 3000000 >> putStrLn "slept 3 secs" >> return 3)
-- a3 <- async (threadDelay 1000000 >> putStrLn "slept 1 sec" >> return 1)
-- c <- collectChan [a1,a2,a3]
-- replicateM_ 3 (atomically (readTChan c) >>= \v -> putStrLn ("Received: " ++ show v))
-- collectChan :: [Async a] -> IO (TChan a)
-- collectChan as = do
-- c <- newTChanIO
-- forM_ as $ \a -> forkIO ((atomically . (waitSTM >=> writeTChan c)) a)
-- return c

I'm reading your question as "is it possible to sort a list of Asyncs by their completion time?". If that's what you meant, the answer is yes.
import Control.Applicative (liftA2)
import Control.Concurrent (threadDelay)
import Control.Concurrent.Async
import Data.Functor (($>))
import Data.List (sortBy)
import Data.Ord (comparing)
import Data.Time (getCurrentTime)
sortByCompletion :: [Async a] -> IO [a]
sortByCompletion = fmap (fmap fst . sortBy (comparing snd)) . mapConcurrently withCompletionTime
where withCompletionTime async = liftA2 (,) (wait async) getCurrentTime
main = do
asyncs <- traverse async [threadDelay 2000000 $> "bar", threadDelay 1000000 $> "foo"]
sortByCompletion asyncs
-- ["foo", "bar"], after two seconds
Using mapConcurrently we wait for each Async on a separate thread. Upon completion we get the current time - the time at which the Async completed - and use it to sort the results. This is O(n log n) complexity because we are sorting the list. (Your original algorithm was effectively a selection sort.)
Like your collect, sortByCompletion doesn't return until all the Asyncs in the list have completed. If you wanted to stream results onto the main thread as they become available, well, lists aren't a very good tool for that. I'd use a streaming abstraction like conduit or pipes, or, working at a lower level, a TQueue. See my other answer for an example.

Extending the IRC bot from wiki.haskell.org with state

Problem
I'm attempting to extend the IRC bot from https://wiki.haskell.org/Roll_your_own_IRC_bot with some state that's updated every time the bot posts a message in the channel it's connected to.
The feature is: every time the command !last said is issued in the IRC channel, the bot should respond with a time stamp. To support this, the privmsg function needs to update the bot's state -- specifically the lastPosted record -- with a new timestamp every time it is called.
Work so far
I took the code from the bottom of the Haskell wiki page (which used a ReaderT to access information about the bot's environment) and tried to change out the ReaderT for a State Transformer (StateT). The results are below and as you can see, I didn't get very far.
import Data.List
import Network
import System.IO
import System.Exit
import System.Time
import Control.Arrow
import Control.Monad.State
import Control.Exception
import Text.Printf
server = "irc.freenode.org"
port = 6667
chan = "#testbot-test"
nick = "testbottest"
-- The 'Net' monad, a wrapper over IO, carrying the bot's immutable state.
type Net = StateT Bot IO
data Bot = Bot { socket :: Handle, lastPosted :: ClockTime }
-- Set up actions to run on start and end, and run the main loop
main :: IO ()
main = bracket connect disconnect loop
where
disconnect = hClose . socket
loop st = runStateT run st
-- Connect to the server and return the initial bot state
connect :: IO Bot
connect = notify $ do
h <- connectTo server (PortNumber (fromIntegral port))
t <- getClockTime
hSetBuffering h NoBuffering
return (Bot h t)
where
notify a = bracket_
(printf "Connecting to %s ... " server >> hFlush stdout)
(putStrLn "done.")
a
-- We're in the Net monad now, so we've connected successfully
-- Join a channel, and start processing commands
run :: Net ()
run = do
write "NICK" nick
write "USER" (nick ++ " 0 * :test bot")
write "JOIN" chan
gets socket >>= listen
-- Process each line from the server
listen :: Handle -> Net ()
listen h = forever $ do
s <- init `fmap` liftIO (hGetLine h)
liftIO (putStrLn s)
if ping s then pong s else eval (clean s)
where
forever a = a >> forever a
clean = drop 1 . dropWhile (/= ':') . drop 1
ping x = "PING :" `isPrefixOf` x
pong x = write "PONG" (':' : drop 6 x)
-- Dispatch a command
eval :: String -> Net ()
eval "!quit" = write "QUIT" ":Exiting" >> liftIO (exitWith ExitSuccess)
-- Posting when something was last posted shouldn't count as last posted.
eval "!last said" = getLastPosted >>= (\t -> write "PRIVMSG" (chan ++ " :" ++ t))
eval x | "!id " `isPrefixOf` x = privmsg (drop 4 x)
eval _ = return () -- ignore everything else
getLastPosted :: Net String
getLastPosted = do
t <- gets lastPosted
return $ show t
-- Send a privmsg to the current chan + server
privmsg :: String -> Net ()
privmsg s = write "PRIVMSG" (chan ++ " :" ++ s)
-- Send a message out to the server we're currently connected to
write :: String -> String -> Net ()
write s t = do
h <- gets socket
liftIO $ hPrintf h "%s %s\r\n" s t
liftIO $ printf "> %s %s\n" s t
Other support avenues explored
spent a couple of days reading up on ReaderT, StateT and their non-transformer friends Reader and State,
checking Stack Overflow for anyone with a similar problem, but the only other IRC bot question threaded the socket as an argument to every function that needed it (instead of using a ReaderT),
Tweeted Don S. the original author of the wiki page
asked in the Haskell IRC channel.
Question
How can the Haskell wiki IRC bot be extended to post a message, containing the date and time stamp of the last message posted? Preferably using an abstraction like ReaderT (only allowing mutable state) rather than passing state around in function arguments.

I got your code to compile by simply adding a >> return () to the definition of loop in your main:
main :: IO ()
main = bracket connect disconnect loop
where
disconnect = hClose . socket
loop st = (runStateT run st) >> return ()
This effectively ignores the return value of runStateT. Here are all of the variants of runState/runStateT:
runStateT - return both the final state and returned value
evalStateT - return only the final value
execStateT - return only the final state
Your original definition of loop was returning a pair (from runStateT), and this didn't type check since main wants a computation which returns just ().
To update the lastPosted field, consider this addition to the eval function which is triggered when the bot is sent the message !update time:
eval "!update time"
= do t <- liftIO getClockTime
bot <- get
put (bot { lastPosted = t })
We need to liftIO getClockTime since we are operating in the Net monad.
Then we get the old state and put the updated state. You can add this logic wherever you want to update the lastPosted time in the Net monad.
Full code is available at: http://lpaste.net/142931

Limiting pipes based on time?

Is it possible to create pipes that get all values that have been sent downstream in a certain time period? I'm implementing a server where the protocol allows me to concatenate outgoing packets and compress them together, so I'd like to effectively "empty out" the queue of downstream ByteStrings every 100ms and mappend them together to then yield on to the next pipe which does the compression.

Here's a solution using pipes-concurrency. You give it any Input and it will periodically drain the input of all values:
import Control.Applicative ((<|>))
import Control.Concurrent (threadDelay)
import Data.Foldable (forM_)
import Pipes
import Pipes.Concurrent
drainAll :: Input a -> STM (Maybe [a])
drainAll i = do
ma <- recv i
case ma of
Nothing -> return Nothing
Just a -> loop (a:)
where
loop diffAs = do
ma <- recv i <|> return Nothing
case ma of
Nothing -> return (Just (diffAs []))
Just a -> loop (diffAs . (a:))
bucketsEvery :: Int -> Input a -> Producer [a] IO ()
bucketsEvery microseconds i = loop
where
loop = do
lift $ threadDelay microseconds
ma <- lift $ atomically $ drainAll i
forM_ ma $ \a -> do
yield a
loop
This gives you much greater control over how you consume elements from upstream, by selecting the type of Buffer you use to build the Input.
If you're new to pipes-concurrency, you can read the tutorial which explains how to use spawn, Buffer and Input.

Here is a possible solution. It is based on a Pipe that tags ByteStrings going downstream with a Bool, in order to identify ByteStrings belonging to the same "time bucket".
First, some imports:
import Data.AdditiveGroup
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as BL
import qualified Data.ByteString.Lazy.Builder as BB
import Data.Thyme.Clock
import Data.Thyme.Clock.POSIX
import Control.Monad.State.Strict
import Control.Lens (view)
import Control.Concurrent (threadDelay)
import Pipes
import Pipes.Lift
import qualified Pipes.Prelude as P
import qualified Pipes.Group as PG
Here is the tagging Pipe. It uses StateT internally:
tagger :: Pipe B.ByteString (B.ByteString,Bool) IO ()
tagger = do
startTime <- liftIO getPOSIXTime
evalStateP (startTime,False) $ forever $ do
b <- await
currentTime <- liftIO getPOSIXTime
-- (POSIXTime,Bool) inner state
(baseTime,tag) <- get
if (currentTime ^-^ baseTime > timeLimit)
then let tag' = not tag in
yield (b,tag') >> put (currentTime, tag')
else yield $ (b,tag)
where
timeLimit = fromSeconds 0.1
Then we can use functions from the pipes-group package to group ByteStrings belonging to the same "time bucket" into lazy ByteStrings:
batch :: Producer B.ByteString IO () -> Producer BL.ByteString IO ()
batch producer = PG.folds (<>) mempty BB.toLazyByteString
. PG.maps (flip for $ yield . BB.byteString . fst)
. view (PG.groupsBy $ \t1 t2-> snd t1 == snd t2)
$ producer >-> tagger
It seems to batch correctly. This program:
main :: IO ()
main = do
count <- P.length $ batch (yield "boo" >> yield "baa")
putStrLn $ show count
count <- P.length $ batch (yield "boo" >> yield "baa"
>> liftIO (threadDelay 200000) >> yield "ddd")
putStrLn $ show count
Has the output:
1
2
Notice that the contents of a "time bucket" are only yielded when the first element of the next bucket arrives. They are not yielded automatically each 100ms. This may or may not be a problem for you. It you want to yield automatically each 100ms, you would need a different solution, possibly based on pipes-concurrency.
Also, you could consider working directly with the FreeT-based "effectul lists" provided by pipes-group. That way you could start compressing the data in a "time bucket" before the bucket is full.

So unlike Daniel's answer my does not tag the data as it is produced. It just takes at least element from upstream and then continues to aggregate more values in the monoid until the time interval has passed.
This codes uses a list to aggregate, but there are better monoids to aggregate with
import Pipes
import qualified Pipes.Prelude as P
import Data.Time.Clock
import Data.Time.Calendar
import Data.Time.Format
import Data.Monoid
import Control.Monad
-- taken from pipes-rt
doubleToNomDiffTime :: Double -> NominalDiffTime
doubleToNomDiffTime x =
let d0 = ModifiedJulianDay 0
t0 = UTCTime d0 (picosecondsToDiffTime 0)
t1 = UTCTime d0 (picosecondsToDiffTime $ floor (x/1e-12))
in diffUTCTime t1 t0
-- Adapted from from pipes-parse-1.0
wrap
:: Monad m =>
Producer a m r -> Producer (Maybe a) m r
wrap p = do
p >-> P.map Just
forever $ yield Nothing
yieldAggregateOverTime
:: (Monoid y, -- monoid dependance so we can do aggregation
MonadIO m -- to beable to get the current time the
-- base monad must have access to IO
) =>
(t -> y) -- Change element from upstream to monoid
-> Double -- Time in seconds to aggregate over
-> Pipe (Maybe t) y m ()
yieldAggregateOverTime wrap period = do
t0 <- liftIO getCurrentTime
loop mempty (dtUTC `addUTCTime` t0)
where
dtUTC = doubleToNomDiffTime period
loop m ts = do
t <- liftIO getCurrentTime
v0 <- await -- await at least one element
case v0 of
Nothing -> yield m
Just v -> do
if t > ts
then do
yield (m <> wrap v)
loop mempty (dtUTC `addUTCTime` ts)
else do
loop (m <> wrap v) ts
main = do
runEffect $ wrap (each [1..]) >-> yieldAggregateOverTime (\x -> [x]) (0.0001)
>-> P.take 10 >-> P.print
Depending on cpu load you the output data will be aggregated differently. With at least on element in each chunk.
$ ghc Main.hs -O2
$ ./Main
[1,2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
$ ./Main
[1,2]
[3]
[4]
[5]
[6,7,8,9,10]
[11,12,13,14,15,16,17,18]
[19,20,21,22,23,24,25,26]
[27,28,29,30,31,32,33,34]
[35,36,37,38,39,40,41,42]
[43,44,45,46,47,48,49,50]
$ ./Main
[1,2,3,4,5,6]
[7]
[8]
[9,10,11,12,13,14,15,16,17,18,19,20]
[21,22,23,24,25,26,27,28,29,30,31,32,33]
[34,35,36,37,38,39,40,41,42,43,44]
[45,46,47,48,49,50,51,52,53,54,55]
[56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72]
[73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88]
[89,90,91,92,93,94,95,96,97,98,99,100,101,102,103]
$ ./Main
[1,2,3,4,5,6,7]
[8]
[9]
[10,11,12,13,14,15,16,17,18]
[19,20,21,22,23,24,25,26,27]
[28,29,30,31,32,33,34,35,36,37]
[38,39,40,41,42,43,44,45,46]
[47,48,49,50]
[51,52,53,54,55,56,57]
[58,59,60,61,62,63,64,65,66]
You might want to look at the source code of
pipes-rt it shows one approach to deal with time in pipes.
edit: Thanks to Daniel Díaz Carrete, adapted pipes-parse-1.0 technique to handle upstream termination. A pipes-group solution should be possible using the same technique as well.

Pipes and callbacks in Haskell

I'm processing some audio using portaudio. The haskell FFI bindings call a user defined callback whenever there's audio data to be processed. This callback should be handled very quickly and ideally with no I/O. I wanted to save the audio input and return quickly since my application doesn't need to react to the audio in realtime (right now I'm just saving the audio data to a file; later I'll construct a simple speech recognition system).
I like the idea of pipes and thought I could use that library. The problem is that I don't know how to create a Producer that returns data that came in through a callback.
How do I handle my use case?
Here's what I'm working with right now, in case that helps (the datum mvar isn't working right now but I don't like storing all the data in a seq... I'd rather process it as it came instead of just at the end):
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
module Main where
import Codec.Wav
import Sound.PortAudio
import Sound.PortAudio.Base
import Sound.PortAudio.Buffer
import Foreign.Ptr
import Foreign.ForeignPtr
import Foreign.C.Types
import Foreign.Storable
import qualified Data.StorableVector as SV
import qualified Data.StorableVector.Base as SVB
import Control.Exception.Base (evaluate)
import Data.Int
import Data.Sequence as Seq
import Control.Concurrent
instance Buffer SV.Vector a where
fromForeignPtr fp = return . SVB.fromForeignPtr fp
toForeignPtr = return . (\(a, b, c) -> (a, c)) . SVB.toForeignPtr
-- | Wrap a buffer callback into the generic stream callback type.
buffCBtoRawCB' :: (StreamFormat input, StreamFormat output, Buffer a input, Buffer b output) =>
BuffStreamCallback input output a b -> StreamCallback input output
buffCBtoRawCB' func = \a b c d e -> do
fpA <- newForeignPtr_ d -- We will not free, as callback system will do that for us
fpB <- newForeignPtr_ e -- We will not free, as callback system will do that for us
storeInp <- fromForeignPtr fpA (fromIntegral $ 1 * c)
storeOut <- fromForeignPtr fpB (fromIntegral $ 0 * c)
func a b c storeInp storeOut
callback :: MVar (Seq.Seq [Int32]) -> PaStreamCallbackTimeInfo -> [StreamCallbackFlag] -> CULong
-> SV.Vector Int32 -> SV.Vector Int32 -> IO StreamResult
callback seqmvar = \timeinfo flags numsamples input output -> do
putStrLn $ "timeinfo: " ++ show timeinfo ++ "; flags are " ++ show flags ++ " in callback with " ++ show numsamples ++ " samples."
print input
-- write data to output
--mapM_ (uncurry $ pokeElemOff output) $ zip (map fromIntegral [0..(numsamples-1)]) datum
--print "wrote data"
input' <- evaluate $ SV.unpack input
modifyMVar_ seqmvar (\s -> return $ s Seq.|> input')
case flags of
[] -> return $ if unPaTime (outputBufferDacTime timeinfo) > 0.2 then Complete else Continue
_ -> return Complete
done doneMVar = do
putStrLn "total done dood!"
putMVar doneMVar True
return ()
main = do
let samplerate = 16000
Nothing <- initialize
print "initialized"
m <- newEmptyMVar
datum <- newMVar Seq.empty
Right s <- openDefaultStream 1 0 samplerate Nothing (Just $ buffCBtoRawCB' (callback datum)) (Just $ done m)
startStream s
_ <- takeMVar m -- wait until our callbacks decide they are done!
Nothing <- terminate
print "let's see what we've recorded..."
stuff <- takeMVar datum
print stuff
-- write out wav file
-- let datum =
-- audio = Audio { sampleRate = samplerate
-- , channelNumber = 1
-- , sampleData = datum
-- }
-- exportFile "foo.wav" audio
print "main done"

The simplest solution is to use MVars to communicate between the callback and Producer. Here's how:
import Control.Proxy
import Control.Concurrent.MVar
fromMVar :: (Proxy p) => MVar (Maybe a) -> () -> Producer p a IO ()
fromMVar mvar () = runIdentityP loop where
loop = do
ma <- lift $ takeMVar mvar
case ma of
Nothing -> return ()
Just a -> do
respond a
loop
Your stream callback will write Just input to the MVar and your finalization callback will write Nothing to terminate the Producer.
Here's a ghci example demonstrating how it works:
>>> mvar <- newEmptyMVar :: IO (MVar (Maybe Int))
>>> forkIO $ runProxy $ fromMVar mvar >-> printD
>>> putMVar mvar (Just 1)
1
>>> putMVar mvar (Just 2)
2
>>> putMVar mvar Nothing
>>> putMVar mvar (Just 3)
>>>
Edit: The pipes-concurrency library now provides this feature, and it even has a section in the tutorial explaining specifically how to use it to get data out of callbacks.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Finalization in Pipes-2.1.0 package - haskell

Related

how to publish in Hedis haskell pubSub

Collecting the Async results as they become available

Extending the IRC bot from wiki.haskell.org with state

Limiting pipes based on time?

Pipes and callbacks in Haskell

Categories

Resources