Buffer Mode MPI does not work as I understand - python-3.x

I have multiple producer and one consumer processes. Each process is launched by MPIPoolExecutor class. First I launch the consumer process and then starts launching producer processes using starmap method. Consumer accepts receives data and save it to hard drive. Each producer process will create a buffer with the same size as the data needs to be sent and send it using the blocker method bsend. I am expecting each producer process to dump the data into the buffer and exit. However, I am noticing a delay where it looks like each producer process waits for the data to be consumed by the consumer process. What am I missing? My code goes like this:
def consumer(args...):
comm = MPI.COMM_WORLD
file = tb.open_file(file_name, 'w')
filters = tb.Filters(complevel=5, complib='blosc')
array = file.create_carray(file.root, 'data', tb.Float32Atom(), shape=(n_, n_), filters=filters)
for i in range(num_tasks):
t = time.time()
idxs, data = MPI.Comm.recv()
print("time for waiting --consumer ", time.time() - t)
array[idxs,:] = data
def producer(args...):
comm = MPI.COMM_WORLD
#adding 1000 just to be in the safe side.
mem = MPI.Alloc_mem(data.nbytes + idxs.nbytes + 1000)
MPI.Attach_buffer(mem)
#Since consumer is launched first, it guarantees to get a rank of 1.
MPI.Comm.bsend([idxs, data], 1)
MPI.Detach_buffer()
....
with MPIPoolExecutor() as executor:
executor.starmap(consumer, [(args)])
executor.starmap(produces, list_of_args)

If the consumer is launched first, it gets rank zero, not 1. Also: you're misunderstanding buffered communication. If you want the producer to return, use MPI_Isend. The buffer detach call blocks until all messages in it have been completed.

Related

Seperating AioRTC datachannel into multiple threads

I have a two-way datachannel setup that takes a heartbeat from a browser client and keeps the session alive as long as the heartbeat stays. The heartbeat is the 'main' communication for WebRTC, but I have other bits of into (Such as coordinates) I need to send constantly.
To do this when a webrtc offer is given, it takes that HTTP request:
Creates a new event loop 'rtcloop'
Set's that as the main event loop.
Then run 'rtcloop' until complete, calling my webRtcStart function and passing through the session info.
Then run a new thread with the target being 'rtcloop', run it forever and start.
Inside the new thread I set the loop with 'get_event_loop' and later define ' #webRtcPeer.on("datachannel")' so when we get a Datachannel message, we run code around that. Depending on the situation, I attempt to do the following:
ptzcoords = 'Supported' #PTZ Coords will be part of WebRTC Communication, send every 0.5 seconds.
ptzloop = asyncio.new_event_loop()
ptzloop.run_until_complete(updatePTZReadOut(webRtcPeer, cameraName, loop))
ptzUpdateThread = Thread(target=ptzloop.run_forever)
ptzUpdateThread.start()
The constant error I get no matter how I structure things is "coroutine 'updatePTZReadOut' was never awaited"
With updatePTZReadOut being:
async def updatePTZReadOut(rtcPeer, cameraName, eventLoop):
# Get Camera Info
# THE CURRENT ISSUE I am having is with the event loops, because this get's called to run in another thread, but it still needs
# to be awaitable,
# Current Warning Is: /usr/lib/python3.10/threading.py:953: RuntimeWarning: coroutine 'updatePTZReadOut' was never awaited
# Ref Article: https://xinhuang.github.io/posts/2017-07-31-common-mistakes-using-python3-asyncio.html
# https://lucumr.pocoo.org/2016/10/30/i-dont-understand-asyncio/
# Get current loop
# try:
loop = asyncio.set_event_loop(eventLoop)
# loop.run_until_complete()
# except RuntimeError:
# loop = asyncio.new_event_loop()
# asyncio.set_event_loop(loop)
# Getting Current COORDS from camera
myCursor.execute("Select * from localcameras where name = '{0}' ".format(cameraName))
camtuple = myCursor.fetchall()
camdata = camtuple[0]
# Create channel object
channel_local = rtcPeer.createDataChannel("chat")
while True:
ptzcoords = readPTZCoords(camdata[1], camdata[3], cryptocode.decrypt(str(camdata[4]), passwordRandomKey))
print("Updating Coords to {0}".format(ptzcoords))
# Publish Here
await channel_local.send("TTTT")
asyncio.sleep(0.5)
Any help here?
updatePTZReadOut is async function. You need to add await whenever you call this function.

Difference between threading.Condition and threading.Event

All existing examples of threading.Condition and threading.Event present in internet are to solve producer consumer issue. But this can be done using either of them. Only advantage of Condition I have found is - it supports two types of notify function (notify and notifyAll)
Here I have easily replicated an existing example of Condition to Event. the Condition example is taken from here
Example using Condition:
def consumer(cv):
logging.debug('Consumer thread started ...')
with cv:
logging.debug('Consumer waiting ...')
cv.wait()
logging.debug('Consumer consumed the resource')
def producer(cv):
logging.debug('Producer thread started ...')
with cv:
logging.debug('Making resource available')
logging.debug('Notifying to all consumers')
cv.notifyAll()
if __name__ == '__main__':
condition = threading.Condition()
cs1 = threading.Thread(name='consumer1', target=consumer, args=(condition,))
cs2 = threading.Thread(name='consumer2', target=consumer, args=(condition,))
pd = threading.Thread(name='producer', target=producer, args=(condition,))
cs1.start()
time.sleep(2)
cs2.start()
time.sleep(2)
pd.start()
Example using Event:
def consumer(ev):
logging.debug('Consumer thread started ...')
logging.debug('Consumer waiting ...')
ev.wait()
logging.debug('Consumer consumed the resource')
def producer(ev):
logging.debug('Producer thread started ...')
logging.debug('Making resource available')
logging.debug('Notifying to all consumers')
ev.set()
if __name__ == '__main__':
event = threading.Event()
cs1 = threading.Thread(name='consumer1', target=consumer, args=(event,))
cs2 = threading.Thread(name='consumer2', target=consumer, args=(event,))
pd = threading.Thread(name='producer', target=producer, args=(event,))
cs1.start()
time.sleep(2)
cs2.start()
time.sleep(2)
pd.start()
Output:
(consumer1) Consumer thread started ...
(consumer1) Consumer waiting ...
(consumer2) Consumer thread started ...
(consumer2) Consumer waiting ...
(producer ) Producer thread started ...
(producer ) Making resource available
(producer ) Notifying to all consumers
(consumer1) Consumer consumed the resource
(consumer2) Consumer consumed the resource
Can someone give an example which is possible by using Condition but not by Event?
Event has state: You can call e.is_set() to find out whether e has been set() or clear()ed.
A Condition has no state. If some thread happens to call c.notify() or c.notify_all() when other threads are waiting in c.wait(), then one or more of those other threads will be awakened. But if the notification comes at a moment when no other thread is waiting, then c.notify() and c.notify_all() do absolutely nothing at all.
In short: An Event is a complete communication mechanism that can be use to transmit information from one thread to another. But a Condition cannot, by itself, be reliably used for communication. It has to be incorporated into some larger scheme.
IMO: Both classes are poorly named. A better name for Event would be Bit because it stores one bit of information.
A better name for Condition would be Event. Notifying a condition is an "event" it the true sense of the word. If you happen to be watching for it at the moment when it happens, then you learn something. But if you don't happen to be watching, then you miss it altogether.

Multiprocess : Persistent Pool?

I have code like the one below :
def expensive(self,c,v):
.....
def inner_loop(self,c,collector):
self.db.query('SELECT ...',(c,))
for v in self.db.cursor.fetchall() :
collector.append( self.expensive(c,v) )
def method(self):
# create a Pool
#join the Pool ??
self.db.query('SELECT ...')
for c in self.db.cursor.fetchall() :
collector = []
#RUN the whole cycle in parallel in separate processes
self.inner_loop(c, collector)
#do stuff with the collector
#! close the pool ?
both the Outer and the Inner loop are thousands of steps ...
I think I understand how to run a Pool of couple of processes.
All the examples I found show that more or less.
But in my case I need to lunch a persistent Pool and then feed the data (c-value). Once a inner-loop process has finished I have to supply the next-available-c-value.
And keep the processes running and collect the results.
How do I do that ?
A clunky idea I have is :
def method(self):
ws = 4
with Pool(processes=ws) as pool :
cs = []
for i,c in enumerate(..) :
cs.append(c)
if i % ws == 0 :
res = [pool.apply(self.inner_loop, (c)) for i in range(ws)]
cs = []
collector.append(res)
will this keep the same pool running !! i.e. not lunch new process every time ?i
Do I need 'if i % ws == 0' part or I can use imap(), map_async() and the Pool obj will block the loop when available workers are exhausted and continue when some are freed ?
Yes, the way that multiprocessing.Pool works is:
Worker processes within a Pool typically live for the complete duration of the Pool’s work queue.
So simply submitting all your work to the pool via imap should be sufficient:
with Pool(processes=4) as pool:
initial_results = db.fetchall("SELECT c FROM outer")
results = [pool.imap(self.inner_loop, (c,)) for c in initial_results]
That said, if you really are doing this to fetch things from the DB, it may make more sense to move more processing down into that layer (bring the computation to the data rather than bringing the data to the computation).

Torch - Multithreading to load tensors into a queue for training purposes

I would like to use the library threads (or perhaps parallel) for loading/preprocessing data into a queue but I am not entirely sure how it works. In summary;
Load data (tensors), pre-process tensors (this takes time, hence why I am here) and put them in a queue. I would like to have as many threads as possible doing this so that the model is not waiting or not waiting for long.
For the tensor at the top of the queue, extract it and forward it through the model and remove it from the queue.
I don't really understand the example in https://github.com/torch/threads enough. A hint or example as to where I would load data into the queue and train would be great.
EDIT 14/03/2016
In this example "https://github.com/torch/threads/blob/master/test/test-low-level.lua" using a low level thread, does anyone know how I can extract data from these threads into the main thread?
Look at this multi-threaded data provider:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua
It runs this file in the thread:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L18
by calling it here:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L30-L43
And afterwards, if you want to queue a job into the thread, you provide two functions:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L84
The first one runs inside the thread, and the second one runs in the main thread after the first one completes.
Hopefully that makes it a bit more clear.
If Soumith's examples in the previous answer are not very easy to use, I suggest you build your own pipeline from scratch. I provide here an example of two synchronized threads : one for writing data and one for reading data:
local t = require 'threads'
t.Threads.serialization('threads.sharedserialize')
local tds = require 'tds'
local dict = tds.Hash() -- only local variables work here, and only tables or tds.Hash()
dict[1] = torch.zeros(4)
local m1 = t.Mutex()
local m2 = t.Mutex()
local m1id = m1:id()
local m2id = m2:id()
m1:lock()
local pool = t.Threads(
1,
function(threadIdx)
end
)
pool:addjob(
function()
local t = require 'threads'
local m1 = t.Mutex(m1id)
local m2 = t.Mutex(m2id)
while true do
m2:lock()
dict[1] = torch.randn(4)
m1:unlock()
print ('W ===> ')
print(dict[1])
collectgarbage()
collectgarbage()
end
return __threadid
end,
function(id)
end
)
-- Code executing on master:
local a = 1
while true do
m1:lock()
a = dict[1]
m2:unlock()
print('R --> ')
print(a)
end

In Scala/Playframework, how does "Future" and "Future.map" work under the hood?

The demo codes are from here
object ProxyController extends Controller {
def proxy = Action {
val responseFuture: Future[Response] = WS.url("http://example.com").get()
val resultFuture: Future[Result] = responseFuture.map { resp =>
// Create a Result that uses the http status, body, and content-type
// from the example.com Response
Status(resp.status)(resp.body).as(resp.ahcResponse.getContentType)
}
Async(resultFuture)
}
}
As I understand, the workflow looks like this:
One of the listener threads (threads to process the HTTP request), T1, executes the proxy action, running through the code from top to bottom. When it runs at the WS.url("http://example.com").get(), T1 just delegate the Web Request to another thread (worker thread) W1 and go the the next line. Then T1 skip the contents of the function passed to the map method, since that depends on a non-blocking I/O call that has not yet completed. Once T1 returns the AsyncResult, it moves on to process other requests.
Later on, the worker thread W1 finished the Web Request to "http://example.com", then it sends signal to a listener thread T2, which may or may not be the same as T1. Then T2 starts to execute the responseFuture.map line, and then delegate the task to another worker thread W2. Once the task is delegated to W2, T2 moves on to process other requests.
Later on, the worker thread W2 finished creating result (the responseFuture.map line), then it sends signal to a listener thread T3. And T3 get the result from W2 and send this result to the client in a magic way (It looks magic because I've no idea how T3 knows the original client.. Is it done by closures?)
Is this the real workflow under the hood? If so, will it be too complex and ineffective for the thread communication? If not, what happened under the hood for the codes above? Does anyone have ideas about this?

Resources