How to reuse a multiprocessing pool? - python-3.x

At the bottom is the code I have now. It seems to work fine. However, I don't completely understand it. I thought without .join(), I'd risking the code going onto the next for-loop before the pool finishes executing. Wouldn't we need those 3 commented-out lines?
On the other hand, if I were to go with the .close() and .join() way, is there any way to 'reopen' that closed pool instead of Pool(6) every time?
import multiprocessing as mp
import random as rdm
from statistics import stdev, mean
import time
def mesh_subset(population, n_chosen=5):
chosen = rdm.choices(population, k=n_chosen)
return mean(chosen)
if __name__ == '__main__':
population = [x for x in range(20)]
N_iteration = 10
start_time = time.time()
pool = mp.Pool(6)
for i in range(N_iteration):
print([round(x,2) for x in population])
print(stdev(population))
# pool = mp.Pool(6)
population = pool.map(mesh_subset, [population]*len(population))
# pool.close()
# pool.join()
print('run time:', time.time() - start_time)

A pool of workers is a relatively costly thing to set up, so it should be done (if possible) only once, usually at the beginning of the script.
The pool.map command blocks until all the tasks are completed. After all, it returns a list of the results. It couldn't do that unless mesh_subset has been called on all the inputs and has returned a result for each. In contrast, methods like pool.apply_async do not block. apply_async returns an ApplyResult object with a get method which blocks until it obtains a result from a worker process.
pool.close sets the worker handler's state to CLOSE. This causes the handler to signal the workers to terminate.
The pool.join blocks until all the worker processes have been terminated.
So you don't need to call -- in fact you shouldn't call -- pool.close and pool.join until you are finished with the pool. Once the workers have been sent the signal to terminate (by pool.close), there is no way to "reopen" them. You would need to start a new pool instead.
In your situation, since you do want the loop to wait until all the tasks are completed, there would be no advantage to using pool.apply_async instead of pool.map. But if you were to use pool.apply_async, you could obtain the same result as before by calling get instead of resorting to closing and restarting the pool:
# you could do this, but using pool.map is simpler
for i in range(N_iteration):
apply_results = [pool.apply_async(mesh_subset, [population]) for i in range(len(population))]
# the call to result.get() blocks until its worker process (running
# mesh_subset) returns a value
population = [result.get() for result in apply_results]
When the loops complete, len(population) is unchanged.
If you did NOT want each loop to block until all the tasks are completed, you could use apply_async's callback feature:
N_pop = len(population)
result = []
for i in range(N_iteration):
for i in range(N_pop):
pool.apply_async(mesh_subset, [population]),
callback=result.append)
pool.close()
pool.join()
print(result)
Now, when any mesh_subset returns a return_value,
result.append(return_value) is called. The calls to apply_async do not
block, so N_iteration * N_pop tasks are pushed into the pools task
queue all at once. But since the pool has 6 workers, at most 6 calls to
mesh_subset are running at any given time. As the workers complete the tasks,
whichever worker finishes first calls result.append(return_value). So the
values in result are unordered. This is different than pool.map which
returns a list whose return values are in the same order as its corresponding
list of arguments.
Barring an exception, result will eventually contain N_iteration * N_pop return values once
all the tasks complete. Above, pool.close() and pool.join() were used to
wait for all the tasks to complete.

Related

Best way to keep creating threads on variable list argument

I have an event that I am listening to every minute that returns a list ; it could be empty, 1 element, or more. And with those elements in that list, I'd like to run a function that would monitor an event on that element every minute for 10 minute.
For that I wrote that script
from concurrent.futures import ThreadPoolExecutor
from time import sleep
import asyncio
import Client
client = Client()
def handle_event(event):
for i in range(10):
client.get_info(event)
sleep(60)
async def main():
while True:
entires = client.get_new_entry()
if len(entires) > 0:
with ThreadPoolExecutor(max_workers=len(entires)) as executor:
executor.map(handle_event, entires)
await asyncio.sleep(60)
if __name__ == "__main__":
loop = asyncio.new_event_loop()
loop.run_until_complete(main())
However, instead of keep monitoring the entries, it blocks while the previous entries are still being monitors.
Any idea how I could do that please?
First let me explain why your program doesn't work the way you want it to: It's because you use the ThreadPoolExecutor as a context manager, which will not close until all the threads started by the call to map are finished. So main() waits there, and the next iteration of the loop can't happen until all the work is finished.
There are ways around this. Since you are using asyncio already, one approach is to move the creation of the Executor to a separate task. Each iteration of the main loop starts one copy of this task, which runs as long as it takes to finish. It's a async def function so many copies of this task can run concurrently.
I changed a few things in your code. Instead of Client I just used some simple print statements. I pass a list of integers, of random length, to handle_event. I increment a counter each time through the while True: loop, and add 10 times the counter to every integer in the list. This makes it easy to see how old calls continue for a time, mixing with new calls. I also shortened your time delays. All of these changes were for convenience and are not important.
The important change is to move ThreadPoolExecutor creation into a task. To make it cooperate with other tasks, it must contain an await expression, and for that reason I use executor.submit rather than executor.map. submit returns a concurrent.futures.Future, which provides a convenient way to await the completion of all the calls. executor.map, on the other hand, returns an iterator; I couldn't think of any good way to convert it to an awaitable object.
To convert a concurrent.futures.Future to an asyncio.Future, an awaitable, there is a function asyncio.wrap_future. When all the futures are complete, I exit from the ThreadPoolExecutor context manager. That will be very fast since all of the Executor's work is finished, so it does not block other tasks.
import random
from concurrent.futures import ThreadPoolExecutor
from time import sleep
import asyncio
def handle_event(event):
for i in range(10):
print("Still here", event)
sleep(2)
async def process_entires(counter, entires):
print("Counter", counter, "Entires", entires)
x = [counter * 10 + a for a in entires]
with ThreadPoolExecutor(max_workers=len(entires)) as executor:
futs = []
for z in x:
futs.append(executor.submit(handle_event, z))
await asyncio.gather(*(asyncio.wrap_future(f) for f in futs))
async def main():
counter = 0
while True:
entires = [0, 1, 2, 3, 4][:random.randrange(5)]
if len(entires) > 0:
counter += 1
asyncio.create_task(process_entires(counter, entires))
await asyncio.sleep(3)
if __name__ == "__main__":
asyncio.run(main())

Is it possible for two coroutines running in different threads can communicate with each other by asyncio.Queue?

Two coroutintes in code below, running in different threads, cannot communicate with each other by asyncio.Queue. After the producer inserts a new item in asyncio.Queue, the consumer cannot get this item from that asyncio.Queue, it gets blocked in method await self.n_queue.get().
I try to print the ids of asyncio.Queue in both consumer and producer, and I find that they are same.
import asyncio
import threading
import time
class Consumer:
def __init__(self):
self.n_queue = None
self._event = None
def run(self, loop):
loop.run_until_complete(asyncio.run(self.main()))
async def consume(self):
while True:
print("id of n_queue in consumer:", id(self.n_queue))
data = await self.n_queue.get()
print("get data ", data)
self.n_queue.task_done()
async def main(self):
loop = asyncio.get_running_loop()
self.n_queue = asyncio.Queue(loop=loop)
task = asyncio.create_task(self.consume())
await asyncio.gather(task)
async def produce(self):
print("id of queue in producer ", id(self.n_queue))
await self.n_queue.put("This is a notification from server")
class Producer:
def __init__(self, consumer, loop):
self._consumer = consumer
self._loop = loop
def start(self):
while True:
time.sleep(2)
self._loop.run_until_complete(self._consumer.produce())
if __name__ == '__main__':
loop = asyncio.get_event_loop()
print(id(loop))
consumer = Consumer()
threading.Thread(target=consumer.run, args=(loop,)).start()
producer = Producer(consumer, loop)
producer.start()
id of n_queue in consumer: 2255377743176
id of queue in producer 2255377743176
id of queue in producer 2255377743176
id of queue in producer 2255377743176
I try to debug step by step in asyncio.Queue, and I find after the method self._getters.append(getter) is invoked in asyncio.Queue, the item is inserted in queue self._getters. The following snippets are all from asyncio.Queue.
async def get(self):
"""Remove and return an item from the queue.
If queue is empty, wait until an item is available.
"""
while self.empty():
getter = self._loop.create_future()
self._getters.append(getter)
try:
await getter
except:
# ...
raise
return self.get_nowait()
When a new item is inserted into asycio.Queue in producer, the methods below would be invoked. The variable self._getters has no items although it has same id in methods put() and set().
def put_nowait(self, item):
"""Put an item into the queue without blocking.
If no free slot is immediately available, raise QueueFull.
"""
if self.full():
raise QueueFull
self._put(item)
self._unfinished_tasks += 1
self._finished.clear()
self._wakeup_next(self._getters)
def _wakeup_next(self, waiters):
# Wake up the next waiter (if any) that isn't cancelled.
while waiters:
waiter = waiters.popleft()
if not waiter.done():
waiter.set_result(None)
break
Does anyone know what's wrong with the demo code above? If the two coroutines are running in different threads, how could they communicate with each other by asyncio.Queue?
Short answer: no!
Because the asyncio.Queue needs to share the same event loop, but
An event loop runs in a thread (typically the main thread) and executes all callbacks and Tasks in its thread. While a Task is running in the event loop, no other Tasks can run in the same thread. When a Task executes an await expression, the running Task gets suspended, and the event loop executes the next Task.
see
https://docs.python.org/3/library/asyncio-dev.html#asyncio-multithreading
Even though you can pass the event loop to threads, it might be dangerous to mix the different concurrency concepts. Still note, that passing the loop just means that you can add tasks to the loop from different threads, but they will still be executed in the main thread. However, adding tasks from threads can lead to race conditions in the event loop, because
Almost all asyncio objects are not thread safe, which is typically not a problem unless there is code that works with them from outside of a Task or a callback. If there’s a need for such code to call a low-level asyncio API, the loop.call_soon_threadsafe() method should be used
see
https://docs.python.org/3/library/asyncio-dev.html#asyncio-multithreading
Typically, you should not need to run async functions in different threads, because they should be IO bound and therefore a single thread should be sufficient to handle the work load. If you still have some CPU bound tasks, you are able to dispatch them to different threads and make the result awaitable using asyncio.to_thread, see https://docs.python.org/3/library/asyncio-task.html#running-in-threads.
There are many questions already about this topic, see e.g. Send asyncio tasks to loop running in other thread or How to combine python asyncio with threads?
If you want to learn more about the concurrency concepts, I recommend to read https://medium.com/analytics-vidhya/asyncio-threading-and-multiprocessing-in-python-4f5ff6ca75e8

Multiproccesing and lists in python

I have a list of jobs but due to certain condition not all of the jobs should run in parallel at the same time because sometimes it is important that a finishes before I start b or vice versa (actually its not important which one runs first just not that they run both at the same time) so i thought i keep a list of the currently running threads and when ever a new on starts it checks in this list of currently running threads if the thread can proceed or not. I wrote some sample code for that:
from time import sleep
from multiprocessing import Pool
def square_and_test(x):
print(running_list)
if not x in running_list:
running_list = running_list.append(x)
sleep(1)
result_list = result_list.append(x**2)
running_list = running_list.remove(x)
else:
print(f'{x} is currently worked on')
task_list = [1,2,3,4,1,1,4,4,2,2]
running_list = []
result_list = []
pool = Pool(2)
pool.map(square_and_test, task_list)
print(result_list)
this code fails with UnboundLocalError: local variable 'running_list' referenced before assignment so i guess my threads don't have access to global variables. Is there a way around this? If not is there another way to solve this problem?

Why is the asyncio is blocking with the processPool?

I have a following code:
import time
import asyncio
from concurrent.futures import ProcessPoolExecutor
def blocking_func(x):
print("In blocking waiting")
time.sleep(x) # Pretend this is expensive calculations
print("after blocking waiting")
return x * 5
#asyncio.coroutine
def main():
executor = ProcessPoolExecutor()
out = yield from loop.run_in_executor(executor, blocking_func, 2) # This does not
print("after process pool")
print(out)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Output:
In blocking waiting
after blocking waiting
after process pool
10
But I was expecting the process pool will run the code in different process. So I was expecting the output to be:
Expecting output:
In blocking waiting
after process pool
after blocking waiting
10
I thought if we run the code on process pool it would not block the main loop.But in the output it came back to the main event loop after it is done with the blocking function.
What is blocking the event loop? Is it the blocking_function? If it is the blocking_function what is the use of having the process pool?
yield from here means "wait for coroutine to complete and return its result". Comparing to Python threading API, it is like calling join().
To get desired result, use something like this:
#asyncio.coroutine
def main():
executor = ProcessPoolExecutor()
task = loop.run_in_executor(executor, blocking_func, 2)
# at this point your blocking func is already running
# in the executor process
print("after process pool")
out = yield from task
print(out)
Coroutines arent' t separate processes. The difference is that coroutines need to give up control to the loop by themselves. This means if you have a blocking coroutine then it will block the whole loop.
The reason you use coroutines is mainly to handle I/O activities. If you are waiting for a message you can simply check a socket and if nothing happens you will return to the main loop. Then other coroutines can be handled before finally the control comes back to the IO function.
In your case it makes sense to use await asyncio.sleep(x) instead of time.sleep(x). This way control is suspended from blocking_func() for the sleep time. Afterwards control goes back there and the result should be as you expected it.
More infos: https://docs.python.org/3/library/asyncio.html

Python Multiprocessing Queue Slow

I have a problem with python multiprocessing Queues.
I'm doing some hard computation on some data. I have created few processes to lower calculation time, also data have been split evenly before sending it to processes. It decrease the time of calculations nicely but when I want to return data from the process by multiprocessing.Queue it takes ages and whole thing is slower than calculating in main thread.
processes = []
proc = 8
for i in range(proc):
processes.append(multiprocessing.Process(target=self.calculateTriangles, args=(inData[i],outData,timer)))
for p in processes:
p.start()
results = []
for i in range(proc):
results.append(outData.get())
print("killing threads")
print(datetime.datetime.now() - timer)
for p in processes:
p.join()
print("Finish Threads")
print(datetime.datetime.now() - timer)
all of threads print their finish time when they are done. Here is example output of this code
0:00:00.017873 CalcDone
0:00:01.692940 CalcDone
0:00:01.777674 CalcDone
0:00:01.780019 CalcDone
0:00:01.796739 CalcDone
0:00:01.831723 CalcDone
0:00:01.842356 CalcDone
0:00:01.868633 CalcDone
0:00:05.497160 killing threads
60968 calculated triangles
As you can see everything is quiet simple until this code.
for i in range(proc):
results.append(outData.get())
print("killing threads")
print(datetime.datetime.now() - timer)
here are some observations I have made on mine computer and slower one.
https://docs.google.com/spreadsheets/d/1_8LovX0eSgvNW63-xh8L9-uylAVlzY4VSPUQ1yP2F9A/edit?usp=sharing . On slower one there isn't any improvement as you can see.
Why does it take so much time to get items from queue when process is finished?? Is there way to speed this up?
So I have solved it myself. Calculations are fast but copying objects from one process to another takes ages. I just made a method that cleared all not-necessary fields in the objects, also using pipes is faster than multiprocessing queues. It took down the time on my slower computer from 29 seconds to 15 seconds.
This time is mainly spent on putting another object to the Queue and spiking up the Semaphore count. If you are able to bulk insert the Queue with all the data at once, then you cut down to 1/10 of the previous time.
I've assigned dynamically a new method to Queue based on the old one. Go to the multiprocessing module for your Python version:
/usr/lib/pythonx.x/multiprocessing.queues.py
Copy the "put" method of the class to your project e.g. for Python 3.7:
def put(self, obj, block=True, timeout=None):
assert not self._closed, "Queue {0!r} has been closed".format(self)
if not self._sem.acquire(block, timeout):
raise Full
with self._notempty:
if self._thread is None:
self._start_thread()
self._buffer.append(obj)
self._notempty.notify()
modify it:
def put_bla(self, obj, block=True, timeout=None):
assert not self._closed, "Queue {0!r} has been closed".format(self)
for el in obj:
if not self._sem.acquire(block, timeout): #spike the semaphore count
raise Full
with self._notempty:
if self._thread is None:
self._start_thread()
self._buffer += el # adding a collections.deque object
self._notempty.notify()
The last step is to add the new method to the class. The multiprocessing.Queue is a DefaultContext method which returns a Queue object. It is easier to inject the method directly to the class of the created object. So:
from collections import deque
queue = Queue()
queue.__class__.put_bulk = put_bla # injecting new method
items = (500, 400, 450, 350) * count # (500, 400, 450, 350, 500, 400...)
queue.put_bulk(deque(items))
Unfortunately the multiprocessing.Pool was always faster by 10%, so just stick with that if you don't require everlasting workers to process your tasks. It is based on multiprocessing.SimpleQueue which is based on multiprocessing.Pipe and I have no idea why it is faster because my SimpleQueue solution wasn't and it is not bulk-injectable:) Break that and You'll have the fastest worker ever:)

Resources