Celery worker with multithreading - how to update results concurently - multithreading

I created a Flask API with a Celery worker. User fires "start tests" button which makes a POST request that returns url which user can use to get results of tests every 5 seconds (needed to update fontend progress bar). The Celery task includes threading. My goal is to update Celery task state based on the results of threads concurently. I don't want to wait until all my threads finish to return their result. My Celery task looks like this:
#celery.task(bind=True) # bind argument instructs Celery to send a "self" argument and use it to record status updates
def run_tests(self, dialog_cases):
"""
Testing running as a background task
"""
results = []
test_case_no = 1
test_controller = TestController(dialog_cases)
bot_config = [test_controller.url, test_controller.headers, test_controller.db_name]
threads = []
queue = Queue()
start = time.perf_counter()
threads_list = list()
for test_case in test_controller.test_cases:
t = Thread(target=queue.put({randint(0,1000): TestCase(test_case, bot_config)}))
t.start()
threads_list.append(t)
for t in threads_list:
t.join()
results_dict_list = [queue.get() for _ in range(len(test_controller.test_cases))]
for result in results_dict_list:
for key, value in result.items():
cprint.info(f"{key}, {value.test_failed}")
Now: the TestCase is an object that on creation runs a function that makes a few iterations and afterwards returns whether the test failed or passed. I have another Flask endpoint which returns the status of the tasks. Question is how to get the value returned by threads simultanously without having to wait until they are all finished? I tried Queue but this can only return results when everything is over.

You can simply use update_state to modify state of the task, from each of those threads if that is what you want. Furthermore, you can create your own, custom states. As you want to know result of each test the moment it is finished, it seems like a good idea to have a custom state for teach test that you update from each thread durint runtime.
An alterantive is to refactor your code so each test is actually a Celery task. Then you use Chord or Group primitives to build your workflow. As you want to know the state during runtime, then perhaps Group is better because then you can monitor the state of the GroupResult object...

Related

How do I retrieve data from a Django DB before sending off Celery task to remote worker?

I have a celery shared_task that is scheduled to run at certain intervals. Every time this task is run, it needs to first retrieve data from the Django DB in order to complete the calculation. This task may or may not be sent to a celery worker that is on a separate machine, so in the celery task I can't make any queries to a local celery database.
So far I have tried using signals to accomplish it, since I know that functions with the wrapper #before_task_publish are executed before the task is even published in the message queue. However, I don't know how I can actually get the data to the task.
#shared_task
def run_computation(data):
perform_computation(data)
#before_task_publish.connect
def receiver_before_task_publish(sender=None, headers=None, body=None, **kwargs):
data = create_data()
# How do I get the data to the task from here?
Is this the right way to approach this in the first place? Or would I be better off making an API route that the celery task can get to retrieve the data?
I'm posting the solution that worked for me, thanks for the help #solarissmoke.
What works best for me is utilizing Celery "chain" callback functions and separate RabbitMQ queues for designating what would be computed locally and what would be computed on the remote worker.
My solution looks something like this:
#app.task
def create_data_task():
# this creates the data to be passed to the analysis function
return create_data()
#app.task
def perform_computation_task(data):
# This performs the computation with given data
return perform_computation(data)
#app.task
def store_results(result):
# This would store the result in the DB here, but for now we just print it
print(result)
#app.task
def run_all_computation():
task = signature("path.to.tasks.create_data_task", queue="default") | signature("path.to.tasks.perform_computation_task", queue="remote_computation") | signature("path.to.tasks.store_results", queue="default")
task()
Its important to note here that these tasks were not run serially; they were in fact separate tasks that are run by the workers and therefore do not block a single thread. The other tasks are only activated by a callback function from the others. I declared two celery queues in RabbitMQ, a default one called default, and one specifically for remote computation called "remote_computation". This is described explicitly here including how to point celery workers at created queues.
It is possible to modify the task data in place, from the before_task_publish handler, so that it gets passed to the task. I will say upfront that there are many reasons why this is not a good idea:
#before_task_publish.connect
def receiver_before_task_publish(sender=None, headers=None, body=None, **kwargs):
data = create_data()
# Modify the body of the task data.
# Body is a tuple, the first entry of which is a tuple of arguments to the task.
# So we replace the first argument (data) with our own.
body[0][0] = data
# Alternatively modify the kwargs, which is a bit more explicit
body[1]['data'] = data
This works, but it should be obvious why it's risky and prone to breakage. Assuming you have control over the task call sites I think it would be better to drop the signal altogether and just have a simple function that does the work for you, i.e.,:
def create_task(data):
data = create_data()
run_computation.delay(data)
And then in your calling code, just call create_task(data) instead of calling the task directly (which is what you presumably do right now).

Is at a good idea to use ThreadPoolExecutor with one worker?

I have a simple rest service which allows you to create task. When a client requests a task - it returns a unique task number and starts executing in a separate thread. The easiest way to implement it
class Executor:
def __init__(self, max_workers=1):
self.executor = ThreadPoolExecutor(max_workers)
def execute(self, body, task_number):
# some logic
pass
def some_rest_method(request):
body = json.loads(request.body)
task_id = generate_task_id()
Executor(max_workers=1).execute(body)
return Response({'taskId': task_id})
Is it a good idea to create each time ThreadPoolExecutor with one (!) workers if i know than one request - is one new task (new thread). Perhaps it is worth putting them in the queue somehow? Maybe the best option is to create a regular stream every time?
Is it a good idea to create each time ThreadPoolExecutor...
No. That completely defeats the purpose of a thread pool. The reason for using a thread pool is so that you don't create and destroy a new thread for every request. Creating and destroying threads is expensive. The idea of a thread pool is that it keeps the "worker thread(s)" alive and re-uses it/them for each next request.
...with just one thread
There's a good use-case for a single-threaded executor, though it probably does not apply to your problem. The use-case is, you need a sequence of tasks to be performed "in the background," but you also need them to be performed sequentially. A single-thread executor will perform the tasks, one after another, in the same order that they were submitted.
Perhaps it is worth putting them in the queue somehow?
You already are putting them in a queue. Every thread pool has a queue of pending tasks. When you submit a task (i.e., executor.execute(...)) that puts the task into the queue.
what's the best way...in my case?
The bones of a simplistic server look something like this (pseudo-code):
POOL = ThreadPoolExecutor(...with however many threads seem appropriate...)
def service():
socket = create_a_socket_that_listens_on_whatever_port()
while True:
client_connection = socket.accept()
POOL.submit(request_handler, connection=connection)
def request_handler(connection):
request = receive_request_from(connection)
reply = generate_reply_based_on(request)
send_reply_to(reply, connection)
connection.close()
def main():
initialize_stuff()
service()
Of course, there are many details that I have left out. I can't design it for you. Especially not in Python. I've written servers like this in other languages, but I'm pretty new to Python.

Using Multiple Asyncio Queues Effectively

I am currently building a project that requires multiple requests made to various endpoints. I am wrapping these requests in Aiohttp to allow for async.
The problem:
I have three Queues: queue1, queue2, and queue3. Additionally, I have three worker functions (worker1, worker2, worker3) that are associated with their respective Queue. The first queue is populated immediately with a list IDs that is known prior to running. When the request is finished and the data is committed to a database, it passes the ID to queue2. A worker2 will take this ID and request more data. From this data it will begin to generate a list of IDs (different from the IDs in queue1/queue2. worker2 will put the IDs in queue3. Finally worker3 will grab this ID from queue3 and request more data before committing to a database.
The issue arises with the fact queue.join() is a blocking call. Each worker is tied to a separate Queue so the join for queue1 will block until its finished. This is fine, but it also defeats the purpose of using async. Without using join() the program is unable to detect when the Queues are totally empty. The other issue is that there may be silent errors when one of the Queues is empty but there is still data that hasn't been added yet.
The basic code outline is as follows:
queue1 = asyncio.Queue()
queue2 = asyncio.Queue()
queue3 = asyncio.Queue()
async with aiohttp.ClientSession() as session:
for i in range(3):
tasks.append(asyncio.create_task(worker1(queue1)))
for i in range(3):
tasks.append(asyncio.create_task(worker2(queue2)))
for i in range(10):
tasks.append(asyncio.create_task(worker3(queue3)))
for i in IDs:
queue1.put_nowait(i)
await asyncio.gather(*tasks)
The worker functions sit in an infinite loop waiting for items to enter the queue.
When the data has all been processed there will be no exit and the program will hang.
Is there a way to effectively manage the workers and end properly?
As nicely explained in this answer, Queue.join serves to inform the producer when all the work injected into the queue got completed. Since your first queue doesn't know when a particular item is done (it's multiplied and distributed to other queues), join is not the right tool for you.
Judging from your code, it seems that your workers need to run for only as long as it takes to process the queue's initial items. If that is the case, then you can use a shutdown sentinel to signal the workers to exit. For example:
async with aiohttp.ClientSession() as session:
# ... create tasks as above ...
for i in IDs:
queue1.put_nowait(i)
queue1.put_nowait(None) # no more work
await asyncio.gather(*tasks)
This is like your original code, but with an explicit shutdown request. Workers must detect the sentinel and react accordingly: propagate it to the next queue/worker and exit. For example, in worker1:
while True:
item = queue1.get()
if item is None:
# done with processing, propagate sentinel to worker2 and exit
await queue2.put(None)
break
# ... process item as usual ...
Doing the same in other two workers (except for worker3 which won't propagate because there's no next queue) will result in all three tasks completing once the work is done. Since queues are FIFO, the workers can safely exit after encountering the sentinel, knowing that no items have been dropped. The explicit shutdown also distinguishes a shut-down queue from one that happens to be empty, thus preventing workers from exiting prematurely due to a temporarily empty queue.
Up to Python 3.7, this technique was actually demonstrated in the documentation of Queue, but that example somewhat confusingly shows both the use of Queue.join and the use of a shutdown sentinel. The two are separate and can be used independently of one another. (And it might also make sense to use them together, e.g. to use Queue.join to wait for a "milestone", and then put other stuff in the queue, while reserving the sentinel for stopping the workers.)

Wait till all tasks are run in Celery python

I am using celery in python for asynchronous tasks. I want to capture the result of it after all the tasks assigned to all workers are done.
For that, I am using the .get() method, but the problem with get() is that all the tasks are being assigned to a single worker which is synchronous but I want the tasks to be distributed to all the workers available.
Below is my snippet.
for url in urls:
res = good_bad_urls.delay(url[1])
res.get()
return JsonResponse(some_data)
Is there any other method in celery to wait until all the tasks run asynchronously?
but the problem with get() is that all the tasks are being assigned to a single worker which is synchronous
Well, not exactly. The task distribution works just the same (even if it can seem to do otherwise), and the tasks themselves are still async. The difference is that result.get() IS a blocking call - so in your case, it waits for the current task to finish until it launches the next one.
But anyway: the solution here is to use a Group. In your case, it should look something like
jobs = group([good_bad_urls.s(url[1]) for url in urls])
async_res = jobs.apply_async()
result = async_res.get()
The get() call will now wait for all tasks to be finished, but they will be launched in parallel.

Process finishes but cannot be joined?

To accelerate a certain task, I'm subclassing Process to create a worker that will process data coming in samples. Some managing class will feed it data and read the outputs (using two Queue instances). For asynchronous operation I'm using put_nowait and get_nowait. At the end I'm sending a special exit code to my process, upon which it breaks its internal loop. However... it never happens. Here's a minimal reproducible example:
import multiprocessing as mp
class Worker(mp.Process):
def __init__(self, in_queue, out_queue):
super(Worker, self).__init__()
self.input_queue = in_queue
self.output_queue = out_queue
def run(self):
while True:
received = self.input_queue.get(block=True)
if received is None:
break
self.output_queue.put_nowait(received)
print("\tWORKER DEAD")
class Processor():
def __init__(self):
# prepare
in_queue = mp.Queue()
out_queue = mp.Queue()
worker = Worker(in_queue, out_queue)
# get to work
worker.start()
in_queue.put_nowait(list(range(10**5))) # XXX
# clean up
print("NOTIFYING")
in_queue.put_nowait(None)
#out_queue.get() # XXX
print("JOINING")
worker.join()
Processor()
This code never completes, hanging permanently like this:
NOTIFYING
JOINING
WORKER DEAD
Why?
I've marked two lines with XXX. In the first one, if I send less data (say, 10**4), everything will finish normally (processes join as expected). Similarly in the second, if I get() after notifying the workers to finish. I know I'm missing something but nothing in the documentation seems relevant.
Documentation mentions that
When an object is put on a queue, the object is pickled and a background thread later flushes the pickled data to an underlying pipe. This has some consequences [...] After putting an object on an empty queue there may be an infinitesimal delay before the queue’s empty() method returns False and get_nowait() can return without raising queue.Empty.
https://docs.python.org/3.7/library/multiprocessing.html#pipes-and-queues
and additionally that
whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate.
https://docs.python.org/3.7/library/multiprocessing.html#multiprocessing-programming
This means that the behaviour you describe is caused probably by a racing condition between self.output_queue.put_nowait(received) in the worker and joining the worker with worker.join() in the Processers __init__. If joining was faster than feeding it into the queue, everything finishes fine. If it was too slow, there is an item in the queue, and the worker would not join.
Uncommenting the out_queue.get() in the main process would empty the queue, which allows joining. But as it is important for the queue to return if the queue would already be empty, using a time-out might be an option to try to wait out the racing condition, e.g out_qeue.get(timeout=10).
Possibly important might also be to protect the main routine, especially for Windows (python multiprocessing on windows, if __name__ == "__main__")

Resources