Could you tell me if this is a correct approach to build several independent async loops inside own threads?
def init():
print("Initializing Async...")
global loop_heavy
loop_heavy = asyncio.new_event_loop()
start_loop(loop_heavy)
def start_loop(loop):
thread = threading.Thread(target=loop.run_forever)
thread.start()
def submit_heavy(task):
future = asyncio.run_coroutine_threadsafe(task, loop_heavy)
try:
future.result()
except Exception as e:
print(e)
def stop():
loop_heavy.call_soon_threadsafe(loop_heavy.stop)
async def heavy():
print("3. heavy start %s" % threading.current_thread().name)
await asyncio.sleep(3) # or await asyncio.sleep(3, loop=loop_heavy)
print("4. heavy done")
Then I am testing it with:
if __name__ == "__main__":
init()
print("1. submit heavy: %s" % threading.current_thread().name)
submit_heavy(heavy())
print("2. submit is done")
stop()
I am expecting to see 1->3->2->4 but in fact it is 1->3->4->2:
Initializing Async...
1. submit heavy: MainThread
3. heavy start Thread-1
4. heavy done
2. submit is done
I think that I miss something in understanding async and threads.
Threads are different. Why am I waiting inside MainThread until the job inside Thread-1 is finished?
Why am I waiting inside MainThread until the job inside Thread-1 is finished?
Good question, why are you?
One possible answer is, because you actually want to block the current thread until the job is finished. This is one of the reasons to put the event loop in another thread and use run_coroutine_threadsafe.
The other possible answer is that you don't have to if you don't want. You can simply return from submit_heavy() the concurrent.futures.Future object returned by run_coroutine_threadsafe, and leave it to the caller to wait for the result (or check if one is ready) at their own leisure.
Finally, if your goal is just to run a regular function "in the background" (without blocking the current thread), perhaps you don't need asyncio at all. Take a look at the concurrent.futures module, whose ThreadPoolExecutor allows you to easily submit a function to a thread pool and leave it to execute unassisted.
I will add one of the possible solutions that I found from the asyncio documentation.
I'm not sure that it is the correct way, but it works as expected (MainThread is not blocked by the execution of the child thread)
Running Blocking Code
Blocking (CPU-bound) code should not be called directly. For example, if a function performs a CPU-intensive calculation for 1 second, all concurrent asyncio Tasks and IO operations would be delayed by 1 second.
An executor can be used to run a task in a different thread or even in a different process to avoid blocking block the OS thread with the event loop. See the loop.run_in_executor() method for more details.
Applying to my code:
import asyncio
import threading
import concurrent.futures
import multiprocessing
import time
def init():
print("Initializing Async...")
global loop, thread_executor_pool
thread_executor_pool = concurrent.futures.ThreadPoolExecutor(max_workers=multiprocessing.cpu_count())
loop = asyncio.get_event_loop()
thread = threading.Thread(target=loop.run_forever)
thread.start()
def submit_task(task, *args):
loop.run_in_executor(thread_executor_pool, task, *args)
def stop():
loop.call_soon_threadsafe(loop.stop)
thread_executor_pool.shutdown()
def blocked_task(msg1, msg2):
print("3. task start msg: %s, %s, thread: %s" % (msg1, msg2, threading.current_thread().name))
time.sleep(3)
print("4. task is done -->")
if __name__ == "__main__":
init()
print("1. --> submit task: %s" % threading.current_thread().name)
submit_task(blocked_task, "a", "b")
print("2. --> submit is done")
stop()
Output:
Initializing Async...
1. --> submit task: MainThread
3. task start msg: a, b, thread: ThreadPoolExecutor-0_0
2. --> submit is done
4. task is done -->
Correct me if there are still any mistakes or it can be done in the other way.
Related
Two coroutintes in code below, running in different threads, cannot communicate with each other by asyncio.Queue. After the producer inserts a new item in asyncio.Queue, the consumer cannot get this item from that asyncio.Queue, it gets blocked in method await self.n_queue.get().
I try to print the ids of asyncio.Queue in both consumer and producer, and I find that they are same.
import asyncio
import threading
import time
class Consumer:
def __init__(self):
self.n_queue = None
self._event = None
def run(self, loop):
loop.run_until_complete(asyncio.run(self.main()))
async def consume(self):
while True:
print("id of n_queue in consumer:", id(self.n_queue))
data = await self.n_queue.get()
print("get data ", data)
self.n_queue.task_done()
async def main(self):
loop = asyncio.get_running_loop()
self.n_queue = asyncio.Queue(loop=loop)
task = asyncio.create_task(self.consume())
await asyncio.gather(task)
async def produce(self):
print("id of queue in producer ", id(self.n_queue))
await self.n_queue.put("This is a notification from server")
class Producer:
def __init__(self, consumer, loop):
self._consumer = consumer
self._loop = loop
def start(self):
while True:
time.sleep(2)
self._loop.run_until_complete(self._consumer.produce())
if __name__ == '__main__':
loop = asyncio.get_event_loop()
print(id(loop))
consumer = Consumer()
threading.Thread(target=consumer.run, args=(loop,)).start()
producer = Producer(consumer, loop)
producer.start()
id of n_queue in consumer: 2255377743176
id of queue in producer 2255377743176
id of queue in producer 2255377743176
id of queue in producer 2255377743176
I try to debug step by step in asyncio.Queue, and I find after the method self._getters.append(getter) is invoked in asyncio.Queue, the item is inserted in queue self._getters. The following snippets are all from asyncio.Queue.
async def get(self):
"""Remove and return an item from the queue.
If queue is empty, wait until an item is available.
"""
while self.empty():
getter = self._loop.create_future()
self._getters.append(getter)
try:
await getter
except:
# ...
raise
return self.get_nowait()
When a new item is inserted into asycio.Queue in producer, the methods below would be invoked. The variable self._getters has no items although it has same id in methods put() and set().
def put_nowait(self, item):
"""Put an item into the queue without blocking.
If no free slot is immediately available, raise QueueFull.
"""
if self.full():
raise QueueFull
self._put(item)
self._unfinished_tasks += 1
self._finished.clear()
self._wakeup_next(self._getters)
def _wakeup_next(self, waiters):
# Wake up the next waiter (if any) that isn't cancelled.
while waiters:
waiter = waiters.popleft()
if not waiter.done():
waiter.set_result(None)
break
Does anyone know what's wrong with the demo code above? If the two coroutines are running in different threads, how could they communicate with each other by asyncio.Queue?
Short answer: no!
Because the asyncio.Queue needs to share the same event loop, but
An event loop runs in a thread (typically the main thread) and executes all callbacks and Tasks in its thread. While a Task is running in the event loop, no other Tasks can run in the same thread. When a Task executes an await expression, the running Task gets suspended, and the event loop executes the next Task.
see
https://docs.python.org/3/library/asyncio-dev.html#asyncio-multithreading
Even though you can pass the event loop to threads, it might be dangerous to mix the different concurrency concepts. Still note, that passing the loop just means that you can add tasks to the loop from different threads, but they will still be executed in the main thread. However, adding tasks from threads can lead to race conditions in the event loop, because
Almost all asyncio objects are not thread safe, which is typically not a problem unless there is code that works with them from outside of a Task or a callback. If there’s a need for such code to call a low-level asyncio API, the loop.call_soon_threadsafe() method should be used
see
https://docs.python.org/3/library/asyncio-dev.html#asyncio-multithreading
Typically, you should not need to run async functions in different threads, because they should be IO bound and therefore a single thread should be sufficient to handle the work load. If you still have some CPU bound tasks, you are able to dispatch them to different threads and make the result awaitable using asyncio.to_thread, see https://docs.python.org/3/library/asyncio-task.html#running-in-threads.
There are many questions already about this topic, see e.g. Send asyncio tasks to loop running in other thread or How to combine python asyncio with threads?
If you want to learn more about the concurrency concepts, I recommend to read https://medium.com/analytics-vidhya/asyncio-threading-and-multiprocessing-in-python-4f5ff6ca75e8
I'm playing about with a personal project in python3.6 and I've run into the following issue which results in the my_queue.join() call blocking indefinitely. Note this isn't my actual code but a minimal example demonstrating the issue.
import threading
import queue
def foo(stop_event, my_queue):
while not stop_event.is_set():
try:
item = my_queue.get(timeout=0.1)
print(item) #Actual logic goes here
except queue.Empty:
pass
print('DONE')
stop_event = threading.Event()
my_queue = queue.Queue()
thread = threading.Thread(target=foo, args=(stop_event, my_queue))
thread.start()
my_queue.put(1)
my_queue.put(2)
my_queue.put(3)
print('ALL PUT')
my_queue.join()
print('ALL PROCESSED')
stop_event.set()
print('ALL COMPLETE')
I get the following output (it's actually been consistent, but I understand that the output order may differ due to threading):
ALL PUT
1
2
3
No matter how long I wait I never see ALL PROCESSED output to the console, so why is my_queue.join() blocking indefinitely when all the items have been processed?
From the docs:
The count of unfinished tasks goes up whenever an item is added to the
queue. The count goes down whenever a consumer thread calls
task_done() to indicate that the item was retrieved and all work on it
is complete. When the count of unfinished tasks drops to zero, join()
unblocks.
You're never calling q.task_done() inside your foo function. The foo function should be something like the example:
def worker():
while True:
item = q.get()
if item is None:
break
do_work(item)
q.task_done()
I have a following code:
import time
import asyncio
from concurrent.futures import ProcessPoolExecutor
def blocking_func(x):
print("In blocking waiting")
time.sleep(x) # Pretend this is expensive calculations
print("after blocking waiting")
return x * 5
#asyncio.coroutine
def main():
executor = ProcessPoolExecutor()
out = yield from loop.run_in_executor(executor, blocking_func, 2) # This does not
print("after process pool")
print(out)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Output:
In blocking waiting
after blocking waiting
after process pool
10
But I was expecting the process pool will run the code in different process. So I was expecting the output to be:
Expecting output:
In blocking waiting
after process pool
after blocking waiting
10
I thought if we run the code on process pool it would not block the main loop.But in the output it came back to the main event loop after it is done with the blocking function.
What is blocking the event loop? Is it the blocking_function? If it is the blocking_function what is the use of having the process pool?
yield from here means "wait for coroutine to complete and return its result". Comparing to Python threading API, it is like calling join().
To get desired result, use something like this:
#asyncio.coroutine
def main():
executor = ProcessPoolExecutor()
task = loop.run_in_executor(executor, blocking_func, 2)
# at this point your blocking func is already running
# in the executor process
print("after process pool")
out = yield from task
print(out)
Coroutines arent' t separate processes. The difference is that coroutines need to give up control to the loop by themselves. This means if you have a blocking coroutine then it will block the whole loop.
The reason you use coroutines is mainly to handle I/O activities. If you are waiting for a message you can simply check a socket and if nothing happens you will return to the main loop. Then other coroutines can be handled before finally the control comes back to the IO function.
In your case it makes sense to use await asyncio.sleep(x) instead of time.sleep(x). This way control is suspended from blocking_func() for the sleep time. Afterwards control goes back there and the result should be as you expected it.
More infos: https://docs.python.org/3/library/asyncio.html
I am writing an application using python3 and am trying out asyncio for the first time. One issue I have encountered is that some of my coroutines block the event loop for longer than I like. I am trying to find something along the lines of top for the event loop that will show how much wall/cpu time is being spent running each of my coroutines. If there isn't anything already existing does anyone know of a way to add hooks to the event loop so that I can take measurements?
I have tried using cProfile which gives some helpful output, but I am more interested in time spent blocking the event loop, rather than total execution time.
Event loop can already track if coroutines take much CPU time to execute. To see it you should enable debug mode with set_debug method:
import asyncio
import time
async def main():
time.sleep(1) # Block event loop
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.set_debug(True) # Enable debug
loop.run_until_complete(main())
In output you'll see:
Executing <Task finished coro=<main() [...]> took 1.016 seconds
By default it shows warnings for coroutines that blocks for more than 0.1 sec. It's not documented, but based on asyncio source code, looks like you can change slow_callback_duration attribute to modify this value.
You can use call_later. Periodically run callback that will log/notify the difference of loop's time and period interval time.
class EventLoopDelayMonitor:
def __init__(self, loop=None, start=True, interval=1, logger=None):
self._interval = interval
self._log = logger or logging.getLogger(__name__)
self._loop = loop or asyncio.get_event_loop()
if start:
self.start()
def run(self):
self._loop.call_later(self._interval, self._handler, self._loop.time())
def _handler(self, start_time):
latency = (self._loop.time() - start_time) - self._interval
self._log.error('EventLoop delay %.4f', latency)
if not self.is_stopped():
self.run()
def is_stopped(self):
return self._stopped
def start(self):
self._stopped = False
self.run()
def stop(self):
self._stopped = True
example
import time
async def main():
EventLoopDelayMonitor(interval=1)
await asyncio.sleep(1)
time.sleep(2)
await asyncio.sleep(1)
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
output
EventLoop delay 0.0013
EventLoop delay 1.0026
EventLoop delay 0.0014
EventLoop delay 0.0015
For anyone reading this in 2019, this might be a better answer: yappi. With Yappi version 1.2.1>=, you can natively profile coroutines and see exactly how much wall or cpu time is spent inside a coroutine.
See here for details on this coroutine profiling.
To expand a bit on one of the answers, if you want to monitor your loop and detect hangs, here's a snippet to do just that. It launches a separate thread that checks whether the loop's tasks yielded execution recently enough.
def monitor_loop(loop, delay_handler):
loop = loop
last_call = loop.time()
INTERVAL = .5 # How often to poll the loop and check the current delay.
def run_last_call_updater():
loop.call_later(INTERVAL, last_call_updater)
def last_call_updater():
nonlocal last_call
last_call = loop.time()
run_last_call_updater()
run_last_call_updater()
def last_call_checker():
threading.Timer(INTERVAL / 2, last_call_checker).start()
if loop.time() - last_call > INTERVAL:
delay_handler(loop.time() - last_call)
threading.Thread(target=last_call_checker).start()
So I want to learn using moveToThread and see the effect of calling onTimeout() of class GenericWorker from a different thread (main thread in this case). The weird thing is that the finish_sig in GenericWorker never gets emitted (should happen at the last line of onTimeout() ). Since it connects to terminate_thread() in Sender class, it should at least print out a terminate_thread in the console, but nothing happens at all.
My original purpose for using it is to emit a signal to quit the thread after onTimeout() is done. But now I can only do t.quit() from main to quit the thread.
Thank you all for spending time taking care of my question!
from PyQt4.QtCore import *
from PyQt4.QtGui import *
import threading
from time import sleep
import sys
class GenericWorker(QObject):
finish_sig = pyqtSignal() # this one never gets emitted!
#pyqtSlot(str, str)
def onTimeout(self, cmd1, cmd2):
print 'onTimeout get called from thread ID: '
print QThread.currentThreadId()
print 'received cmd 1: ' + cmd1
print 'received cmd 2: ' + cmd2
self.finish_sig.emit() # supposed to emit here!
class Sender(QObject):
send_sig = pyqtSignal(str, str)
terminate_sig = pyqtSignal()
def emit_sig(self, cmd):
print 'emit_sig thread ID: '
print QThread.currentThreadId()
sleep(1)
self.send_sig.emit(cmd, '2nd_cmd')
def terminate_thread(self):
print 'terminate_thread'
self.terminate_sig.emit()
if __name__ == "__main__":
app = QApplication(sys.argv)
print 'Main thread ID: '
print QThread.currentThreadId()
t = QThread()
my_worker = GenericWorker()
my_worker.moveToThread(t)
t.start()
my_sender = Sender()
my_sender.send_sig.connect(my_worker.onTimeout)
my_sender.terminate_sig.connect(t.quit)
my_worker.finish_sig.connect(my_sender.terminate_thread)
# my_worker.finish_sig.connect(t.quit)
my_sender.emit_sig('hello')
sleep(1)
# my_sender.terminate_thread()
# t.quit() # this one works
# t.wait()
exit(1)
sys.exit(app.exec_())
The output:
Main thread ID:
46965006517856
emit_sig thread ID:
46965006517856
onTimeout get called from thread ID:
1111861568
received cmd 1: hello
received cmd 2: 2nd_cmd
QThread: Destroyed while thread is still running
UPDATE:
After referring to #tmoreau and #ekhumoro's answers, there are two key problems with this code:
The exit(1) is not a proper way to exit, I need to remove this line.
I don't have a way to exit the QApplication, what I need to do is to add t.finish.connect(app.quit) to exit the application. (By the way, the last line sys.exit(app.exec_()) seems not taking care of the exiting of the QApplication)
In sum, there are basically three things that I need to exit: QThread, QApplication and sys, what I missed is to exit QApplication. Let me know if my understanding is right or not...
Your issue is that you exit the program before it's complete.
my_sender.emit_sig('hello')
sleep(1)
exit(1)
sys.exit(app.exec_())
exit() ends your program, even if the thread has not finished running, hence the error:
QThread: Destroyed while thread is still running
If you remove sleep(1), you'll see the program stops even earlier:
Main thread ID:
46965006517856
emit_sig thread ID:
46965006517856
QThread: Destroyed while thread is still running
Here's more or less what's happening in parallel:
# main thread #worker thread
my_sender.emit_sig('hello') #slot onTimeout is called
sleep(1) #print "onTimeout get called..."
exit(1) #emit finish_sig
sys.exit(app.exec_())
# slot terminate_thread is called #thread ends (t.quit)
If you remove exit(1), your program will work, because you create an event loop with app.exec_(). The event loop means your program is always waiting to catch signals, and will not stop even if there's nothing left to do. So the thread has plenty of time to end :)
In Qt, you usually stop the event loop by closing your main window. Therefore, a cleaner way to implement your thread is:
class window(QWidget):
def __init__(self,parent=None):
super(window,self).__init__(parent)
t=QThread(self)
self.my_worker = GenericWorker()
self.my_worker.moveToThread(t)
t.start()
self.my_sender = Sender()
self.my_sender.send_sig.connect(self.my_worker.onTimeout)
self.my_sender.terminate_sig.connect(t.quit)
self.my_worker.finish_sig.connect(self.my_sender.terminate_thread)
self.my_sender.emit_sig('hello')
if __name__ == "__main__":
app = QApplication(sys.argv)
win=window()
win.show()
sys.exit(app.exec_())
You need self to keep a reference to the thread and classes. Otherwise they are destroyed when __init__ ends.