Why the aiohttp request stucks if lock is used? - python-3.x

Why does this code:
import asyncio
import time
from multiprocessing import Pool, Manager
from threading import Thread, Lock
from aiohttp import ClientSession
async def test(s: ClientSession, lock: Lock, identifier):
print(f'before acquiring {identifier}')
lock.acquire()
print(f'before request {identifier}')
async with s.get('http://icanhazip.com') as r:
print(f'after request {identifier}')
lock.release()
print(f'after releasing {identifier}')
async def main(lock: Lock):
async with ClientSession() as s:
await asyncio.gather(test(s, lock, 1), test(s, lock, 2))
def run(lock: Lock):
asyncio.run(main(lock))
if __name__ == '__main__':
# Thread(target=run, args=[Lock()]).start()
with Pool(processes=1) as pool:
pool.map(run, [Manager().Lock()])
prints:
before acquiring 1
before request 1
before acquiring 2
and then stucks? Why is the request with identifier 1 not being executed? The same with Thread (commented) Tried with requests, working.

This is happening because you are mixing synchronous locks, which block an entire thread of execution, with asyncio, which requires all operations to be non-blocking. Both of your coroutines (the two calls to test) are running in the same thread, so when the second coroutine attempts to take the lock, but gets blocked, it also blocks the first coroutine (which holds the lock) from making any additional progress.
You can fix this by using an asyncio.Lock instead. It will only block the coroutine waiting on the lock, rather than blocking the entire thread. Note that this lock can't be passed between processes, though, so it will not work unless you stop using multiprocessing, which is not actually necessary in your example code above. You just create a single lock that you only use in a single child process, so you could simply create the asyncio.Lock in the child process without any loss of functionality.
However, if your actual use-case requires an asyncio-friendly lock that can also be shared between processes you can use aioprocessing for that (full disclosure: I am the author of aioprocessing).

Related

Does a loop.run_in_executor functions need asyncio.lock() or threading.Lock()?

I copied the following code for my project and it's worked quite well for me but I don't really understand how the following code runs my blocking_function:
#client.event
async def on_message(message):
loop = asyncio.get_event_loop()
block_response = await loop.run_in_executor(ThreadPoolExecutor(), blocking_function)
where on_message is called every time I receive a message. If I receive multiple messages, they are processed asynchronously.
blocking_function is a synchronous function that I don't want to be run when another blocking_function is running.Then within blocking_function, should I use threading.Lock() or asyncio.lock()?
As pointed out by dirn in the comment, in blocking_function you cannot use an asyncio.Lock because it's just not async. (The opposite also applies: you cannot lock a threading.Lock from an async function because attempting to do so would block the event loop.) If you need to guard data accessed by other instances of blocking_function, you should use a threading.Lock.
but I don't really understand how the following code runs my blocking_function
It hands off blocking_function to the thread pool you created to run it. The thread pool queues and runs the function (which happens "in the background" from your perspective), and the run_in_executor arranges the event loop to be notified when the function is done, handing off its return value as the result of the await expression.
Note that you should use None as the first argument of run_in_executor. If you use ThreadPoolExecutor(), you create a whole new thread pool for each message, and you never dispose of it. A thread pool is normally meant to be created once, and reuse a fixed number ("pool") of threads for subsequent work. None tells asyncio to use the thread pool it creates for this purpose.
It seems you can easily achieve your desired objective by ensuring a single thread is used.
A simple solution would be to ensure that all calls to blocking_function is run on a single thread. This can be easily achieved by creating a ThreadPoolExecutor object with 1 worker outside of the async function. Then every subsequent calls to the blocking function will be run on that single thread
thread_pool = ThreadPoolExecutor(max_workers=1)
#client.event
async def on_message(message):
loop = asyncio.get_event_loop()
block_response = await loop.run_in_executor(thread_pool, blocking_function)
Don't forget to shutdown the thread afterwards.

asyncio and threading: why is the thread id always the same?

With the simplest example of a pure TCP asyncio server I could write, I want to get the thread id of the current thread. Because I'm in a async coroutine, I thought this would be in a different thread (especially with asyncio library). But the result always prints the same id value. What am I missing? Is it the wrong function call? Does the asyncio not create a new thread?
import asyncio
import threading
from asyncio import StreamWriter, StreamReader
HOST = '127.0.0.1'
PORT = 7070
async def handle(reader: StreamReader, writer: StreamWriter):
print(f"{threading.get_native_id()=} / {threading.get_ident()=}")
writer.close()
async def main():
server = await asyncio.start_server(handle, HOST, PORT)
async with server:
await server.serve_forever()
asyncio.run(main())
asyncio library works in a single OS thread. Basically it's all about the event loop and coroutines being run by that event loop. asyncio applies the concept of cooperative multitasking - a coroutine itself decides when to bring control back to the event loop.
As for multithreading, I suggest you to read this article about GIL. Running multiple threads will not give you any performance gain because of GIL. That's why the key to performance gain (mostly with I/O bound tasks) is to use things like gevent/asyncio. Those libraries will manage "switching between tasks" (i.e. OS scheduler is not applied).

Synchronization issue in Python

I am trying to understand synchronization and have the following code with Reentrant lock
import threading
from time import sleep,ctime,time
class show:
lock=threading.RLock()
def __init__(self):
self.x=0
def increment(self):
show.lock.acquire()
print("x=",self.x)
# show.lock.acquire()
self.x+=1
show.lock.release()
class mythread(threading.Thread):
def __init__(self,aa):
super().__init__(group=None)
self.obj=aa
def run(self):
for i in range(0,100):
self.obj.increment()
ss=show()
ss1=show()
one=mythread(ss)
two=mythread(ss)
one.start()
two.start()
Now if i run the code as above things are working fine and i get output from 0 to 199. But if i uncomment the line above where we reacquire the lock the output is 0 to 99. why is this change. how reacquiring lock is changing the output
After uncommenting, one of threads is blocked by another which still holds a hundered of locks on class show after terminating. You should always match number of aquired and releases locks even if using recursive (aka reentrant) locks.
Check the Wikipedia or the docs for the rlock definition. The latter says:
To unlock the lock, a thread calls its release() method.
acquire()/release() call pairs may be nested; only the final release()
(the release() of the outermost pair) resets the lock to unlocked and
allows another thread blocked in acquire() to proceed.
To avoid the issues with missing lock releases I recommend a context manager
def increment(self):
with show.lock:
print("x=", self.x)
self.x += 1

how to use asyncio with boost.python?

Is it possible to use Python3 asyncio package with Boost.Python library?
I have CPython C++ extension that builds with Boost.Python. And functions that are written in C++ can work really long time. I want to use asyncio to call these functions but res = await cpp_function() code doesn't work.
What happens when cpp_function is called inside coroutine?
How not get blocked by calling C++ function that works very long time?
NOTE: C++ doesn't do some I/O operations, just calculations.
What happens when cpp_function is called inside coroutine?
If you call long-running Python/C function inside any of your coroutines, it freezes your event loop (freezes all coroutines everywhere).
You should avoid this situation.
How not get blocked by calling C++ function that works very long time
You should use run_in_executor to run you function in separate thread or process. run_in_executor returns coroutine that you can await.
You'll probably need ProcessPoolExecutor because of GIL (I'm not sure if ThreadPoolExecutor is option in your situation, but I advice you to check it).
Here's example of awaiting long-running code:
import asyncio
from concurrent.futures import ProcessPoolExecutor
import time
def blocking_function():
# Function with long-running C/Python code.
time.sleep(3)
return True
async def main():
# Await of executing in other process,
# it doesn't block your event loop:
loop = asyncio.get_event_loop()
res = await loop.run_in_executor(executor, blocking_function)
if __name__ == '__main__':
executor = ProcessPoolExecutor(max_workers=1) # Prepare your executor somewhere.
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()

Python thread never starts if run() contains yield from

Python 3.4, I'm trying to make a server using the websockets module (I was previously using regular sockets but wanted to make a javascript client) when I ran into an issue (because it expects async, at least if the examples are to be trusted, which I didn't use before). Threading simply does not work. If I run the following code, bar will never be printed, whereas if I comment out the line with yield from, it works as expected. So yield is probably doing something I don't quite understand, but why is it never even executed? Should I install python 3.5?
import threading
class SampleThread(threading.Thread):
def __init__(self):
super(SampleThread, self).__init__()
print("foo")
def run(self):
print("bar")
yield from var2
thread = SampleThread()
thread.start()
This is not the correct way to handle multithreading. run is neither a generator nor a coroutine. It should be noted that the asyncio event loop is only defined for the main thread. Any call to asyncio.get_event_loop() in a new thread (without first setting it with asyncio.set_event_loop() will throw an exception.
Before looking at running the event loop in a new thread, you should first analyze to see if you really need the event loop running in its own thread. It has a built-in thread pool executor at: loop.run_in_executor(). This will take a pool from concurrent.futures (either a ThreadPoolExecutor or a ProcessPoolExecutor) and provides a non-blocking way of running processes and threads directly from the loop object. As such, these can be await-ed (with Python3.5 syntax)
That being said, if you want to run your event loop from another thread, you can do it thustly:
import asyncio
class LoopThread(threading.Thread):
def __init__(self):
self.loop = asyncio.new_event_loop()
def run():
ayncio.set_event_loop(self.loop)
self.loop.run_forever()
def stop():
self.loop.call_soon_threadsafe(self.loop.stop)
From here, you still need to device a thread-safe way of creating tasks, etc. Some of the code in this thread is usable, although I did not have a lot of success with it: python asyncio, how to create and cancel tasks from another thread

Resources