asyncio and threading: why is the thread id always the same?

asyncio and threading: why is the thread id always the same? - python-3.x

With the simplest example of a pure TCP asyncio server I could write, I want to get the thread id of the current thread. Because I'm in a async coroutine, I thought this would be in a different thread (especially with asyncio library). But the result always prints the same id value. What am I missing? Is it the wrong function call? Does the asyncio not create a new thread?
import asyncio
import threading
from asyncio import StreamWriter, StreamReader
HOST = '127.0.0.1'
PORT = 7070
async def handle(reader: StreamReader, writer: StreamWriter):
print(f"{threading.get_native_id()=} / {threading.get_ident()=}")
writer.close()
async def main():
server = await asyncio.start_server(handle, HOST, PORT)
async with server:
await server.serve_forever()
asyncio.run(main())

asyncio library works in a single OS thread. Basically it's all about the event loop and coroutines being run by that event loop. asyncio applies the concept of cooperative multitasking - a coroutine itself decides when to bring control back to the event loop.
As for multithreading, I suggest you to read this article about GIL. Running multiple threads will not give you any performance gain because of GIL. That's why the key to performance gain (mostly with I/O bound tasks) is to use things like gevent/asyncio. Those libraries will manage "switching between tasks" (i.e. OS scheduler is not applied).

Related

Why the aiohttp request stucks if lock is used?

Why does this code:
import asyncio
import time
from multiprocessing import Pool, Manager
from threading import Thread, Lock
from aiohttp import ClientSession
async def test(s: ClientSession, lock: Lock, identifier):
print(f'before acquiring {identifier}')
lock.acquire()
print(f'before request {identifier}')
async with s.get('http://icanhazip.com') as r:
print(f'after request {identifier}')
lock.release()
print(f'after releasing {identifier}')
async def main(lock: Lock):
async with ClientSession() as s:
await asyncio.gather(test(s, lock, 1), test(s, lock, 2))
def run(lock: Lock):
asyncio.run(main(lock))
if __name__ == '__main__':
# Thread(target=run, args=[Lock()]).start()
with Pool(processes=1) as pool:
pool.map(run, [Manager().Lock()])
prints:
before acquiring 1
before request 1
before acquiring 2
and then stucks? Why is the request with identifier 1 not being executed? The same with Thread (commented) Tried with requests, working.

This is happening because you are mixing synchronous locks, which block an entire thread of execution, with asyncio, which requires all operations to be non-blocking. Both of your coroutines (the two calls to test) are running in the same thread, so when the second coroutine attempts to take the lock, but gets blocked, it also blocks the first coroutine (which holds the lock) from making any additional progress.
You can fix this by using an asyncio.Lock instead. It will only block the coroutine waiting on the lock, rather than blocking the entire thread. Note that this lock can't be passed between processes, though, so it will not work unless you stop using multiprocessing, which is not actually necessary in your example code above. You just create a single lock that you only use in a single child process, so you could simply create the asyncio.Lock in the child process without any loss of functionality.
However, if your actual use-case requires an asyncio-friendly lock that can also be shared between processes you can use aioprocessing for that (full disclosure: I am the author of aioprocessing).

Blocking and non-blocking calls on server side, why does it matter for asynchronous client side?

Experimenting with some asynchronous code, in Python 3.8.0, I stumbled on the following situation. I have client.py which can handle connections asynchronously with a server in server.py. This server pretends to do some work, but actually sleeps for some seconds and then returns. My question is, since the server is running in a completely different process, why does it matter whether the sleep method is blocking or not and if processes on the server side may not be blocking, what is the benefit of doing asynchronous calls like these in the first place?
# client.py
import time
import asyncio
import aiohttp
async def request_coro(url, session):
async with session.get(url) as response:
return await response.read()
async def concurrent_requests(number, url='http://localhost:8080'):
tasks = []
async with aiohttp.ClientSession() as session:
for n in range(number):
# Schedule the tasks
task = asyncio.create_task(request_coro(url, session))
tasks.append(task)
# returns when all tasks are completed
return await asyncio.gather(*tasks)
t0 = time.time()
responses = asyncio.run(concurrent_requests(10))
elapsed_concurrent = time.time() - t0
sum_sleeps = sum((int(i) for i in responses))
print(f'{elapsed_concurrent=:.2f} and {sum_sleeps=:.2f}')
# server.py
import time
import random
import logging
import asyncio
from aiohttp import web
random.seed(10)
async def index(requests):
# Introduce some latency at the server side
sleeps = random.randint(1, 3)
# NON-BLOCKING
# await asyncio.sleep(sleeps)
# BLOCKING
time.sleep(sleeps)
return web.Response(text=str(sleeps))
app = web.Application()
app.add_routes([web.get('/', index),
web.get('/index', index)])
logging.basicConfig(level=logging.DEBUG)
web.run_app(app, host='localhost', port=8080)
These are the results from 10 asynchronous calls by the client using either the blocking or the non-blocking sleep methods:
asyncio.sleep (non-blocking)
elapsed_concurrent=3.02 and sum_sleeps=19.00
time.sleep (blocking)
elapsed_concurrent=19.04 and sum_sleeps=19.00

Although the server is running in a completely different process, it can not take multiple active connections at the same time, like a multi threaded server. So the client and the server are working asynchonously both having their own event loop.
The server can only take new connections from the client when the event loop is suspended in a non-blocking sleep. Making it appear that the server is multi threaded but actually rapidly alternates between available connections. A blocking sleep will make the requests sequential because the suspended event loop will sit idle and can not handle new connections in the mean time.

ValueError when asyncio.run() is called in separate thread

I have a network application which is listening on multiple sockets.
To handle each socket individually, I use Python's threading.Thread module.
These sockets must be able to run tasks on packet reception without delaying any further packet reception from the socket handling thread.
To do so, I've declared the method(s) that are running the previously mentioned tasks with the keyword async so I can run them asynchronously with asyncio.run(my_async_task(my_parameters)).
I have tested this approach on a single socket (running on the main thread) with great success.
But when I use multiple sockets (each one with it's independent handler thread), the following exception is raised:
ValueError: set_wakeup_fd only works in main thread
My question is the following: Is asyncio the appropriate tool for what I need? If it is, how do I run an async method from a thread that is not a main thread.
Most of my search results are including "event loops" and "awaiting" assync results, which (if I understand these results correctly) is not what I am looking for.
I am talking about sockets in this question to provide context but my problem is mostly about the behaviour of asyncio in child threads.
I can, if needed, write a short code sample to reproduce the error.
Thank you for the help!
Edit1, here is a minimal reproducible code example:
import asyncio
import threading
import time
# Handle a specific packet from any socket without interrupting the listenning thread
async def handle_it(val):
print("handled: {}".format(val))
# A class to simulate a threaded socket listenner
class MyFakeSocket(threading.Thread):
def __init__(self, val):
threading.Thread.__init__(self)
self.val = val # Value for a fake received packet
def run(self):
for i in range(10):
# The (fake) socket will sequentially receive [val, val+1, ... val+9]
asyncio.run(handle_it(self.val + i))
time.sleep(0.5)
# Entry point
sockets = MyFakeSocket(0), MyFakeSocket(10)
for socket in sockets:
socket.start()

This is possibly related to the bug discussed here: https://bugs.python.org/issue34679
If so, this would be a problem with python 3.8 on windows. To work around this, you could try either downgrading to python 3.7, which doesn't include asyncio.main so you will need to get and run the event loop manually like:
loop = asyncio.get_event_loop()
loop.run_until_complete(<your tasks>)
loop.close()
Otherwise, would you be able to run the code in a docker container? This might work for you and would then be detached from the OS behaviour, but is a lot more work!

how to use asyncio with boost.python?

Is it possible to use Python3 asyncio package with Boost.Python library?
I have CPython C++ extension that builds with Boost.Python. And functions that are written in C++ can work really long time. I want to use asyncio to call these functions but res = await cpp_function() code doesn't work.
What happens when cpp_function is called inside coroutine?
How not get blocked by calling C++ function that works very long time?
NOTE: C++ doesn't do some I/O operations, just calculations.

What happens when cpp_function is called inside coroutine?
If you call long-running Python/C function inside any of your coroutines, it freezes your event loop (freezes all coroutines everywhere).
You should avoid this situation.
How not get blocked by calling C++ function that works very long time
You should use run_in_executor to run you function in separate thread or process. run_in_executor returns coroutine that you can await.
You'll probably need ProcessPoolExecutor because of GIL (I'm not sure if ThreadPoolExecutor is option in your situation, but I advice you to check it).
Here's example of awaiting long-running code:
import asyncio
from concurrent.futures import ProcessPoolExecutor
import time
def blocking_function():
# Function with long-running C/Python code.
time.sleep(3)
return True
async def main():
# Await of executing in other process,
# it doesn't block your event loop:
loop = asyncio.get_event_loop()
res = await loop.run_in_executor(executor, blocking_function)
if __name__ == '__main__':
executor = ProcessPoolExecutor(max_workers=1) # Prepare your executor somewhere.
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()

Python thread never starts if run() contains yield from

Python 3.4, I'm trying to make a server using the websockets module (I was previously using regular sockets but wanted to make a javascript client) when I ran into an issue (because it expects async, at least if the examples are to be trusted, which I didn't use before). Threading simply does not work. If I run the following code, bar will never be printed, whereas if I comment out the line with yield from, it works as expected. So yield is probably doing something I don't quite understand, but why is it never even executed? Should I install python 3.5?
import threading
class SampleThread(threading.Thread):
def __init__(self):
super(SampleThread, self).__init__()
print("foo")
def run(self):
print("bar")
yield from var2
thread = SampleThread()
thread.start()

This is not the correct way to handle multithreading. run is neither a generator nor a coroutine. It should be noted that the asyncio event loop is only defined for the main thread. Any call to asyncio.get_event_loop() in a new thread (without first setting it with asyncio.set_event_loop() will throw an exception.
Before looking at running the event loop in a new thread, you should first analyze to see if you really need the event loop running in its own thread. It has a built-in thread pool executor at: loop.run_in_executor(). This will take a pool from concurrent.futures (either a ThreadPoolExecutor or a ProcessPoolExecutor) and provides a non-blocking way of running processes and threads directly from the loop object. As such, these can be await-ed (with Python3.5 syntax)
That being said, if you want to run your event loop from another thread, you can do it thustly:
import asyncio
class LoopThread(threading.Thread):
def __init__(self):
self.loop = asyncio.new_event_loop()
def run():
ayncio.set_event_loop(self.loop)
self.loop.run_forever()
def stop():
self.loop.call_soon_threadsafe(self.loop.stop)
From here, you still need to device a thread-safe way of creating tasks, etc. Some of the code in this thread is usable, although I did not have a lot of success with it: python asyncio, how to create and cancel tasks from another thread

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

asyncio and threading: why is the thread id always the same? - python-3.x

Related

Why the aiohttp request stucks if lock is used?

Blocking and non-blocking calls on server side, why does it matter for asynchronous client side?

ValueError when asyncio.run() is called in separate thread

how to use asyncio with boost.python?

Python thread never starts if run() contains yield from

Categories

Resources