Python: concurrently pending on async coroutine and synchronous function - python-3.x

I'd like to establish an SSH SOCKs tunnel (using asyncssh) during the execution of a synchronous function. When the function is done I want to tear down the tunnel and exit.
Apparently some async function has to be awaited to keep the tunnel working so the important thing is that conn.wait_closed() and the synchronous function are executed concurrently. So I am quite sure that I actually need a second thread.
I first tried some saner things using a ThreadPoolExecutor with run_in_executor but then ended up with the abysmal multihreaded variant below.
#! /usr/bin/env python3
import traceback
from threading import Thread
from concurrent.futures import ThreadPoolExecutor
import asyncio, asyncssh, sys
_server="127.0.0.1"
_port=22
_proxy_port=8080
async def run_client():
conn = await asyncio.wait_for(
asyncssh.connect(
_server,
port=_port,
options=asyncssh.SSHClientConnectionOptions(client_host_keysign=True),
),
10,
)
listener = await conn.forward_socks('127.0.0.1', _proxy_port)
return conn
async def do_stuff(func):
try:
conn = await run_client()
print("SSH tunnel active")
def start_loop(loop):
asyncio.set_event_loop(loop)
try:
loop.run_forever()
except Exception as e:
print(f"worker loop: {e}")
async def thread_func():
ret=await func()
print("Func done - tearing done worker thread and SSH connection")
conn.close()
# asyncio.get_event_loop().stop()
return ret
func_loop = asyncio.new_event_loop()
func_thread = Thread(target=start_loop, args=(func_loop,))
func_thread.start()
print("thread started")
fut = asyncio.run_coroutine_threadsafe(thread_func(), func_loop)
print(f"fut scheduled: {fut}")
done = await asyncio.gather(asyncio.wrap_future(fut), conn.wait_closed())
print("wait done")
for ret in done:
print(f"ret={ret}")
# Canceling pending tasks and stopping the loop
# asyncio.gather(*asyncio.Task.all_tasks()).cancel()
print("stopping func_loop")
func_loop.call_soon_threadsafe(func_loop.stop())
print("joining func_thread")
func_thread.join()
print("joined func_thread")
except (OSError, asyncssh.Error) as exc:
sys.exit('SSH connection failed: ' + str(exc))
except (Exception) as exc:
sys.exit('Unhandled exception: ' + str(exc))
traceback.print_exc()
async def just_wait():
print("starting just_wait")
input()
print("ending just_wait")
return 42
asyncio.get_event_loop().run_until_complete(do_stuff(just_wait))
It actually "works" "correctly" till the end where I get an exception while joining the worker thread. I presume because something I do is not threadsafe.
Exception in callback None()
handle: <Handle>
Traceback (most recent call last):
File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
TypeError: 'NoneType' object is not callable
To test the code you must have a local SSH server running with key files setup for your user. You may want to change the _port variable.
I am looking for the reason of the exception and/or a version of the program that requires less manual intervention in the threading and possibly uses just a single event loop. I don't know how to achieve the latter when I want to await the two things (as in the asyncio.gather call).

The immediate cause of your error is this line:
# incorrect
func_loop.call_soon_threadsafe(func_loop.stop())
The intention is to call func_loop.stop() in the thread that runs the func_loop event loop. But as written, it invokes func_loop.stop() in the current thread and passes its return value (None) to call_soon_threadsafe as the function to invoke. This causes call_soon_threadsafe to complain that None is not callable. To fix the immediate problem, you should drop the extra parentheses and invoke the method as:
# correct
func_loop.call_soon_threadsafe(func_loop.stop)
However, the code is definitely over-complicated as written:
it doesn't make sense to create a new event loop when you are already inside an event loop
just_wait shouldn't be async def since it doesn't await anything, so it's clearly not async.
sys.exit takes an integer exit status, not a string. Also, it doesn't make much sense to attempt to print a backtrace after the call to sys.exit.
To run a non-async function from asyncio, just use run_in_executor with the function and pass it the non-async function as-is. You don't need an extra thread nor an extra event loop, run_in_executor will take care of the thread and connect it with your current event loop, effectively making the sync function awaitable. For example (untested):
async def do_stuff(func):
conn = await run_client()
print("SSH tunnel active")
loop = asyncio.get_event_loop()
ret = await loop.run_in_executor(None, func)
print(f"ret={ret}")
conn.close()
await conn.wait_closed()
print("wait done")
def just_wait():
# just_wait is a regular function; it can call blocking code,
# but it cannot await
print("starting just_wait")
input()
print("ending just_wait")
return 42
asyncio.get_event_loop().run_until_complete(do_stuff(just_wait))
If you need to await things in just_wait, you can make it async and use run_in_executor for the actual blocking code inside it:
async def do_stuff():
conn = await run_client()
print("SSH tunnel active")
loop = asyncio.get_event_loop()
ret = await just_wait()
print(f"ret={ret}")
conn.close()
await conn.wait_closed()
print("wait done")
async def just_wait():
# just_wait is an async function, it can await, but
# must invoke blocking code through run_in_executor
print("starting just_wait")
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, input)
print("ending just_wait")
return 42
asyncio.run(do_stuff())

Related

Python async: Waiting for stdin input while doing other stuff

I'm trying to create a WebSocket command line client that waits for messages from a WebSocket server but waits for user input at the same time.
Regularly polling multiple online sources every second works fine on the server, (the one running at localhost:6789 in this example), but instead of using Python's normal sleep() method, it uses asyncio.sleep(), which makes sense because sleeping and asynchronously sleeping aren't the same thing, at least not under the hood.
Similarly, waiting for user input and asynchronously waiting for user input aren't the same thing, but I can't figure out how to asynchronously wait for user input in the same way that I can asynchronously wait for an arbitrary amount of seconds, so that the client can deal with incoming messages from the WebSocket server while simultaneously waiting for user input.
The comment below in the else-clause of monitor_cmd() hopefully explains what I'm getting at:
import asyncio
import json
import websockets
async def monitor_ws():
uri = 'ws://localhost:6789'
async with websockets.connect(uri) as websocket:
async for message in websocket:
print(json.dumps(json.loads(message), indent=2, sort_keys=True))
async def monitor_cmd():
while True:
sleep_instead = False
if sleep_instead:
await asyncio.sleep(1)
print('Sleeping works fine.')
else:
# Seems like I need the equivalent of:
# line = await asyncio.input('Is this your line? ')
line = input('Is this your line? ')
print(line)
try:
asyncio.get_event_loop().run_until_complete(asyncio.wait([
monitor_ws(),
monitor_cmd()
]))
except KeyboardInterrupt:
quit()
This code just waits for input indefinitely and does nothing else in the meantime, and I understand why. What I don't understand, is how to fix it. :)
Of course, if I'm thinking about this problem in the wrong way, I'd be very happy to learn how to remedy that as well.
You can use the aioconsole third-party package to interact with stdin in an asyncio-friendly manner:
line = await aioconsole.ainput('Is this your line? ')
Borrowing heavily from aioconsole, if you would rather avoid using an external library you could define your own async input function:
async def ainput(string: str) -> str:
await asyncio.get_event_loop().run_in_executor(
None, lambda s=string: sys.stdout.write(s+' '))
return await asyncio.get_event_loop().run_in_executor(
None, sys.stdin.readline)
Borrowing heavily from aioconsole, there are 2 ways to handle.
start a new daemon thread:
import sys
import asyncio
import threading
from concurrent.futures import Future
async def run_as_daemon(func, *args):
future = Future()
future.set_running_or_notify_cancel()
def daemon():
try:
result = func(*args)
except Exception as e:
future.set_exception(e)
else:
future.set_result(result)
threading.Thread(target=daemon, daemon=True).start()
return await asyncio.wrap_future(future)
async def main():
data = await run_as_daemon(sys.stdin.readline)
print(data)
if __name__ == "__main__":
asyncio.run(main())
use stream reader:
import sys
import asyncio
async def get_steam_reader(pipe) -> asyncio.StreamReader:
loop = asyncio.get_event_loop()
reader = asyncio.StreamReader(loop=loop)
protocol = asyncio.StreamReaderProtocol(reader)
await loop.connect_read_pipe(lambda: protocol, pipe)
return reader
async def main():
reader = await get_steam_reader(sys.stdin)
data = await reader.readline()
print(data)
if __name__ == "__main__":
asyncio.run(main())

Run and wait for asynchronous function from a synchronous one using Python asyncio

In my code I have a class with properties, that occasionally need to run asynchronous code. Sometimes I need to access the property from asynchronous function, sometimes from synchronous - that's why I don't want my properties to be asynchronous. Besides, I have an impression that asynchronous properties in general is a code smell. Correct me if I'm wrong.
I have a problem with executing the asynchronous method from the synchronous property and blocking the further execution until the asynchronous method will finish.
Here is a sample code:
import asyncio
async def main():
print('entering main')
synchronous_property()
print('exiting main')
def synchronous_property():
print('entering synchronous_property')
loop = asyncio.get_event_loop()
try:
# this will raise an exception, so I catch it and ignore
loop.run_until_complete(asynchronous())
except RuntimeError:
pass
print('exiting synchronous_property')
async def asynchronous():
print('entering asynchronous')
print('exiting asynchronous')
asyncio.run(main())
Its output:
entering main
entering synchronous_property
exiting synchronous_property
exiting main
entering asynchronous
exiting asynchronous
First, the RuntimeError capturing seems wrong, but if I won't do that, I'll get RuntimeError: This event loop is already running exception.
Second, the asynchronous() function is executed last, after the synchronous one finish. I want to do some processing on the data set by asynchronous method so I need to wait for it to finish.
If I'll add await asyncio.sleep(0) after calling synchronous_property(), it will call asynchronous() before main() finish, but it doesn't help me. I need to run asynchronous() before synchronous_property() finish.
What am I missing? I'm running python 3.7.
Asyncio is really insistent on not allowing nested loops, by design. However, you can always run another event loop in a different thread. Here is a variant that uses a thread pool to avoid having to create a new thread each time around:
import asyncio, concurrent.futures
async def main():
print('entering main')
synchronous_property()
print('exiting main')
pool = concurrent.futures.ThreadPoolExecutor()
def synchronous_property():
print('entering synchronous_property')
result = pool.submit(asyncio.run, asynchronous()).result()
print('exiting synchronous_property', result)
async def asynchronous():
print('entering asynchronous')
await asyncio.sleep(1)
print('exiting asynchronous')
return 42
asyncio.run(main())
This code creates a new event loop on each sync->async boundary, so don't expect high performance if you're doing that a lot. It could be improved by creating only one event loop per thread using asyncio.new_event_loop, and caching it in a thread-local variable.
The easiest way is using an existing "wheel",
like
asgiref.async_to_sync
from asgiref.sync import async_to_sync
then:
async_to_sync(main)()
in general:
async_to_sync(<your_async_func>)(<.. arguments for async function ..>)
This is a caller class which turns an awaitable that only works on the thread with
the event loop into a synchronous callable that works in a subthread.
If the call stack contains an async loop, the code runs there.
Otherwise, the code runs in a new loop in a new thread.
Either way, this thread then pauses and waits to run any thread_sensitive
code called from further down the call stack using SyncToAsync, before
finally exiting once the async task returns.
There appears to a problem with the question as stated. Restating the question:
How to communicate between a thread (containing no async processes and hence considered sync) and an async proces (running in some event loop). One approach is to use two sync Queues. The sync process puts its request/parameters into the QtoAsync, and waits on the QtoSync. The async process reads the QtoAsync WITHOUT wait, and if it finds a request/parameters, executes the request, and places the result in QtoSync.
import queue
QtoAsync = queue.Queue()
QtoSync = queue.Queue()
...
async def asyncProc():
while True:
try:
data=QtoAsync.get_nowait()
result = await <the async that you wish to execute>
QtoAsync.put(result) #This can block if queue is full. you can use put_nowait and handle the exception.
except queue.Empty:
await asyncio.sleep(0.001) #put a nominal delay forcing this to wait in event loop
....
#start the sync process in a different thread here..
asyncio.run(main()) #main invokes the async tasks including the asyncProc
The sync thread puts it request to async using:
req = <the async that you wish to execute>
QtoAsync.put(req)
result = QtoSync.get()
This should work.
Problem with the question as stated:
1. When the async processes are started with asyncio.run (or similar) execution blocks until the async processes are completed. A separate sync thread has to be started explicity before calling asyncio.run
2. In general asyncio processes depend on other asyncio processes in that loop. So calling a async process from another thread is not permitted directly. The interaction should be with the event loop, and using two queues is one approach.
I want to make the async call to execute from sync and block it's execution
Just make the sync func async and await the asynchronous function. Async functions are just like normal functions and you can put whatever code you want in them. If you still have a problem modify your question using actual code you are trying to run.
import asyncio
async def main():
print('entering main')
await synchronous_property()
print('exiting main')
async def synchronous_property():
print('entering synchronous_property')
await asynchronous()
# Do whatever sync stuff you want who cares
print('exiting synchronous_property')
async def asynchronous():
print('entering asynchronous')
print('exiting asynchronous')
asyncio.run(main())

Handling ensure_future and its missing tasks

I have a streaming application that almost continuously takes the data given as input and sends an HTTP request using that value and does something with the returned value.
Obviously to speed things up I've used asyncio and aiohttp libraries in Python 3.7 to get the best performance, but it becomes hard to debug given how fast the data moves.
This is what my code looks like
'''
Gets the final requests
'''
async def apiRequest(info, url, session, reqType, post_data=''):
if reqType:
async with session.post(url, data = post_data) as response:
info['response'] = await response.text()
else:
async with session.get(url+post_data) as response:
info['response'] = await response.text()
logger.debug(info)
return info
'''
Loops through the batches and sends it for request
'''
async def main(data, listOfData):
tasks = []
async with ClientSession() as session:
for reqData in listOfData:
try:
task = asyncio.ensure_future(apiRequest(**reqData))
tasks.append(task)
except Exception as e:
print(e)
exc_type, exc_obj, exc_tb = sys.exc_info()
fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
print(exc_type, fname, exc_tb.tb_lineno)
responses = await asyncio.gather(*tasks)
return responses #list of APIResponses
'''
Streams data in and prepares batches to send for requests
'''
async def Kconsumer(data, loop, batchsize=100):
consumer = AIOKafkaConsumer(**KafkaConfigs)
await consumer.start()
dataPoints = []
async for msg in consumer:
try:
sys.stdout.flush()
consumedMsg = loads(msg.value.decode('utf-8'))
if consumedMsg['tid']:
dataPoints.append(loads(msg.value.decode('utf-8')))
if len(dataPoints)==batchsize or time.time() - startTime>5:
'''
#1: The task below goes and sends HTTP GET requests in bulk using aiohttp
'''
task = asyncio.ensure_future(getRequests(data, dataPoints))
res = await asyncio.gather(*[task])
if task.done():
outputs = []
'''
#2: Does some ETL on the returned values
'''
ids = await asyncio.gather(*[doSomething(**{'tid':x['tid'],
'cid':x['cid'], 'tn':x['tn'],
'id':x['id'], 'ix':x['ix'],
'ac':x['ac'], 'output':to_dict(xmltodict.parse(x['response'],encoding='utf-8')),
'loop':loop, 'option':1}) for x in res[0]])
simplySaveDataIntoDataBase(id) # This is where I see some missing data in the database
dataPoints = []
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
exc_type, exc_obj, exc_tb = sys.exc_info()
fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
logger.error(str(exc_type) +' '+ str(fname) +' '+ str(exc_tb.tb_lineno))
if __name__ == '__main__':
loop = asyncio.get_event_loop()
asyncio.ensure_future(Kconsumer(data, loop, batchsize=100))
loop.run_forever()
Does the ensure_future need to be awaited ?
How does aiohttp handle requests that come a little later than the others? Shouldn't it hold the whole batch back instead of forgetting about it altoghter?
Does the ensure_future need to be awaited ?
Yes, and your code is doing that already. await asyncio.gather(*tasks) awaits the provided tasks and returns their results in the same order.
Note that await asyncio.gather(*[task]) doesn't make sense, because it is equivalent to await asyncio.gather(task), which is again equivalent to await task. In other words, when you need the result of getRequests(data, dataPoints), you can write res = await getRequests(data, dataPoints) without the ceremony of first calling ensure_future() and then calling gather().
In fact, you almost never need to call ensure_future yourself:
if you need to await multiple tasks, you can pass coroutine objects directly to gather, e.g. gather(coroutine1(), coroutine2()).
if you need to spawn a background task, you can call asyncio.create_task(coroutine(...))
How does aiohttp handle requests that come a little later than the others? Shouldn't it hold the whole batch back instead of forgetting about it altoghter?
If you use gather, all requests must finish before any of them return. (That is not aiohttp policy, it's how gather works.) If you need to implement a timeout, you can use asyncio.wait_for or similar.

Wait for db future to complete?

I have written a code for sanic application, rethinkdb is being used as a backend database. I want to wait for rethinkdb connection function to intialise before other functions as they have dependency on rethinkdb connection.
My rethinkdb connection initialization function is:
async def open_connections(app):
logger.warning('opening database connection')
r.set_loop_type('asyncio')
connection= await r.connect(
port=app.config.DATABASE["port"],
host=app.config.DATABASE["ip"],
db=app.config.DATABASE["dbname"],
user=app.config.DATABASE["user"],
password=app.config.DATABASE["password"])
print (f"connection established {connection}")
return connection
The call back function which will be executed after future gets resolved is
def db_callback(future):
exc = future.exception()
if exc:
# Handle wonderful empty TimeoutError exception
logger.error(f"From mnemonic api isnt working with error {exc}")
sys.exit(1)
result = future.result()
return result
sanic app:
def main():
app = Sanic(__name__)
load_config(app)
zmq = ZMQEventLoop()
asyncio.set_event_loop(zmq)
server = app.create_server(
host=app.config.HOST, port=app.config.PORT, debug=app.config.DEBUG, access_log=True)
loop = asyncio.get_event_loop()
##not wait for the server to strat, this will return a future object
asyncio.ensure_future(server)
##not wait for the rethinkdb connection to initialize, this will return
##a future object
future = asyncio.ensure_future(open_connections(app))
result = future.add_done_callback(db_callback)
logger.debug(result)
future = asyncio.ensure_future(insert_mstr_account(app))
future.add_done_callback(insert_mstr_acc_callback)
future = asyncio.ensure_future(check_master_accounts(app))
future.add_done_callback(callbk_check_master_accounts)
signal(SIGINT, lambda s, f: loop.close())
try:
loop.run_forever()
except KeyboardInterrupt:
close_connections(app)
loop.stop()
When i start this app, the print statement in open_connections functions executes in the last.
future = asyncio.ensure_future(open_connections(app))
result = future.add_done_callback(db_callback)
ensure_future schedules coroutines concurrently
add_done_callback does not wait for the completion of the future, instead it simply schedules a function call after the future is completed. You can see it here
So you should explicitly await the open_connections future before performing other functions:
future = asyncio.ensure_future(open_connections(app))
future.add_done_callback(db_callback)
result = await future
EDITED: the answer above applies only to coroutine
In this case we want to wait for the completion of future in the function body. To do it we should use loop.run_until_complete
def main():
...
future = asyncio.ensure_future(open_connections(app))
future.add_done_callback(db_callback)
result = loop.run_until_complete(future)

Asyncio, await and infinite loops

async def start(channel):
while True:
m = await client.send_message(channel, "Generating... ")
generator.makeFile()
with open('tmp.png', 'rb') as f:
await client.send_file(channel, f)
await client.delete_message(m)
await asyncio.sleep(2)
I have a discord bot that runs a task every 2 seconds. I tried using an infinite loop for this, but the script crashes with a Task was destroyed but it is still pending! I have read about asyncio's coroutines, but none of the examples that I found use await in them. Is it possible avoid this error, by running a coroutine with await, for example?
Task was destroyed but it is still pending! is warning that you receive when you call loop.close() when some of tasks in your script aren't finished. Usually you should avoid this situation because unfinished task may not release some resources. You need either to await task done or cancel it before event loop closed.
Since you have infinite loop you probably would need to cancel task, example:
import asyncio
from contextlib import suppress
async def start():
# your infinite loop here, for example:
while True:
print('echo')
await asyncio.sleep(1)
async def main():
task = asyncio.Task(start())
# let script some thime to work:
await asyncio.sleep(3)
# cancel task to avoid warning:
task.cancel()
with suppress(asyncio.CancelledError):
await task # await for task cancellation
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()
See also this answer for more information about tasks.

Resources