I'm trying to create a WebSocket command line client that waits for messages from a WebSocket server but waits for user input at the same time.
Regularly polling multiple online sources every second works fine on the server, (the one running at localhost:6789 in this example), but instead of using Python's normal sleep() method, it uses asyncio.sleep(), which makes sense because sleeping and asynchronously sleeping aren't the same thing, at least not under the hood.
Similarly, waiting for user input and asynchronously waiting for user input aren't the same thing, but I can't figure out how to asynchronously wait for user input in the same way that I can asynchronously wait for an arbitrary amount of seconds, so that the client can deal with incoming messages from the WebSocket server while simultaneously waiting for user input.
The comment below in the else-clause of monitor_cmd() hopefully explains what I'm getting at:
import asyncio
import json
import websockets
async def monitor_ws():
uri = 'ws://localhost:6789'
async with websockets.connect(uri) as websocket:
async for message in websocket:
print(json.dumps(json.loads(message), indent=2, sort_keys=True))
async def monitor_cmd():
while True:
sleep_instead = False
if sleep_instead:
await asyncio.sleep(1)
print('Sleeping works fine.')
else:
# Seems like I need the equivalent of:
# line = await asyncio.input('Is this your line? ')
line = input('Is this your line? ')
print(line)
try:
asyncio.get_event_loop().run_until_complete(asyncio.wait([
monitor_ws(),
monitor_cmd()
]))
except KeyboardInterrupt:
quit()
This code just waits for input indefinitely and does nothing else in the meantime, and I understand why. What I don't understand, is how to fix it. :)
Of course, if I'm thinking about this problem in the wrong way, I'd be very happy to learn how to remedy that as well.
You can use the aioconsole third-party package to interact with stdin in an asyncio-friendly manner:
line = await aioconsole.ainput('Is this your line? ')
Borrowing heavily from aioconsole, if you would rather avoid using an external library you could define your own async input function:
async def ainput(string: str) -> str:
await asyncio.get_event_loop().run_in_executor(
None, lambda s=string: sys.stdout.write(s+' '))
return await asyncio.get_event_loop().run_in_executor(
None, sys.stdin.readline)
Borrowing heavily from aioconsole, there are 2 ways to handle.
start a new daemon thread:
import sys
import asyncio
import threading
from concurrent.futures import Future
async def run_as_daemon(func, *args):
future = Future()
future.set_running_or_notify_cancel()
def daemon():
try:
result = func(*args)
except Exception as e:
future.set_exception(e)
else:
future.set_result(result)
threading.Thread(target=daemon, daemon=True).start()
return await asyncio.wrap_future(future)
async def main():
data = await run_as_daemon(sys.stdin.readline)
print(data)
if __name__ == "__main__":
asyncio.run(main())
use stream reader:
import sys
import asyncio
async def get_steam_reader(pipe) -> asyncio.StreamReader:
loop = asyncio.get_event_loop()
reader = asyncio.StreamReader(loop=loop)
protocol = asyncio.StreamReaderProtocol(reader)
await loop.connect_read_pipe(lambda: protocol, pipe)
return reader
async def main():
reader = await get_steam_reader(sys.stdin)
data = await reader.readline()
print(data)
if __name__ == "__main__":
asyncio.run(main())
Related
When I use aiologger, I have to write await logger many times.
For example,
import asyncio
from aiologger import Logger
async def main():
logger = Logger.with_default_handlers(name='my-logger')
await logger.debug("debug at stdout")
await logger.info("info at stdout")
await logger.warning("warning at stderr")
await logger.error("error at stderr")
await logger.critical("critical at stderr")
await logger.shutdown()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()
It would be great if I could write something like al instead of await logger.
Disclaimer: I've written about this -- https://coxley.org/logging/#logging-over-the-network
Please don't accept a logging interface like this.
You can't avoid using await to yield the event loop. You just can't. But you can leverage existing features to do I/O outside of the main thread and still use asyncio. You just start a second event loop in that thread.
Example
I don't like to recommend third-party libs in answers, but janus.Queue is important here. Makes it easier to bridge between non-asyncio writers (eg: Log Handler) and asyncio readers (the flusher).
Note 1: If you don't actually need asyncio-compatible I/O from the flusher, use stdlib queue.Queue, remove the async-closure, and get rid of the second loop.
Note 2: This example has both an unbounded queue and does I/O for every message. Add an interval and/or message threshold for flushing to be production-ready. Depending on your system, decide whether you accept memory growth for log bursts, drop logs, or block the main code-path.
import asyncio
import logging
import time
import threading
import typing as t
# pip install --user janus
import janus
LOG = logging.getLogger(__name__)
# Queue must be created within the event loop it will be used from. Start as
# None since this will not be the main thread.
_QUEUE: t.Optional[janus.Queue] = None
class IOHandler(logging.Handler):
def __init__(self, *args, **kwargs):
# This is set from the flusher thread
global _QUEUE
while _QUEUE is None:
time.sleep(0.01)
self.q = _QUEUE.sync_q
super().__init__(*args, **kwargs)
def emit(self, record: logging.LogRecord):
self.q.put(record)
def flusher():
async def run():
global _QUEUE
if _QUEUE is None:
_QUEUE = janus.Queue()
# Upload record instead of print
# Perhaps flush every n-seconds w/ buffer for upper-bound on inserts.
q = _QUEUE.async_q
while True:
record = await q.get()
print("woohoo, doing i/o:", record.msg)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(run())
def foo():
print("foo")
def bar():
print("bar")
async def baz():
await asyncio.sleep(1)
print("baz")
async def main():
threading.Thread(target=flusher, daemon=True).start()
LOG.setLevel(logging.INFO)
LOG.addHandler(IOHandler())
foo()
LOG.info("starting program")
LOG.info("doing some stuff")
LOG.info("mighty cool")
bar()
await baz()
if __name__ == "__main__":
asyncio.run(main())
I am trying to figure out how to have a websocket based server listen to incoming requests, place them in a queue for another process to do work, then place the results in another queue where the websocket based server can wait for said result and send the response back to the client.
This is just me trying to learn and gain more experience with both asyncio and sharing data between processes. I am using Python 3.9.2 64bit.
Right now I am stuck with a deadlock as commented in the "producer_handler" function in the server code. Here is the code I am playing with:
Server:
import asyncio
import logging
import time
from multiprocessing import Manager, Process
import websockets
logging.root.setLevel(0)
def server(recievequeue, sendqueue):
async def consumer_handler(websocket, path):
while True:
logging.info('Waiting for request')
try:
request = await websocket.recv()
except Exception as exception:
logging.warning(f'consumer_handler Error: {exception}')
break
logging.info(f'Request: {request}')
recievequeue.put(request)
logging.info('Request placed in recievequeue')
async def producer_handler(websocket, path):
while True:
logging.info('Waiting for response')
response = sendqueue.get()# Deadlock is here.
try:
await websocket.send(response)
except Exception as exception:
logging.warning(f'producer_handler Error: {exception}')
break
logging.info('Response sent')
async def handler(websocket, path):
consumer_task = asyncio.ensure_future(consumer_handler(websocket, path))
producer_task = asyncio.ensure_future(producer_handler(websocket, path))
done, pending = await asyncio.wait([producer_task, consumer_task], return_when=asyncio.FIRST_COMPLETED)
for task in done:
logging.info(f'Canceling: {task}')
task.cancel()
for task in pending:
logging.info(f'Canceling: {task}')
task.cancel()
eventloop = asyncio.get_event_loop()
eventloop.run_until_complete(websockets.serve(handler, 'localhost', 8081, ssl=None))
eventloop.run_forever()
def message_handler(recievequeue, sendqueue):
while True:
# I just want to test getting a message from the recievequeue, and placing it in the sendqueue
request = recievequeue.get()
logging.info(f'Request: {request}')
time.sleep(3)
data = str(time.time())
logging.info(f'Work completed # {data}')
sendqueue.put(data)
def main():
logging.info('Starting Application')
manager = Manager()
sendqueue = manager.Queue()
recievequeue = manager.Queue()
test_process_1 = Process(target=server, args=(recievequeue, sendqueue), name='Server')
test_process_1.start()
test_process_2 = Process(target=message_handler, args=(recievequeue, sendqueue), name='Message Handler')
test_process_2.start()
test_process_1.join()
if __name__ == '__main__':
main()
And the client:
import asyncio
import logging
import websockets
logging.root.setLevel(0)
URI = "wss://localhost:8081"
async def test():
async def consumer_handler(connection):
while True:
try:
request = await connection.recv()
except Exception as exception:
logging.warning(f'Error: {exception}')
break
logging.info(request)
async def producer_handler(connection):
while True:
await asyncio.sleep(5)
try:
await connection.send('Hello World')
except Exception as exception:
logging.warning(f'Error: {exception}')
break
async with websockets.connect(URI, ssl=None) as connection:
consumer_task = asyncio.ensure_future(consumer_handler(connection))
producer_task = asyncio.ensure_future(producer_handler(connection))
while True:
await asyncio.wait([consumer_task, producer_task], return_when=asyncio.FIRST_COMPLETED)
def main():
logging.info('Starting Application')
eventloop = asyncio.get_event_loop()
try:
eventloop.run_until_complete(test())
eventloop.run_forever()
except Exception as exception:
logging.warning(f'Error: {exception}')
if __name__ == '__main__':
main()
If I remove the queues the server and multiple client can talk back and forth with no issues. I just can't figure out how to get() and put() the requests and responses. Any help would be appreciated!
So after looking through other posts I noticed others talking about deadlocks and using run_in_executor. After some more testing I found replacing the line causing the deadlock with the following code resolved the issue:
response = await eventloop.run_in_executor(None, sendqueue.get)
I'd like to do a non-blocking http request in Python 3.7. What I'm trying to do is described well in this SO post, but it doesn't yet have an accepted answer.
Here's my code so far:
import asyncio
from aiohttp import ClientSession
[.....]
async def call_endpoint_async(endpoint, data):
async with ClientSession() as session, session.post(url=endpoint, data=data) as result:
response = await result.read()
print(response)
return response
class CreateTestScores(APIView):
permission_classes = (IsAuthenticated,)
def post(self, request):
[.....]
asyncio.run(call_endpoint_async(url, data))
print('cp #1') # <== `async.io` BLOCKS -- PRINT STATEMENT DOESN'T RUN UNTIL `asyncio.run` RETURNS
What is the correct way to do an Ajax-style non-blocking http request in Python?
Asyncio makes it easy to make a non-blocking request if your program runs in asyncio. For example:
async def doit():
task = asyncio.create_task(call_endpoint_async(url, data))
print('cp #1')
await asyncio.sleep(1)
print('is it done?', task.done())
await task
print('now it is done')
But this requires that the "caller" be async as well. In your case you want the whole asyncio event loop to run in the background, so that. This can be achieved by running it in a separate thread, e.g.:
pool = concurrent.futures.ThreadPoolExecutor()
# ...
def post(self, request):
fut = pool.submit(asyncio.run, call_endpoint_async(url, data))
print('cp #1')
However, in that case you're not getting anything by using asyncio. Since you're using threads anyway, you could as well call a sync function such as requests.get() to begin with.
I have a streaming application that almost continuously takes the data given as input and sends an HTTP request using that value and does something with the returned value.
Obviously to speed things up I've used asyncio and aiohttp libraries in Python 3.7 to get the best performance, but it becomes hard to debug given how fast the data moves.
This is what my code looks like
'''
Gets the final requests
'''
async def apiRequest(info, url, session, reqType, post_data=''):
if reqType:
async with session.post(url, data = post_data) as response:
info['response'] = await response.text()
else:
async with session.get(url+post_data) as response:
info['response'] = await response.text()
logger.debug(info)
return info
'''
Loops through the batches and sends it for request
'''
async def main(data, listOfData):
tasks = []
async with ClientSession() as session:
for reqData in listOfData:
try:
task = asyncio.ensure_future(apiRequest(**reqData))
tasks.append(task)
except Exception as e:
print(e)
exc_type, exc_obj, exc_tb = sys.exc_info()
fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
print(exc_type, fname, exc_tb.tb_lineno)
responses = await asyncio.gather(*tasks)
return responses #list of APIResponses
'''
Streams data in and prepares batches to send for requests
'''
async def Kconsumer(data, loop, batchsize=100):
consumer = AIOKafkaConsumer(**KafkaConfigs)
await consumer.start()
dataPoints = []
async for msg in consumer:
try:
sys.stdout.flush()
consumedMsg = loads(msg.value.decode('utf-8'))
if consumedMsg['tid']:
dataPoints.append(loads(msg.value.decode('utf-8')))
if len(dataPoints)==batchsize or time.time() - startTime>5:
'''
#1: The task below goes and sends HTTP GET requests in bulk using aiohttp
'''
task = asyncio.ensure_future(getRequests(data, dataPoints))
res = await asyncio.gather(*[task])
if task.done():
outputs = []
'''
#2: Does some ETL on the returned values
'''
ids = await asyncio.gather(*[doSomething(**{'tid':x['tid'],
'cid':x['cid'], 'tn':x['tn'],
'id':x['id'], 'ix':x['ix'],
'ac':x['ac'], 'output':to_dict(xmltodict.parse(x['response'],encoding='utf-8')),
'loop':loop, 'option':1}) for x in res[0]])
simplySaveDataIntoDataBase(id) # This is where I see some missing data in the database
dataPoints = []
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
exc_type, exc_obj, exc_tb = sys.exc_info()
fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
logger.error(str(exc_type) +' '+ str(fname) +' '+ str(exc_tb.tb_lineno))
if __name__ == '__main__':
loop = asyncio.get_event_loop()
asyncio.ensure_future(Kconsumer(data, loop, batchsize=100))
loop.run_forever()
Does the ensure_future need to be awaited ?
How does aiohttp handle requests that come a little later than the others? Shouldn't it hold the whole batch back instead of forgetting about it altoghter?
Does the ensure_future need to be awaited ?
Yes, and your code is doing that already. await asyncio.gather(*tasks) awaits the provided tasks and returns their results in the same order.
Note that await asyncio.gather(*[task]) doesn't make sense, because it is equivalent to await asyncio.gather(task), which is again equivalent to await task. In other words, when you need the result of getRequests(data, dataPoints), you can write res = await getRequests(data, dataPoints) without the ceremony of first calling ensure_future() and then calling gather().
In fact, you almost never need to call ensure_future yourself:
if you need to await multiple tasks, you can pass coroutine objects directly to gather, e.g. gather(coroutine1(), coroutine2()).
if you need to spawn a background task, you can call asyncio.create_task(coroutine(...))
How does aiohttp handle requests that come a little later than the others? Shouldn't it hold the whole batch back instead of forgetting about it altoghter?
If you use gather, all requests must finish before any of them return. (That is not aiohttp policy, it's how gather works.) If you need to implement a timeout, you can use asyncio.wait_for or similar.
I wrote a script for a socket server that simply listens for incoming connections and processes the incoming data. The chosen architecture is the asyncio.start_server for the socket management and the asyncio.Queues for passing the data between the producer and consumer coroutines. The problem is that the consume(q1) function is executed only once (at the first script startup). Then it is not more executed. Is the line run_until_complete(asyncio.gather()) wrong?
import asyncio
import functools
async def handle_readnwrite(reader, writer, q1): #Producer coroutine
data = await reader.read(1024)
message = data.decode()
await writer.drain()
await q1.put(message[3:20])
await q1.put(None)
writer.close() #Close the client socket
async def consume(q1): #Consumer coroutine
while True:
# wait for an item from the producer
item = await q1.get()
if item is None:
logging.debug('None items') # the producer emits None to indicate that it is done
break
do_something(item)
loop = asyncio.get_event_loop()
q1 = asyncio.Queue(loop=loop)
producer_coro = asyncio.start_server(functools.partial(handle_readnwrite, q1=q1), '0.0.0.0', 3000, loop=loop)
consumer_coro = consume(q1)
loop.run_until_complete(asyncio.gather(consumer_coro,producer_coro))
try:
loop.run_forever()
except KeyboardInterrupt:
pass
loop.close()
handle_readnwrite always enqueues the None terminator, which causes consume to break (and therefore finish the coroutine). If consume should continue running and process other messages, the None terminator must not be sent after each message.