asyncio wait on multiple tasks with timeout and cancellation - python-3.x

I have some code that runs multiple tasks in a loop like this:
done, running = await asyncio.wait(running, timeout=timeout_seconds,
return_when=asyncio.FIRST_COMPLETED)
I need to be able to determine which of these timed out. According to the documentation:
Note that this function does not raise asyncio.TimeoutError. Futures or Tasks that aren’t done when the timeout occurs are simply returned in the second set.
I could use wait_for() instead, but that function only accepts a single awaitable, whereas I need to specify multiple. Is there any way to determine which one from the set of awaitables I passed to wait() was responsible for the timeout?
Alternatively, is there a way to use wait_for() with multiple awaitables?

Your can try that tricks, probably it is not good solution:
import asyncio
async def foo():
return 42
async def need_some_sleep():
await asyncio.sleep(1000)
return 42
async def coro_wrapper(coro):
result = await asyncio.wait_for(coro(), timeout=10)
return result
loop = asyncio.get_event_loop()
done, running = loop.run_until_complete(asyncio.wait(
[coro_wrapper(foo), coro_wrapper(need_some_sleep)],
return_when=asyncio.FIRST_COMPLETED
)
)
for item in done:
print(item.result())
print(done, running)

Here is how I do it:
done, pending = await asyncio.wait({
asyncio.create_task(task, name=index)
for index, task in enumerate([
my_coroutine(),
my_coroutine(),
my_coroutine(),
])
},
return_when=asyncio.FIRST_COMPLETED
)
num = next(t.get_name() for t in done)
if num == 2:
pass
Use enumerate to name the tasks as they are created.

Related

how to run two async methods simultaneously?

i'm ashamed to admit i've been using python's asyncio for a long time without really understanding how it works and now i'm in a pickle. in pseudo code, my current program is like this:
async def api_function1(parameters):
result = await asyncio.gather(*[some_other_thing(p) for p in parameters])
async def api_function2(parameters):
result = await asyncio.gather(*[some_other_thing2(p) for p in parameters])
def a(initial_parameters):
output = []
data = asyncio.run(api_function1(initial_parameters))
output.append(data)
while True:
data = asyncio.run(api_function1(get_parameters_from_data(data)))
output.append(data)
if some _condition is True:
break
return output
def b(initial_parameters):
output = []
data = asyncio.run(api_function2(initial_parameters))
output.append(data)
while True:
data = asyncio.run(api_function2(get_parameters_from_data(data)))
output.append(data)
if some condition is True:
break
return output
a() and b() get data from two different rest api endpoints, each with its own rate limits and nuances. i want to run a() and b() simultaneously.
what's the best/easiest way of structuring the program so a() and b() can run simultaneously?
I tried making a() and b() both async methods and tried to await them simultaneously, i.e. something like
async a(initial_parameters):
...
async b(initial_parameters):
...
A = await a(initial_parameters)
B = await b(initial_parameters)
but it didn't work, so based on the docs, I'm guessing maybe i need to manually get the event_loop and pass it as an argument to a() and b() which would pass them to api_function2() and api_function2(), and then close it manually when both tasks are donw but not really sure if i'm on the right track or how to do it.
Also open to better design pattern for this if you have one in mind
There is no reason why you can't nest calls to asyncio.gather. If you want to run a() and b() simultaneously, you must make both of them coroutines. And you can't use asyncio.run() inside either one of them, since that is a blocking call - it doesn't return until its argument has completed. You need to replace all the calls to asyncio.run() in a() and b() with await expressions. You will end up with something that looks like this:
async def api_function1(parameters):
return await asyncio.gather(*[some_other_thing(p) for p in parameters])
async def api_function2(parameters):
return await asyncio.gather(*[some_other_thing2(p) for p in parameters])
async def a(initial_parameters):
output = []
data = await api_function1(initial_parameters)
output.append(data)
while True:
data = await api_function1(get_parameters_from_data(data))
output.append(data)
if some _condition is True:
break
return output
async def b(initial_parameters):
output = []
data = await api_function2(initial_parameters)
output.append(data)
while True:
data = await api_function2(get_parameters_from_data(data))
output.append(data)
if some condition is True:
break
return output
async def main():
a_data, b_data = asyncio.gather(a(initial_parameters), b(initial_parameters))
async def main():
task_a = asyncio.create_task(a(initial_parameters))
task_b = asyncio.create_task(b(initial_parameters))
a_data = await task_a
b_data = await task_b
asyncio.run(main())
This is still pseudocode.
I have given two possible ways of writing main(), one using asyncio.gather and the other using two calls to asyncio.create_task. Both versions create two tasks that run simultaneously, but the latter version doesn't require you to collect all the tasks in one place and start them all at the same time, as gather does. If gather works for your requirements, as it does here, it is more convenient.
Finally, a call to asyncio.run starts the program. The docs recommend having only one call to asyncio.run per program.
The two api functions should return something instead of setting a local variable.
In asyncio the crucial concept is the Task. It is Tasks that cooperate to provide simultaneous execution. Asyncio.gather actually creates Tasks under the hood, even though you typically pass it a list of coroutines. That's how it runs things in parallel.

Asyncio wait_for one function to end, or for another

My code has 2 functions:
async def blabla():
sleep(5)
And
async def blublu():
sleep(2)
asyncio.wait_for as I know can wait for one function like this:
asyncio.wait_for(blublu(), timeout=6) or asyncio.wait_for(blublu(), timeout=6)
What I wan't to do, is to make asyncio wait for both of them, and if one of them ends faster, proceed without waiting for the second one.
Is it possible to make so?
Edit: timeout is needed
Use asyncio.wait with the return_when kwarg:
# directly passing coroutine objects in `asyncio.wait`
# is deprecated since py 3.8+, wrapping into a task
blabla_task = asyncio.create_task(blabla())
blublu_task = asyncio.create_task(blublu())
done, pending = await asyncio.wait(
{blabla_task, blublu_task},
return_when=asyncio.FIRST_COMPLETED
)
# do something with the `done` set

how to run gather and a coroutine task concurrently

I have approximately the following code
import asyncio
.
.
.
async def query_loop()
while connected:
result = await asyncio.gather(get_value1, get_value2, get_value3)
if True in result:
connected = False
async def main():
await query_loop()
asyncio.run(main())
The get_value - functions query a device, receive values, and publish them to a server. If no problems occur they return False, else True.
Now I need to implement, that the get_value2-function checks if it received the value 7. In this case I need the program to wait for 3 min before sending a special command to the device. But in the mean time, and also afterwards the query_loop should continue.
Has anybody an idea how to do that ?
thanks in advance!
If I understand you correctly, you want to modify get_value2 so that it reacts to a value received from device by spawning additional work in the background, i.e. do something without the loop in query_loop having to wait for that new work to finish.
You can use asyncio.create_task() to spawn a background task. In fact, you can always combine create_task() and await to runs things in the background; asyncio.gather is just a utility function that does it for you. In this case query_loop remains unchanged, and get_value2 gets modified like this:
async def get_value2():
...
value = await receive_value_from_device()
if value == 7:
# schedule send_command() to run, but don't wait for it
asyncio.create_task(special_command())
...
return False
async def special_command():
await asyncio.sleep(180)
await send_command_to_device(...)
Note that if get_value1 and others are async functions, the correct invocation of gather must call them, so it should be await asyncio.gather(get_value1(), get_value2(), get_value3()) (note the extra parentheses).

Return an asyncio callback result from task creating function

I'm trying to wrap an async function up so that I can use it without importing asyncio in certain files. The ultimate goal is to use asynchronous functions but being able to call them normally and get back the result.
How can I access the result from the callback function printing(task) and use it as the return of my make_task(x) function?
MWE:
#!/usr/bin/env python3.7
import asyncio
loop = asyncio.get_event_loop()
def make_task(x): # Can be used without asyncio
task = loop.create_task(my_async(x))
task.add_done_callback(printing)
# return to get the
def printing(task):
print('sleep done: %s' % task.done())
print('results: %s' % task.result())
return task.result() # How can i access this return?
async def my_async(x): # Handeling the actual async running
print('Starting my async')
res = await my_sleep(x)
return res # The value I want to ultimately use in the real callback
async def my_sleep(x):
print('starting sleep for %d' % x)
await asyncio.sleep(x)
return x**2
async def my_coro(*coro):
return await asyncio.gather(*coro)
val1 = make_task(4)
val2 = make_task(5)
loop.run_until_complete(my_coro(asyncio.sleep(6)))
print(val1)
print(val2)
If I understand correctly you want to use asynchronous functions but don't want to write async/await in top-level code.
If that's the case, I'm afraid it's not possible to achieve with asyncio. asyncio wants you to write async/await everywhere asynchronous stuff happens and this is intentional: forcing to explicitly mark places of possible context switch is a asyncio's way to fight concurrency-related problems (which is very hard to fight otherwise). Read this answer for more info.
If you still want to have asynchronous stuff and use it "as usual code" take a look at alternative solutions like gevent.
Instead of using a callback, you can make printing a coroutine and await the original coroutine, such as my_async. make_task can then create a task out of printing(my_async(...)), which will make the return value of printing available as the task result. In other words, to return a value out of printing, just - return it.
For example, if you define make_task and printing like this and leave the rest of the program unchanged:
def make_task(x):
task = loop.create_task(printing(my_async(x)))
return task
async def printing(coro):
coro_result = await coro
print('sleep done')
print('results: %s' % coro_result)
return coro_result
The resulting output is:
Starting my async
starting sleep for 4
Starting my async
starting sleep for 5
sleep done
results: 16
sleep done
results: 25
<Task finished coro=<printing() done, defined at result1.py:11> result=16>
<Task finished coro=<printing() done, defined at result1.py:11> result=25>

Capture the return value from nursery objects

When using trio and nursery objects, how do you capture any value that was returned from a method?
Take this example from the trio website:
async def append_fruits():
fruits = []
fruits.append("Apple")
fruits.append("Orange")
return fruits
async def numbers():
numbers = []
numbers.append(1)
numbers.append(2)
return numbers
async def parent():
async with trio.open_nursery() as nursery:
nursery.start_soon(append_fruits)
nursery.start_soon(numbers)
I modified it so that each method returns a list. How would you capture the return value so that I could print them?
Currently, there is no built-in mechanism for this. Mostly because we haven't figured out how we would even want it to work, so if you have some suggestions that would be helpful :-).
The thing is, with regular functions, there's exactly one obvious place to access the return value – the caller is waiting, so you hand them the return value, done. With concurrent functions, the caller isn't waiting, so you also need some way to specify where to return it to, when to return it, if there are multiple functions you have to keep track of which one is returning a value, and so on. It's not as simple a concept.
What do you want to do with the return values? Do you want to, say, print them immediately when each function returns? In that case the simplest thing is to do it directly from the tasks:
async def print_fruits():
print(await fruits())
async def print_numbers():
print(await numbers())
async with trio.open_nursery() as nursery:
nursery.start_soon(print_fruits)
nursery.start_soon(print_numbers)
You could even factor this into a helper function:
async def call_then_print(fn):
print(await fn())
async with trio.open_nursery() as nursery:
nursery.start_soon(call_then_print, fruits)
nursery.start_soon(call_then_print, numbers)
Or maybe you want to put them in a data structure to look at later?
results = {}
async def store_fruits_in_results_dict():
results["fruits"] = await fruits()
async def store_numbers_in_results_dict():
results["numbers"] = await numbers()
async with trio.open_nursery() as nursery:
nursery.start_soon(store_fruits_in_results_dict)
nursery.start_soon(store_numbers_in_results_dict)
# This is after the nursery block, so we know that the dict is fully filled in:
print(results["fruits"])
print(results["numbers"])
You can imagine fancier versions of those too – for example, sometimes when you run a lot of tasks in parallel you want to capture exceptions, not just return values, so that some tasks can still succeed even if some of them fail. For that you can use a try/except around each individual function, or the outcome library. Or when each operation finishes you could put its return value into a trio.Queue, so that another task can process the results as they're finished. But hopefully this gives you a good starting point :-)
In this case, simply create the arrays in the parent and pass each to the child that needs it.
More generally, pass an object to the tasks; they can set an attribute on it. You might also add an Event so that the parent can wait for the results to be available.

Resources