I am new to asyncio ( used with python3.4 ) and I am not sure I use it as one should. I have seen in this thread that it can be use to execute a function every n seconds (in my case ms) without having to dive into threading.
I use it to get data from laser sensors through a basic serial protocol every n ms until I get m samples.
Here is the definition of my functions :
def countDown( self,
loop,
funcToDo,
*args,
counter = [ 1 ],
**kwargs ):
""" At every call, it executes funcToDo ( pass it args and kwargs )
and count down from counter to 0. Then, it stop loop """
if counter[ 0 ] == 0:
loop.stop()
else:
funcToDo( *args, **kwargs )
counter[ 0 ] -= 1
def _frangeGen( self, start = 0, stop = None, step = 1 ):
""" use to generate a time frange from start to stop by step step """
while stop is None or start < stop:
yield start
start += step
def callEvery( self,
loop,
interval,
funcToCall,
*args,
now = True,
**kwargs ):
""" repeat funcToCall every interval sec in loop object """
nb = kwargs.get( 'counter', [ 1000 ] )
def repeat( now = True,
times = self._frangeGen( start = loop.time(),
stop=loop.time()+nb[0]*interval,
step = interval ) ):
if now:
funcToCall( *args, **kwargs )
loop.call_at( next( times ), repeat )
repeat( now = now )
And this is how I use it (getAllData is the function that manage serial communication) :
ts = 0.01
nbOfSamples = 1000
loop = asyncio.get_event_loop()
callEvery( loop, ts, countDown, loop, getAllData, counter = [nbOfSamples] )
loop.run_forever()
I want to put that bloc into a function and call it as often as I want, something like this :
for i in range( nbOfMeasures ):
myFunction()
processData()
But the second test does not call getAllData 1000 times, only twice, sometimes thrice. The interesting fact is one time in two I get as much data as I want. I don't really understand, and I can't find anything in the docs, so I am asking for your help. Any explanation or an easier way to do it is gladly welcome :)
You are complicating things too much and, generally speaking, doing recursion when you have an event loop is bad design.
asyncio is fun only when you make use of coroutines. Here's one way of doing it:
import asyncio as aio
def get_laser_data():
"""
get data from the laser using blocking IO
"""
...
#aio.coroutine
def get_samples(loop, m, n):
"""
loop = asyncio event loop
m = number of samples
n = time between samples
"""
samples = []
while len(samples) < m:
sample = yield from loop.run_in_executor(None, get_laser_data)
samples.append(sample)
yield from aio.sleep(n)
return samples
#aio.coroutine
def main(loop):
for i in range(nbOfMeasures):
samples = yield from get_samples(loop, 1000, 0.01)
...
loop = aio.get_event_loop()
loop.run_until_complete(main(loop))
loop.close()
If you are completely confused by this, consider reading some tutorials/documentation about asyncio.
But I would like to point out that you must use a thread to get the data from the laser sensor. Doing any blocking IO in the same thread that the event loop is running will block the loop and throw off aio.sleep. This is what yield from loop.run_in_executor(None, get_laser_data) is doing. It's running the get_laser_data function in a separate thread.
In python 3.5, you can make use of the async for syntax and create an asynchronous iterator to control your time frames. It has to implement the __aiter__ and __anext__ methods:
class timeframes(collections.AsyncIterator):
def __init__(self, steps, delay=1.0, *, loop=None):
self.loop = asyncio.get_event_loop() if loop is None else loop
self.ref = self.loop.time()
self.delay = delay
self.steps = steps
self.iter = iter(range(steps))
async def __anext__(self):
try:
when = self.ref + next(self.iter) * self.delay
except StopIteration:
raise StopAsyncIteration
else:
future = asyncio.Future()
self.loop.call_at(when, future.set_result, None)
await future
return self.loop.time()
async def __aiter__(self):
return self
Here's a coroutine that simulates an execution:
async def simulate(steps, delay, execution):
# Prepare timing
start = loop.time()
expected = steps * delay - delay + execution
# Run simulation
async for t in timeframes(steps, delay):
await loop.run_in_executor(None, time.sleep, execution)
# Return error
result = loop.time() - start
return result - expected
And this is the kind of result you'll get on a linux OS:
>>> loop = asyncio.get_event_loop()
>>> simulation = simulate(steps=1000, delay=0.020, execution=0.014)
>>> error = loop.run_until_complete(simulation)
>>> print("Overall error = {:.3f} ms".format(error * 1000))
Overall error = 1.199 ms
It is different on a windows OS (see this answer) but the event loop will catch up and the overall error should never exceed 15 ms.
Related
please help me to find out where is the mistake, I want to run two functions in parallel. THNX
import asyncio
import time
async def test1():
for i in range(10):
print(f"First function {i}")
async def test2():
for i in range(10):
print(f"Second function {i}")
async def main():
print(f"started at {time.strftime('%X')}")
F = asyncio.create_task(test1())
S = asyncio.create_task(test2())
tasks = (F, S)
await asyncio.gather(*tasks)
print(f"finished at {time.strftime('%X')}")
asyncio.run(main())
Instead I get such an output that doesn't have anything relative to the parallel execution :
started at 10:42:53
First function 0
First function 1
First function 2
First function 3
First function 4
First function 5
First function 6
First function 7
First function 8
First function 9
Second function 0
Second function 1
Second function 2
Second function 3
Second function 4
Second function 5
Second function 6
Second function 7
Second function 8
Second function 9
finished at 10:42:53
Ok, finally I've rewrite my code to work with threads, and it works perfectly, but I couldn't figure out how to catch return values from threaded functions, so at first my code looked like this:
import threading
import time
def test1():
for i in range(10):
print(f"First function {i}")
return 11
def test2():
for i in range(10,20):
print(f"Second function {i}")
def main():
t = time.time()
t1 = threading.Thread(target=test1)
t2 = threading.Thread(target=test2)
t1.start()
t2.start()
print(t1.join())
t2.join()
print("Woaahh!! My work is finished..")
print("I took " + str(time.time() - t))
main()
And wwith help of this thread I've changed a bit to handle return values, here
So now my code look like this:
import time
import concurrent.futures
def test1():
for i in range(10):
print(f"First function {i}")
return 11
def test2():
for i in range(10,20):
print(f"Second function {i}")
return 12
def main():
t = time.time()
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(test1)
future2 = executor.submit(test2)
return_value = future.result()
rv= future2.result()
print(return_value)
print(rv)
print("Woaahh!! My work is finished..")
print("I took " + str(time.time() - t))
main()
I do not know what is your problem in your code but if you wan to run in parallel i think you can try something like this, by using thread ?
threading.Thread(target=test1).start()
threading.Thread(target=test2).start()
https://docs.python.org/fr/3/library/threading.html
Here is the other example of handling functions output code using Queues
import threading
import time
import queue
def test1(q):
for i in range(10):
print(f"First function {i}")
q.put(1)
def test2(q):
for i in range(10,20):
print(f"Second function {i}")
q.put(2)
def main():
q = queue.Queue()
t = time.time()
t1 = threading.Thread(target=test1, args=(q, ))
t2 = threading.Thread(target=test2, args=(q, ))
t1.start()
t2.start()
print(t1.join())
t2.join()
print("Woaahh!! My work is finished..")
print("I took " + str(time.time() - t))
while not q.empty():
print(q.get())
main()
I'm trying to get one or several returning values from a thread in a multithreading process. The code I show get cycled with no way to interrupt it with Ctrl-C, Ctrl+D.
import queue as Queue
import threading
class myThread (threading.Thread):
def __init__(self, threadID, name, region):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.region = region
def run(self):
GetSales(self.region)
def GetSales(strReg):
print("Thread-" + strReg)
return "Returning-" + strReg
def Main():
RegionList = []
RegionList.append("EMEA")
RegionList.append("AP")
RegionList.append("AM")
# Create threads
threads = []
x = 0
for region in RegionList:
x += 1
rthread = myThread(x, "Thread-" + region, region) # Create new thread
rthread.start() # Start new thread
threads.append(rthread) # Add new thread to threads list
que = Queue.Queue()
# Wait for all threads to complete
for t in threads:
t.join()
result = que.get()
print(t.name + " -> Done")
Main()
If I comment line "result = que.get()" the program runs with no issues.
What you are looking for is future and async management.
Firstly, your program loop indefinitely because of the line que.get(), because there is nothing in the queue, it wait that something happen, which will never happen. You don't use it.
What you want to do is an async task and get the result :
import asyncio
async def yourExpensiveTask():
// some long calculation
return 42
async main():
tasks = []
tasks += [asyncio.create_task(yourExpensiveTask())]
tasks += [asyncio.create_task(yourExpensiveTask())]
for task in tasks:
result = await task
print(result)
See also https://docs.python.org/3/library/asyncio-task.html
Im trying to make asynchronous API calls nested with asyncio using ensure_future() and gather().
I have tried two methods of getting this to work.
First of all the API is written with aiohttp and works fine.
I have tried it with two methods (here named get_rows_working() and get_rows_not_working()). One is working and one is not.
A single row always does its API calls in parallel to increase speed.
Now what I'm trying to do is getting all rows pulled in parallel aswell.
async def get_single_row(api):
tasks = []
tasks.append(asyncio.ensure_future(api.get_some_data())
tasks.append(asyncio.ensure_future(api.get_some_data2())
resp = await asyncio.gather(*tasks)
data = resp[0]
data2 = resp[1]
extra_data = data['some_key']
extra_data2 = data2['some_key2']
return (extra_data, extra_data2)
async def get_rows_working(rows):
data = []
for r in rows:
api = API(r)
data.append(await get_single_row(api))
return data
async def get_rows_not_working(rows):
tasks = []
for r in rows:
api = API(r)
tasks.append(asyncio.ensure_future(get_single_row(api)))
data = await asyncio.gather(*tasks)
return data
loop = asyncio.get_event_loop()
loop.run_until_complete(get_rows_working())
loop.run_until_complete(get_rows_not_working())
What happens if you start nesting these?
Because im starting to get KeyErrors on these lines (which I dont have with get_rows_working()):
extra_data = data['some_key']
extra_data2 = data2['some_key2']
Which makes me believe the internal order of operations gets all wonky because of nesting these.
Im not sure how to describe it better, sorry.
Is this even to correct way to achieve this?
Thanks for any answers.
I don't think the KeyError exceptions are related to the way your code is structured.
In order to reproduce your results, I mocked your API calls using asyncio.sleep():
import asyncio
class API:
def __init__(self, r):
self.r = r
async def get_some_data(self, i):
await asyncio.sleep(3)
return {'key_{}'.format(i) : 'Data_{}___Row_{}'.format(i, self.r)}
async def get_single_row(api):
tasks = []
tasks.append(asyncio.ensure_future(api.get_some_data(0)))
tasks.append(asyncio.ensure_future(api.get_some_data(1)))
resp = await asyncio.gather(*tasks)
data_0 = resp[0]
data_1 = resp[1]
extra_data_0 = data_0['key_0']
extra_data_1 = data_1['key_1']
return (extra_data_0, extra_data_1)
async def get_rows_working(rows):
data = []
for r in rows:
api = API(r)
data.append(await get_single_row(api))
return data
async def get_rows_not_working(rows):
tasks = []
for r in rows:
api = API(r)
tasks.append(asyncio.ensure_future(get_single_row(api)))
data = await asyncio.gather(*tasks)
return data
Then added a timer, and run both functions to understand what happens:
import time
class Timer:
def __enter__(self):
self.start = time.perf_counter()
return self
def __exit__(self, *args):
self.end = time.perf_counter()
self.interval = self.end - self.start
loop = asyncio.get_event_loop()
rows = range(10)
with Timer() as t:
res = loop.run_until_complete(get_rows_working(rows))
print("get_rows_working() result : {}".format(res))
print('get_rows_working() API call took %.03f sec.\n' % t.interval)
with Timer() as t:
res = loop.run_until_complete(get_rows_not_working(rows))
print("get_rows_not_working() result : {}".format(res))
print('get_rows_not_working() API call took %.03f sec.' % t.interval)
Output:
get_rows_working() result : [('Data_0___Row_0', 'Data_1___Row_0'), ('Data_0___Row_1', 'Data_1___Row_1'), ('Data_0___Row_2', 'Data_1___Row_2'), ('Data_0___Row_3', 'Data_1___Row_3'), ('Data_0___Row_4', 'Data_1___Row_4'), ('Data_0___Row_5', 'Data_1___Row_5'), ('Data_0___Row_6', 'Data_1___Row_6'), ('Data_0___Row_7', 'Data_1___Row_7'), ('Data_0___Row_8', 'Data_1___Row_8'), ('Data_0___Row_9', 'Data_1___Row_9')]
get_rows_working() API call took 30.034 sec.
get_rows_not_working() result : [('Data_0___Row_0', 'Data_1___Row_0'), ('Data_0___Row_1', 'Data_1___Row_1'), ('Data_0___Row_2', 'Data_1___Row_2'), ('Data_0___Row_3', 'Data_1___Row_3'), ('Data_0___Row_4', 'Data_1___Row_4'), ('Data_0___Row_5', 'Data_1___Row_5'), ('Data_0___Row_6', 'Data_1___Row_6'), ('Data_0___Row_7', 'Data_1___Row_7'), ('Data_0___Row_8', 'Data_1___Row_8'), ('Data_0___Row_9', 'Data_1___Row_9')]
get_rows_not_working() API call took 3.008 sec.
Which means that the second function get_rows_not_working() actually works as expected and calls the API concurrently.
Is it possible that you are getting KeyError exceptions because the API returns empty data when you exceed the request rate limit? For example, if the API is implemented as:
MAX_CONCUR_ROWS = 5
class API:
connections = 0
def __init__(self, r):
self.r = r
async def get_some_data(self, i):
API.connections += 1
await asyncio.sleep(3)
if API.connections > MAX_CONCUR_ROWS * 2:
res = {}
else:
res = {'key_{}'.format(i) : 'Data_{}___Row_{}'.format(i, self.r)}
API.connections -= 1
return res
Then get_rows_not_working() would return KeyError: 'key_0', while get_rows_working() works fine.
If that's the case, then you would want to throttle your requests, by batching them, or using asyncio.Semaphore :
async def get_single_row(api, semaphore):
# using tasks instead of coroutines won't work because asyncio.ensure_future() starts the coroutine, so the semaphore won't have any effect.
coros = []
coros.append(api.get_some_data(0))
coros.append(api.get_some_data(1))
async with semaphore:
resp = await asyncio.gather(*coros)
data_0 = resp[0]
data_1 = resp[1]
extra_data_0 = data_0['key_0']
extra_data_1 = data_1['key_1']
return (extra_data_0, extra_data_1)
async def get_rows_not_working(rows):
semaphore = asyncio.Semaphore(MAX_CONCUR_ROWS)
tasks = []
for r in rows:
api = API(r)
tasks.append(asyncio.ensure_future(get_single_row_coros(api, semaphore)))
data = await asyncio.gather(*tasks)
return data
The above code doesn't perform more than 5 concurrent calls at a time and returns the expected output (notice the it takes 6 sec. now instead of 3):
get_rows_not_working() result : [('Data_0___Row_0', 'Data_1___Row_0'), ('Data_0___Row_1', 'Data_1___Row_1'), ('Data_0___Row_2', 'Data_1___Row_2'), ('Data_0___Row_3', 'Data_1___Row_3'), ('Data_0___Row_4', 'Data_1___Row_4'), ('Data_0___Row_5', 'Data_1___Row_5'), ('Data_0___Row_6', 'Data_1___Row_6'), ('Data_0___Row_7', 'Data_1___Row_7'), ('Data_0___Row_8', 'Data_1___Row_8'), ('Data_0___Row_9', 'Data_1___Row_9')]
get_rows_not_working() API call took 6.013 sec.
I've recently converted my old template matching program to asyncio and I have a situation where one of my coroutines relies on a blocking method (processing_frame).
I want to run that method in a seperate thread or process whenever the coroutine that calls that method (analyze_frame) gets an item from the shared asyncio.Queue()
I'm not sure if that's possible or worth it performance wise since I have very little experience with threading and multiprocessing
import cv2
import datetime
import argparse
import os
import asyncio
# Making CLI
if not os.path.exists("frames"):
os.makedirs("frames")
t0 = datetime.datetime.now()
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
help="path to our file")
args = vars(ap.parse_args())
threshold = .2
death_count = 0
was_found = False
template = cv2.imread('youdied.png')
vidcap = cv2.VideoCapture(args["video"])
loop = asyncio.get_event_loop()
frames_to_analyze = asyncio.Queue()
def main():
length = int(vidcap.get(cv2.CAP_PROP_FRAME_COUNT))
tasks = []
for _ in range(int(length / 50)):
tasks.append(loop.create_task(read_frame(50, frames_to_analyze)))
tasks.append(loop.create_task(analyze_frame(threshold, template, frames_to_analyze)))
final_task = asyncio.gather(*tasks)
loop.run_until_complete(final_task)
dt = datetime.datetime.now() - t0
print("App exiting, total time: {:,.2f} sec.".format(dt.total_seconds()))
print(f"Deaths registered: {death_count}")
async def read_frame(frames, frames_to_analyze):
global vidcap
for _ in range(frames-1):
vidcap.grab()
else:
current_frame = vidcap.read()[1]
print("Read 50 frames")
await frames_to_analyze.put(current_frame)
async def analyze_frame(threshold, template, frames_to_analyze):
global vidcap
global was_found
global death_count
frame = await frames_to_analyze.get()
is_found = processing_frame(frame)
if was_found and not is_found:
death_count += 1
await writing_to_file(death_count, frame)
was_found = is_found
def processing_frame(frame):
res = cv2.matchTemplate(frame, template, cv2.TM_CCOEFF_NORMED)
max_val = cv2.minMaxLoc(res)[1]
is_found = max_val >= threshold
print(is_found)
return is_found
async def writing_to_file(death_count, frame):
cv2.imwrite(f"frames/frame{death_count}.jpg", frame)
if __name__ == '__main__':
main()
I've tried using unsync but without much success
I would get something along the lines of
with self._rlock:
PermissionError: [WinError 5] Access is denied
If processing_frame is a blocking function, you should call it with await loop.run_in_executor(None, processing_frame, frame). That will submit the function to a thread pool and allow the event loop to proceed with doing other things until the call function completes.
The same goes for calls such as cv2.imwrite. As written, writing_to_file is not truly asynchronous, despite being defined with async def. This is because it doesn't await anything, so once its execution starts, it will proceed to the end without ever suspending. In that case one could as well make it a normal function in the first place, to make it obvious what's going on.
I want to achieve same effect as
# Code 1
from multiprocessing.pool import ThreadPool as Pool
from time import sleep, time
def square(a):
print('start', a)
sleep(a)
print('end', a)
return a * a
def main():
p = Pool(2)
queue = list(range(4))
start = time()
results = p.map(square, queue)
print(results)
print(time() - start)
if __name__ == "__main__":
main()
with async functions like
# Code 2
from multiprocessing.pool import ThreadPool as Pool
from time import sleep, time
import asyncio
async def square(a):
print('start', a)
sleep(a) # await asyncio.sleep same effect
print('end', a)
return a * a
async def main():
p = Pool(2)
queue = list(range(4))
start = time()
results = p.map_async(square, queue)
results = results.get()
results = [await result for result in results]
print(results)
print(time() - start)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()
Currently Code 1 takes 4 seconds and Code 2 takes 6 seconds which means it is not running in parallel. What is the correct and cleanest way to run multiple async functions in parallel?
Better to be python 3.6 compatible. Thank you!
map_async() is not the same "async" as in async def - if it is fed with an async def method, it won't actually run it but return a coroutine instance immediately (try calling such a method without await). Then you awaited on the 4 coroutines one by one, that equals to sequential execution, and ended up with 6 seconds.
Please see following example:
from time import time
import asyncio
from asyncio.locks import Semaphore
semaphore = Semaphore(2)
async def square(a):
async with semaphore:
print('start', a)
await asyncio.sleep(a)
print('end', a)
return a * a
async def main():
start = time()
tasks = []
for a in range(4):
tasks.append(asyncio.ensure_future(square(a)))
await asyncio.wait(tasks)
print([t.result() for t in tasks])
print(time() - start)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()
The Semaphore acts similarly like the ThreadPool - it allows only 2 concurrent coroutines entering the async with semaphore: block.