Parallel computing Python / Raspberry Pi

Parallel computing Python / Raspberry Pi - python-3.x

I am doing a project where I am required to measure vibration levels due to passing vehicles (using an accelerometer sampling at high rate) and capture video of the event "SIMULTANEOUSLY" when the vibrations exceed a certain threshold value. I want both the processes i.e. data acquisition and video recording to go simultaneously so that they are synchronized in time.
I am doing both the things on a raspberry pi. To achieve my goal I have 2 strategies in mind.
1) Write a function for
a) Data acquisition
b) Recording video
and run them in the same script.
2) Or while running the script for data acquisition, use a call to os.system, to run a script to capture video of the event.
Now I am curious in what order are functions executed in Python? So if I write
while 1:
if v_threshold > some_value:
Acquire_data(); Capture_Video();
Are they being executed simultaneously or does Capture_Video() only begin once Acquire_data() has finished execution? Or in general, how are functions executed in Python (or any other programming language).
What would be your advice to achieve the goal? Writing functions or running 2 parallel scripts? Or should I use the multiprocessing module?
Thanks
Nischal

I would go for option 1 as you have more control over the whole process (e.g. you can easily wait until both subprocesses are done)
Your main setup would look like this:
from multiprocessing import Process
import time
def acquire_data(arg):
for i in range(5):
print('acquiring data: {}'.format(arg))
time.sleep(1.1)
def capture_video():
for i in range(5):
print('capturing video')
time.sleep(1)
if __name__ == '__main__':
p_data = Process(target=acquire_data, args=('foo',))
p_video = Process(target=capture_video)
p_data.start()
p_video.start()
p_data.join() # wait until acquire_data is done
p_video.join() # wait also until capture_video is done
Also: Which model is it? Because if both data-acquiring and video-capturing is taking 100% cpu then you will run into an issue with only a single core raspberry pi. Model 3 has 4 cores so this is fine.

Related

Multiprocessing with Multiple Functions: Need to add a function to the pool from within another function

I am measuring the metrics of an encryption algorithm that I designed. I have declared 2 functions and a brief sample is as follows:
import sys, random, timeit, psutil, os, time
from multiprocessing import Process
from subprocess import check_output
pid=0
def cpuUsage():
global running
while pid == 0:
time.sleep(1)
running=true
p = psutil.Process(pid)
while running:
print(f'PID: {pid}\t|\tCPU Usage: {p.memory_info().rss/(1024*1024)} MB')
time.sleep(1)
def Encryption()
global pid, running
pid = os.getpid()
myList=[]
for i in range(1000):
myList.append(random.randint(-sys.maxsize,sys.maxsize)+random.random())
print('Now running timeit function for speed metrics.')
p1 = Process(target=metric_collector())
p1.start()
p1.join()
number=1000
unit='msec'
setup = '''
import homomorphic,random,sys,time,os,timeit
myList={myList}
'''
enc_code='''
for x in range(len(myList)):
myList[x] = encryptMethod(a, b, myList[x], d)
'''
dec_code='''
\nfor x in range(len(myList)):
myList[x] = decryptMethod(myList[x])
'''
time=timeit.timeit(setup=setup,
stmt=(enc_code+dec_code),
number=number)
running=False
print(f'''Average Time:\t\t\t {time/number*.0001} seconds
Total time for {number} Iters:\t\t\t {time} {unit}s
Total Encrypted/Decrypted Values:\t {number*len(myList)}''')
sys.exit()
if __name__ == '__main__':
print('Beginning Metric Evaluation\n...\n')
p2 = Process(target=Encryption())
p2.start()
p2.join()
I am sure there's an implementation error in my code, I'm just having trouble grabbing the PID for the encryption method and I am trying to make the overhead from other calls as minimal as possible so I can get an accurate reading of just the functionality of the methods being called by timeit. If you know a simpler implementation, please let me know. Trying to figure out how to measure all of the metrics has been killing me softly.
I've tried acquiring the pid a few different ways, but I only want to measure performance when timeit is run. Good chance I'll have to break this out separately and run it that way (instead of multiprocessing) to evaluate the function properly, I'm guessing.

There are at least three major problems with your code. The net result is that you are not actually doing any multiprocessing.
The first problem is here, and in a couple of other similar places:
p2 = Process(target=Encryption())
What this code passes to Process is not the function Encryption but the returned value from Encryption(). It is exactly the same as if you had written:
x = Encryption()
p2 = Process(target=x)
What you want is this:
p2 = Process(target=Encryption)
This code tells Python to create a new Process and execute the function Encryption() in that Process.
The second problem has to do with the way Python handles memory for Processes. Each Process lives in its own memory space. Each Process has its own local copy of global variables, so you cannot set a global variable in one Process and have another Process be aware of this change. There are mechanisms to handle this important situation, documented in the multiprocessing module. See the section titled "Sharing state between processes." The bottom line here is that you cannot simply set a global variable inside a Process and expect other Processes to see the change, as you are trying to do with pid. You have to use one of the approaches described in the documentation.
The third problem is this code pattern, which occurs for both p1 and p2.
p2 = Process(target=Encryption)
p2.start()
p2.join()
This tells Python to create a Process and to start it. Then you immediately wait for it to finish, which means that your current Process must stop at that point until the new Process is finished. You never allow two Processes to run at once, so there is no performance benefit. The only reason to use multiprocessing is to run two things at the same time, which you never do. You might as well not bother with multiprocessing at all since it is only making your life more difficult.
Finally I am not sure why you have decided to try to use multiprocessing in the first place. The functions that measure memory usage and execution time are almost certainly very fast, and I would expect them to be much faster than any method of synchronizing one Process to another. If you're worried about errors due to the time used by the diagnostic functions themselves, I doubt that you can make things better by multiprocessing. Why not just start with a simple program and see what results you get?

Asynchronous Communication between few 'loops'

I have 3 classes that represent nearly isolated processes that can be run concurrently (meant to be persistent, like 3 main() loops).
class DataProcess:
...
def runOnce(self):
...
class ComputeProcess:
...
def runOnce(self):
...
class OtherProcess:
...
def runOnce(self):
...
Here's the pattern I'm trying to achieve:
start various streams
start each process
allow each process to publish to any stream
allow each process to listen to any stream (at various points in it's loop) and behave accordingly (allow for interruption of it's current task or not, etc.)
For example one 'process' Listens for external data. Another process does computation on some of that data. The computation process might be busy for a while, so by the time it comes back to start and checks the stream, there may be many values that piled up. I don't want to just use a queue because, actually I don't want to be forced to process each one in order, I'd rather be able to implement logic like, "if there is one or multiple things waiting, just run your process one more time, otherwise go do this interruptible task while you wait for something to show up."
That's like a lot, right? So I was thinking of using an actor model until I discovered RxPy. I saw that a stream is like a subject
from reactivex.subject import BehaviorSubject
newData = BehaviorSubject()
newModel = BehaviorSubject()
then I thought I'd start 3 threads for each of my high level processes:
thread = threading.Thread(target=data)
threads = {'data': thread}
thread = threading.Thread(target=compute)
threads = {'compute': thread}
thread = threading.Thread(target=other)
threads = {'other': thread}
for thread in threads.values():
thread.start()
and I thought the functions of those threads should listen to the streams:
def data():
while True:
DataProcess().runOnce() # publishes to stream inside process
def compute():
def run():
ComuteProcess().runOnce()
newData.events.subscribe(run())
newModel.events.subscribe(run())
def other():
''' not done '''
ComuteProcess().runOnce()
Ok, so that's what I have so far. Is this pattern going to give me what I'm looking for?
Should I use threading in conjunction with rxpy or just use rxpy scheduler stuff to achieve concurrency? If so how?
I hope this question isn't too vague, I suppose I'm looking for the simplest framework where I can have a small number of computational-memory units (like objects because they have internal state) that communicate with each other and work in parallel (or concurrently). At the highest level I want to be able to treat these computational-memory units (which I've called processes above) as like individuals who mostly work on their own stuff but occasionally broadcast or send a message to a specific other individual, requesting information or providing information.
Am I perhaps actually looking for an actor model framework? or is this RxPy setup versatile enough to achieve that without extreme complexity?
Thanks so much!

What is the best concurrency way of doing 10 000 continuous opencv operations simultaneously in Python3?

I used to have a relatively simple Python3 app that read a video streaming source and did continuous opencv and I/O-heavy (with files and databases) operations:
cap_vid = cv2.VideoCapture(stream_url)
while True:
# ...
# openCV operations
# database I/O operations
# file I/O operations
# ...
The app ran smoothly. However, there arose a need to do this not just with 1 channel, but with many, potentially 10 000 channels. Now, let's say I have a list of these channels (stream_urls). If I wrap my usual code inside for stream_url in stream_urls:, it will of course not work, because the iteration will never proceed further than the 0th index. So, the first thing that comes to mind is concurrency.
Now, as much as I know, there are 3 ways of doing concurrent programming in Python3: threading, asyncio, and multiprocessing:
I understand that in the case of multiprocessing the OS creates new (instances of) Python interpreter so there can be at most as many instances as there are cores of the machine, which seldom exceeds 16; however, the number of processes can be potentially up to 10 000. Also, the overhead from the use of multiprocessing exceeds the performance gains if the number of processes are more than a certain amount, so this one appears to be useless.
The case of threading seems the easiest in terms of the machinery it uses. I'd just wrap my code in a function and create threads like the following:
from threading import Thread
def work_channel(ch_src):
cap_vid = cv2.VideoCapture(ch_src)
while True:
# ...
# openCV operations
# database I/O operations
# file I/O operations
# ...
for stream_url in stream_urls:
Thread(target=work_channel, args=(stream_url,), daemon=True).start()
But there are a few problems with threading: first, using more than 11-17 threads nullifies any of its favourable effects because of the overhead costs. Also, it's not safe to work with file I/O in threading, which is a very important concern for me.
I don't know how to use asyncio and couldn't find how to do what I want with it.
Given the above scenario, which one of the 3 (or more if there are other methods that I am unaware of) concurrency methods should I use for the fastest, most accurate (or at least expected) performance? And what way should I use that method correctly?
Any help is appreciated.

Spawning more threads than your computer has cores is very much possible.
Could be as simple as:
import threading
all_urls = [your url list]
def threadfunction(url):
cap_vid = cv2.VideoCapture(url)
while True:
# ...
# openCV operations
# file I/O operations
# ...
for stream_url in all_urls:
threading.Thread(target=threadfunction, args=(stream_url,)).start()

Python multiprocessing taking the brakes off OSX

I have a program that randomly selects 13 cards from a full pack and analyses the hands for shape, point count and some other features important to the game of bridge. The program will select and analyse 10**7 hands in about 5 minutes. Checking the Activity Monitor shows that during execution the CPU (which s a 6 Core processor) is devoting about 9% of its time to the program and ~90% of its time it is idle. So it looks like a prime candidate for multiprocessing and I created a multiprocessing version using a Queue to pass information from each process back to the main program. Having navigated the problems of IDLE not working will multiprocessing (I now run it using PyCharm) and that doing a join on a process before it has finished freezes the program, I got it to work.
However, it doesn’t matter how many processes I use 5,10, 25 or 50 the result is always the same. The CPU devotes about 18% of its time to the program and has ~75% of its time idle and the execution time is slightly more than double at a bit over 10 minutes.
Can anyone explain how I can get the processes to take up more of the CPU time and how I can get the execution time to reflect this? Below are the relevant sections fo the program:
import random
import collections
import datetime
import time
from math import log10
from multiprocessing import Process, Queue
NUM_OF_HANDS = 10**6
NUM_OF_PROCESSES = 25
def analyse_hands(numofhands, q):
#code remove as not relevant to the problem
q.put((distribution, points, notrumps))
if __name__ == '__main__':
processlist = []
q = Queue()
handsperprocess = NUM_OF_HANDS // NUM_OF_PROCESSES
print(handsperprocess)
# Set up the processes and get them to do their stuff
start_time = time.time()
for _ in range(NUM_OF_PROCESSES):
p = Process(target=analyse_hands, args=((handsperprocess, q)))
processlist.append(p)
p.start()
# Allow q to get a few items
time.sleep(.05)
while not q.empty():
while not q.empty():
#code remove as not relevant to the problem
# Allow q to be refreshed so allowing all processes to finish before
# doing a join. It seems that doing a join before a process is
# finished will cause the program to lock
time.sleep(.05)
counter['empty'] += 1
for p in processlist:
p.join()
while not q.empty():
# This is never executed as all the processes have finished and q
# emptied before the join command above.
#code remove as not relevant to the problem
finish_time = time.time()

I have no answer to the reason why IDLE will not run a multiprocessor start instruction correctly but I believe the answer to the doubling of the execution times lies in the type of problem I am dealing with. Perhaps others can comment but it seems to me that the overhead involved with adding and removing items to and from the Queue is quite high so that performance improvements will be best achieved when the amount of data being passed via the Queue is small compared with the amount of processing required to obtain that data.
In my program I am creating and passing 10**7 items of data and I suppose it is the overhead of passing this number of items via the Queue that kills any performance improvement from getting the data via separate Processes. By using a map it seems all 10^7 items of data will need to be stored in the map before any further processing can be done. This might improve performance depending on the overhead of using the map and dealing with that amount of data but for the time being I will stick with my original vanilla, single processed code.

Python Async Functionality

I'm trying to figure out how the async functionality works in Python. I have watched countless videos but I guess I'm not 'getting it'. My code looks as follows:
def run_watchers():
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(watcher_helper())
loop.close()
async def watcher_helper():
watchers = Watcher.objects.all()
for watcher in watchers:
print("Running watcher : " + str(watcher.pk))
await watcher_helper2(watcher)
async def watcher_helper2(watcher):
for i in range(1,1000000):
x = i * 1000 / 2000
What makes sense to me is to have three functions. One to start the loop, second to iterate through the different options to execute and third to do the work.
I am expecting the following output:
Running watcher : 1
Running watcher : 2
...
...
Calculation done
Calculation done
...
...
however I am getting:
Running watcher : 1
Calculation done
Running watcher : 2
Calculation done
...
...
which obviously shows the calculations are not done in parallel. Any idea what I am doing wrong?

asyncio can be used only to speedup multiple network I/O related functions (send/receive data through internet). While you wait some data from network (which may take long time) you usually idle. Using asyncio allows you to use this idle time for another useful job: for example, to start another parallel network request.
asyncio can't somehow speedup CPU-related job (which is what watcher_helper2 do in your example). While you multiply some numbers there's simply no idle time which can be used to do something different and to achieve benefit through that.
Read also this answer for more detailed explanation.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string