Run two functions in parallel in Python - python-3.x

Currently I have a program which scans all the available ble devices and then updates it to a list bt_addrs. This list is then passed to a function named data() which performs the work. To go over all the elements in the list I have used ProcessPoolExecutor.
One issue that I am facing is the scanning happens only for one time and then it stops. I want to make the scanning function to execute continuously so that if any new device comes in the vicinity, it will get added to the list and list should get updated and data function should work on that updated list in parallel.
Having an intermediate knowledge about Python and multiprocessing is an advance topic , a help would definitely help me increase my knowledge and help me get the desired results of the whole program.
If you have any other approach, please tell.
I am running this code on Raspberry Pi 4- quad core Arm v8 processor, python version 3.7.3
Below is the code I currently have
import sys
import time
from bluepy import btle
from bluepy.btle import Scanner
import concurrent.futures
import signal
bt_addrs = []
def data(mac_adrs3):
while True:
// main works happen here
def main():
scanner = Scanner() # scanner ovject
devices = scanner.scan(30.0) # scans for 30 sec
available_devices=[]
for dev in devices:
available_devices.append(dev.addr) # all the MAC address are stored
bt_addrs = available_devices
with concurrent.futures.ProcessPoolExecutor() as executor:
executor.map(data,bt_addrs) #data fun runs on all the elements in list bt_addrs
if __name__=="__main__":
main()

Related

Pyftdi issue with multiprocessing

I'm currently using a FT232H Breakout board (USB GPIO) from Adafruit that I am controlling with Pyftdi in Windows, using the libusbK 3.0.7 driver installed through ZADIG. It's working just fine in every aspect, but for this particular project I need to use the multiprocessing module.
However, I can't get Pyftdi to work with it. To replicate my issue, you can just run this bit of code.
import multiprocessing as mp
import board
def func():
print('This will crash')
p1 = mp.Process(target=func)
p1.start()
p1.join()
p1.terminate
From what I can gather, the problem is that when instanciating a new process, Python will once again import the board module, required to run the FT232H, and will attempt to claim it's USB interface which is already claimed, throwing this error:
pyftdi.ftdi.FtdiError: UsbError: [Errno None] b'libusb0-dll:err [claim_interface] could not claim interface 0, win error: Cannot create a file when that file already exists.
However, if I write around this so that the board module is not imported a second time for the new process, any FT232H commands ran in the new process will not work.
Anyone has any ideas on how I can tackle this somehow?

Need to detect any usb block device with specific partition label

I'm surprised this is proving to be difficult to find.
I need to detect when a USB block device with a specific partition label is added (plugged in) using python3.
Is there a way to use pyudev to provide a list of USB block devices? How can I specify a filter with subsystem="block" AND subsystem="usb", they seem to be mutually exclusive filters.
When a USB device having a partition named "XYZ" is plugged in, I need to run a script to mount it and run a program that uses the data on that partition.
I have tried too many variations to count, from various udev rules, systemd units, many scripts and combinations thereof, but have not had any success until I used the following code. It worked but caused 100% CPU load. When I added sleep time in the while loop at the end it no longer worked at all, and even prevented PCmanFM automounting as well.
The issue was in the usbEvent.py process. I could run it from the command line and it worked just fine. First thing it does is use Popen to call "grep devName /proc/mounts" to wait for the automounter to mount the partition. The Popen is called in a loop and adding some time.sleep eliminated the CPU burden tho that was surprising since the mount point appears in under a few seconds.
There seems to be some interplay between the code below systemd runs and the usbEvent.py process it spawns that I don't fully understand. They are separate processes so I would think they should be quite independent of each other.
The usbEvent.py handler works but it takes much longer to recognize the mount and continue. While it's running it consumes around 5% of the CPU, and only 0.3 when it finishes. Why it doesn't end when the timeout is over must be due to p.communicate, but if p.poll does NOT return None the process should be complete and should not block... but it does! Why?
The platform is a Raspberry Pi4 with 8GB RAM and January 2021 Raspbery Pi OS release.
#!/usr/bin/env python3
import os
import time
import subprocess as sp
import pyudev
# This code is run on boot via systemd to detect when
# my custom USB storage device (USB stick, SSD etc)
# is inserted or removed. It spawns a new process to
# handle the event.
context = pyudev.Context()
monitor = pyudev.Monitor.from_netlink(context)
monitor.filter_by('block', device_type="partition")
def log_event(action, device):
devName = device.get('DEVNAME')
devLabel = device.get('ID_FS_LABEL')
if devLabel == "MY_CUSTOM_USB":
sp.Popen(["/home/user/bin/customUSB/usbEvent.py",
action, devName, devLabel],
stdin=sp.DEVNULL, stdout=sp.DEVNULL, stderr=sp.DEVNULL)
observer = pyudev.MonitorObserver(monitor, log_event)
observer.start()
while True:
# pass
time.sleep(0.1)
Here is the portion of the usbEvent.py handler that I changed to get it working:
# Waits for mount point of "dev" to appear and returns it. Communnicate
def getMountPoint(dev):
out = ""
interval = 0.1
timeout = 5 / interval
while timeout > 0:
p = sp.Popen(["grep", dev, "/proc/mounts"],
text=True, stdout=sp.PIPE, stderr=sp.PIPE)
retCode = p.poll()
if retCode is None:
time.sleep(interval)
else:
out, err = p.communicate() # This should not block but does!
if retCode == 0 and len(out) > 0:
out = out.split()[1]
break
else:
lg.info(f"exit code: {retCode} Error: {err}")
exit(1)
if timeout == 0:
p.terminate()
return out

Python 3.8 RAM owerflow and loading issues

First, I want to mention, that this is our first project in a bigger scale and therefore we don't know everything but we learn fast.
We developed a code for image recognition. We tried it with a raspberry pi 4b but quickly faced that this is way to slow overall. Currently we are using a NVIDIA Jetson Nano. The first recognition was ok (around 30 sec.) and the second try was even better (around 6-7 sec.). The first took so long because the model will be loaded for the first time. Via an API the image recognition can be triggered and the meta data from the AI model will be the response. We use fast-API for this.
But there is a problem right now, where if I load my CNN as a global variable in the beginning of my classification file (loaded on import) and use it within a thread I need to use mp.set_start_method('spawn') because otherwise I will get the following error:
"RuntimeError: Cannot re-initialize CUDA in forked subprocess.
To use CUDA with multiprocessing, you must use the 'spawn' start method"
Now that is of course an easy fix. Just add the method above before starting my thread. Indeed this works but another challenge occurs at the same time. After setting the start method to 'spawn' the ERROR disappears but the Jetson starts to allocate way to much memory.
Because of the overhead and preloaded CNN model, the RAM is around 2.5Gig before the thread starts. After the start it doesn’t stop allocating RAM, it consumes all 4Gig of the RAM and also the whole 6Gig Swap. Right after this, the whole API process kill with this error: "cannot allocate memory" which is obvious.
I managed to fix that as well just by loading the CNN Model in the classification function. (Not preloading it on the GPU as in the two cases before). However, here I got problem as well. The process of loading the model to the GPU takes around 15s - 20s and this every time the recognition starts. This is not suitable for us and we are wondering why we cannot pre-load the model without killing the whole thing after two image-recognitions. Our goal is to be under 5 sec with this.
#clasify
import torchvision.transforms as transforms
from skimage import io
import time
from torch.utils.data import Dataset
from .loader import *
from .ResNet import *
#if this part is in the classify() function than no allocation problem occurs
net = ResNet152(num_classes=25)
net = net.to('cuda')
save_file = torch.load("./model.pt", map_location=torch.device('cuda'))
net.load_state_dict(save_file)
def classify(imgp=""):
#do some classification with the net
pass
if __name__ == '__main__':
mp.set_start_method('spawn') #if commented out the first error ocours
manager = mp.Manager()
return_dict = manager.dict()
p = mp.Process(target=classify, args=('./bild.jpg', return_dict))
p.start()
p.join()
print(return_dict.values())
Any help here will be much appreciated. Thank you.

How to monitor process creation and statistics using kernel module

I wrote a kernel module to monitor cpu and memory time series. Additionally to that, I would like to log all process creations (and their meta date like pid, cmdline, ...) and also exists with their statistics like total I/O and CPU usage.
The main questions is: Can I create a kind of listener to process creation and exit? Especially on exit, I would also need the meta information for the process. How can this be done?
What you're describing sounds eerily like the Linux process accounting system, which already exists in the kernel. If it isn't an exact fit, your best bet will be to consider extending it, rather than building something entirely new.
Another existing system to look at will be the process events connector, which can be used to notify userspace processes when other processes are created and exit.
I know you are talking about monitoring processes using a Linux kernel module.
But I think it worth mention the python module psutil. Even if it is a user-space solution.
It is a very complete tool that allows monitoring processes about the resources they are using, memory, disk, CPU.
Some examples from the documentation:
Getting CPU usage for some process
>>> import psutil
>>> p = psutil.Process()
>>> # blocking
>>> p.cpu_percent(interval=1)
2.0
>>> # non-blocking (percentage since last call)
>>> p.cpu_percent(interval=None)
2.9
Getting memory info
>>> import psutil
>>> p = psutil.Process()
>>> p.memory_info()
pmem(rss=15491072, vms=84025344, shared=5206016, text=2555904, lib=0, data=9891840, dirty=0)
And the very interesting open_files
>>> import psutil
>>> f = open('file.ext', 'w')
>>> p = psutil.Process()
>>> p.open_files()
[popenfile(path='/home/giampaolo/svn/psutil/file.ext', fd=3, position=0, mode='w', flags=32769)]
The process creation time
>>> import psutil, datetime
>>> p = psutil.Process()
>>> p.create_time()
1307289803.47
>>> datetime.datetime.fromtimestamp(p.create_time()).strftime("%Y-%m-%d %H:%M:%S")
'2011-03-05 18:03:52'
Of course, you can query info for any process running in your target system just provide the pid to psutil.Process like this: psutil.Process(pid)

How to process audio stream in realtime

I have a setup with a raspberry pi 3 running latest jessie with all updates installed in which i provide a A2DP bluetooth sink where i connect with a phone to play some music.
Via pulseaudio, the source (phone) is routed to the alsa output (sink). This works reasonably well.
I now want to analyze the audio stream using python3.4 with librosa and i found a promising example using pyaudio which got adjusted to use the pulseaudio input (which magically works because its the default) instead of a wavfile:
"""PyAudio Example: Play a wave file (callback version)."""
import pyaudio
import wave
import time
import sys
import numpy
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# define callback (2)
def callback(in_data, frame_count, time_info, status):
# convert data to array
data = numpy.fromstring(data, dtype=numpy.float32)
# process data array using librosa
# ...
return (None, pyaudio.paContinue)
# open stream using callback (3)
stream = p.open(format=p.paFloat32,
channels=1,
rate=44100,
input=True,
output=False,
frames_per_buffer=int(44100*10),
stream_callback=callback)
# start the stream (4)
stream.start_stream()
# wait for stream to finish (5)
while stream.is_active():
time.sleep(0.1)
# stop stream (6)
stream.stop_stream()
stream.close()
wf.close()
# close PyAudio (7)
p.terminate()
Now while the data flow works in principle, there is a delay (length of buffer) with which the stream_callback gets called. Since the docs state
Note that PyAudio calls the callback function in a separate thread.
i would have assumed that while the callback is worked on, the buffer keeps filling in the mainthread. Of course, there would be an initial delay to fill the buffer, afterwards i expected to get synchronous flow.
I need a longer portion in the buffer (see frames_in_buffer) for librosa to be able to perfom analysis correctly.
How is something like this possible? Is it a limitation of the software-ports for the raspberry ARM?
I found other answers, but they use the blocking I/O. How would i wrap this into a thread so that librosa analysis (which might take some time) does not block the buffer filling?
This blog seems to fight performance issues with cython, but i dont think the delay is a performance issue. Or might it? Others seem to need some ALSA tweaks but would this help while using pulseaudio?
Thanks, any input appreciated!

Resources