Sending variables from one python thread to another - python-3.x

Lets say I have a function that will run in its own thread since its gettign serial data through a port.
def serialDataIncoming ():
device = Radar()
device.connect(port 1, baudrate 256000)
serialdata = device.startscan
for count, scan in enumerate(serialdata):
distance = device.distance
sector = device.angle
Now I want to run this in its own thread
try:
thread.start_new_thread(serialDataIncoming())
except:
# error handling here
now , I want to add to the code of serialDataIncoming(), a line where I send the distance and sector to another function to be processed and then send somewhere else, now here is this issue, the data incoming from "device" is continusly being sent, I can experience a delay or even lose some data if I lose some time inside the loop for another loop, so I want to create a new thread and from that thread run a function that will receive data from the first thread and process it and do whatever.
def dataProcessing():
# random code here where I process the data
However my issue is , how do I send both variables from one thread to the second thread, in my mind within multiple threads the second thread would have to wait until it receives variables and then start working, its going to be send a lot of data at the same time so I might have to introduce a third thread that would hold that data and then send it to the thread that processes.
So the question is basically that, how would I write in python sending 2 variables to another thread, and how would that be written in the function being used on the second thread?

To pass arguments to the thread function you can do:
def thread_fn(a, b, c):
print(a, b, c)
thread.start_new_thread(thread_fn, ("asdsd", 123, False))
The list of arguments must be a tuple or list. However in Python only one thread is actually running at a time so it may actually be more reliable (and simpler) to work out a way to do this with one thread. From the sounds of it you are polling the data so this is not like file access where the OS will notify the thread when it can wake up again once the file operation has completed (hence you wont get the kind of gains you would from multithreaded file access.)

Related

Asynchronous Communication between few 'loops'

I have 3 classes that represent nearly isolated processes that can be run concurrently (meant to be persistent, like 3 main() loops).
class DataProcess:
...
def runOnce(self):
...
class ComputeProcess:
...
def runOnce(self):
...
class OtherProcess:
...
def runOnce(self):
...
Here's the pattern I'm trying to achieve:
start various streams
start each process
allow each process to publish to any stream
allow each process to listen to any stream (at various points in it's loop) and behave accordingly (allow for interruption of it's current task or not, etc.)
For example one 'process' Listens for external data. Another process does computation on some of that data. The computation process might be busy for a while, so by the time it comes back to start and checks the stream, there may be many values that piled up. I don't want to just use a queue because, actually I don't want to be forced to process each one in order, I'd rather be able to implement logic like, "if there is one or multiple things waiting, just run your process one more time, otherwise go do this interruptible task while you wait for something to show up."
That's like a lot, right? So I was thinking of using an actor model until I discovered RxPy. I saw that a stream is like a subject
from reactivex.subject import BehaviorSubject
newData = BehaviorSubject()
newModel = BehaviorSubject()
then I thought I'd start 3 threads for each of my high level processes:
thread = threading.Thread(target=data)
threads = {'data': thread}
thread = threading.Thread(target=compute)
threads = {'compute': thread}
thread = threading.Thread(target=other)
threads = {'other': thread}
for thread in threads.values():
thread.start()
and I thought the functions of those threads should listen to the streams:
def data():
while True:
DataProcess().runOnce() # publishes to stream inside process
def compute():
def run():
ComuteProcess().runOnce()
newData.events.subscribe(run())
newModel.events.subscribe(run())
def other():
''' not done '''
ComuteProcess().runOnce()
Ok, so that's what I have so far. Is this pattern going to give me what I'm looking for?
Should I use threading in conjunction with rxpy or just use rxpy scheduler stuff to achieve concurrency? If so how?
I hope this question isn't too vague, I suppose I'm looking for the simplest framework where I can have a small number of computational-memory units (like objects because they have internal state) that communicate with each other and work in parallel (or concurrently). At the highest level I want to be able to treat these computational-memory units (which I've called processes above) as like individuals who mostly work on their own stuff but occasionally broadcast or send a message to a specific other individual, requesting information or providing information.
Am I perhaps actually looking for an actor model framework? or is this RxPy setup versatile enough to achieve that without extreme complexity?
Thanks so much!

How can i pause a thread until another thread has stopped its action in python?

I have two threads concurrently running, speechRecognition and speakBack. Both of these threads are run in while loops (while True: #do something).
Speech recognition is constantly waiting for microphone input. Then, once it is received, it saves the text version of the verbal input to a file, which is loaded by my second thread, speakBack, and spoken through the speakers.
My issue is that when the phrase is spoken through the speakers, it is picked up by the microphone and then translated and once again saved to this file to be processed, resulting in an endless loop.
How can i make the speechRecognition thread suspend itself, wait for the speakBack thread to stop outputting sound through the speakers, and then continue listening for the next verbal input?
Im using the speechRecognition library and the pyttsx3 library for speech recognition and verbal ouput respectively.
The way to do this is to have shared state between the threads (either with global variables that the threads can store into and read from to indicate their progress, or with a mutable reference that is passed into each thread). The solution I’ll give below involves a global variable that stores a mutable reference, but you could just as easily pass the queue into both threads instead of storing it globally.
Using queues is a very standard way to pass messages between threads in python, because queues are already written in a thread-safe way that makes it so you don’t have to think about synchronization and locking. Furthermore, the blocking call to queue.get is implemented in a way that doesn’t involve repeatedly and wastefully checking a condition variable in a while loop.
Here’s how some code might look:
import queue
START_SPEAK_BACK = 0
START_SPEECH_RECOGNITION = 1
messageQueue = queue.Queue()
# thread 1
def speechRecognition():
while True:
# wait for input like you were doing before
# write to file as before
# put message on the queue for other thread to get
messageQueue.put(START_SPEAK_BACK)
# Calling `get` with no arguments makes the call be
# "blocking" in the sense that it won't return until
# there is an element on the queue to get.
messageFromOtherThread = messageQueue.get()
# logically, messageFromOtherThread can only ever be
# START_SPEECH_RECOGNITION, but you could still
# check that this is true and raise an exception if not.
# thread 2
def speakBack():
while True:
messageFromOtherThread = messageQueue.get()
# likewise, this message will only be START_SPEAK_BACK
# but you could still check.
# Here, fill in the code that speaks through the speakers.
# When that's done:
messageQueue.put(START_SPEECH_RECOGNITION)
Some comments:
This solution uses a single queue. It could just have easily used two queues, one for speakBack —> speechRecognition communication and the other for speechRecognition —> communication. This might make more sense if the two threads were generating messages concurrently.
This solution doesn’t actually involve inspecting the contents of the messages. However, if you need to pass additional information between threads, you could very easily pass objects or data as messages (instead of just constant values)
Finally, it’s not clear to me why you don’t just run all code in the same thread. It seems like there’s a very clear (serial) series of steps you want your program to follow: get audio input, write it to file, speak it back, start over. It might make more sense to write everything as a normal, serial, threadless python program.

Process finishes but cannot be joined?

To accelerate a certain task, I'm subclassing Process to create a worker that will process data coming in samples. Some managing class will feed it data and read the outputs (using two Queue instances). For asynchronous operation I'm using put_nowait and get_nowait. At the end I'm sending a special exit code to my process, upon which it breaks its internal loop. However... it never happens. Here's a minimal reproducible example:
import multiprocessing as mp
class Worker(mp.Process):
def __init__(self, in_queue, out_queue):
super(Worker, self).__init__()
self.input_queue = in_queue
self.output_queue = out_queue
def run(self):
while True:
received = self.input_queue.get(block=True)
if received is None:
break
self.output_queue.put_nowait(received)
print("\tWORKER DEAD")
class Processor():
def __init__(self):
# prepare
in_queue = mp.Queue()
out_queue = mp.Queue()
worker = Worker(in_queue, out_queue)
# get to work
worker.start()
in_queue.put_nowait(list(range(10**5))) # XXX
# clean up
print("NOTIFYING")
in_queue.put_nowait(None)
#out_queue.get() # XXX
print("JOINING")
worker.join()
Processor()
This code never completes, hanging permanently like this:
NOTIFYING
JOINING
WORKER DEAD
Why?
I've marked two lines with XXX. In the first one, if I send less data (say, 10**4), everything will finish normally (processes join as expected). Similarly in the second, if I get() after notifying the workers to finish. I know I'm missing something but nothing in the documentation seems relevant.
Documentation mentions that
When an object is put on a queue, the object is pickled and a background thread later flushes the pickled data to an underlying pipe. This has some consequences [...] After putting an object on an empty queue there may be an infinitesimal delay before the queue’s empty() method returns False and get_nowait() can return without raising queue.Empty.
https://docs.python.org/3.7/library/multiprocessing.html#pipes-and-queues
and additionally that
whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate.
https://docs.python.org/3.7/library/multiprocessing.html#multiprocessing-programming
This means that the behaviour you describe is caused probably by a racing condition between self.output_queue.put_nowait(received) in the worker and joining the worker with worker.join() in the Processers __init__. If joining was faster than feeding it into the queue, everything finishes fine. If it was too slow, there is an item in the queue, and the worker would not join.
Uncommenting the out_queue.get() in the main process would empty the queue, which allows joining. But as it is important for the queue to return if the queue would already be empty, using a time-out might be an option to try to wait out the racing condition, e.g out_qeue.get(timeout=10).
Possibly important might also be to protect the main routine, especially for Windows (python multiprocessing on windows, if __name__ == "__main__")

Is there a way to use cherrypy's Monitor to perform a single task and then stop?

I have a web application that requests a report that takes more than 10 minutes to run. Apart from improving that performance, I would for now prefer to set up a thread to run the report and mail it to the user, returning that decision message back to the user immediately.
I have been looking at cherrypy.process.plugins.Monitor, but I'm not clear if it is the correct choice (what to do with the frequency parameter?)
Monitor is not the correct choice; it's for running the same task repeatedly on a schedule. You're probably better off just calling threading.Thread(target=run_report).start(). You can then return 202 Accepted to the user, along with a URL for the client to watch the status and/or retrieve the newly-created report resource when it's ready.
The one caveat to that is that you might want your new thread to shut down gracefully when the cherrypy.engine stops. Have a look at the various plugins for examples of how to hook into the 'stop' channel on the bus. The other option would be to make your thread daemonic, if you don't care if it terminates abnormally.
Besides agreeing with fumanchu's answer, I would like to add that the frequency parameter is actually the period expressed in seconds.cherrypy.process.plugins.Monitor (the name is misleading).
Another possible solution could be having a monitor executed periodically, and a set of working computations which can be checked periodically for completion. The code would be something like
class Scheduler:
def __init__ (self):
self.lock = threading.Lock()
self.mon = Monitor(cherrypy.engine, check_computations, frequency=whatever)
self.mon.start()
self.computations = list() # on which we append stuff
def check_computations (self):
with self.lock:
for i in self.computations:
check(i) # Single check function
Caveats:
The computation time of check matters. You don't want to have workload on this perioic routine
Beware on how you use locks:
It is protecting the computations list;
If you access it (even indirectly) from with check your program gets into deadlock. This could be the case if you want to unsubscribe something from the computations list.

Simultaneous Read/Write on a file by two threads (Mutex aren't helping)

I want to use one thread to get fields of packets by using tshark utility (using system () command) whos output is then redirected to a file. This same file needs to be read by another thread simultaneously, so that it can make runtime decisions based on the fields observed in the file.
The problem I am having currently now is even though the first thread is writing to the file, the second thread is unable to read it (It reads NULL from the file). I am not sure why its behaving this way. I thought it might be due to simultaneous access to the same file. I thought of using mutex locks but that would block the reading thread, since the first thread will only end when the program terminates.
Any ideas on how to go about it?
If you are using that file for interprocess communication, you could instead use named pipes or message queues instead. They are much easier to use and don't require synchronization because one thread writes and the other one reads when data is available.
Edit: For inter-thread communication you can simply use shared variables and a conditional variable to signal when some data has been produced (a producer-consumer pattern). Something like:
// thread 1
while(1)
{
// read packet
// write packet to global variable
// signal thread 2
// wait for confirmation of reading
}
// thread 2
while(1)
{
// wait for signal from thread 1
// read from global variable
// signal thread 2 to continue
}
The signal parts can be implemented with conditional variables: pthread_cond_t.

Resources