I've written a PySide Windows application that uses libvlc to show a video, log keystrokes, and write aggregated information about those keystrokes to a file. I'm experiencing two bugs that are causing the application to crash (other question here -> https://stackoverflow.com/questions/18326943/pyside-qlistwidget-crash).
The application writes the keystroke file at every five minute interval on the video. Users can change the playback speed, so that five minute interval may take more or less than five minutes; it's not controlled by a timer.
The video continues playing while the file is written, so I've created an object inheriting from threading.Thread for the file creation - IntervalFile. Some information about the file to be written is passed in the constructor; IntervalFile doesn't access its parent (the main QWidget) at all. This is the only threading object I use in the app. There are no timer declared anywhere.
Intermittently, the application will crash and I'll get the following message: "QObject::killTimers: timers cannot be stopped from another thread".
The code that creates IntervalFile is (part of CustomWidget, inherited from QWidget):
def doIntervalChange(self):
...
ifile = IntervalFile(5, filepath, dbpath) # db is sqlite, with new connection created within IntervalFile
ifile.start()
#end of def
doIntervalChange is called from within QWidget using a signal.
IntervalFile is:
class IntervalFile(threading.Thread):
def __init__(self, interval, filepath, dbpath):
# declaration of variables
threading.Thread.__init__(self)
def run(self):
shutil.copy('db.local', self.dbPath) # because db is still being used in main QWidget
self.localDB = local(self.dbPath) # creates connection to sqlite db, with sql within the object to make db calls easier
# query db for keystroke data
# write file
self.localDB.close()
self.localDB = None
os.remove(self.dbPath) # don't need this copy anymore
When ifile.start() is commented out, I don't see the killTimers crash. Any suggestions? Note that the crash seems random; sometimes I can use the app (just continuely pressing the same keystroke over and over) for an hour without it crashing, sometimes it's within the first couple of intervals. Because of this difficulty reproducing the crashes, I think these lines of code are the issue, but I'm not 100% sure.
I'm pretty sure you need to hold a reference to your thread object. When your doIntervalChange() method finishes, nothing is holding a reference to the thread object (ifile) any more and so it can be garbage collected. Presumably this is why the crash happens randomly (if the thread finishes it's task before the object is garbage collected, then you don't have a problem).
Not exactly sure what is creating the QTimers, but I'm fairly certain that won't affect my proposed solution!
So in doIntervalChange() save a reference to ifile in a list, and periodically clean up the list when threads have finished execution. Have a look at this for an idea (and if a better way to clean up threads shows up in that post, implement that!): Is there a more elegant way to clean up thread references in python? Do I even have to worry about them?
Related
I'm trying to implement a GUI to assist some coworkers with image processing, but I am running into a problem with the Tkinter (ttk) progress bar, which keeps lagging while I execute work in another thread.
Basically, I want to have the progress bar run in indeterminate mode and bounce back and forth, as a kind of visual confirmation that things are still progressing (like the "working" circle in windows).
I set it up as follows:
# Frame/root set up (etc.) above here...
pbv = IntVar()
progressBar = ttk.Progressbar(consoleFrame, orient="horizontal", length=100, variable=pbv, mode="indeterminate")
progressBar.grid(column=1, row=1, sticky=N)
progressBar.start()
root.mainloop()
and it bounces along as intended, running nicely on the primary thread. Good so far.
Then, I initialize a separate thread and run a very computationally intensive function.
externalFunctionThread = threading.Thread(target=expensiveFunctionThread, args=[foo])
externalFunctionThread.deamon = True
externalFunctionThread.start()
With some function:
def expensiveFunctionThread(foo):
for i in range(a_few_iterations):
superDuperExpensiveFunction(i, foo)
# Note: Were I to comment this out and replace it with a time.sleep(n), the bar would progress normally
# Even a for loop that prints numbers up to some large integer would not cause lag
For context, superDuperExpensiveFunction is a call to a pyradiomics function, which is generating many texture features from some image.
What ends up happening is the bar will heavily lag, frequently locking up (not moving) while the superDuperExpensiveFunction is running, only moving after each iteration. Once the super duper expensive function thread finishes it will return to normal.
I have looked at other threads on this board (How to connect a progress bar to a function? , Tkinter: How to use threads to preventing main event loop from "freezing" , ttk progress bar freezing , etc.), but none of them help, as they are concerned with the progress bar freezing at the start, as opposed to lagging in its updates. The last one does not work (including progressBar.update()/progressBar.update_idletasks() does nothing). My fear is that it has to do with how Python handles threading (in that there is ultimately only 1 "real" thread that can do work at one time, and Python has to cycle which thread is allowed to work (the GIL or something?)).
Anyways, is there a workaround for this? Why might this be occurring?
Okay, so after doing some research I found out that pyradiomics has a flag that prohibits this kind of rapid task switching, and therefore threading is not a viable solution. To solve the problem, as furas recommended, I implemented multiprocessing (pooling, in fact!). This not only allowed me to have the progress bar run without any lag, but also enabled dividing up the computation across cores, speeding up my implementation greatly. The syntax is very similar to that of threading, with one main exception that caused me a lot of grief before I figured it out. Variables are not inherited by child processes! Therefore, should I want to get a value from an Entry or something, I would have to explicitly pass it in to the child process. Past that, there really aren't that many changes. Instead of threading just use the associated process thing (Process instead of Thread, multiprocessing.Queue instead of queue). No lag, and greater efficiency!
I have two threads concurrently running, speechRecognition and speakBack. Both of these threads are run in while loops (while True: #do something).
Speech recognition is constantly waiting for microphone input. Then, once it is received, it saves the text version of the verbal input to a file, which is loaded by my second thread, speakBack, and spoken through the speakers.
My issue is that when the phrase is spoken through the speakers, it is picked up by the microphone and then translated and once again saved to this file to be processed, resulting in an endless loop.
How can i make the speechRecognition thread suspend itself, wait for the speakBack thread to stop outputting sound through the speakers, and then continue listening for the next verbal input?
Im using the speechRecognition library and the pyttsx3 library for speech recognition and verbal ouput respectively.
The way to do this is to have shared state between the threads (either with global variables that the threads can store into and read from to indicate their progress, or with a mutable reference that is passed into each thread). The solution I’ll give below involves a global variable that stores a mutable reference, but you could just as easily pass the queue into both threads instead of storing it globally.
Using queues is a very standard way to pass messages between threads in python, because queues are already written in a thread-safe way that makes it so you don’t have to think about synchronization and locking. Furthermore, the blocking call to queue.get is implemented in a way that doesn’t involve repeatedly and wastefully checking a condition variable in a while loop.
Here’s how some code might look:
import queue
START_SPEAK_BACK = 0
START_SPEECH_RECOGNITION = 1
messageQueue = queue.Queue()
# thread 1
def speechRecognition():
while True:
# wait for input like you were doing before
# write to file as before
# put message on the queue for other thread to get
messageQueue.put(START_SPEAK_BACK)
# Calling `get` with no arguments makes the call be
# "blocking" in the sense that it won't return until
# there is an element on the queue to get.
messageFromOtherThread = messageQueue.get()
# logically, messageFromOtherThread can only ever be
# START_SPEECH_RECOGNITION, but you could still
# check that this is true and raise an exception if not.
# thread 2
def speakBack():
while True:
messageFromOtherThread = messageQueue.get()
# likewise, this message will only be START_SPEAK_BACK
# but you could still check.
# Here, fill in the code that speaks through the speakers.
# When that's done:
messageQueue.put(START_SPEECH_RECOGNITION)
Some comments:
This solution uses a single queue. It could just have easily used two queues, one for speakBack —> speechRecognition communication and the other for speechRecognition —> communication. This might make more sense if the two threads were generating messages concurrently.
This solution doesn’t actually involve inspecting the contents of the messages. However, if you need to pass additional information between threads, you could very easily pass objects or data as messages (instead of just constant values)
Finally, it’s not clear to me why you don’t just run all code in the same thread. It seems like there’s a very clear (serial) series of steps you want your program to follow: get audio input, write it to file, speak it back, start over. It might make more sense to write everything as a normal, serial, threadless python program.
I have a piece of code which has to get executed every 100ms and update the GUI. When I am updating the GUI - I am pressing a button, which calls a thread and in turn it calls a target function. The target function gives back the message to the GUI thread using pub sub as follows.
wx.CallAfter(pub.sendMessage, "READ EVENT", arg1=data, arg2=status_read) # This command line is in my target function
pub.subscribe(self.ReadEvent, "READ EVENT") # This is in my GUI file - whihc calls the following function
def ReadEvent(self, arg1, arg2):
if arg2 == 0:
self.MessageBox('The program did not properly read data from MCU \n Contact the Program Developer')
return
else:
self.data = arg1
self.firmware_version_text_control.Clear()
#fwversion = '0x' + ''.join('{:02X}'.format(j) for j in reversed(fwversion))
self.firmware_version_text_control.AppendText(str(SortAndDecode(self.data, 'FwVersion')))
# Pump Model
self.pump_model_text_control.Clear()
self.pump_model_text_control.AppendText(str(SortAndDecode(self.data, 'ModelName')))
# Pump Serial Number
self.pump_serial_number_text_control.Clear()
self.pump_serial_number_text_control.AppendText(str(SortAndDecode(self.data, 'SerialNum'))[:10]) # Personal Hack to not to display the AA , AB and A0
# Pressure GAIN
self.gain_text_control.Clear()
self.gain_text_control.AppendText(str(SortAndDecode(self.data, 'PresGain')))
# Pressure OFFSET Offset
self.offset_text_control.Clear()
self.offset_text_control.AppendText(str(SortAndDecode(self.data, 'PresOffset')))
#Wagner Message:
#self.status_text.SetLabel(str(SortAndDecode(self.data, 'WelcomeMsg')))
# PUMP RUNNING OR STOPPED
if PumpState(SortAndDecode(self.data, 'PumpState')) == 1:
self.led6.SetBackgroundColour('GREEN')
elif PumpState(SortAndDecode(self.data, 'PumpState')) == 0:
self.led6.SetBackgroundColour('RED')
else:
self.status_text.SetLabel(PumpState(SortAndDecode(self.data, 'PumpState')))
# PUMP RPM
self.pump_rpm_text_control.Clear()
if not self.new_model_value.GetValue():
self.pump_rpm_text_control.AppendText("000")
else:
self.pump_rpm_text_control.AppendText(str(self.sheet_num.cell_value(self.sel+1,10)*(SortAndDecode(self.data, 'FrqQ5'))/65536))
# PUMP PRESSURE
self.pressure_text_control.Clear()
self.pressure_text_control.AppendText(str(SortAndDecode(self.data, 'PresPsi')))
# ON TIME -- HOURS AND MINUTES --- EDITING IF YOU WANT
self.on_time_text_control.Clear()
self.on_time_text_control.AppendText(str(SortAndDecode(self.data, 'OnTime')))
# JOB ON TIME - HOURS AND MINUTES - EDITING IF YOU WANT
self.job_on_time_text_control.Clear()
self.job_on_time_text_control.AppendText(str(SortAndDecode(self.data, 'JobOnTime')))
# LAST ERROR ----- RECHECK THIS AGAIN
self.last_error_text_control.Clear()
self.last_error_text_control.AppendText(str(SortAndDecode(self.data, 'LastErr')))
# LAST ERROR COUNT --- RECHECK THIS AGAIN
self.error_count_text_control.Clear()
self.error_count_text_control.AppendText("CHECK THIS")
As you can see my READEVENT is very big and it takes a while for the GUI to take enough time to successfully do all these things. My problem here is, when my GUI is updating the values of TEXTCTRL it is going unresponsive - I cannot do anything else. I cant press a button or enter data or anything else. My question is if there is a better way for me to do this, so my GUI wont go unresponsive. I dont know how I can put this in a different thread as all widgets are in the main GUI. But that also requires keep creating threads every 100ms - which is horrible. Any suggestions would be greatly helpful.
Some suggestions:
How long does SortAndDecode take? What about the str() of the result? Those may be good candidates for keeping that processing in the worker thread instead of the UI thread, and passing the values to the UI thread pre-sorted-and-decoded.
You can save a little time in each iteration by calling ChangeValue instead of Clear and AppendText. Why do two function calls for each text widget instead of just one? Function calls are relatively expensive in Python compared to other Python code.
If it's possible that the same value will be sent that was sent on the last iteration then adding checks for the new value matching the old value and skipping the update of the widget could potentially save lots of time. Updating widget values is very expensive compared to leaving them alone.
Unless there is a hard requirement for 100ms updates you may want to try 150 or 200. Fewer updates per second may be fast enough for most people, especially since it's mostly textual. How much text can you read in 100ms?
If you are still having troubles with having more updates than the UI thread can keep up with, then you may want to use a different approach than pubsub and wx.CallAfter. For example you could have the worker thread receive and process the data and then add an object to a Queue.Queue and then call wx.WakeUpIdle(). In the UI thread you can have an EVT_IDLE event handler that checks the queue and pulls the first item out of the queue, if there are any, and then updates the widgets with that data. This will give the benefit of not flooding the pending events list with events from too many wx.CallAfter calls, and you can also do things like remove items from your data queue if there are too many items in it.
The problem I have is that in my IronPython application threads are being created but never cleaned up, even when the method they run has exited. In my application I start threads in two ways: a) by using Python-style threads (sub-classes of threading.Thread that do something in their run() method), and b) by using the .NET 'ThreadStart' approach. The Python-style threads behave as expected, and after their 'run()' exits they get cleaned up. The .NET style threads never get cleaned up, even after they have exited. You can call del, Abort, whatever you want, and it has no effect on them.
The following IronPython script demonstrates the issue:
import System
import threading
import time
import logging
def do_beeps():
logging.debug("starting do_beeps")
t_start = time.clock()
while time.clock() - t_start < 10:
System.Console.Beep()
System.Threading.Thread.CurrentThread.Join(1000)
logging.debug("exiting do_beeps")
class PythonStyleThread(threading.Thread):
def __init__(self, thread_name="PythonStyleThread"):
super(PythonStyleThread, self).__init__(name=thread_name)
def run(self):
do_beeps()
class ThreadStarter():
def start(self):
t = System.Threading.Thread(System.Threading.ThreadStart(do_beeps))
t.IsBackground = True
t.Name = "ThreadStartStyleThread"
t.Start()
if __name__ == '__main__':
logging.basicConfig(format='%(asctime)s %(levelname)s: %(message)s', level=logging.DEBUG, datefmt='%H:%M:%S')
# Start some ThreadStarter threads:
for _ in range(5):
ts = ThreadStarter()
ts.start()
System.Threading.Thread.CurrentThread.Join(200)
# Start some Python-style threads:
for _ in range(5):
pt = PythonStyleThread()
pt.start()
System.Threading.Thread.CurrentThread.Join(200)
# Do something on the main thread:
for _ in range(30):
print(".")
System.Threading.Thread.CurrentThread.Join(1000)
When this is debugged in PyDev, what I see is that all the threads appear as expected in the 'debug' view as they are created:
but whereas the Python-style ones disappear after they've finished, the .NET / ThreadStart ones stay until the main thread exits.
As can be seen in the image, in the debugger the problematic threads appear with names 'Dummy-4', 'Dummy-5' etc, whereas the Pythonic ones appear with the name I've given them ('PythonStyleThread'). Looking in the threading.py file in my IronPython installation I see there is a class called "_DummyThread", a subclass of Thread, that sets its 'name' as 'name=_newname("Dummy-%d")', so it looks like by using ThreadStart I'm ending up with _DummyThreads. The comment for the class also says:
# Dummy thread class to represent threads not started here.
# These aren't garbage collected when they die, nor can they be waited for.
which would explain why I can't get rid of them.
But I don't want 'DummyThread's. I just want normal ones, that behave nicely and get garbage-collected when they've finished doing their thing.
Now, a slightly odd thing about all of this is that unless I set up the logger I don't see the DummyThread entries in the debugger at all (although they still run). This may be a funny of the PyDev debugger, or it may relevant. Is there any sane reason why logging should have any bearing on this? Can I solve my problem just by not logging in my thread?
Here, it says:
"There is the possibility that "dummy thread objects" are created. These are thread objects corresponding to "alien threads", which are threads of control started outside the threading module, such as directly from C code. Dummy thread objects have limited functionality; they are always considered alive and daemonic, and cannot be joined. They are never deleted, since it is impossible to detect the termination of alien threads."
Which makes me wonder why I've had the misfortune of ending up with them?
While I have a workaround in that I can use Python-style threading.Thread subclasses everywhere I currently use .NET 'ThreadStart' threads, I am not keen to do this as the reason I was using .NET style threads in certain places was because they give me an Abort method (whereas the Python ones don't). I know Aborting threads is a Bad Thing, but the application is a unit-testing framework, and I a) need to run unit-tests in a thread, and b) have no control over their contents (they are written by third-parties), so I have no means of periodically checking for a 'please shut me down nicely' flag on these threads, and in extremis may need to kill them rudely.
So a) why am I getting DummyThreads, b) has this got anything to do with logging and c) what can I do about it?
Thanks.
I have a web application that requests a report that takes more than 10 minutes to run. Apart from improving that performance, I would for now prefer to set up a thread to run the report and mail it to the user, returning that decision message back to the user immediately.
I have been looking at cherrypy.process.plugins.Monitor, but I'm not clear if it is the correct choice (what to do with the frequency parameter?)
Monitor is not the correct choice; it's for running the same task repeatedly on a schedule. You're probably better off just calling threading.Thread(target=run_report).start(). You can then return 202 Accepted to the user, along with a URL for the client to watch the status and/or retrieve the newly-created report resource when it's ready.
The one caveat to that is that you might want your new thread to shut down gracefully when the cherrypy.engine stops. Have a look at the various plugins for examples of how to hook into the 'stop' channel on the bus. The other option would be to make your thread daemonic, if you don't care if it terminates abnormally.
Besides agreeing with fumanchu's answer, I would like to add that the frequency parameter is actually the period expressed in seconds.cherrypy.process.plugins.Monitor (the name is misleading).
Another possible solution could be having a monitor executed periodically, and a set of working computations which can be checked periodically for completion. The code would be something like
class Scheduler:
def __init__ (self):
self.lock = threading.Lock()
self.mon = Monitor(cherrypy.engine, check_computations, frequency=whatever)
self.mon.start()
self.computations = list() # on which we append stuff
def check_computations (self):
with self.lock:
for i in self.computations:
check(i) # Single check function
Caveats:
The computation time of check matters. You don't want to have workload on this perioic routine
Beware on how you use locks:
It is protecting the computations list;
If you access it (even indirectly) from with check your program gets into deadlock. This could be the case if you want to unsubscribe something from the computations list.