How to get cProfile results from a callback function in Python - multithreading

I am trying to determine which parts of my python code are running the slowest, so that I have a better understanding on what I need to fix. I recently discovered cProfile and gprof2dot which have been extremely helpful. My problem is that I'm not seeing any information about functions that I'm using as callbacks, which I believe might be running very slowly. From what I understand from this answer is that cProfile only works by default in the main thread, and I'm guessing that callbacks use a separate thread. That answer showed a way to get things working if you are using the threading library, but I couldn't get it to work for my case.
Here is roughly what my code looks like:
import rospy
import cv
import cProfile
from numpy import *
from collections import deque
class Bla():
def __init__( self ):
self.image_data = deque()
self.do_some_stuff()
def vis_callback( self, data ):
cv_im = self.bridge.imgmsg_to_cv( data, "mono8" )
im = asarray( cv_im )
self.play_with_data( im )
self.image_data.append( im )
def run( self ):
rospy.init_node( 'bla', anonymous=True )
sub_vis = rospy.Subscriber('navbot/camera/image',Image,self.vis_callback)
while not rospy.is_shutdown():
if len( self.image_data ) > 0:
im = self.image_data.popleft()
self.do_some_things( im )
def main():
bla = Bla()
bla.run()
if __name__ == "__main__":
cProfile.run( 'main()', 'pstats.out' ) # this could go here, or just run python with -m cProfile
#main()
Any ideas on how to get cProfile info on the vis_callback function? Either by modifying the script itself, or even better using python -m cProfile or some other command line method?
I'm guessing it is either the reading of images or appending them to the queue which slow. My gut feeling is that storing them on a queue is a horrible thing to do, but I need to display them with matplotlib, which refuses to work if it isn't in the main thread, so I want to see exactly how bad the damage is with this workaround

Related

Assigned attribute is None in a Process

Consider the following peice of code.
from multiprocessing import Process
class A(Process):
def __init__(self):
self.a = None
super(A, self).__init__()
def run(self) -> None:
self.a = 4
def print_a(self):
print(self.a)
if __name__ == '__main__':
b = A()
b.start()
b.print_a()
I want output to be :-
4
But after running the code, the output is :-
None
What am I doing wrong and how to fix it ?
My final Remarks :-
I am a noob in Python. I used to use Thread in java, and we never had such issue there. Initially, i didn't knew there is a godly class like threading in python too.
Simply put, what i wanted to do, is just the thing for which we use threading's Thread class. After using that, all my issues are fixed. The code I provided above is just a simple enact of the actual code where the shared variable is a non-Picklable class Web3 and it was giving me tons of problems with multiprocessing. It's just that my understanding and knowledge of python was shallow. I didn't realise that having a seperate memory for variables is what Process is for...

Threading/Async in Requests-html

I have a large number of links I need to scrape from a website. I have ~70 base links and from them over 700 links that need to be scraped from those starting 70. So in order to speed up this process, takes about 2-3 hours without threading/async, I decided to try and use a thread/async.
My problem is that I need to render some javascript in order to get the links in the first place. I have been using requests-html to do this as its html.render() method is very reliable. However, when I try and run this using threading or async I run into a host of problems. I tried AsyncHTMLSession due to this Github PR but have been unable to get it to work. I was wondering if anyone had any ideas or links they could point me too that might help.
Here is some example code:
from multiprocessing.pool import ThreadPool
from requests_html import AsyncHTMLSession
links = (tuple of links)
n = 5
batch = [links[i:i+n] for i in range(0, len(links), n)]
def link_processor(batch_link):
session = AsyncHTMLSession()
results = []
for l in batch_link:
print(l)
r = session.get(l)
r.html.arender()
tmp_next = r.html.xpath('//a[contains(#href, "/matches/")]')
return tmp_next
pool = ThreadPool(processes=2)
output = pool.map(link_processor, batch)
pool.close()
pool.join()
print(output)
Output:
RuntimeError: There is no current event loop in thread 'Thread-1'.
Was able to fix this with some help from the learnpython subreddit. Turns out requests-html probably uses threads in some way and so threading the threads has an issue so simply using multiprocessing pool works.
FIXED CODE:
from multiprocessing import Pool
from requests_html import HTMLSession
.....
pool = Pool(processes=3)
output = pool.map(link_processor, batch[:2])
pool.close()
pool.join()
print(output)

OpenMDAO 1.x: recording in parallel

When running an analysis under MPI with distributed components in a ParallelGroup, I get an error when adding a DumpRecorder to the analysis. Below is a small example that demonstrates this (this was run with the latest master branch commit aaa67a4d51f4081e9e41b250b0a76b077f6f0c21 from 28/10/2015):
import numpy as np
from openmdao.core.mpi_wrap import MPI
from openmdao.api import Component, Group, DumpRecorder, Problem, ParallelGroup
class Sliced(Component):
def __init__(self):
super(Sliced, self).__init__()
self.add_param('x', 0.)
self.add_output('y', 0.)
def solve_nonlinear(self, params, unknowns, resids):
unknowns['y'] = params['x'] * 2.
class VectorComp(Component):
def __init__(self, size):
super(VectorComp, self).__init__()
self.add_param('xin', np.zeros(size))
self.add_output('x', np.zeros(size))
def solve_nonlinear(self, params, unknowns, resids):
unknowns['x'] = params['xin'] * 2.
class Analysis(Group):
def __init__(self, size):
super(Analysis, self).__init__()
self.add('v', VectorComp(size), promotes=['*'])
par = self.add('par', ParallelGroup())
for i in range(size):
par.add('sec%02d' % i, Sliced())
self.connect('x', 'par.sec%02d.x' % i, src_indices=[i])
if __name__ == '__main__':
if MPI:
from openmdao.core.petsc_impl import PetscImpl as impl
else:
from openmdao.core.basic_impl import BasicImpl as impl
p = Problem(impl=impl, root=Analysis(4))
recorder = DumpRecorder('optimization.log')
# adding specific includes works, but leaving it out results in a crash
# recorder.options['includes'] = ['x']
p.driver.add_recorder(recorder)
p.setup()
p.run()
The error which is raised is:
RuntimeError: Cannot access remote Variable 'par.sec00.x' in this process.
I see that the recorder dumps a file per processor, so shouldn't the BaseRecorder._filter_vectors method filter out params not present on a specific processor? I'm not yet familiar enough with the code to propose a fix, so I hope the OpenMDAO devs can easily figure out what goes wrong.
Manually specifying the includes works since the Sliced parameters are then excluded, but it would be nice that this was not necessary, and dealt with under the hood.
I also want to let you guys know how excited we are about the new framework. It is so much faster that the 0.x version, and the parallel FD feature is much appreciated and works like a charm!
There were some recent changes that broke the dump recorder in parallel. We put a story up for someone to fix it, but in the meantime, you might want to try the SqliteRecorder recorder. It's what I have been using for performance testing on CADRE. You set it up the same way, but then to read the values using an sqlitedict. There is a small example in the docs, but a more practical example is here in the CADRE code:
https://github.com/OpenMDAO/CADRE/blob/master/plot_progress.py

QSlider stuck though still emitting sliderMoved

In my PyQt4-based program, QSliders (with signals sliderMoved and sliderReleased connected to callables) sometimes "freeze", i.e. they don't move anymore when trying to drag them with the mouse, even though sliderMoved and sliderReleased are still emitted.
This behaviour happens seemingly randomly, sometimes after running the program for hours -- making it more or less impossible to reproduce and test.
Any help to solve this issue would be welcome.
EDIT: This is with PyQt 4.10.4 on Python 3.4 and Windows 7.
After some debugging I am pretty sure that this was due to calling a GUI slot from a separate thread, which (I knew) is forbidden. Fixing this to use a proper signal-slot approach seems to have fixed the issue.
After calling the patch function defined below, all slot calls are wrapped by a wrapper that checks that they are called only from the GUI thread -- a warning is printed otherwise. This is how I found the culprit.
import functools
import sys
import threading
import traceback
from PyQt4.QtCore import QMetaMethod
from PyQt4.QtGui import QWidget
SLOT_CACHE = {}
def patch():
"""Check for calls to widget slots outside of the main thread.
"""
qwidget_getattribute = QWidget.__getattribute__
def getattribute(obj, name):
attr = qwidget_getattribute(obj, name)
if type(obj) not in SLOT_CACHE:
meta = qwidget_getattribute(obj, "metaObject")()
SLOT_CACHE[type(obj)] = [
method.signature().split("(", 1)[0]
for method in map(meta.method, range(meta.methodCount()))
if method.methodType() == QMetaMethod.Slot]
if (isinstance(attr, type(print)) and # Wrap builtin functions only.
attr.__name__ in SLOT_CACHE[type(obj)]):
#functools.wraps(
attr, assigned=functools.WRAPPER_ASSIGNMENTS + ("__self__",))
def wrapper(*args, **kwargs):
if threading.current_thread() is not threading.main_thread():
print("{}.{} was called out of main thread:".format(
type(obj), name), file=sys.stderr)
traceback.print_stack()
return attr(*args, **kwargs)
return wrapper
else:
return attr
QWidget.__getattribute__ = getattribute

How to run web.py server inside of another application

I have a little desktop game written in Python and would like to be able to access internal of it while the game is running. I was thinking of doing this by having a web.py running on a separate thread and serving pages. So when I access http://localhost:8080/map it would display map of the current level for debugging purposes.
I got web.py installed and running, but I don't really know where to go from here. I tried starting web.application in a separate thread, but for some reason I can not share data between threads (I think).
Below is simple example, that I was using testing this idea. I thought that http://localhost:8080/ would return different number every time, but it keeps showing the same one (5). If I print common_value inside of the while loop, it is being incremented, but it starts from 5.
What am I missing here and is the approach anywhere close to sensible? I really would like to avoid using database if possible.
import web
import thread
urls = (
'/(.*)', 'hello'
)
app = web.application(urls, globals())
common_value = 5
class hello:
def GET(self):
return str(common_value)
if __name__ == "__main__":
thread.start_new_thread(app.run, ())
while 1:
common_value = common_value + 1
After searching around a bit, I found a solution that works:
If common_value is defined at a separate module and imported from there, the above code works. So in essence (excuse the naming):
thingy.py
common_value = 0
server.py
import web
import thread
import thingy
import sys; sys.path.insert(0, ".")
urls = (
'/(.*)', 'hello'
)
app = web.application(urls, globals())
thingy.common_value = 5
class hello:
def GET(self):
return str(thingy.common_value)
if __name__ == "__main__":
thread.start_new_thread(app.run, ())
while 1:
thingy.common_value = thingy.common_value + 1
I found erros with arguments, but
change :
def GET(self):
with:
def GET(self, *args):
and works now.

Resources