pyqt performance issues on dynamic widget array-like table - pyqt

I have a GUI that uses a custom QStyledItemDelegate, QTableView and QAbstractModel to display records that I load at startup.
Each record is displayed as a row of widgets. Each column is associated with a specific widget type. I.e. a QDateEdit, a QTimeEdit, a couple of QLineEdit, 3 QComboBox.
If I try to load ~2500 records the GUI takes a lot of time to load, as well as adapt to changes when some of the filters are used. Scrolling is not excessively bad though, so I suspect I might be doing things inefficiently when it comes to add or remove widget rows.
Row addition is always an append. Row deletion can happen everywhere.
These are the implementation in my model:
def row_append(self, count=1, parent=QtCore.QModelIndex()):
rows = self.rowCount()
last = rows + count - 1
self.layoutAboutToBeChanged.emit()
self.beginInsertRows(parent, rows, last)
self.insertRows(rows, count, parent) # Implemented in my model
self.endInsertRows()
self.layoutChanged.emit()
for column in range(self.columnCount()):
self.table_view.enable_cell(rows, column)
return self.rowCount() - count
def row_delete(self, row, count=1, parent=QtCore.QModelIndex()):
last = row + count - 1
self.layoutAboutToBeChanged.emit()
self.beginRemoveRows(parent, row, last)
self.removeRows(row, count, parent) # Implemented in my model
self.endRemoveRows()
self.layoutChanged.emit()
The implementation details of my version of insertRows and removeRows are not relevant, as I can deal with that.
If I receive 500 inserts or 351 deletes, I will call these functions.
My issue is if this is how it's supposed to be done when you get a huge batch of things that causes most of what's currently being displayed to change.
I remember reading that if you have to make big changes, it might be easier to destroy everything and rebuild from scratch, but:
Am I using the wrong approach (e.g. functions used and implemented) to destroy everything? Similarly for when I rebuild everything from scratch.
Could it be that I'm trying to build something too heavy for pyqt?

Related

How do I make a PyQt QLineEdit widget narrower than the default size of 17 x's when using QT Designer?

When I put Qt widgets in a QHBoxLayout layout using QT Designer, there seems to be a minimum width for QLineEdit widgets of 17 x's, at least if this is the current source code:
https://github.com/qt/qtbase/blob/dev/src/widgets/widgets/qlineedit.cpp
I cannot find a way to make PyQt5 lay those widgets out so they will be narrower than this default, but still change size if the font is changed.
As an example, a QComboBox will be automatically laid out so that it is just wide enough to display the longest text that is entered as a possible value. If the longest text entered for a combobox in Designer is 5 characters, the combobox will be laid out as quite narrow compared to the minimum QLineEdit width. How can I configure a QLineEdit with a number of characters so that it is always wide enough for that many characters, whatever the font is set to, and no wider, using just QT Designer? I know in Designer I can enter maxLength, which will limit the maximum number of characters that can be entered/displayed, but that setting of course has no effect on the layout.
I have some text boxes that will never have more than 5 characters in them, for example, and the layout with Designer makes them at least 3 times wider than I will ever need. This is using the default "Expanding" horizontal policy, but I have tried many combinations of horizontal policy and values for minimum size or base size. I want to allow for people to have different font sizes, so I cannot safely set a maximum pixel size and I cannot safely set a fixed horizontal size. The handling for QComboBoxes is precisely what I want for QLineEdits.
This is all in python using the latest versions of PyQt5 available on Pip, pyqt5 5.15.6 and pyqt5-qt5 5.15.2.
The size hint of QLineEdit is computed considering various aspects, most of them using private functions that are not exposed to the API, and the "x" count is hardcoded, meaning that this cannot be achieved directly from Designer, and can only be done through subclassing.
While we could try to mimic its behavior to implement a custom character size, I believe it is unnecessary for simple cases, so I simplified the concept by taking the default size hint and adapting the width based on the difference between the custom character hint and the default 17 "x" count.
class CharHintLineEdit(QtWidgets.QLineEdit):
_charHint = 17
#QtCore.pyqtProperty(int)
def charHint(self):
return self._charHint
#charHint.setter
def charHint(self, chars):
chars = max(1, chars)
if self._charHint != chars:
self._charHint = chars
self.updateGeometry()
def changeEvent(self, event):
super().changeEvent(event)
if event.type() in (event.FontChange, event.StyleChange):
self.updateGeometry()
def sizeHint(self):
hint = super().sizeHint()
if self._charHint != 17:
# the 17 char width is hardcoded in Qt and there is no way to
# retrieve it, it might change in the future, so, just to be safe,
# we always set an arbitrary minimum based on half the height hint
charSize = self.fontMetrics().horizontalAdvance('x')
hint.setWidth(max(hint.height() // 2, hint.width() +
charSize * (self._charHint - 17)))
return hint
if __name__ == '__main__':
import sys
from random import randrange
app = QtWidgets.QApplication(sys.argv)
test = QtWidgets.QWidget()
layout = QtWidgets.QHBoxLayout(test)
for i in range(5):
charHint = randrange(5, 15)
le = CharHintLineEdit(charHint=charHint, placeholderText=str(charHint))
layout.addWidget(le)
test.show()
sys.exit(app.exec())
With the above code, you can use the custom widget in Designer by adding a standard QLineEdit and promoting it with the class name and relative python file (without the file extension) as header. You can also set the charHint as a dynamic property in Designer, and it will be properly set for the widget when the UI is loaded.

Attempting to use choropleth maps in folium for first time, index error

This is my first time posting to stackoverflow, so I hope I am doing this correctly. I am currently finishing up a 'jumpstart' introduction to data analytics. We are utilizing Python with a few different packages, such as pandas, seaborn, folium etc. For part of our final project/presentation, I am trying to make a zipcode choropleth map. I have successfully imported folium and have my map displayed - the choropleth concept is new to me and is completely extracurricular. Trying to challenge myself to succeed.
I found an example of creating a choropleth map here that I am trying to use: https://medium.com/#saidakbarp/interactive-map-visualization-with-folium-in-python-2e95544d8d9b. I believe I correctly substituted the different object names for the data frame and map name that I am working with. For the geoJSON data, I found this https://github.com/OpenDataDE/State-zip-code-GeoJSON. I opened the geoJSON file in Atom, and found the feature title for what I believe to be the five digit zipcode 'ZCTA5CE10'.
Here is my code:
folium.Choropleth(geo_data='../data/tn_tennessee_zip_codes_geo.min.json',
data=slow_to_resolve,
columns=['zipcode'],
key_on='feature.properties.ZCTA5CE10',
fill_color='BuPu', fill_opacity=0.7, line_opacity=0.2,
legend_name='Zipcode').add_to(nash_map)
folium.LayerControl().add_to(nash_map)
nash_map
When I try to run the code, I get this error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-114-a2968de30f1b> in <module>
----> 1 folium.Choropleth(geo_data='../data/tn_tennessee_zip_codes_geo.min.json',
2 data=slow_to_resolve,
3 columns=['zipcode'],
4 key_on='feature.properties.ZCTA5CE10',
5 fill_color='BuPu', fill_opacity=0.7, line_opacity=0.2,
~\anaconda3\lib\site-packages\folium\features.py in __init__(self, geo_data, data, columns, key_on, bins, fill_color, nan_fill_color, fill_opacity, nan_fill_opacity, line_color, line_weight, line_opacity, name, legend_name, overlay, control, show, topojson, smooth_factor, highlight, **kwargs)
1198 if hasattr(data, 'set_index'):
1199 # This is a pd.DataFrame
-> 1200 color_data = data.set_index(columns[0])[columns[1]].to_dict()
1201 elif hasattr(data, 'to_dict'):
1202 # This is a pd.Series
IndexError: list index out of range
Prior to this error, I had two columns from my dataframe specified, but I got some 'isnan' error that I am pretty sure was attributed to string type data in the second column, so I removed it. Now currently trying to figure out this posted error.
Can someone point me in the right direction? Please keep in mind that aside from this three week jumpstart program, I have zero programming knowledge or experience - so I am still learning terminology and concepts.
Thank you!
The error you got was an IndexError: list index out of range because you provided to the columns parameter with just one column columns=['zipcode']. It has to be two like this: columns=['zipcode', 'columnName_to_color_map'].
Where the first column 'zipcode' must match the object node in the GeoJSON file data (note the format/type must also match, that is string: '11372' IS NOT integer: 11372). The second column 'columnName_to_color_map' or whatever name you used should be the column which defined the choropleth colors.
Also note that key_on should match the first column 'zipcode'.
So the code should look like this:-
folium.Choropleth(geo_data='../data/tn_tennessee_zip_codes_geo.min.json',
data=slow_to_resolve,
columns=['zipcode', 'columnName_to_color_map'],
key_on='feature.properties.ZCTA5CE10',
fill_color='BuPu', fill_opacity=0.7, line_opacity=0.2,
legend_name='Zipcode').add_to(nash_map)
folium.LayerControl().add_to(nash_map)
nash_map

Python: How to excute a variable in a string in a for loop in a function?

I have an excel (output of survey) and I am trying to write codes to give marks based on the entries of a survey of different students.
Suppose the data at hand is given as follow
import warnings, pickle
import numpy as np, pandas as pd
warnings.filterwarnings('ignore')
A='pickle.loads(b\'\\x80\\x03cpandas.core.frame\\nDataFrame\\nq\\x00)\\x81q\\x01}q\\x02(X\\x05\\x00\\x00\\x00_dataq\\x03cpandas.core.internals.managers\\nBlockManager\\nq\\x04)\\x81q\\x05(]q\\x06(cpandas.core.indexes.base\\n_new_Index\\nq\\x07cpandas.core.indexes.base\\nIndex\\nq\\x08}q\\t(X\\x04\\x00\\x00\\x00dataq\\ncnumpy.core.multiarray\\n_reconstruct\\nq\\x0bcnumpy\\nndarray\\nq\\x0cK\\x00\\x85q\\rC\\x01bq\\x0e\\x87q\\x0fRq\\x10(K\\x01K\\x04\\x85q\\x11cnumpy\\ndtype\\nq\\x12X\\x02\\x00\\x00\\x00O8q\\x13K\\x00K\\x01\\x87q\\x14Rq\\x15(K\\x03X\\x01\\x00\\x00\\x00|q\\x16NNNJ\\xff\\xff\\xff\\xffJ\\xff\\xff\\xff\\xffK?tq\\x17b\\x89]q\\x18(X\\x04\\x00\\x00\\x00Nameq\\x19X \\x00\\x00\\x00Which companies have you chosen?q\\x1aX\\n\\x00\\x00\\x00Question 1q\\x1bX\\n\\x00\\x00\\x00Question 2q\\x1cetq\\x1dbX\\x04\\x00\\x00\\x00nameq\\x1eNu\\x86q\\x1fRq h\\x07cpandas.core.indexes.range\\nRangeIndex\\nq!}q"(h\\x1eNX\\x05\\x00\\x00\\x00startq#K\\x00X\\x04\\x00\\x00\\x00stopq$K\\x04X\\x04\\x00\\x00\\x00stepq%K\\x01u\\x86q&Rq\\\'e]q(h\\x0bh\\x0cK\\x00\\x85q)h\\x0e\\x87q*Rq+(K\\x01K\\x04K\\x04\\x86q,h\\x15\\x89]q-(X\\x03\\x00\\x00\\x00Ayaq.X\\x04\\x00\\x00\\x00Ramiq/X\\x07\\x00\\x00\\x00Geniousq0X\\x05\\x00\\x00\\x00Samirq1G\\x7f\\xf8\\x00\\x00\\x00\\x00\\x00\\x00X?\\x00\\x00\\x00 Mobil, ConsolidatedEdisonq2X?\\x00\\x00\\x00 DataGeneral, GeneralPublicUtilitiesq3h3X\\x11\\x00\\x00\\x00Uploaded the dataq4X\\xfe\\x00\\x00\\x00Uploaded the data,Specified the reason behind the selected stocks,You were successful in cleaning your data-sets,You have had a justification why you selected this particular time period of your chosen stock. This should have been answered in Question 2.q5X{\\x01\\x00\\x00Uploaded the data,Specified the reason behind the selected stocks,You selected both stocks from different industries,You were successful in cleaning your data-sets,You have had a justification why you selected this particular time period of your chosen stock. This should have been answered in Question 2.,You did extra work for this question and you deserve a round of applause.q6X\\xec\\x00\\x00\\x00Specified the reason behind the selected stocks,You were successful in cleaning your data-sets,You have had a justification why you selected this particular time period of your chosen stock. This should have been answered in Question 2.q7G\\x7f\\xf8\\x00\\x00\\x00\\x00\\x00\\x00X(\\x01\\x00\\x00You plot a graph over the time axis that demonstrate how your selected returns behaved in exciting periods,You produced a scatter plot that shows that you may change your x-axis from time to market return.,You also were able to compare your selected stock to other stocks within the same industryq8X\\xde\\x01\\x00\\x00You plot a graph over the time axis that demonstrate how your selected returns behaved in exciting periods,You produced a scatter plot that shows that you may change your x-axis from time to market return.,You have shown some references to what has been going in your chosen period and you gave a good justification.,You also were able to compare your selected stock to other stocks within the same industry,You did extra work for this question and you deserve a very high mark.q9h8etq:ba]q;h\\x07h\\x08}q<(h\\nh\\x0bh\\x0cK\\x00\\x85q=h\\x0e\\x87q>Rq?(K\\x01K\\x04\\x85q#h\\x15\\x89]qA(h\\x19h\\x1ah\\x1bh\\x1cetqBbh\\x1eNu\\x86qCRqDa}qEX\\x06\\x00\\x00\\x000.14.1qF}qG(X\\x04\\x00\\x00\\x00axesqHh\\x06X\\x06\\x00\\x00\\x00blocksqI]qJ}qK(X\\x06\\x00\\x00\\x00valuesqLh+X\\x08\\x00\\x00\\x00mgr_locsqMcbuiltins\\nslice\\nqNK\\x00K\\x04K\\x01\\x87qORqPuaustqQbX\\x04\\x00\\x00\\x00_typqRX\\t\\x00\\x00\\x00dataframeqSX\\t\\x00\\x00\\x00_metadataqT]qUub.\')'
df1=eval(A)
I wrote these function to help me understand the meaning of applying or getting results
# This function will give you the students's answer {Answer_Student_1_Q_1, Answer_Student_1_Q_2}
def Get_student_answers_per_question(i1, total_number_of_questions):
g=df1.index[df1['Name']!='Genious'][i1]
Gen1=df1.ix[g,:]
print(Gen1)
for i in range(total_number_of_questions):
if isinstance(Gen1[i+2], float):
foo='Answer_Student_'+str(i1+1)+'_Q_'+str(i+1)+ '= Gen1['+ str(i+2) +']'
else:
foo='Answer_Student_'+str(i1+1)+'_Q_'+str(i+1)+ '= Gen1['+ str(i+2) +'].split(",")'
exec(foo, locals(), globals())
pass
# Get student score for each answer in question
def Get_score_per_question(student, question, dictionary_mark, total_number_of_questions):
# Get_all_answers will get all the answers for student (student)
Get_all_answers=Get_student_answers_per_question(student-1,total_number_of_questions)
foo='Answer_Student_'+str(student)+'_Q_'+str(question)
print(foo)
v=[dictionary_mark.get(i) for i in exec(foo)]
return v
Now in the last function, Get_score_per_question, I was trying to code
v=[dictionary_mark.get(i) for i in exec(foo)]
where v is the score of the variable if available in the answer of the dictionary.
so depending on the entries in the string variable foo the results would be of same length with numbers
The example that I am trying to run is this
student=1
question=1
dictionary_mark={'Uploaded the data': 1,
'Specified the reason behind the selected stocks': 1,
'You selected both stocks from different industries': 1,
'You were successful in cleaning your data-sets': 2,
'You have had a justification why you selected this particular time period of your chosen stock. This should have been answered in Question 2.': 1,
'You did extra work for this question and you deserve a round of applause.': 1}
total_number_of_questions=2
Get_score_per_question(student, question, dictionary_mark, total_number_of_questions)
Where, as you can foresee, I will get the following error
TypeError: 'NoneType' object is not iterable
Can somebody help me in this regards, and is there any tutorial or page that someone could refer me to do a better coding in such surveys in python espcially when handling splits and so on.
Docs of exec() function : https://docs.python.org/3/library/functions.html#exec
Be aware that the return and yield statements may not be used outside of function definitions even within the context of code passed to the exec() function. The return value is None.
And you have used this piece of code : [dictionary_mark.get(i) for i in exec(foo)]
So obviously the None object doesn't implement __iter__ function and you get Type Error when you try to loop through

How to use a conditional statement while scraping?

I'm trying to scrape the MTA website and need a little help scraping the "Train Lines Row." (Website for reference: https://advisory.mtanyct.info/EEoutage/EEOutageReport.aspx?StationID=All
The train line information is stored as image files (1 line subway, A line subway, etc) describing each line that's accessible through a particular station. I've had success scraping info out of rows in which only one train passes through, but I'm having difficulty figuring out how to iterate through the columns which have multiple trains passing through it...using a conditional statement to test for whether it has one line or multiple lines.
tableElements = table.find_elements_by_tag_name('tr')
that's the table i'm iterating through
tableElements[2].find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_element_by_tag_name('img').get_attribute('alt')
this successfully gives me the values if only one value exists in the particular column
tableElements[8].find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_elements_by_tag_name('img')
this successfully gives me a list of values I can successfully iterate through to extract my needed values.
Now I try and combine these lines of code together in a forloop to extract all the information without stopping.
for info in tableElements[1:]:
if info.find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_elements_by_tag_name('img')[1] == True:
for images in info.find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_elements_by_tag_name('img'):
print(images.get_attribute('alt'))
else:
print(info.find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_element_by_tag_name('img').get_attribute('alt'))
I'm getting the error message: "list index out of range." I dont know why, as every iteration done in isolation seems to work. My hunch is I haven't correctly used the boolean operation properly here. My idea was that if find_elements_by_tag_name had an index of [1] that would mean multiple image text for me to iterate through. Hence, why I want to use this boolean operation.
Hi All, thanks so much for your help. I've uploaded my full code to Github and attached a link for your reference: https://github.com/tsp2123/MTA-Scraping/blob/master/MTA.ElevatorData.ipynb
The end goal is going to be to put this information into a dataframe using some formulation of and having a for loop that will extract the image information that I want.
dataframe = []
for elements in tableElements:
row = {}
columnName1 = find_element_by_class_name('td')
..
Your logic isn't off here.
"My hunch is I haven't correctly used the boolean operation properly here. My idea was that if find_elements_by_tag_name had an index of [1] that would mean multiple image text for me to iterate through."
The problem is it can't check if the statement is True if there's nothing in index position [1]. Hence the error at this point.
if info.find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_elements_by_tag_name('img')[1] == True:
What you want to do is use try: So something like:
for info in tableElements[1:]:
try:
if info.find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_elements_by_tag_name('img')[1] == True:
for images in info.find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_elements_by_tag_name('img'):
print(images.get_attribute('alt'))
else:
print(info.find_elements_by_tag_name('td')[1].find_element_by_tag_name('h4').find_element_by_tag_name('img').get_attribute('alt'))
except:
#do something else
print ('Nothing found in index position.')
Is it also possible to back to your question and provide the full code? When I try this, I'm getting 11 table elements, so want to test it with the specific table you're trying to scrape.

Pandas Drop and Replace functions won't work within a UDF

I looked around at other questions but couldn't find out that addresses the issue I'm having. I am cleaning a data set in an ipython notebook. When I run the cleaning tasks individually they work as expected, but I am having trouble with the replace() and drop() functions when they are included in a UDF. Specifically, these lines aren't doing anything within the UDF, however, a dataframe is returned that completes the other tasks as expected (i.e. reads in the file, sets the index, and filters select dates out).
Any help is much appreciated!
Note that in this problem the df.drop() and df.replace() commands both work as expected when executed outside of the UDF. The function is below for your reference. The issue is with the last two lines "station.replace()" and "station.drop()".
def read_file(file_path):
'''Function to read in daily x data'''
if os.path.exists(os.getcwd()+'/'+file_path) == True:
station = pd.read_csv(file_path)
else:
!unzip alldata.zip
station = pd.read_csv(file_path)
station.set_index('date',inplace=True) #put date in the index
station = station_data[station_data.index > '1984-09-29'] #removes days where there is no y-data
station.replace('---','0',inplace=True)
station.drop(columns=['Unnamed: 0'],axis=1,inplace=True) #drop non-station columns
There was a mistake here:
station = station_data[station_data.index > '1984-09-29']
I was using an old table index. I corrected it to:
station = station[station.index > '1984-09-29']
Note, I had to restart the notebook and re-run it from the top for it to work. I believe it was an issue with conflicting table names in the UDF vs. what was already stored in memory.

Resources