up front: I am using wxPython-4.0.4, Python 3.7.2, wx.ProgressDialog, and Windows 10
I encountered a problem while trying to create a progress bar. I was reading a file with over 1 mil. lines and added a progress bar that updates every 10k or so lines. What I found is that the bar advance was... let's just say unexpected. I looked at the values I got with GetValue() and noticed that they were way smaller then the values I set with Update().
I deleted all unnecessary code, experimented a bit, and noticed that the effect depends on the maximum value of the progress bar. Everything seems to work fine for values smaller than 65,536, which happens to be 2^16. I haven't found a similar case during my search for a solution. You can find my example code below.
Is this supposed to happen? Is there a way to avoid this behaviour (besides reducing the maximum value)? Or am I missing something? I know that other types of progress bar do not act in this manner but I would like to use this type due to its simplicity.
Thank you already in advance!
import wx
def pb_test(maxVal):
print("Progress bar test with maxVal=%d" % maxVal)
pb = wx.ProgressDialog(title="", message="")
pb.SetRange(maxVal) # set maximum of progress bar
val=20000 # arbitrary value to set progress bar to
pb.Update(val) # set value
print("Value/Range: %d/%d\tExpected Value: %d" %(pb.GetValue(), pb.GetRange(), val))
pb.Update(maxVal) # set maximum value
print("Value/Range: %d/%d\tExpected Value: %d" %(pb.GetValue(), pb.GetRange(), maxVal))
app = wx.App()
pb_test(65535) # <-- this one works as expected
print("------------------")
pb_test(65536) # <-- this one shows only half the expected values
EDIT: This is my output:
Progress bar test with maxVal=65535
Value/Range: 20000/65535 Expected Value: 20000
Value/Range: 65535/65535 Expected Value: 65535
------------------
Progress bar test with maxVal=65536
Value/Range: 10000/65536 Expected Value: 20000
Value/Range: 32768/65536 Expected Value: 65536
Related
Background:
My question should be relatively easy, however I am not able to figure it out.
I have written a function regarding queueing theory and it will be used for ambulance service planning. For example, how many calls for service can I expect in a given time frame.
The function takes two parameters; a starting value of the number of ambulances in my system starting at 0 and ending at 100 ambulances. This will show the probability of zero calls for service, one call for service, three calls for service….up to 100 calls for service. Second parameter is an arrival rate number which is the past historical arrival rate in my system.
The function runs and prints out the result to my screen. I have checked the math and it appears to be correct.
This is Python 3.7 with the Anaconda distribution.
My question is this:
I would like to process this data even further but I don’t know how to capture it and do more math. For example, I would like to take this list and accumulate the probability values. With an arrival rate of five, there is a cumulative probability of 61.56% of at least five calls for service, etc.
A second example of how I would like to process this data is to format it as percentages and write out a text file
A third example would be to process the cumulative probabilities and exclude any values higher than the 99% cumulative value (because these vanish into extremely small numbers).
A fourth example would be to create a bar chart showing the probability of n calls for service.
These are some of the things I want to do with the queueing theory calculations. And there are a lot more. I am planning on writing a larger application. But I am stuck at this point. The function writes an output into my Python 3.7 console. How do I “capture” that output as an object or something and perform other processing on the data?
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import math
import csv
def probability_x(start_value = 0, arrival_rate = 0):
probability_arrivals = []
while start_value <= 100:
probability_arrivals = [start_value, math.pow(arrival_rate, start_value) * math.pow(math.e, -arrival_rate) / math.factorial(start_value)]
print(probability_arrivals)
start_value = start_value + 1
return probability_arrivals
#probability_x(arrival_rate = 5, x = 5)
#The code written above prints to the console, but my goal is to take the returned values and make other calculations.
#How do I 'capture' this data for further processing is where I need help (for example, bar plots, cumulative frequency, etc )
#failure. TypeError: writerows() argument must be iterable.
with open('ExpectedProbability.csv', 'w') as writeFile:
writer = csv.writer(writeFile)
for value in probability_x(arrival_rate = 5):
writer.writerows(value)
writeFile.close()
#Failure. Why does it return 2. Yes there are two columns but I was expecting 101 as the length because that is the end of my loop.
print(len(probability_x(arrival_rate = 5)))
The problem is, when you write
probability_arrivals = [start_value, math.pow(arrival_rate, start_value) * math.pow(math.e, -arrival_rate) / math.factorial(start_value)]
You're overwriting the previous contents of probability_arrivals. Everything that it held previously is lost.
Instead of using = to reassign probability_arrivals, you want to append another entry to the list:
probability_arrivals.append([start_value, math.pow(arrival_rate, start_value) * math.pow(math.e, -arrival_rate) / math.factorial(start_value)])
I'll also note, your while loop can be improved. You're basically just looping over start_value until it reaches a certain value. A for loop would be more appropriate here:
for s in range(start_value, 101): # The end value is exclusive, so it's 101 not 100
probability_arrivals = [s, math.pow(arrival_rate, s) * math.pow(math.e, -arrival_rate) / math.factorial(s)]
print(probability_arrivals)
Now you don't need to manually worry about incrementing the counter.
I use tqdm to print a progress bar for a long running optimization process with hyperopt.
The process calls a function say 500 times and each call will take around 10 to 20 minutes, so I started to make the progress display a bit more fine granular and added some tqdm.update-statements in the loop, advancing the progress bar fraction-wise to avoid having two nested progress bars while still beeing able to immediately see how many function calls have been performed so far.
Now the ugly result looks like this:
15%|███▌ | 73.69999999999993/500 [7:40:31<102:54:08, 868.98s/it, evaluating fold 2 of 2 folds...]Iteration 1, loss = 2.50358388
You can see above, it is the 73th call of the function and this 73th function call is about 70% finished. In fact I just estimated the number of substeps m in the function (which might vary from call to call) and used the fraction 1/m to update the progress bar. Then after the function call I just synchronize the progress bar back to a full integer to avoid adding rounding errors.
Of course accuracy is not an issue at all here. But I would like to display 73.70 rather than 73.69999999999993.
I already tried to round my update value to two decimal places, which doesn't fix the problem, because of precision issues in float, if a number is not exactly representable by a float, then it gets ugly-long again.
According to the documentation of tqdm this part is hidden in the in the part r_bar of the whole format string, but I couldn't find a way to set it. Can you help me with this?
According to the docs r_bar defaults to:
r_bar='| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, '
Here is my code:
with tqdm(iterable=None, initial=num_trials, maxinterval=maxinterval, total=max_evals, ascii=False, disable=show_progressbar is False) as progress_bar:
def fn_to_minimize(*args, **kwargs):
return fn(*args, **kwargs, _progress_bar=progress_bar)
for num_trials in range(num_trials, max_evals):
progress_bar.n=float(num_trials)
progress_bar.refresh()
best = fmin(**kwargs, fn=fn_to_minimize, trials=trials, max_evals=num_trials+1)
# do some other stuff here
In the called function (one of the entries in kwargs btw) I update the progress bar just like this:
_progress_bar.update(round(update_value, 2))
For rounding issues in tqdm, you can directly edit the formatting in the r_bar as one of the parameters in the bar_format. For example:
from tqdm import trange
for i in trange(int(7e7), bar_format = "{desc}: {percentage:.3f}%|{bar}| {n_fmt}/{total_fmt} [{elapsed}<{remaining}"):
pass
shows:
For 2 decimal places, you can simply edit the {n_fmt} to be {n:.2f} . You can also edit other parameters such as {desc} or add in additional decimal places to the percentage.
from tqdm import trange
for i in trange(int(7e7), bar_format = "{desc}: {percentage:.10f}%|{bar}| {n:.2f}/{total_fmt} [{elapsed}<{remaining}"):
pass
shows:
Upon looking through the source code of tqdm, n_fmt is actually pointing to str(n), hence passing in the formatted version of n can bypass its intrinsic formatting.
if unit_scale:
n_fmt = format_sizeof(n, divisor=unit_divisor)
total_fmt = format_sizeof(total, divisor=unit_divisor) \
if total is not None else '?'
else:
n_fmt = str(n)
total_fmt = str(total) if total is not None else '?'
try:
postfix = ', ' + postfix if postfix else ''
except TypeError:
pass
I am writing a Python script that gives basic data for all the planets, the Sun and the Moon. My first function divides the planets between those that are above the horizon, and those that are not risen yet:
planets = {
'mercury': ephem.Mercury(),
'venus': ephem.Venus(),
'mars': ephem.Mars(),
'jupiter': ephem.Jupiter(),
'saturn': ephem.Saturn(),
'uranus': ephem.Uranus(),
'neptune': ephem.Neptune()
}
def findVisiblePlanets(obs):
visiblePlanets = dict()
notVisiblePlanets = dict()
for obj in planets:
planets[obj].compute(obs)
if planets[obj].alt > 0:
visiblePlanets[obj] = planets[obj]
else:
notVisiblePlanets[obj] = planets[obj]
return (visiblePlanets, notVisiblePlanets)
This works alright, the tuple I receive from findVisiblePlanets corresponds corresponds to the actual sky for the given 'obs'.
But in another function, I need to test the altitude of each planet. If it's above 0, the script displays 'setting at xxx', and if it's under 0, the script displays 'rising at xxx'. Here is the code:
if bodies[obj].alt > 0:
print(' Sets at', setTime.strftime('%H:%M:%S'), deltaSet)
else:
print(' Rises at', riseTime.strftime('%H:%M:%S'), deltaRise)
So I'm using the exact same condition, except that this time it doesn't work. I am sure I have the correct object behind bodies[obj], as the script displays name, magnitude, distance, etc. But for some reason, the altitude (.alt) is always below 0, so the script only displays the rising time.
I tried print(bodies[obj].alt), and I receive a negative figure in the form of '-0:00:07.8' (example). I tried using int(bodies[obj].alt) for the comparison but this ends up being a 0. How can I test if the altitude is negative? Am I missing something obvious here?
Thanks for your help.
I thinkk I had a similar problem once. How I understand it pyephem forwards the time of your observer, when you call nextrising() or nextsetting() on a object. It somehow looks, at which timepoint the object is above/below the horizont for the first time. if you then call the bodie.alt it will always be this little bit below/above horizon.
You have to store your observer time somehow and set it again after calculating setting/rising times.
In a small portion of code for a retail auditing calculator, I'm attempting to allow the input of a retail value and multiply it by up to 2 entered quantities The expected (intended) result is $X*Y=$Z.
I've attempted to modify the code a couple of says and seem to be stuck on how this math is (isn't) working correctly.
I've attempted a number of different configurations in the code and the most I've achieved is the following:
#Retail value of item, whole number (i.e. $49.99 entered as 4999)
rtlVAL = input("Retail Value: ")
#Quantity of Items - can be multiplied for full stack items, default if no entry is '1'
qt1 = float(input("Quantity 1: ")) #ex. 4
qt2 = float(input("Quantity 2: ") or "1") #ex " "
#Convert the Retail Value to finacial format (i.e 4999 to $49.99)
rtl = float("{:.2}".format (rtlVAL))
# Screen Output
qtyVAL = int(qt1)*int(qt2)
print("$" + str(qtyVAL*rtl))
The entered values are:
Retail Value: 4999
Quantity 1: 4
Quantity 2: (blank)
The expected performance is 4999 * 4 * (because no entry defaults to value of 1) and the expected result is $199.96
The result of this code is $196.0, so not only is it the wrong conclusion but it's missing the two decimal places.
I'm not entirely certain why the math comes up wrong in context to expectation.
What am I missing here?
On line 9, I've tried the following:
rtl = float("{:.2f}".format (rtlVAL))
rtl = int("{:.2f}".format (rtlVAL))
The return was
ValueError: Unknown format code 'f' for object of type 'str'
if I change line 13 to:
print("$" + float(qtyVAL*rtl))
I get
TypeError: must be str, not float
using either of the prior alterations in conjunction with the latter will return the ValueError:
Python 3.4 and 3.6
I did search a few other SO questions regarding Python, Math, Floating point, and formatting but the questions were looking for and presenting something far more advances and entangled than this so i wasn't able to glean an answer to make a contextual application or it applied mainly to Python 2.7 wherein some of the code such as raw input() is simply input() and altered by int(input())in Python 3.x to step out of str value (as far as I understand for this purpose.
I did not see this as a duplicate, but if I missed something in that I do apologize - it isn't intentional.
No need to mess around with number formats:
rtl = float(rtlVAL)/100
Just divide the retail value by 100 to get the dollar value
EDIT:
Incidentally, the reason it was coming up with 196 was because your number format was taking the first two digits of rtlVAL - 49 in your case - and then multiplying by that.
Background:
I'm working on a program to show a 2d cross section of 3d data. The data is stored in a simple text csv file in the format x, y, z1, z2, z3, etc. I take a start and end point and flick through the dataset (~110,000 lines) to create a line of points between these two locations, and dump them into an array. This works fine, and fairly quickly (takes about 0.3 seconds). To then display this line, I've been creating a matplotlib stacked bar chart. However, the total run time of the program is about 5.5 seconds. I've narrowed the bulk of it (3 seconds worth) down to the code below.
'values' is an array with the x, y and z values plus a leading identifier, which isn't used in this part of the code. The first plt.bar is plotting the bar sections, and the second is used to create an arbitrary floor of -2000. In order to generate a continuous looking section, I'm using an interval between each bar of zero.
import matplotlib.pyplot as plt
for values in crossSection:
prevNum = None
layerColour = None
if values != None:
for i in range(3, len(values)):
if values[i] != 'n':
num = float(values[i].strip())
if prevNum != None:
plt.bar(spacing, prevNum-num, width=interval, \
bottom=num, color=layerColour, \
edgecolor=None, linewidth=0)
prevNum = num
layerColour = layerParams[i].strip()
if prevNum != None:
plt.bar(spacing, prevNum+2000, width=interval, bottom=-2000, \
color=layerColour, linewidth=0)
spacing += interval
I'm sure there's a more efficient way to do this, but I'm new to Matplotlib and still unfamilar with its capabilities. The other main use of time in the code is:
plt.savefig('output.png')
which takes about a second, but I figure this is to be expected to save the file and I can't do anything about it.
Question:
Is there a faster way of generating the same output (a stacked bar chart or something that looks like one) by using plt.bar() better, or a different Matplotlib function?
EDIT:
I forgot to mention in the original post that I'm using Python 3.2.3 and Matplotlib 1.2.0
Leaving this here in case someone runs into the same problem...
While not exactly the same as using bar(), with a sufficiently large dataset (large enough that using bar() takes a few seconds) the results are indistinguishable from stackplot(). If I sort the data into layers using the method given by tcaswell and feed it into stackplot() the chart is created in 0.2 seconds, rather than 3 seconds.
EDIT
Code provided by tcaswell to turn the data into layers:
accum_values = []
for values in crosssection:
accum_values.append([float(v.strip()) for v iv values[3:]])
accum_values = np.vstack(accum_values).T
layer_params = [l.strip() for l in layerParams]
bottom = numpy.zeros(accum_values[0].shape)
It looks like you are drawing each bar, you can pass sequences to bar (see this example)
I think something like:
accum_values = []
for values in crosssection:
accum_values.append([float(v.strip()) for v iv values[3:]])
accum_values = np.vstack(accum_values).T
layer_params = [l.strip() for l in layerParams]
bottom = numpy.zeros(accum_values[0].shape)
ax = plt.gca()
spacing = interval*numpy.arange(len(accum_values[0]))
for data,color is zip(accum_values,layer_params):
ax.bar(spacing,data,bottom=bottom,color=color,linewidth=0,width=interval)
bottom += data
will be faster (because each call to bar creates one BarContainer and I suspect the source of your issues is you were creating one for each bar, instead of one for each layer).
I don't really understand what you are doing with the bars that have tops below their bottoms, so I didn't try to implement that, so you will have to adapt this a bit.