How to put time difference value into summation equation using python - python-3.x

Here I wrote the summation equation to find the Y value according to my csv file.
Here I want to write for range 0 with time difference. When I wrote it gave me this error, ("'str' object cannot be interpreted as an integer", 'occurred at index 0')
my summation equation,
n = time difference in between two rows
my code:
def y_convert(X,time):
Y=0
if x == 10:
for k in range(0,time):
Y=np.sum(X*k)
else:
for k in range(0,time):
Y=np.sum(X*k)
return Y
Then convert time difference into minute and then apply this code to find y
df1['time_diff'] = pd.to_datetime(df1["time"])
df1['delta'] = (df1['time_diff']-df1['time_diff'].shift()).fillna(0)
df1['t'] = df1['delta'].apply(lambda x: x / np.timedelta64(1,'m')).astype('int64') % (24*60)
X = df1['X'].astype(int)
time=df1['t'].astype(int)
Y = df1.apply(lambda x: y_convert(x.X,x.time), axis=1)
Then I tried to plot the graph after getting the correct answer provided by jezrael
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(time, df1['Y'])
ax.set_xlabel
ax.set_ylabel
plt.show()
Plot graph:
my csv file:

I think you need pass column t, not column time:
df1['t'] = df1['delta'].dt.total_seconds().div(60).astype(int)
Y = df1.apply(lambda x: y_convert(x.X,x.t), axis=1)
Reason is if use range function in time, second argument is no time, but integer.
In your solution is used:
range(0,'6:15:00')
Also seems your solution should be simlify a lot:
Y = df1['X'] * (df1['t'] - 1)

Related

how do I make a for loop understand that I want it to run i for all values inside x. x being a defined array

My code right now is as it follows:
from math import *
import matplotlib.pyplot as plt
import numpy as np
"""
TITLE
"""
def f(x,y):
for i in range(len(x)):
y.append(exp(-x[i]) - sin (pi*x[i]/2))
def ddxf(x,y2):
for i in range(len(x)):
y2.append(-exp(-x[i]) - (pi/2)*cos(pi*x[i]/2))
y = []
y2 = []
f(x, y)
x = np.linspace(0, 4, 100)
plt.title('Graph of function x')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.plot(x, y, 'g')
plt.grid(True)
plt.show()
x0 = float(input("Insert the approximate value of the first positive root: "))
intMax = 100
i = 0
epsilon = 1.e-7
while abs(f(x0,y)) > epsilon and i < intMax:
x1 = x0 - (f(x0))/(ddxf(x0))
x0 = x1
i += 1
print (i)
print (x1)
I get this error when running the program. it seems that (len(x)) cannot be used if x isnt a string. which doesn't make sense to me. if the array has a len and isn't infinite, why (len(x)) cant read his length? anyway, please help me. hope I made myself clear
Regarding the error: You are using x before defining it. In your code, you first use f(x, y) and only after that you actually define what x is (namely x = np.linspace(0, 4, 100)). You probably want to swap these two lines to fix the issue.
Regarding the question in the title: len(x) should be fine to get the length of a list. However, in Python you don't need to go through a list like that. You can instead use for element_name in list_name: This will essentially go through list_name element by element and make it available to you with the name element_name.
There is also something called list comprehensions in Python - you might want to take a look at those and see whether you can apply it to your code.

Divide by Zero in Mean()?

I'm trying to write some code to compute mean, Variance, Standard Deviation, FWHM, and finally evaluate the Gaussian Integral. I've been running into a division by zero error that I can't get past and I would like to know the solution for this ?
Where it's throwing an error I've tried to throw an exception handler as follows
Average = (sum(yvalues)) / (len(yvalues)) try: return (sum(yvalues) / len(yvalues))
expect ZeroDivisionError:
return 0
xvalues = []
yvalues = []
def generate():
for i in range(0,300):
a = rand.uniform((float("-inf") , float("inf")))
b = rand.uniform((float("-inf") , float("inf")))
xvalues.append(i)
### Defining the variable 'y'
y = a * (b + i)
yvalues.append(y) + 1
def mean():
Average = (sum(yvalues))/(len(yvalues))
print("The average is", Average)
return Average
def varience():
# This calculates the SD and the varience
s = []
for i in yvalues:
z = i - mean()
z = (np.abs(i-z))**2
s.append(y)**2
t = mean()
v = numpy.sqrt(t)
print("Answer for Varience is:", v)
return v
Traceback (most recent call last):
File "Tuesday.py", line 42, in <module>
def make_gauss(sigma=varience(), mu=mean(), x = random.uniform((float("inf"))*-1, float("inf"))):
File "Tuesday.py", line 35, in varience
t = mean()
File "Tuesday.py", line 25, in mean
Average = (sum(yvalues))/(len(yvalues))
ZeroDivisionError: division by zero
There are a few things that are not quite right as people noted above.
import random
import numpy as np
def generate():
xvalues, yvalues = [], []
for i in range(0,300):
a = random.uniform(-1000, 1000)
b = random.uniform(-1000, 1000)
xvalues.append(i)
### Defining the variable 'y'
y = a * (b + i)
yvalues.append(y)
return xvalues, yvalues
def mean(yvalues):
return sum(yvalues)/len(yvalues)
def variance(yvalues):
# This calculates the SD and the varience
s = []
yvalues_mean = mean(yvalues)
for y in yvalues:
z = (y - yvalues_mean)**2
s.append(z)
t = mean(s)
return t
def variance2(yvalues):
yvalues_mean = mean(yvalues)
return sum( (y-yvalues_mean)**2 for y in yvalues) / len(yvalues)
# Generate the xvalues and yvalues
xvalues, yvalues = generate()
# Now do the calculation, based on the passed parameters
mean_yvalues = mean(yvalues)
variance_yvalues = variance(yvalues)
variance_yvalues2 = variance2(yvalues)
print('Mean {} variance {} {}'.format(mean_yvalues, variance_yvalues, variance_yvalues2))
# Using Numpy
np_mean = np.mean(yvalues)
np_var = np.var(yvalues)
print('Numpy: Mean {} variance {}'.format(np_mean, np_var))
The way variance was calculated isn't quite right, but given the comment of "SD and variance" you were probably going to calculate both.
The code above gives 2 (well, 3) ways to do what I understand you were trying to do but I changed a few of the methods to clean them up a bit. generate() returns two lists now. mean() returns the mean, etc. The function variance2() gives an alternative way to calculate the variance but using a list comprehension style.
The last couple of lines are an example using numpy which has all of it built in and, if available, is a great way to go.
The one part that wasn't clear was the random.uniform(float("-inf"), float("inf"))) which seems to be an error (?).
You are calling mean before you call generate.
This is obvious since yvalues.append(y) + 1 (in generate) would have caused another error (TypeError) since .append returns None and you can't add 1 to None.
Change yvalues.append(y) + 1 to yvalues.append(y + 1) and then make sure to call generate before you call mean.
Also notice that you have the same error in varience (which should be called variance, btw). s.append(y)**2 should be s.append(y ** 2).
Another error you have is that the stacktrace shows make_gauss(sigma=varience(), mu=mean(), x = random.uniform((float("inf"))*-1, float("inf"))).
I'm pretty sure you don't actually want to call varience and mean on this line, just reference them. So also change that line to make_gauss(sigma=varience, mu=mean, x = random.uniform((float("inf"))*-1, float("inf")))

Weighted moving average in python with different width in different regions

I was trying to take a oscillation avarage of a highly oscillating data. The oscillations are not uniform, it has less oscillations in the initial regions.
x = np.linspace(0, 1000, 1000001)
y = some oscillating data say, sin(x^2)
(The original data file is huge, so I can't upload it)
I want to take a weighted moving avarage of the function and plot it. Initially the period of the function is larger, so I want to take avarage over a large time interval. While I can do with smaller time interval latter.
I have found a possible elegant solution in following post:
Weighted moving average in python
However, I want to have different width in different regions of x. Say when x is between (0,100) I want the width=0.6, while when x is between (101, 300) width=0.2 and so on.
This is what I have tried to implement( with my limited knowledge in programing!)
def weighted_moving_average(x,y,step_size=0.05):#change the width to control average
bin_centers = np.arange(np.min(x),np.max(x)-0.5*step_size,step_size)+0.5*step_size
bin_avg = np.zeros(len(bin_centers))
#We're going to weight with a Gaussian function
def gaussian(x,amp=1,mean=0,sigma=1):
return amp*np.exp(-(x-mean)**2/(2*sigma**2))
if x.any() < 100:
for index in range(0,len(bin_centers)):
bin_center = bin_centers[index]
weights = gaussian(x,mean=bin_center,sigma=0.6)
bin_avg[index] = np.average(y,weights=weights)
else:
for index in range(0,len(bin_centers)):
bin_center = bin_centers[index]
weights = gaussian(x,mean=bin_center,sigma=0.1)
bin_avg[index] = np.average(y,weights=weights)
return (bin_centers,bin_avg)
It is needless to say that this is not working! I am getting the plot with the first value of sigma. Please help...
The following snippet should do more or less what you tried to do. You have mainly a logical problem in your code, x.any() < 100 will always be True, so you'll never execute the second part.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 1000)
y = np.sin(x**2)
def gaussian(x,amp=1,mean=0,sigma=1):
return amp*np.exp(-(x-mean)**2/(2*sigma**2))
def weighted_average(x,y,step_size=0.3):
weights = np.zeros_like(x)
bin_centers = np.arange(np.min(x),np.max(x)-.5*step_size,step_size)+.5*step_size
bin_avg = np.zeros_like(bin_centers)
for i, center in enumerate(bin_centers):
# Select the indices that should count to that bin
idx = ((x >= center-.5*step_size) & (x <= center+.5*step_size))
weights = gaussian(x[idx], mean=center, sigma=step_size)
bin_avg[i] = np.average(y[idx], weights=weights)
return (bin_centers,bin_avg)
idx = x <= 4
plt.plot(*weighted_average(x[idx],y[idx], step_size=0.6))
idx = x >= 3
plt.plot(*weighted_average(x[idx],y[idx], step_size=0.1))
plt.plot(x,y)
plt.legend(['0.6', '0.1', 'y'])
plt.show()
However, depending on the usage, you could also implement moving average directly:
x = np.linspace(0, 60, 1000)
y = np.sin(x**2)
z = np.zeros_like(x)
z[0] = x[0]
for i, t in enumerate(x[1:]):
a=.2
z[i+1] = a*y[i+1] + (1-a)*z[i]
plt.plot(x,y)
plt.plot(x,z)
plt.legend(['data', 'moving average'])
plt.show()
Of course you could then change a adaptively, e.g. depending of the local variance. Also note that this has apriori a small bias depending on a and the step size in x.

Matplotlib How to set the x axis to correspond to days of the week?

I have a graph which plots the scores across a week
At the minute I am using a simple incrementor to plot the x axis
percent = []
count = 0
time = []
for x, y in zip(counts100, counts):
percent.append(float(x/y))
time.append(count)
count += 1
plt.xlabel('weekday')
plt.ylabel('% score over 100')
plt.plot( time, percent )
plt.gcf().autofmt_xdate()
plt.show()
This displays the data correctly, however the x axis only reads the numbers.
How can I convert this to weekdays? Thanks
EDIT:
I found this answer here and I am trying to fit it to my problem and it almost works but I keep getting weird errors
File "/usr/local/lib/python3.5/dist-packages/matplotlib/dates.py", line 401, in num2date
return _from_ordinalf(x, tz)
File "/usr/local/lib/python3.5/dist-packages/matplotlib/dates.py", line 254, in _from_ordinalf
dt = datetime.datetime.fromordinal(ix).replace(tzinfo=UTC)
ValueError: ordinal must be >= 1
I found the solution here. In my questions edit I was not changing the value to be plotted along the X axis.
percent = []
count = 0
time = []
for x, y, z in zip(counts100, counts, hour):
percent.append(float(x/y))
time.append(count)
count += 1
print("the percent is : {:.4f} for hour {}".format(float(x/y), z))
times = pd.date_range('2017-10-06 00:00', periods=count, freq='1H', tz = 'UTC')
fig, ax = plt.subplots(1)
fig.autofmt_xdate()
plt.xlabel('weekday')
plt.ylabel('% score over 100')
plt.plot( times, percent )
xfmt = mdates.DateFormatter('%d %H:%M')
ax.xaxis.set_major_formatter(xfmt)
plt.show()

matplotlib pcolormesh plot from x,y,z data

I have data in a textfile in tableform with three columns. I use np.genfromtxt to read all the columns into matplotlib as x, y, z.
I want to create a color meshplot where x and y are the coordinates and z represents the color, i think people refer to such a plot as heatmap.
My code is as follows:
x = np.genfromtxt('mesh.txt', dtype=float, delimiter=' ', usecols = (0))
y = np.genfromtxt('mesh.txt', dtype=float, delimiter=' ', usecols = (1))
z = np.genfromtxt('mesh.txt', dtype=float, delimiter=' ', usecols = (2))
xmesh, ymesh = np.meshgrid(x,y)
diagram1.pcolormesh(xmesh,ymesh,z)
But I get the following error message:
line 7154, in pcolormesh
C = ma.ravel(C[0:Ny-1, 0:Nx-1]) # data point in each cell is value at
IndexError: too many indices
The textfile is as follows:
1 1 5
2 1 4
3 1 2
4 1 6
1 2 6
2 2 2
3 2 1
4 2 9
1 3 7
2 3 4
3 3 3
4 3 5
1 4 3
2 4 4
3 4 7
4 4 6
How is this to solve.
In the example data provided above, x, y, and z can be easily reshaped to get 2D array. The answer below is for someone who is looking for more generalized answer with random x,y, and z arrays.
import matplotlib.pyplot as plt
from matplotlib.mlab import griddata
import numpy
# use your x,y and z arrays here
x = numpy.random.randint(1,30, 50)
y = numpy.random.randint(1,30, 50)
z = numpy.random.randint(1,30, 50)
yy, xx = numpy.meshgrid(y,x)
zz = griddata(x,y,z,xx,yy, interp='linear')
plt.pcolor(zz)
#plt.contourf(xx,yy,zz) # if you want contour plot
#plt.imshow(zz)
plt.pcolorbar()
plt.show()
My guess is that x, y and z will be read as one-dimensional vectors of the same length, let's say N. The problem is that when you create your xmesh and ymesh, they are N x N, which your z values should be as well. It's only N, which is why you are getting an error.
What is the layout of your file? I'm guessing each row is a (x,y,z) that you want to create a mesh from. In order to do this, you need to know how the points are ordered as a mesh (either as row-major or column-major). Once you know this, instead of creating xmesh and ymesh, you can do something like this:
N = np.sqrt(len(x)) # Only if squared, adjust accordingly
x = x.reshape((N, N))
y = y.reshape((N, N))
z = z.reshape((N, N))
pcolormesh(x, y, z)
Before doing this, I would start by doing this:
scatter(x, y, c=z)
which will give you the points of the mesh, which is a good starting point.
I had the same problem and agree with Gustav Larsson's suggestion to use
scatter(x, y, c=z)
In my particular case, I set the linewidths of the scatter points to zero:
scatter(x, y, c=z, linewidths=0)
of course, you can play around with other decorations, color schemes etc., the reference of matplotlib.pyplot.scatter will help you further.
It seems you are plotting X and Y as 2D arrays while Z is still a 1D array. Try something like:
Znew=np.reshape(z,(len(xmesh[:,0]),len(xmesh[0,:])))
diagram1.pcolormesh(xmesh,ymesh,Znew)
Update: Tou have a X/Y grid of size 4x4:
x = np.genfromtxt('mesh.txt', dtype=float, delimiter=' ', usecols = (0))
y = np.genfromtxt('mesh.txt', dtype=float, delimiter=' ', usecols = (1))
z = np.genfromtxt('mesh.txt', dtype=float, delimiter=' ', usecols = (2))
Reshape the arrays as suggestet by #Gustav Larsson and myself like this:
Xnew=np.reshape(x,(4,4))
Xnew=np.reshape(y,(4,4))
Znew=np.reshape(z,(4,4))
Which gives you three 4x4 arrays to plot using pcolormesh:
diagram1.pcolormesh(Xnew,Ynew,Znew)

Resources