Weighted moving average in python with different width in different regions - python-3.x

I was trying to take a oscillation avarage of a highly oscillating data. The oscillations are not uniform, it has less oscillations in the initial regions.
x = np.linspace(0, 1000, 1000001)
y = some oscillating data say, sin(x^2)
(The original data file is huge, so I can't upload it)
I want to take a weighted moving avarage of the function and plot it. Initially the period of the function is larger, so I want to take avarage over a large time interval. While I can do with smaller time interval latter.
I have found a possible elegant solution in following post:
Weighted moving average in python
However, I want to have different width in different regions of x. Say when x is between (0,100) I want the width=0.6, while when x is between (101, 300) width=0.2 and so on.
This is what I have tried to implement( with my limited knowledge in programing!)
def weighted_moving_average(x,y,step_size=0.05):#change the width to control average
bin_centers = np.arange(np.min(x),np.max(x)-0.5*step_size,step_size)+0.5*step_size
bin_avg = np.zeros(len(bin_centers))
#We're going to weight with a Gaussian function
def gaussian(x,amp=1,mean=0,sigma=1):
return amp*np.exp(-(x-mean)**2/(2*sigma**2))
if x.any() < 100:
for index in range(0,len(bin_centers)):
bin_center = bin_centers[index]
weights = gaussian(x,mean=bin_center,sigma=0.6)
bin_avg[index] = np.average(y,weights=weights)
else:
for index in range(0,len(bin_centers)):
bin_center = bin_centers[index]
weights = gaussian(x,mean=bin_center,sigma=0.1)
bin_avg[index] = np.average(y,weights=weights)
return (bin_centers,bin_avg)
It is needless to say that this is not working! I am getting the plot with the first value of sigma. Please help...

The following snippet should do more or less what you tried to do. You have mainly a logical problem in your code, x.any() < 100 will always be True, so you'll never execute the second part.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 1000)
y = np.sin(x**2)
def gaussian(x,amp=1,mean=0,sigma=1):
return amp*np.exp(-(x-mean)**2/(2*sigma**2))
def weighted_average(x,y,step_size=0.3):
weights = np.zeros_like(x)
bin_centers = np.arange(np.min(x),np.max(x)-.5*step_size,step_size)+.5*step_size
bin_avg = np.zeros_like(bin_centers)
for i, center in enumerate(bin_centers):
# Select the indices that should count to that bin
idx = ((x >= center-.5*step_size) & (x <= center+.5*step_size))
weights = gaussian(x[idx], mean=center, sigma=step_size)
bin_avg[i] = np.average(y[idx], weights=weights)
return (bin_centers,bin_avg)
idx = x <= 4
plt.plot(*weighted_average(x[idx],y[idx], step_size=0.6))
idx = x >= 3
plt.plot(*weighted_average(x[idx],y[idx], step_size=0.1))
plt.plot(x,y)
plt.legend(['0.6', '0.1', 'y'])
plt.show()
However, depending on the usage, you could also implement moving average directly:
x = np.linspace(0, 60, 1000)
y = np.sin(x**2)
z = np.zeros_like(x)
z[0] = x[0]
for i, t in enumerate(x[1:]):
a=.2
z[i+1] = a*y[i+1] + (1-a)*z[i]
plt.plot(x,y)
plt.plot(x,z)
plt.legend(['data', 'moving average'])
plt.show()
Of course you could then change a adaptively, e.g. depending of the local variance. Also note that this has apriori a small bias depending on a and the step size in x.

Related

How to insert Gaussian/Normal distribution in multiple subplots?

So I have multiple plots, using subplot and I would like to add the Gaussian distribution on it. I have done it, in a for loop for each plot separately, but I am not sure how to do it using subplots. At the moment it does not show anything on the subplots.
def index_of(arrval, value):
if value < min(arrval):
return 0
return max(np.where(arrval <= value)[0])
# load file using loadtxt
for file in filename:
data = np.loadtxt(file,delimiter='\t', skiprows=2)
for x,y in data:
x = data[:,0]
y = data[:,1]
xs.append(x)
ys.append(y)
# Make the subplots
for i, (x, y) in enumerate(zip(xs, ys)):
ij = np.unravel_index(i, axs.shape)
axs[ij].plot(x, y,label = lsnames[i])
axs[ij].set_title(lsnames[i])
axs[ij].legend()
# Using one of the lmfit functions to get the Gaussian plot.
# But it does not show anything
gauss1 = GaussianModel(prefix='g1_')
gauss2 = GaussianModel(prefix='g2_')
pars = gauss1.guess(y, x=x)
pars.update(gauss2.make_params())
ix1 = index_of(x, 20)
ix2 = index_of(x, 40)
ix3 = index_of(x, 75)
gauss1.guess(y[ix1:ix2], x=x[ix1:ix2])
gauss2.guess(y[ix2:ix3], x=x[ix2:ix3])
mod = gauss1 + gauss2
mod = GaussianModel()
pars = mod.guess(y, x=x)
out = mod.fit(y, pars, x=x)
print(out.fit_report(min_correl=0.25))
plt.show()
Maybe I'm not fully understanding, but this seems like it could just be a looping question or even an indentation problem.
I think what you're asking to do is something like:
# loop over datasets, putting each in a subplot
for i, (x, y) in enumerate(zip(xs, ys)):
ij = np.unravel_index(i, axs.shape)
axs[ij].plot(x, y,label = lsnames[i])
axs[ij].set_title(lsnames[i])
axs[ij].legend()
# fit this dataset with 1 gaussian
mod = GaussianModel()
pars = mod.guess(y, x=x)
out = mod.fit(y, pars, x=x)
# plot best-fit
axs[ij].plot(x, out.best_fit, label='fit')
print("Data Set %d" % i)
print(out.fit_report(min_correl=0.25))
plt.show()
Your code was sort of confusingly making a model with two Gaussians and then not using it. It would be fine to use a more complicated model in the loop.
Hope that helps.

How to draw from two different gaussians conditionally on some Bernoulli distribution?

I have two gaussian distributions (I'm using multivariate_normal) and I'd like to draw from them with probability of p for the first gaussian and 1-p for the other one. I'd like to make n draws.
Is it possible to do that without a for loop? (for efficiency purposes)
Thanks
Yes, it is possible to perform this operation without a loop. Try:
import numpy as np
from scipy import stats
sample_size = 100
p = 0.25
# Flip a coin with P(HEADS) = p to determine which distribution to draw from
indicators = stats.bernoulli.rvs(p, size=sample_size)
# Draw from N(0, 1) w/ probability p and N(-1, 1) w/ probability (1-p)
draws = (indicators == 1) * np.random.normal(0, 1, size=sample_size) + \
(indicators == 0) * np.random.normal(-1, 1, size=sample_size)
You can accomplish the same thing using np.vectorize (caveat emptor):
def draw(x):
if x == 0:
return np.random.normal(-1, 1)
elif x == 1:
return np.random.normal(0, 1)
draw_vec = np.vectorize(draw)
draws = draw_vec(indicators)
If you need to extend the solution to a mixture of more than 2 distributions, you can use np.random.multinomial to assign samples to distributions and add additional cases to the if/else in draw.

simpson integration on python

I am trying to integrate numerically using simpson integration rule for f(x) = 2x from 0 to 1, but keep getting a large error. The desired output is 1 but, the output from python is 1.334. Can someone help me find a solution to this problem?
thank you.
import numpy as np
def f(x):
return 2*x
def simpson(f,a,b,n):
x = np.linspace(a,b,n)
dx = (b-a)/n
for i in np.arange(1,n):
if i % 2 != 0:
y = 4*f(x)
elif i % 2 == 0:
y = 2*f(x)
return (f(a)+sum(y)+f(x)[-1])*dx/3
a = 0
b = 1
n = 1000
ans = simpson(f,a,b,n)
print(ans)
There is everything wrong. x is an array, everytime you call f(x), you are evaluating the function over the whole array. As n is even and n-1 odd, the y in the last loop is 4*f(x) and from its sum something is computed
Then n is the number of segments. The number of points is n+1. A correct implementation is
def simpson(f,a,b,n):
x = np.linspace(a,b,n+1)
y = f(x)
dx = x[1]-x[0]
return (y[0]+4*sum(y[1::2])+2*sum(y[2:-1:2])+y[-1])*dx/3
simpson(lambda x:2*x, 0, 1, 1000)
which then correctly returns 1.000. You might want to add a test if n is even, and increase it by one if that is not the case.
If you really want to keep the loop, you need to actually accumulate the sum inside the loop.
def simpson(f,a,b,n):
dx = (b-a)/n;
res = 0;
for i in range(1,n): res += f(a+i*dx)*(2 if i%2==0 else 4);
return (f(a)+f(b) + res)*dx/3;
simpson(lambda x:2*x, 0, 1, 1000)
But loops are generally slower than vectorized operations, so if you use numpy, use vectorized operations. Or just use directly scipy.integrate.simps.

ValueError: operands could not be broadcast together with shapes (3,) (0,)

My aim is to make the image1 move along the ring from its current position upto 180 degree. I have been trying to do different things but nothing seem to work. My final aim is to move both the images along the ring in different directions and finally merge them to and make them disappear.I keep getting the error above.Can you please help? Also can you tell how I can go about this problem?
from visual import *
import numpy as np
x = 3
y = 0
z = 0
i = pi/3
c = 0.120239 # A.U/minute
r = 1
for theta in arange(0, 2*pi, 0.1): #range of theta values; 0 to
xunit = r * sin(theta)*cos(i) +x
yunit = r * sin(theta)*sin(i) +y
zunit = r*cos(theta) +z
ring = curve( color = color.white ) #creates a curve
for theta in arange(0, 2*pi, 0.01):
ring.append( pos=(sin(theta)*cos(i) +x,sin(theta)*sin(i) +y,cos(theta) +z) )
image1=sphere(pos=(2.5,-0.866,0),radius=0.02, color=color.yellow)
image2=sphere(pos=(2.5,-0.866,0),radius=0.02, color=color.yellow)
earth=sphere(pos=(-3,0,-0.4),color=color.yellow, radius =0.3,material=materials.earth) #creates the observer
d_c_p = pow((x-xunit)**2 + (y-yunit)**2 + (z-zunit)**2,0.5) #calculates the distance between the center and points on ring
d_n_p = abs(yunit + 0.4998112152755791) #calculates the distance to the nearest point
t1 = ( d_c_p+d_n_p)/c
t0=d_c_p/c
t=t1-t0 #calculates the time it takes from one point to another
theta = []
t = []
dtheta = np.diff(theta) #calculates the difference in theta
dt = np.diff(t) #calculates the difference in t
speed = r*dtheta/dt #hence this calculates the speed
deltat = 0.005
t2=0
while True:
rate(5)
image2.pos = image2.pos + speed*deltat #increments the position of the image1
t2 = t2 + deltat
Your problem is that image2.pos is a vector (that's the "3" in the error message) but speed*deltat is a scalar (that's the "0" in the error message). You can't add a vector and a scalar. Instead of a scalar "speed" you need a vector velocity. There seem to be some errors in indentation in the program you posted, so there is some possibility I've misinterpreted what you're trying to do.
For VPython questions it's better to post to the VPython forum, where there are many more VPython users who will see your question than if you post to stackoverflow:
https://groups.google.com/forum/?fromgroups&hl=en#!forum/vpython-users

Colormapping the Mandelbrot set by iterations in python

I am using np.ogrid to create the x and y grid from which I am drawing my values. I have tried a number of different ways to color the scheme according to the iterations required for |z| >= 2 but nothing seems to work. Even when iterating 10,000 times just to be sure that I have a clear picture when zooming, I cannot figure out how to color the set according to iteration ranges. Here is the code I am using, some of the structure was borrowed from a tutorial. Any suggestions?
#I found this function and searched in numpy for best usage for this type of density plot
x_val, y_val = np.ogrid[-2:2:2000j, -2:2:2000j]
#Creating the values to work with during the iterations
c = x_val + 1j*y_val
z = 0
iter_num = int(input("Please enter the number of iterations:"))
for n in range(iter_num):
z = z**2 + c
if n%10 == 0:
print("Iterations left: ",iter_num - n)
#Creates the mask to filter out values of |z| > 2
z_mask = abs(z) < 2
proper_z_mask = z_mask - 255 #switches current black/white pallette
#Creating the figure and sizing for optimal viewing on a small laptop screen
plt.figure(1, figsize=(8,8))
plt.imshow(z_mask.T, extent=[-2, 2, -2, 2])
plt.gray()
plt.show()

Resources