Compute Errors on Fit Parameters In Scipy Curve fit - python-3.x

Using "scipy.optimize.curve_fit" we can determine the fit parameters for a curve fit on x and y using
popt, pcov = curve_fit(func, xdata, ydata)
In the documentation for this function, they state that: To compute one standard deviation errors on the parameters use
perr = np.sqrt(np.diag(pcov))
Here's a link to the documentation I was reading. https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
What if I want to compute something more general than simply 1 standard deviation on the errors of the parameters? In particular, what If I'm looking for, say, 2 standard deviations (a 95% confidence interval on the parameters).
To be clear, I'm not looking for a 10 line+ solution. I already know how to compute these errors in a "hackish" way for a linear function:
def get_slope_params(data1, data2):
x_mean = mean(data1)
y_mean = mean(data2)
N = len(data1)
sum_xy = 0
for (x, y) in zip(data1, data2):
sum_xy = sum_xy + x*y
sum_xsq = 0
for x in data1:
sum_xsq = sum_xsq + x*x
b = (sum_xy-N*x_mean*y_mean)/(sum_xsq-N*x_mean**2)
a = y_mean - b*x_mean
return (a,b)
# 95%
def get_slope_params_uncertainties(data1, data2):
N = len(data1)
a, b = get_slope_params(data1, data2)
y_approx = a+b*data1
s_eps = 0
for (y, y_app) in zip(data2, y_approx):
s_eps = s_eps + (y-y_app)**2
s_eps = np.sqrt(s_eps/(N-2))
s_x = np.sqrt(cov(data1, data1))
delta_b = (1/np.sqrt(N-1))*(s_eps/s_x)*sp.stats.t.ppf(1-0.05/2, N-2)
delta_a = mean(data1)*delta_b
return delta_a, delta_b
What I'd like is a function already implemented entirely by scipy.

Related

Numpy Vectorization for Nested 'for' loop

I was trying to write a program which plots level set for any given function.
rmin = -5.0
rmax = 5.0
c = 4.0
x = np.arange(rmin,rmax,0.1)
y = np.arange(rmin,rmax,0.1)
x,y = np.meshgrid(x,y)
f = lambda x,y: y**2.0 - 4*x
realplots = []
for i in range(x.shape[0]):
for j in range(x.shape[1]):
if abs(f(x[i,j],y[i,j])-c)< 1e-4:
realplots.append([x[i,j],y[i,j]])`
But it being a nested for loop, is taking lot of time. Any help in vectorizing the above code/new method of plotting level set is highly appreciated.(Note: The function 'f' will be changed at the time of running.So, the vectorization must be done without considering the function's properties)
I tried vectorizing through
ans = np.where(abs(f(x,y)-c)<1e-4,np.array([x,y]),[0,0])
but it was giving me operands could not be broadcast together with shapes (100,100) (2,100,100) (2,)
I was adding [0,0] as an escape from else condition in np.where which is indeed wrong.
Since you get the values rather than the indexes, you don't really need np.where.
You can directly use the mask to index x and y, look at the "Boolean array indexing" section of the documentation.
It is straightforward:
def vectorized(x, y, c, f, threshold):
mask = np.abs(f(x, y) - c) < threshold
x, y = x[mask], y[mask]
return np.stack([x, y], axis=-1)
Your function for reference:
def op(x, y, c, f, threshold):
res = []
for i in range(x.shape[0]):
for j in range(x.shape[1]):
if abs(f(x[i, j], y[i, j]) - c) < threshold:
res.append([x[i, j], y[i, j]])
return res
Tests:
rmin, rmax = -5.0, +5.0
c = 4.0
threshold = 1e-4
x = np.arange(rmin, rmax, 0.1)
y = np.arange(rmin, rmax, 0.1)
x, y = np.meshgrid(x, y)
f = lambda x, y: y**2 - 4 * x
res_op = op(x, y, c, f, threshold)
res_vec = vectorized(x, y, c, f, threshold)
assert np.allclose(res_op, res_vec)

Numpy.linalg.eig is giving different results than numpy.linalg.eigh for Hermitian matrices

I have one hermitian matrix (specifically, a Hamiltonian). Though phase of a singe eigenvector can be arbitrary, the quantities I am calculating is physical (I reduced the code a bit keeping just the reproducible part). eig and eigh are giving very different results.
import numpy as np
import numpy.linalg as nlg
import matplotlib.pyplot as plt
def Ham(Ny, Nx, t, phi):
h = np.zeros((Ny,Ny), dtype=complex)
for ii in range(Ny-1):
h[ii+1,ii] = t
h[Ny-1,0] = t
h=h+np.transpose(np.conj(h))
u = np.zeros((Ny,Ny), dtype=complex)
for ii in range(Ny):
u[ii,ii] = -t*np.exp(-2*np.pi*1j*phi*ii)
u = u + 1e-10*np.eye(Ny)
H = np.kron(np.eye(Nx,dtype=int),h) + np.kron(np.diag(np.ones(Nx-1), 1),u) + np.kron(np.diag(np.ones(Nx-1), -1),np.transpose(np.conj(u)))
H[0:Ny,Ny*(Nx-1):Ny*Nx] = np.transpose(np.conj(u))
H[Ny*(Nx-1):Ny*Nx,0:Ny] = u
x=[]; y=[];
for jj in range (1,Nx+1):
for ii in range (1,Ny+1):
x.append(jj); y.append(ii)
x = np.asarray(x)
y = np.asarray(y)
return H, x, y
def C_num(Nx, Ny, E, t, phi):
H, x, y = Ham(Ny, Nx, t, phi)
ifhermitian = np.allclose(H, np.transpose(np.conj(H)), rtol=1e-5, atol=1e-8)
assert ifhermitian == True
Hp = H
V,wf = nlg.eigh(Hp) ##Check. eig gives different result
idx = np.argsort(np.real(V))
wf = wf[:, idx]
normmat = wf*np.conj(wf)
norm = np.sqrt(np.sum(normmat, axis=0))
wf = wf/(norm*np.sqrt(len(H)))
wf = wf[:, V<=E] ##Chose a subset of eigenvectors
V01 = wf*np.exp(1j*x)[:,None]; V12 = wf*np.exp(1j*y)[:,None]
V23 = wf*np.exp(1j*x)[:,None]; V30 = wf*np.exp(1j*y)[:,None]
wff = np.transpose(np.conj(wf))
C01 = np.dot(wff,V01); C12 = np.dot(wff,V12); C23 = np.dot(wff,V23); C30 = np.dot(wff,V30)
F = nlg.multi_dot([C01,C12,C23,C30])
ifhermitian = np.allclose(F, np.transpose(np.conj(F)), rtol=1e-5, atol=1e-8)
assert ifhermitian == True
evals, efuns = nlg.eig(F) ##Check eig gives different result
C = (1/(2*np.pi))*np.sum(np.angle(evals));
return C
C = C_num(16, 16, 0, 1, 1/8)
print(C)
Changing both nlg.eigh to nlg.eig, or even changing only the last one, giving very different results.
As I mentioned elsewhere, the eigenvalue and eigenvector are not unique.
The only thing that is true is that for each eigenvalue $A v = lambda v$, the two matrices returned by eig and eigh describe those solutions, it is natural that eig inexact but approximate results.
You can see that both the solutions will triangularize your matrix in different ways
H, x, y = Ham(16, 16, 1, 1./8)
D, V = nlg.eig(H)
Dh, Vh = nlg.eigh(H)
Then
import matplotlib.pyplot as plt
plt.figure(figsize=(14, 7))
plt.subplot(121);
plt.imshow(abs(np.conj(Vh.T) # H # Vh))
plt.title('diagonalized with eigh')
plt.subplot(122);
plt.imshow(abs(np.conj(V.T) # H # V))
plt.title('diagonalized with eig')
Plots this
That both diagonalizations were successfull, but the eigenvalues are indifferent order.
If you sort the eigenvalues you see they match
plt.plot(np.diag(np.real(np.conj(Vh.T) # H # Vh)))
plt.plot(np.diag(np.imag(np.conj(Vh.T) # H # Vh)))
plt.plot(np.sort(np.diag(np.real(np.conj(V.T) # H # V))))
plt.title('eigenvalues')
plt.legend(['real eigh', 'imag eigh', 'sorted real eig'], loc='upper left')
Since many eigenvalues are repeated, the eigenvector associated with a given eigenvalue is not unique as well, the only thing we can guarantee is that the eigenvectors for a given eigenvalue must span the same subspace.
The diagonalization test is the best in my opinion.
Is eigh always better than eig?
If you search for the eigenvalues in the lapack routines you will have many options. So it is I cannot discuss each possible implementation here. The common sense says that we can expect that the symmetric/hermitian routines to perform better, otherwise ther would be no reason to add one more routine that is more limited. But I never tested carefully the behavior of eig vs eigh.
To have an intuition compare the equation for tridiagonalization for symmetric matrices, and the equation for reduction of a general matrix to its Heisenberg form found here.

Better way to solve simultaneous linear equations programmatically in Python

I have the following code that solves simultaneous linear equations by starting with the first equation and finding y when x=0, then putting that y into the second equation and finding x, then putting that x back into the first equation etc...
Obviously, this has the potential to reach infinity, so if it reaches +-inf then it swaps the order of the equations so the spiral/ladder goes the other way.
This seems to work, tho I'm not such a good mathematician that I can prove it will always work beyond a hunch, and of course some lines never meet (I know how to use matrices and linear algebra to check straight off whether they will never meet, but I'm not so interested in that atm).
Is there a better way to 'spiral' in on the answer? I'm not interested in using math functions or numpy for the whole solution - I want to be able to code the solution. I don't mind using libraries to improve the performance, for instance using some sort of statistical method.
This may be a very naive question from either a coding or maths point of view, but if so I'd like to know why!
My code is as follows:
# A python program to solve 2d simultaneous equations
# by iterating over coefficients in spirals
import numpy as np
def Input(coeff_or_constant, var, lower, upper):
val = int(input("Let the {} {} be a number between {} and {}: ".format(coeff_or_constant, var, lower, upper)))
if val >= lower and val <= upper :
return val
else:
print("Invalid input")
exit(0)
def Equation(equation_array):
a = Input("coefficient", "a", 0, 10)
b = Input("coefficient", "b", 0, 10)
c = Input("constant", "c", 0, 10)
equation_list = [a, b, c]
equation_array.append(equation_list)
return equation_array
def Stringify_Equations(equation_array):
A = str(equation_array[0][0])
B = str(equation_array[0][1])
C = str(equation_array[0][2])
D = str(equation_array[1][0])
E = str(equation_array[1][1])
F = str(equation_array[1][2])
eq1 = str(A + "y = " + B + "x + " + C)
eq2 = str(D + "y = " + E + "x + " + F)
print(eq1)
print(eq2)
def Spiral(equation_array):
a = equation_array[0][0]
b = equation_array[0][1]
c = equation_array[0][2]
d = equation_array[1][0]
e = equation_array[1][1]
f = equation_array[1][2]
# start at y when x = 0
x = 0
infinity_flag = False
count = 0
coords = []
coords.append([0, 0])
coords.append([1, 1])
# solve equation 2 for x when y = START
while not (coords[0][0] == coords[1][0]):
try:
y = ( ( b * x ) + c ) / a
except:
y = 0
print(y)
try:
x = ( ( d * y ) - f ) / e
except:
x = 0
if x >= 100000 or x <= -100000:
count = count + 1
if count >= 100000:
print("It\'s looking like these linear equations don\'t intersect!")
break
print(x)
new_coords = [x, y]
coords.append(new_coords)
coords.pop(0)
if not ((x == float("inf") or x == float("-inf")) and (y == float("inf") or y == float("-inf"))):
pass
else:
infinity_flag if False else True
if infinity_flag == False:
# if the spiral is divergent this switches the equations around so it converges
# the infinity_flag is to check if both spirals returned infinity meaning the lines do not intersect
# I think this would mostly work for linear equations, but for other kinds of equations it might not
x = 0
a = equation_array[1][0]
b = equation_array[1][1]
c = equation_array[1][2]
d = equation_array[0][0]
e = equation_array[0][1]
f = equation_array[0][2]
infinity_flag = False
else:
print("These linear equations do not intersect")
break
y = round(y, 3)
x = round(x, 3)
print(x, y)
equation_array = []
print("Specify coefficients a and b, and a constant c for equation 1")
equations = Equation(equation_array)
print("Specify coefficients a and b, and a constant c for equation 1")
equations = Equation(equation_array)
print(equation_array)
Stringify_Equations(equation_array)
Spiral(equation_array)

Not able to print the list, how to rectify the errors?

This program is to find the normalization of a vector but I am not able to print the list:
Def function:
def _unit_vector_sample_(vector):
# calculate the magnitude
x = vector[0]
y = vector[1]
z = vector[2]
mag = ((x**2) + (y**2) + (z**2))**(1/2)
# normalize the vector by dividing each component with the magnitude
new_x = x/mag
new_y = y/mag
new_z = z/mag
unit_vector = [new_x, new_y, new_z]
#return unit_vector
Main program:
vector=[2,3,-4]
def _unit_vector_sample_(vector):
print(unit_vector)
How can I rectify the errors?
Try this:
def _unit_vector_sample_(vector):
# calculate the magnitude
x = vector[0]
y = vector[1]
z = vector[2]
mag = ((x**2) + (y**2) + (z**2))**(1/2)
# normalize the vector by dividing each component with the magnitude
new_x = x/mag
new_y = y/mag
new_z = z/mag
unit_vector = [new_x, new_y, new_z]
return unit_vector
vector=[2,3,-4]
print(_unit_vector_sample_(vector))
prints this output:
[0.3713906763541037, 0.5570860145311556, -0.7427813527082074]
You need to declare a return statement in your _unit_vector_sample function. Otherwise your function will run but it cannot give it results back to main.
Alternatively you can do this:
def _unit_vector_sample_(vector):
# calculate the magnitude
x = vector[0]
y = vector[1]
z = vector[2]
mag = ((x**2) + (y**2) + (z**2))**(1/2)
# normalize the vector by dividing each component with the magnitude
new_x = x/mag
new_y = y/mag
new_z = z/mag
unit_vector = [new_x, new_y, new_z]
print(unit_vector)
vector=[2,3,-4]
_unit_vector_sample_(vector)
resulting in the same output being printed:
[0.3713906763541037, 0.5570860145311556, -0.7427813527082074]
Here by calling print in your function the unit_vector gets printed every time the function is run.
Which one to use depends on what you want to do.
Do you also want to assign the outcome of the fuction to a variable in main then use the first solution (and instead of directly printing the outcome of the function assign it to a variable). If this is not required you can use the second option.

Smoothing values (neighbors between 1-9)

Instructions: Compute and store R=1000 random values from 0-1 as x. moving_window_average(x, n_neighbors) is pre-loaded into memory from 3a. Compute the moving window average for x for the range of n_neighbors 1-9. Store x as well as each of these averages as consecutive lists in a list called Y.
My solution:
R = 1000
n_neighbors = 9
x = [random.uniform(0,1) for i in range(R)]
Y = [moving_window_average(x, n_neighbors) for n_neighbors in range(1,n_neighbors)]
where moving_window_average(x, n_neighbors) is a function as follows:
def moving_window_average(x, n_neighbors=1):
n = len(x)
width = n_neighbors*2 + 1
x = [x[0]]*n_neighbors + x + [x[-1]]*n_neighbors
# To complete the function,
# return a list of the mean of values from i to i+width for all values i from 0 to n-1.
mean_values=[]
for i in range(1,n+1):
mean_values.append((x[i-1] + x[i] + x[i+1])/width)
return (mean_values)
This gives me an error, Check your usage of Y again. Even though I've tested for a few values, I did not get yet why there is a problem with this exercise. Did I just misunderstand something?
The instruction tells you to compute moving averages for all neighbors ranging from 1 to 9. So the below code should work:
import random
random.seed(1)
R = 1000
x = []
for i in range(R):
num = random.uniform(0,1)
x.append(num)
Y = []
Y.append(x)
for i in range(1,10):
mov_avg = moving_window_average(x, n_neighbors=i)
Y.append(mov_avg)
Actually your moving_window_average(list, n_neighbors) function is not going to work with a n_neighbors bigger than one, I mean, the interpreter won't say a thing, but you're not delivering correctness on what you have been asked.
I suggest you to use something like:
def moving_window_average(x, n_neighbors=1):
n = len(x)
width = n_neighbors*2 + 1
x = [x[0]]*n_neighbors + x + [x[-1]]*n_neighbors
mean_values = []
for i in range(n):
temp = x[i: i+width]
sum_= 0
for elm in temp:
sum_+= elm
mean_values.append(sum_ / width)
return mean_values
My solution for +100XP
import random
random.seed(1)
R=1000
Y = list()
x = [random.uniform(0, 1) for num in range(R)]
for n_neighbors in range(10):
Y.append(moving_window_average(x, n_neighbors))

Resources