curve_fit with polynomials of variable length - python-3.x

I'm new to python (and programming in general) and want to make a polynomial fit using curve_fit, where the order of the polynomials (or the number of fit parameters) is variable.
I made this code which is working for a fixed number of 3 parameters a,b,c
# fit function
def fit_func(x, a,b,c):
p = np.polyval([a,b,c], x)
return p
# do the fitting
popt, pcov = curve_fit(fit_func, x_data, y_data)
But now I'd like to have my fit function to only depend on a number N of parameters instead of a,b,c,....
I'm guessing that's not a very hard thing to do, but because of my limited knowledge I can't get it work.
I've already looked at this question, but I wasn't able to apply it to my problem.

You can define the function to be fit to your data like this:
def fit_func(x, *coeffs):
y = np.polyval(coeffs, x)
return y
Then, when you call curve_fit, set the argument p0 to the initial guess of the polynomial coefficients. For example, this plot is generated by the script that follows.
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
# Generate a sample input dataset for the demonstration.
x = np.arange(12)
y = np.cos(0.4*x)
def fit_func(x, *coeffs):
y = np.polyval(coeffs, x)
return y
fit_results = []
for n in range(2, 6):
# The initial guess of the parameters to be found by curve_fit.
# Warning: in general, an array of ones might not be a good enough
# guess for `curve_fit`, but in this example, it works.
p0 = np.ones(n)
popt, pcov = curve_fit(fit_func, x, y, p0=p0)
# XXX Should check pcov here, but in this example, curve_fit converges.
fit_results.append(popt)
plt.plot(x, y, 'k.', label='data')
xx = np.linspace(x.min(), x.max(), 100)
for p in fit_results:
yy = fit_func(xx, *p)
plt.plot(xx, yy, alpha=0.6, label='n = %d' % len(p))
plt.legend(framealpha=1, shadow=True)
plt.grid(True)
plt.xlabel('x')
plt.show()

The parameters of polyval specify p is an array of coefficients from the highest to lowest. With x being a number or array of numbers to evaluate the polynomial at. It says, the following.
If p is of length N, this function returns the value:
p[0]*x**(N-1) + p[1]*x**(N-2) + ... + p[N-2]*x + p[N-1]
def fit_func(p,x):
z = np.polyval(p,x)
return z
e.g.
t= np.array([3,4,5,3])
y = fit_func(t,5)
503
which is if you do the math here is right.

Related

Python: Fitting a piecewise polynomial

I am trying to fit a piecewise polynomial function
Code:
import numpy as np
import scipy
from scipy.interpolate import UnivariateSpline, splrep
from scipy.optimize import curve_fit
from matplotlib import pyplot as plt
def piecewise_func(x, X, Y):
"""
cond_l: condition list
func_l: function list
"""
spl = UnivariateSpline(X, Y, k=3, s=0.5)
tck = (spl._data[8], spl._data[9], 3) # tck = (knots, coefficients, degree)
p = scipy.interpolate.PPoly.from_spline(tck)
cond_l = []
func_l = []
for idx, i in enumerate(range(3, len(spl.get_knots()) + 3 - 1)):
cond_l.append([(x >= p.x[i] & x < p.x[i + 1])])
func_l.append([lambda x: p.c[3, i] + p.c[2, i] * x + p.c[1, i] * x ** 2 + p.c[0, i] * x ** 3])
return np.piecewise(x, cond_l, func_l)
if __name__ == '__main__':
xdata = [0.28190937, 0.63429607, 0.91620544, 1.68793236, 2.32350115, 2.95215219, 4.5,
4.78103382, 7.2, 7.53430054, 8.03627018, 9., 9.86212529, 11.25951191, 11.62658532, 11.65598578, 13.90295926]
ydata = [0.36273168, 0.81614628, 1.17887796, 1.4475374, 5.52692706, 2.17548169, 3.55313396, 3.80326533, 7.75556311, 8.30176616, 10.72117182, 11.2499386,
11.72296513, 11.02146624, 14.51260631, 20.59365525, 21.77847853]
spl = UnivariateSpline(xdata, ydata, k=3, s=1)
plt.plot(xdata, ydata, '*')
plt.plot(xdata, spl(xdata))
plt.show()
p, e = curve_fit(piecewise_func, xdata, ydata)
# x_plot = np.linspace(0., 0.15, len(x))
# plt.plot(x, y, "+")
# plt.plot(x, (piecewise_func(x_plot, *p)), 'C3-', lw=3)
I tried the UnivariateSpline function to interpolate, I see the following result
However, I don't want the polynomial curve to pass through all data points. I tried varying the smoothing factor but I am not able to obtain something like the one below.
Expected output:
I'm trying curve fitting (Use UnivariateSpline to fit data tightly) to get the expected output and I have the following issues.
piecewise_func in the code posted returns the piecewise polynomial.
Passing this to curve_fit(piecewise_func, xdata, ydata) returns an error
Error:
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
ValueError: diff requires input that is at least one dimensional
I am not sure what is wrong.
Suggestions on how to get the expected fit will be
of great help.
I would recommend having a closer look at the parameter s in the UnivariateSpline documentation:
s : float or None, optional
Positive smoothing factor used to choose the number of knots. Number of knots will be increased until the smoothing condition is satisfied:
sum((w[i] * (y[i]-spl(x[i])))**2, axis=0) <= s
If s is None, s = len(w) which should be a good value if 1/w[i] is an estimate of the standard deviation of y[i]. If 0, spline will interpolate through all data points. Default is None.
Since you do not set w, this is just a complicated way of saying that s is the least squares error that you allow, i.e., squared errors summed over all the data points. Your value of 1 does not lead to interpolation but it is quite tight compared to what you want to achieve.
Taking
spl = UnivariateSpline(xdata, ydata, k=3, s=10)
you get the following:
Yet closer to your goal is s=100:
So my recommendation is to play around with s and if that proves insufficient, to ask a new question describing what you need more precisely. I haven't had a proper look at the problem with piecewise_func.

Curve fitting with known coefficients in Python

I tried using Numpy, Scipy and Scikitlearn, but couldn't find what I need in any of them, basically I need to fit a curve to a dataset, but restricting some of the coefficients to known values, I found how to do it in MATLAB, using fittype, but couldn't do it in python.
In my case I have a dataset of X and Y and I need to find the best fitting curve, I know it's a polynomial of second degree (ax^2 + bx + c) and I know it's values of b and c, so I just needed it to find the value of a.
The solution I found in MATLAB was https://www.mathworks.com/matlabcentral/answers/216688-constraining-polyfit-with-known-coefficients which is the same problem as mine, but with the difference that their polynomial was of degree 5th, how could I do something similar in python?
To add some info: I need to fit a curve to a dataset, so things like scipy.optimize.curve_fit that expects a function won't work (at least as far as I tried).
The tools you have available usually expect functions only inputting their parameters (a being the only unknown in your case), or inputting their parameters and some data (a, x, and y in your case).
Scipy's curve-fit handles that use-case just fine, so long as we hand it a function that it understands. It expects x first and all your parameters as the remaining arguments:
from scipy.optimize import curve_fit
import numpy as np
b = 0
c = 0
def f(x, a):
return c+x*(b+x*a)
x = np.linspace(-5, 5)
y = x**2
# params == [1.]
params, _ = curve_fit(f, x, y)
Alternatively you can reach for your favorite minimization routine. The difference here is that you manually construct the error function so that it only inputs the parameters you care about, and then you don't need to provide that data to scipy.
from scipy.optimize import minimize
import numpy as np
b = 0
c = 0
x = np.linspace(-5, 5)
y = x**2
def error(a):
prediction = c+x*(b+x*a)
return np.linalg.norm(prediction-y)/len(prediction)**.5
result = minimize(error, np.array([42.]))
assert result.success
# params == [1.]
params = result.x
I don't think scipy has a partially applied polynomial fit function built-in, but you could use either of the above ideas to easily build one yourself if you do that kind of thing a lot.
from scipy.optimize import curve_fit
import numpy as np
def polyfit(coefs, x, y):
# build a mapping from null coefficient locations to locations in the function
# coefficients we're passing to curve_fit
#
# idx[j]==i means that unknown_coefs[i] belongs in coefs[j]
_tmp = [i for i,c in enumerate(coefs) if c is None]
idx = {j:i for i,j in enumerate(_tmp)}
def f(x, *unknown_coefs):
# create the entire polynomial's coefficients by filling in the unknown
# values in the right places, using the aforementioned mapping
p = [(unknown_coefs[idx[i]] if c is None else c) for i,c in enumerate(coefs)]
return np.polyval(p, x)
# we're passing an initial value just so that scipy knows how many parameters
# to use
params, _ = curve_fit(f, x, y, np.zeros((sum(c is None for c in coefs),)))
# return all the polynomial's coefficients, not just the few we just discovered
return np.array([(params[idx[i]] if c is None else c) for i,c in enumerate(coefs)])
x = np.linspace(-5, 5)
y = x**2
# (unknown)x^2 + 1x + 0
# params == [1, 0, 0.]
params = fit([None, 0, 0], x, y)
Similar features exist in nearly every mainstream scientific library; you just might need to reshape your problem a bit to frame it in terms of the available primitives.

Python 3: Getting Value Error, sequence larger than 32

I'm trying to write a script that computes numerical derivatives using the forward, backward, and centered approximations, and plots the results. I've made a linspace from 0 to 2pi with 100 points. I've made many arrays and linspaces in the past, but I've never seen this error: "ValueError: sequence too large; cannot be greater than 32"
I don't understand what the problem is. Here is my script:
import numpy as np
import matplotlib.pyplot as plt
def f(x):
return np.cos(x) + np.sin(x)
def f_diff(x):
return np.cos(x) - np.sin(x)
def forward(x,h): #forward approximation
return (f(x+h)-f(x))/h
def backward(x,h): #backward approximation
return (f(x)-f(x-h))/h
def center(x,h): #center approximation
return (f(x+h)-f(x-h))/(2*h)
x0 = 0
x = np.linspace(0,2*np.pi,100)
forward_result = np.zeros(x)
backward_result = np.zeros(x)
center_result = np.zeros(x)
true_result = np.zeros(x)
for i in range(x):
forward_result[i] = forward[x0,i]
true_result[i] = f_diff[x0]
print('Forward (x0={}) = {}'.format(x0,forward(x0,x)))
#print('Backward (x0={}) = {}'.format(x0,backward(x0,dx)))
#print('Center (x0={}) = {}'.format(x0,center(x0,dx)))
plt.figure()
plt.plot(x, f)
plt.plot(x,f_diff)
plt.plot(x, abs(forward_result-true_result),label='Forward difference')
I did try setting the linspace points to 32, but that gave me another error: "TypeError: 'numpy.float64' object cannot be interpreted as an integer"
I don't understand that one either. What am I doing wrong?
The issue starts at forward_result = np.zeros(x) because x is a numpy array not a dimension. Since x has 100 entries, np.zeros wants to create object in R^x[0] times R^x[1] times R^x[3] etc. The maximum dimension is 32.
You need a flat np array.
UPDATE: On request, I add corrected lines from code above:
forward_result = np.zeros(x.size) creates the array of dimension 1.
Corrected evaluation of the function is done via circular brackets. Also fixed the loop:
for i, h in enumerate(x):
forward_result[i] = forward(x0,h)
true_result[i] = f_diff(x0)
Finally, in the figure, you want to plot numpy array vs function. Fixed version:
plt.plot(x, [f(val) for val in x])
plt.plot(x, [f_diff(val) for val in x])

Double Trapezoidal Integral in numpy

I have a two-dimensional function $f(x,y)=\exp(y-x)$. I would like to compute the double integral $\int_{0}^{10}\int_{0}^{10}f(x,y) dx dy$ using NumPy trapz. After some reading, they say I should just repeat the trapz twice but it's not working. I have tried the following
import numpy as np
def distFunc(x,y):
f = np.exp(-x+y)
return f
# Values in x to evaluate the integral.
x = np.linspace(.1, 10, 100)
y = np.linspace(.1, 10, 100)
list1=distFunc(x,y)
int_exp2d = np.trapz(np.trapz(list1, y, axis=0), x, axis=0)
The code always gives the error
IndexError: list assignment index out of range
I don't know how to fix this so that the code can work. I thought the inner trapz was to integrate along y first then we end by the second along x. Thank you.
You need to convert x and y to 2D arrays which can be done conveniently in numpy with np.meshgrid. This way, when you call distfunc it will return a 2D array which can be integrated along one axis first and then the other. As your code stands right now, you are passing a 1D list to the first integral (which is fine) and then the second integral receives a scalar value.
import numpy as np
def distFunc(x,y):
f = np.exp(-x+y)
return f
# Values in x to evaluate the integral.
x = np.linspace(.1, 10, 100)
y = np.linspace(.1, 10, 100)
X, Y = np.meshgrid(x, y)
list1=distFunc(X, Y)
int_exp2d = np.trapz(np.trapz(list1, y, axis=0), x, axis=0)

Plotting all of a trigonometric function (x^2 + y^2 == 1) with matplotlib and python

As an exercise in learning Matplotlib and improving my math/coding I decided to try and plot a trigonometric function (x squared plus y squared equals one).
Trigonometric functions are also called "circular" functions but I am only producing half the circle.
#Attempt to plot equation x^2 + y^2 == 1
import numpy as np
import matplotlib.pyplot as plt
import math
x = np.linspace(-1, 1, 21) #generate np.array of X values -1 to 1 in 0.1 increments
x_sq = [i**2 for i in x]
y = [math.sqrt(1-(math.pow(i, 2))) for i in x] #calculate y for each value in x
y_sq = [i**2 for i in y]
#Print for debugging / sanity check
for i,j in zip(x_sq, y_sq):
print('x: {:1.4f} y: {:1.4f} x^2: {:1.4f} y^2: {:1.4f} x^2 + Y^2 = {:1.4f}'.format(math.sqrt(i), math.sqrt(j), i, j, i+j))
#Format how the chart displays
plt.figure(figsize=(6, 4))
plt.axhline(y=0, color='y')
plt.axvline(x=0, color='y')
plt.grid()
plt.plot(x, y, 'rx')
plt.show()
I want to plot the full circle. My code only produces the positive y values and I want to plot the full circle.
Here is how the full plot should look. I used Wolfram Alpha to generate it.
Ideally I don't want solutions where the lifting is done for me such as using matplotlib.pyplot.contour. As a learning exercise, I want to "see the working" so to speak. Namely I ideally want to generate all the values and plot them "manually".
The only method I can think of is to re-arrange the equation and generate a set of negative y values with calculated x values then plot them separately. I am sure there is a better way to achieve the outcome and I am sure one of the gurus on Stack Overflow will know what those options are.
Any help will be gratefully received. :-)
The equation x**2 + y**2 = 1 describes a circle with radius 1 around the origin.
But suppose you wouldn't know this already, you can still try to write this equation in polar coordinates,
x = r*cos(phi)
y = r*sin(phi)
(r*cos(phi))**2 + (r*sin(phi))**2 == 1
r**2*(cos(phi)**2 + sin(phi)**2) == 1
Due to the trigonometric identity cos(phi)**2 + sin(phi)**2 == 1 this reduces to
r**2 == 1
and since r should be real,
r == 1
(for any phi).
Plugging this into python:
import numpy as np
import matplotlib.pyplot as plt
phi = np.linspace(0, 2*np.pi, 200)
r = 1
x = r*np.cos(phi)
y = r*np.sin(phi)
plt.plot(x,y)
plt.axis("equal")
plt.show()
This happens because the square root returns only the positive value, so you need to take those values and turn them into negative values.
You can do something like this:
import numpy as np
import matplotlib.pyplot as plt
r = 1 # radius
x = np.linspace(-r, r, 1000)
y = np.sqrt(r-x**2)
plt.figure(figsize=(5,5), dpi=100) # figsize=(n,n), n needs to be equal so the image doesn't flatten out
plt.grid(linestyle='-', linewidth=2)
plt.plot(x, y, color='g')
plt.plot(x, -y, color='r')
plt.legend(['Positive y', 'Negative y'], loc='lower right')
plt.axhline(y=0, color='b')
plt.axvline(x=0, color='b')
plt.show()
And that should return this:
PLOT

Resources