Using Polynomial Regression in Excel - excel-formula

Suppose I have a set of X's and Y's(both numneric data) which is dynamic i.e. changes based on certain conditions.
Using those two sets of data a scatter plot is prepared with a polynomial trendline of degree 3. I have also called for its equation and now I want to utilise that dynamic equation for my further calculations.
How to use the formula of that trendline in calculations?

From: https://www.ablebits.com/office-addins-blog/2019/01/16/excel-trendline-types-equations-formulas/
By Svetlana Cheusheva
Equation: y = b3x3 + b2x2+ b1x + a
b3: =INDEX(LINEST(y, x^{1,2,3}), 1)
b2: =INDEX(LINEST(y, x^{1,2,3}), 1, 2)
b1: =INDEX(LINEST(y, x^{1,2,3}), 1, 3)
a: =INDEX(LINEST(y, x^{1,2,3}), 1, 4)
One cannot reference the trendline formula on the graph directly, but one can use the formulas above to get the parts needed to use with the regression analysis.

Related

Raising a number to the power of a range of numbers

The formula I would like to use looks something like this: SUMPRODUCT(x^(1:n),y^(n:1)). n=values in column A. 1:n is the exponents in forward progression from 1 to n in steps of 1. n:1 is the exponents in reverse progression from n to 1 in steps of 1. I would like the formula to be dynamic to fill in column B with the n values based on column A.
Try:
=SUMPRODUCT(5^ROW(1:100))
Or in Excel O365
=SUM(5^ROW(1:100))
As per #RonRosenfeld, a more sturdy solution could be =SUM(5^SEQUENCE(100)) in Excel 365.
EDIT: Based on OP's comments he could use (no O365):
=SUMPRODUCT(5^ROW(A1:INDEX(A:A,COUNTA(A:A))),7^LARGE(ROW(A1:INDEX(A:A,COUNTA(A:A))),ROW(A1:INDEX(A:A,COUNTA(A:A)))))
You can store the powers in a column and use the array formula:
SUM((A1:A100)^$B$1) where A column contains 5 in each cell and B column contains the range of powers you want to use. You can use an array formula in the different cell to get the answer.
Use the SERIESSUM function
The Excel SERIESSUM function returns the sum of a power series, based on the following power series expansion:
Power Series Equation
The syntax of the function is:
SERIESSUM( x, n, m, coefficients )
Where the function arguments are:
x - The input value to the power series.
n - The first power to which x is to be raised.
m - The step size that n is increased by, on each successive power of x.
coefficients - An array of coefficients that multiply each successive power of x.
The number of values in the supplied coefficients array defines the number of terms in the power series. This is illustrated in the following examples.
Example 1:
In the spreadsheet below, the Excel Seriessum function is used to calculate the power series:
5^1 + 5^2 + 5^3 + 5^4 + 5^5
formula: =SERIESSUM( 5, 1, 1, {1,1,1,1,1} )
output = 3905
Example 2:
1 * 2^1 + 2 * 2^3 + 3 * 2^5 + 4 * 2^7 + 5 * 2^9
formula: =SERIESSUM( 2, 1, 2, {1,2,3,4,5} )
output = 3186
I hope this is of help.
An Alternative Answer again. I think the correct for your case :-)
Using the SERIESSUM function allows the use of different coefficients therefore the reason for the use of the coefficients in an array. But because the coefficients are the same then this is simply a geometric progression.
The following formula will do that for you:
=n+n*(n)^(1)*(1-(n)^c)/(1-n)
where "n" is the number (5) and "c" is the number of the series (100)
This becomes:
=5+5*(5)^(1)*(1-(5)^100)/(1-5)
=SUMPRODUCT(5^ROW(A1:INDEX(A:A,COUNTA(A:A))),7^LARGE(ROW(A1:INDEX(A:A,COUNTA(A:A))),ROW(A1:INDEX(A:A,COUNTA(A:A)))))
This formula worked flawlessly!!!
Thank you #JvdV and everyone else for your efforts in helping me! GREATLY APPRECIATED!

Excel Equation of line not correct

I'm hoping someone will be able to tell me why the equation that Excel generated is not giving the correct results as it is graphed correctly.
I have some X and Y points that I will list below. I plotted those points in Excel and then plotted the trend line, and had it show me the equation of the trendline. When I take the equation and then plug in the X values I get very different answers back.
X and Y Values
X Y
0 3
3 2
5 1.4
7 1
10 0.5
18 0.1
When I set the intercept to 3, the equation of the trendline is y = 0.0088x5 - 0.1457x4 + 0.8753x3 - 2.224x2 + 1.4798x + 3
Screenshot of Excel window with equation
Any help is greatly appreciated.
I suspect you didn't set up your graph correctly.
Select a single cell in your table
Insert/Scatter (and decide which you want with regard to markers, etc)
Select the line and add Trendline
Set you parameters for the trendline
If you want to get the formula for the trendline from the "show formula" option, be sure to format the trendline label to be numeric with 15 decimals. Otherwise the equation will certainly not work, even if it appears to be correct.
Note that you can obtain the formula directly using the LINEST worksheet function.
=LINEST(Y,X^{1,2,3,4,5}) returns the array:
{0.0000399230399230442,-0.00152188552188569,0.0192991822991846,-0.0840134680134806,-0.217128427128402,2.99999999999999}
The last value in the array is the y-intercept
The slight differences are due to the use of different algorithms for the two methods.

Excel gives weird R square calculations?

This is really weird. I calculate R^2 values with Excel in two different ways and the results differ hugely. Why?
1) First I use Excel to do a linear regression via a graph, and use the "Add Trendline..." right mouse button functionality to specify Intercept = 0. The R square value shows -3.253. The regressed equation is Y = -0.1321 * X
2) Then I use Excel to do a linear regression via LINEST function. I highlight 5x2 rows and in the top left cell, I type "=LINEST ([Y vector]; [X vector], FALSE, TRUE). The False means the intercept is 0, and the True means Excel should print additional regression statistical information. Then I press CTRL + SHIFT + Enter. This will show me additional statistics, such as R^2 value in the third left cell. Which turns out to be 0.11166. The regressed equation is Y = -0.1321 * X
My question is; what am I doing wrong in calculating R^2 with the graph? Python and statsmodels.api confirms that R^2 is 0.11166, and the regressed equation is Y = -0.1321 * X.
Y =
0.0291970802919708
0.141801551718973
0.145668034655723
0.0691229530946433
0.0431577486597426
0.133618351873374
X =
-0.35551988
-0.20577599
0.10780785
-0.25028796
-0.42762184
0.02442197
Your calculation is correct. Scatter plot does not return correct R^2 when the intercept is 0. This is an formula fo R^2
where
If you use standard regression model, you use average value of y as y̅. But when you assume that the intercept equals 0, you need to set y̅ as zero. If you use the average value of y instead of zero, you get the R^2 = -3.252767.
You can see the calculation here. The SStot wrong column uses average value of y as y̅. Then the R^2 value equals to -3.252767. If you use 0 (as I did in SStot right column), then you get 0.111.
It is an old bug described by Microsoft here:https://support.microsoft.com/en-us/help/829249/you-will-receive-an-incorrect-r-squared-value-in-the-chart-tool-in-excel-2003
You need to use the LINEST function to get correct R^2 value.
Me and my fellow engineers just got tangled up in this. Based on this discussion and what we observed, the R^2 is wrong all of the time except when Excel calculates the best y-intercept. Any other y-intercept (either forced through Zero OR user-defined), is wrong.

how to obtain estimation from regression in excel?

I use datas in excel to produce a graphic.
Then I make a regression, and have an equation. I'd like to know what value would be obtained from the regression (for example, x = 7,6 is the value for which I wanna know an estimation of y).
It is an approximation with a 6 degree polynome.
One wimple method would be this : I have the equation, so I could use it
However, I wondered if there is a fast method to do it? Like I enter 7,6 somewhere to have the result quickly?
if you are looking at a linear regression line (straight line) you could try the forecast formula
=forecast(X, Known Ys, Known Xs)
you could also build your own equation automatically from
=linest(...)
I found the following on a site describing the capabilities of the linest function in excel:
In addition to using LOGEST to calculate statistics for other
regression types, you can use LINEST to calculate a range of other
regression types by entering functions of the x and y variables as the
x and y series for LINEST. For example, the following formula:
=LINEST(yvalues, xvalues^COLUMN($A:$C))
works when you have a single column of y-values and a single column of
x-values to calculate the cubic (polynomial of order 3) approximation
of the form:
y = m1*x + m2*x^2 + m3*x^3 + b
You can adjust this formula to calculate other types of regression,
but in some cases it requires the adjustment of the output values and
other statistics.
or look at:
=trend

How to curve fit data in Excel to a multi variable polynomial?

I have a simple set of data, 10 values that increase.
I want to fit them to a polynomial of the form:
Z = A1 + A2*X + A3*Y + A4*X^2 + A5*X*Y+ A6*Y^2
Where Z the output is the set of data above, A1 - A6 are the coefficients I am looking for,
X is the range of inputs (10 of course), and Y for the moment is a constant value.
How can I curve fit to this polynomial and not the standard 2nd order one that is created using 'trendline'?
Construct a Vandermonde matrix on your data points, find it's inverse with MINVERSE, then apply this to the vector of Z values with MMULT. This would work for polynomial degree n with n data points.
Otherwise you could try polynomial regression, which will again use the Vandermonde matrix.
More math than Excel really.

Resources