how to obtain estimation from regression in excel? - excel

I use datas in excel to produce a graphic.
Then I make a regression, and have an equation. I'd like to know what value would be obtained from the regression (for example, x = 7,6 is the value for which I wanna know an estimation of y).
It is an approximation with a 6 degree polynome.
One wimple method would be this : I have the equation, so I could use it
However, I wondered if there is a fast method to do it? Like I enter 7,6 somewhere to have the result quickly?

if you are looking at a linear regression line (straight line) you could try the forecast formula
=forecast(X, Known Ys, Known Xs)
you could also build your own equation automatically from
=linest(...)
I found the following on a site describing the capabilities of the linest function in excel:
In addition to using LOGEST to calculate statistics for other
regression types, you can use LINEST to calculate a range of other
regression types by entering functions of the x and y variables as the
x and y series for LINEST. For example, the following formula:
=LINEST(yvalues, xvalues^COLUMN($A:$C))
works when you have a single column of y-values and a single column of
x-values to calculate the cubic (polynomial of order 3) approximation
of the form:
y = m1*x + m2*x^2 + m3*x^3 + b
You can adjust this formula to calculate other types of regression,
but in some cases it requires the adjustment of the output values and
other statistics.
or look at:
=trend

Related

How to calculate with the Poisson-Distribution in Matlab?

I’ve used Excel in the past but the calculations including the Poisson-Distribution took a while, that’s why I switched to SQL. Soon I’ve recognized that SQL might not be a proper solution to deal with statistical issues. Finally I’ve decided to switch to Matlab but I’m not used to it at all, my problem Is the following:
I’ve imported a .csv-table and have two columns with values, let’s say A and B (110 x 1 double)
These values both are the input values for my Poisson-calculations. Since I wanna calculate for at least the first 20 events, I’ve created a variable z=1:20.
When I now calculated let’s say
New = Poisspdf(z,A),
it says something like non-scalar arguments must match in size.
Z only has 20 records but A and l both have 110 records. So I’ve expanded Z= 1:110 and transposed it:
Znew = Z.
When I now try to execute the actual calculation:
Results = Poisspdf(Znew,A).*Poisspdf(Znew,B)
I always get only a 100x1 Vector but what I want is a matrix that is 20x20 for each record of A and B (based on my actual choice of z=1:20, I only changed to z=1:110 because Matlab told that they need to match in size).
So in this 20x20 Matrix there should always be in each cell the result of a slightly different calculation (Poisspdf(Znew,A).*Poisspdf(Znew,B)).
For example in the first cell (1,1) I want to have the result of
Poisspdf(0,value of A).*Poisspdf(0,value of B),
in cell(1,2): Poisspdf(0,value of A).*Poisspdf(1,value of B),
in cell(2,1): Poisspdf(1,value of A).*Poisspdf(0,value of B),
and so on...assuming that it’s in the Format cell(row, column)
Finally I want to sum up certain parts of each 20x20 matrix and show the result of the summed up parts in new columns.
Is there anybody able to help? Many thanks!
EDIT:
Poisson Matrix in Excel
In Excel there is Poisson-function: POISSON(x, μ, FALSE) = probability density function value f(x) at the value x for the Poisson distribution with mean μ.
In e.g. cell AD313 in the table above there is the following calculation:
=POISSON(0;first value of A;FALSE)*POISSON(0;first value of B;FALSE)
, in cell AD314
=POISSON(1;first value of A;FALSE)*POISSON(0;first value of B;FALSE)
, in cell AE313
=POISSON(0;first value of A;FALSE)*POISSON(1;first value of B;FALSE)
, and so on.
I am not sure if I completely understand your question. I wrote this code that might help you:
clear; clc
% These are the lambdas parameters for the Poisson distribution
lambdaA = 100;
lambdaB = 200;
% Generating Poisson data here
A = poissrnd(lambdaA,110,1);
B = poissrnd(lambdaB,110,1);
% Get the first 20 samples
zA = A(1:20);
zB = B(1:20);
% Perform the calculation
results = repmat(poisspdf(zA,lambdaA),1,20) .* repmat(poisspdf(zB,lambdaB)',20,1);
% Sum
sumFinal = sum(results,2);
Let me know if this is what you were trying to do.

Using Polynomial Regression in Excel

Suppose I have a set of X's and Y's(both numneric data) which is dynamic i.e. changes based on certain conditions.
Using those two sets of data a scatter plot is prepared with a polynomial trendline of degree 3. I have also called for its equation and now I want to utilise that dynamic equation for my further calculations.
How to use the formula of that trendline in calculations?
From: https://www.ablebits.com/office-addins-blog/2019/01/16/excel-trendline-types-equations-formulas/
By Svetlana Cheusheva
Equation: y = b3x3 + b2x2+ b1x + a
b3: =INDEX(LINEST(y, x^{1,2,3}), 1)
b2: =INDEX(LINEST(y, x^{1,2,3}), 1, 2)
b1: =INDEX(LINEST(y, x^{1,2,3}), 1, 3)
a: =INDEX(LINEST(y, x^{1,2,3}), 1, 4)
One cannot reference the trendline formula on the graph directly, but one can use the formulas above to get the parts needed to use with the regression analysis.

Inverse CDF of Poisson dist in Excel

I want to know is there a function to calculate the inverse cdf of poisson distribution? So that I can use inverse CDF of poisson to generate a set of poisson distributed random number.
A) Inverse CDF of Poisson distribution
The inverse CDF at q is also referred to as the q quantile of a distribution. For a discrete distribution distribution . the inverse CDF at q is the smallest integer x such that CDF[dist,x]≥q.. The Poisson distribution is a discrete distribution that models the number of events based on a constant rate of occurrence. The Poisson distribution can be used as an approximation to the binomial when the number of independent trials is large and the probability of success is small. A common application of the Poisson distribution is predicting the number of events over a specific time, such as the number of cars arriving at a toll plaza in 1 minute.
Formula
The probability mass function (PMF) is:
mean = λ
variance = λ
Notation
Term Description
e base of the natural logarithm
Reference: Methods and Formulas for Inverse Cumulative Distribution Functions
B) Excel Function: Excel provides the following function for the Poisson distribution:
POISSON(x, μ, cum)
where μ = the mean of the distribution and cum takes the values TRUE and FALSE
POISSON(x, μ, FALSE) = probability density function value f(x) at the value x for the Poisson distribution with mean μ.
POISSON(x, μ, TRUE)= cumulative probability distribution function F(x) at the value x for the Poisson distribution with mean μ.
Excel 2010/2013/2016 provide the additional function POISSON.DIST which is equivalent to POISSON.
Reference: Office Support POISSON.DIST Function
C) Excel doesn’t provide a worksheet function for the inverse of the Poisson distribution.
Instead you can use the following function provided by the Real Statistics Resource Pack. It’s a free download for Excel various versions.
POISSON_INV(p, μ) = smallest integer x such that POISSON(x, μ, TRUE) ≥ p
Note that the maximum value of x is 1,024,000,000. A value higher than this indicates an error.
Reference: Real Statistics Using Excel
D)
Reference to MREXCEL.COM web site a query related to your question quoted below seems to be related to your question.
Not sure if anyone can help with this. Basically I'm trying to find out how to apply the reverse of the Poisson function in excel. So as of now I have poisson(x value, mean, true-cumulative) and that lets me get the probability for that occurence. Basically I want to know how I can get the minimum/maximum x value based on a given probability.
So if I have a list of data (700 rows) and I want to find out what the minimum starting value should be given a desired average and the fact that I want the lowest value to be at the 0.05% probability. So 0.05% = (x, 35, True) solve for x. I know I can prob do this with solver, but I am trying to figure out a way to do this formulaicly without having to use the solver (as I may have to use this many times).
The code referred to here covers the inverse of the poisson formula when using True in the excel formula. It does not cover the inverse of the poisson formula when using False in the excel formula.
Re: Reverse Poisson?
Originally Posted by shg
A further mod to accommodate large means:
Code:
Function PoissonInv(Prob As Double, Mean As Double) As Variant
' shg 2011, 2012, 2014, 2015-0415
' For a Poisson process with mean Mean, returns a three-element array:
' o The smallest integer N such that POISSON(N, Mean, True) >= Prob
' o The CDF for N-1 (which is < Prob)
' o The CDF for N (which is >= Prob)
-------Reference :> https://www.mrexcel.com/forum/excel-questions/507508-reverse-poisson-2.html>
E) Why doesn't Excel have a POISSON.INV function?
Discussion on Referred web page have references to some formulas for calculating related information desired by OP.
You could use the following.
With the Poisson mean named lambda, enter the following in an newly inserted worksheet.
A1: =IF(ROWS(A$1:A1)<=4*lambda,POISSON(ROWS(A$1:A1)-1,lambda,1))
Fill A1 down into A2:A1000 (4 times as many rows as your most typical lambda value). Name the A1:A1000 range POISSON.CDF. Then use the formula
=MATCH(n,POISSON.CDF)-1
to give the results a POISSON.INV(n,lambda) function would.
If you want this for varying lambda, use the array formula
=MATCH(n,POISSON(ROW($A$1:INDEX($A:$A,4*lambda+1),lambda,1))-1
Reference Shared Link
Hope That Helps.
=MATCH(RAND(),MMULT((ROW(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+3,1)))=COLUMN(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(1,MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+2))))+0,MMULT((ROW(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+2,1)))=(COLUMN(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(1,MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+1)))+1))+0,POISSON(ROW($A$1:INDEX($A:$A,MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+1))-1,lambda,1)))+(ROW(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+3,1)))=(COLUMN(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(1,1)))+FLOOR(MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+2,1)))+0)-1
It is quite slow for lambda >1000.
This expands on the array formula
=MATCH(C4,POISSON(ROW($A$1:INDEX($A:$A,4*lambda+1)),lambda,1))-1
shared above by skkakkar, by prepending the array with 0 and appending with 1, following Is there a way to concatenate two arrays in Excel without VBA? .
The rest is mostly making the array shorter by replacing 4* lambda with 6* SQRT(lambda).

Excel gives weird R square calculations?

This is really weird. I calculate R^2 values with Excel in two different ways and the results differ hugely. Why?
1) First I use Excel to do a linear regression via a graph, and use the "Add Trendline..." right mouse button functionality to specify Intercept = 0. The R square value shows -3.253. The regressed equation is Y = -0.1321 * X
2) Then I use Excel to do a linear regression via LINEST function. I highlight 5x2 rows and in the top left cell, I type "=LINEST ([Y vector]; [X vector], FALSE, TRUE). The False means the intercept is 0, and the True means Excel should print additional regression statistical information. Then I press CTRL + SHIFT + Enter. This will show me additional statistics, such as R^2 value in the third left cell. Which turns out to be 0.11166. The regressed equation is Y = -0.1321 * X
My question is; what am I doing wrong in calculating R^2 with the graph? Python and statsmodels.api confirms that R^2 is 0.11166, and the regressed equation is Y = -0.1321 * X.
Y =
0.0291970802919708
0.141801551718973
0.145668034655723
0.0691229530946433
0.0431577486597426
0.133618351873374
X =
-0.35551988
-0.20577599
0.10780785
-0.25028796
-0.42762184
0.02442197
Your calculation is correct. Scatter plot does not return correct R^2 when the intercept is 0. This is an formula fo R^2
where
If you use standard regression model, you use average value of y as y̅. But when you assume that the intercept equals 0, you need to set y̅ as zero. If you use the average value of y instead of zero, you get the R^2 = -3.252767.
You can see the calculation here. The SStot wrong column uses average value of y as y̅. Then the R^2 value equals to -3.252767. If you use 0 (as I did in SStot right column), then you get 0.111.
It is an old bug described by Microsoft here:https://support.microsoft.com/en-us/help/829249/you-will-receive-an-incorrect-r-squared-value-in-the-chart-tool-in-excel-2003
You need to use the LINEST function to get correct R^2 value.
Me and my fellow engineers just got tangled up in this. Based on this discussion and what we observed, the R^2 is wrong all of the time except when Excel calculates the best y-intercept. Any other y-intercept (either forced through Zero OR user-defined), is wrong.

How to curve fit data in Excel to a multi variable polynomial?

I have a simple set of data, 10 values that increase.
I want to fit them to a polynomial of the form:
Z = A1 + A2*X + A3*Y + A4*X^2 + A5*X*Y+ A6*Y^2
Where Z the output is the set of data above, A1 - A6 are the coefficients I am looking for,
X is the range of inputs (10 of course), and Y for the moment is a constant value.
How can I curve fit to this polynomial and not the standard 2nd order one that is created using 'trendline'?
Construct a Vandermonde matrix on your data points, find it's inverse with MINVERSE, then apply this to the vector of Z values with MMULT. This would work for polynomial degree n with n data points.
Otherwise you could try polynomial regression, which will again use the Vandermonde matrix.
More math than Excel really.

Resources