This is an extension to the question asked in the forums a few years ago:
Excel produces scatter diagrams for sets of pair values. It also gives the option of producing a best fit trendline and formula for the trendline. It also produces bubble diagrams which take into consideration a weight provided with each value. However, the weight has no influence on the trendline or formula. Here is an example set of values, with their mappings and weights.
Value Map Weight
0 1 10
1 2 10
2 5 10
3 5 20
4 6 20
5 1 1
I have used the formula that brettDJ offered:
=INDEX(LINEST(B2:B7*C2:C7^0.5,IF({1,0},1,A2:A7)*C2:C7^0.5,TRUE,TRUE),3,1)
However, I could not understand why we used the ^0.5 here to sqrt the weights.
The original question is here
Related
I tried to do a regression analysis with some 91 data points. When I did the regression analysis initially, I got R value as 0.366733. Later I sorted the datapoints from smallest to largest and then did the regression analysis. My new R value is 0.04323. Does the order in which the original data points are arranged influence the regression analysis
The ordering of paired datapoints does not matter in regression
For example:
5 9
6 1
3 7
9 5
6 4
Gives a correlation (which is the same as standardized regression) of -0.37
If I reorder the entire data based on column 1 values:
3 7
5 9
6 1
6 4
9 5
I get the same correlation of -0.37. Notice that the pairs are still aligned, i.e. both columns are being sorted together
But in Excel its very easy to get into a situation like the following, where you're sorting by only a single column. Meaning one column will be the ordered, but the pair alignment is broken because the second column doesnt change:
3 9
5 1
6 7
6 5
9 4
Now I get a correlation of -0.41. The pairs of data are no longer aligned and effectively makes this a completely different dataset than before
Bottom line: when youre sorting in Excel make sure you've selected all of your data for the sort and not just a single column
I have been trying to create a windrose that displays the occurence of multiple wind speeds and their respective wind direction. Using other very helpful posts on here I've gotten pretty close to what I want. There is just one thing I can't seem to fix.
As you can see in the figure below the graph starts at 0 degrees while I want the "North" wind direction to start at -11,25 (or +348,75) degrees.
Currently the radial axis labels are added using a pie chart while the rest of the data is plotted in a filled radar chart. It is easy to rotate the pie chart but I can't seem to find a similar function for rotating the radar chart. Any help would be much appreciated. The excel file is attached beneath the figure.
EDIT: Locked excel file against editing
Excel file
I haven't fully digested the netiquette of this website and not sure if it is a good idea to try giving you an answer 6+ months after you posted. Also hope that by this time you found an answer.
If not, this link should be of help:
https://superuser.com/questions/687036/how-to-make-a-pie-radar-chart
In the example the creator made one field for each degree and started the first series, which would be equivalent to your north at 0°. However nothing prevents you from starting at 348.
I have not tested but I also think that nothing prevents you from adding even more "resolution", e.g. half-degree steps.. or even more to your discretion.
EDIT: following L.Guthardt's feedback.
In order to provide you an answer I opted to simplify your table and chart. Mostly for convenience, but also because I struggle to get a full understanding of the original "architecture". Still, the solution should work at any level and is based on two key elements:
first you will have to double the number of rows from 16 to 32 (thus each quadrant being repeated two times, e.g. ... nne - nne - ne - ne...)
second, you have to start and finish with N as showcased here
Direction Cat6
N 6
NNE 4 4
NNE 6
NE 4 4
NE 6
ENE 4 4
ENE 6
E 4 4
E 6
ESE 4 4
ESE 6
SE 4 4
SE 6
SSE 4 4
SSE 6
S 4 4
S 6
SSW 4 4
SSW 6
SW 4 4
SW 6
WSW 4 4
WSW 6
W 4 4
W 6
WNW 4 4
WNW 6
NW 4 4
NW 6
NNW 4 4
NNW 6
N 4 4
which will generate
for the pie chart I used a separate range with alternate gaps in the labels
Direction Dummy
N 1
1
NNE 1
1
NE 1
1
ENE 1
1
E 1
1
ESE 1
1
SE 1
1
SSE 1
1
S 1
1
SSW 1
1
SW 1
1
WSW 1
1
W 1
1
WNW 1
1
NW 1
1
NNW 1
1
Rotating radar charts in Excel can be achieved by building a separate table for plotting the chart. It would have three columns:
Column A: New categories
Column B: Original categories (calculated from A)
Column C: Original data using VLOOKUP() on B
The chart will be plotted using columns B and C. Column B category numbers are offset by the desired number of categories.
If the chart needs to be rotated by other than multiples of a category degree (e.g., 30 degrees for 12 categories), you would need to add rows in between (corresponding to the amount of rotation in relation to the category degree). For example, to rotate a 12-category radar chart by multiples of 15 degrees, one extra row is needed in-between each original category row (to create 24 new categories). In this case, you would need to calculate the intermediate values by linearly interpolating between actual data points.
The trick is that blank category values are not displayed on the chart and the values for these categories blend in smoothly with the real data (because they are interpolated).
I will post an example if the above is not clear enough.
P.S. I cannot look at your new Excel file (in Answers) because it exceeds 5 MB (see screenshot 1).
So I did keep working on this problem and the best solution I've come up with (while using Microsoft Excel) looks as follows:
Currently, the number of sectors in the plot is fixed at 16. If I want to make this number variable, the table required for the plot data requires a very large amount of lookup functions which make the spreadsheet too slow to work with.
I've uploaded the new Excel file here to take a look at:
Excel file
Suppose I had two time series consisting of weekly data points, and I want to compute the covariance of the time series for the last n weeks using the covariance function in Excel.
Would it be possible to set this scenario up in such a way that a certain cell contains the number of weeks of data I want to compute the covariance for?
That is, changing the cell element to k would lead to the already computed covariance for n weeks to change to the covariance of the data series for the last k weeks?
You decided that sample data was not important so here is some.
date nmbr
03-30-2017 4
04-04-2017 4
04-07-2017 2
04-09-2017 2
04-12-2017 1
04-15-2017 4
04-18-2017 1
04-21-2017 2
04-24-2017 1
04-26-2017 3
04-30-2017 4
05-02-2017 5
05-07-2017 4
05-09-2017 2
05-10-2017 1
05-12-2017 5
05-14-2017 4
My crystal ball tells me that this question is not so much about Excel's COVARIANCE.P or COVARIANCE.S but about limiting date related data. To this end, I'll simply SUM 4 weeks of data.
The formulas needed in E2:H2 (see supplied image) are:
=TODAY()
4
=FLOOR(E2-(F2*7), 7)+1
=SUM(INDEX(B:B, MATCH(G2, A:A)+ISNA(MATCH(G2, A:A, 0))):INDEX(B:B, MATCH(1E+99, A:A)))
Note that the dates are in ascending order.
I am trying to find the equation of a plane of best fit to a set of x,y,z data using the LINEST function. Some of the z data is missing, meaning that there are #N/As in the z column. For example:
A B C
(x) (y) (z)
1 1 1 5.1
2 2 1 5.4
3 3 1 5.7
4 1 2 #N/A
5 2 2 5.2
6 3 2 5.5
7 1 3 4.7
8 2 3 5
9 3 3 5.3
I would like to do =LINEST(C1:C9,A1:B9), but the #N/A causes this to return a value error.
I found a solution for a single independent variable (one column of known_x's, i.e. fitting a line to x,y data), but I have not been able to extend it for two independent variables (two known_x's columns, i.e. fitting a plane to x,y,z data). The solution I found is here: http://www.excelforum.com/excel-general/647448-linest-question.html, and the formula (slightly modified for my application) is:
=LINEST(
N(OFFSET(C1:C9,SMALL(IF(ISNUMBER(C1:C9),ROW(C1:C9)-ROW(C1)),
ROW(INDIRECT("1:"&COUNT(C1:C9)))),0,1)),
N(OFFSET(A1:A9,SMALL(IF(ISNUMBER(C1:C9),ROW(C1:C9)-ROW(C1)),
ROW(INDIRECT("1:"&COUNT(C1:C9)))),0,1)),
)
which is equivalent to =LINEST(C1:C9,A1:A9), ignoring the row containing the #N/A.
The formula from the posted link could probably be adapted but it is unwieldy. Least squares with missing data can be viewed as a regression with weight 1 for numeric values and weight 0 for non-numeric values. Based on this observation you could try this (with Ctrl+Shift+Enter in a 1x3 range):
=LINEST(IF(ISNUMBER(C1:C9),C1:C9,),IF(ISNUMBER(C1:C9),CHOOSE({1,2,3},1,A1:A9,B1:B9),),)
This gives the equation of the plane as z=-0.2x+0.3y+5 which can be checked against the results of using LINEST(C1:C8,A1:B8) with the error row removed.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have an excel spreadsheet with
x y
0 -1.5
100 1.6
200 0
300 -6.8
400 -19.8
500 -39.9
I want to find the values where x = 600 through 1500. I have tried making a graph and using the trend line and getting Polynomial 2, and it returns
y = -2.8857x2 + 12.686x - 11.7
R² = 0.999
So i plug this into my calculation using
=-2.8857*A110*A110+12686*A110-11.7
where A110 is the value 600, but it answers
6572736.3
I'm no math major, but in a trend of -6.8,-19.8,-39.9, the next number is not 6572736.3
Can someone please tell me how to figure out the equation so I can complete the series of numbers?
I concur with #mkingston (see output below**).
I'd add two points:
1) I find it is always a good idea to plot the original data and the regression equation before doing anything with the equation. In this case, plotting #mkingston's result gives:
... which shows that #mkingston's fitted results (shown by the lines) are, in fact, a good fit to the original data.
2) Extrapolation is always hazardous. If you already have a very good reason to believe that the underlying function is a quadratic of the form that we've fitted here, then the fit results below indicate the uncertainty in the parameters and hence can be used to estimate the uncertainty in the prediction (which may be quite substantial once you extrapolate to x = 1500). If, on the other hand, the quadratic equation that we've fitted is just a convenient shape that fits the data range that is available to us, then there are many alternative functions that could fit the available data roughly as well as this quadratic does, but would predict wildly different values for the range x = 600 to 1500. In this latter case, I'd descrbe any prediction at x = 600 as very uncertain and any prediction beyond that point as highly speculative, at best.
**The output I get from the Data | Data Analysis | Regression function of Excel 2007 is (after I've edited to change "X Variable" to "X" and "X Variable 2" to "X^2" for clarity):
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.999516468
R Square 0.99903317
Adjusted R Square 0.998388617
Standard Error 0.647338875
Observations 6
ANOVA
df SS MS F Significance F
Regression 2 1299.01619 649.5080952 1549.9625 3.00625E-05
Residual 3 1.257142857 0.419047619
Total 5 1300.273333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercep -1.9 0.586700679 -3.238448611 0.047907326 -3.767143409 -0.032856591 -3.767143409 -0.032856591
X 0.069142857 0.005518676 12.52888554 0.00109613 0.051579968 0.086705 746 0.051579968 0.086705746
X^2 -0.000288571 1.05946E-05 -27.23767444 0.000108607 -0.000322288 -0.000254855 -0.000322288 -0.000254855