FORECAST / TREND with *really* simple data Excel - GSheets - excel

I'm tracking something extremely simple; a week number, and body weight.
I can't get Excel or Google Sheets to use either of those functions to predict what the weight will be for the next 4 weeks.
I have a chart like this
1 185
2 184.3
3 186
4 189
5 183
6 186
7 188
etc
I need a prediction of 8, 9, 10, 11.
I've tried =FORECAST(X57,Y53:Y56,X53:X56) where the x57 is the next week number, but what happens is Excel/sheets starts counting at a lower number than the last weight. I got a weird negative number with TREND. I know I'm not doing something right. I tried switching the ranges too.
I've inserted a screenshot from Sheets.
I feel really stupid because I should be able to figure this out but it's been over an hour of scratching my head and getting frustrated. I don't have Excel 2016 with the Forecast graph function.
Am I doing this right, but because the numbers go up and down this is Excel/Sheet's best guess?

I placed your data in G1 through H7:
2nd Order Polynomial Trendline
Equation: y = (A * x2) + (B * x ) + C
A =INDEX(LINEST(y,x^{1,2}),1)
B =INDEX(LINEST(y,x^{1,2}),1,2)
C =INDEX(LINEST(y,x^{1,2}),1,3)
So in I1 through I3, enter these equations for the coefficients A, B and C:
=INDEX(LINEST(H1:H7,G1:G7^{1,2}),1)
=INDEX(LINEST(H1:H7,G1:G7^{1,2}),1,2)
=INDEX(LINEST(H1:H7,G1:G7^{1,2}),1,3)
Then enter 8 through 11 in column G. Then H8 enter:
=$I$1*G8^2+$I$2*G8+$I$3
and copy down:

With the data in A1:B7 you could either chart that, set a trendline (linear seems reasonable) and pick up the formula from the chart:
=0.3357*(A1)+184.56
(in C1 and copied down for comparison) or apply this in B8 and copy down:
=FORECAST(A8,B$1:B$7,A$1:A$7)
The known points are used for the chart on the left and the known points plus FORECAST ones for the chart on the right:

I know you don't have Excel 2016, but here is how it would look if you insert a forecast from Excel's Data menu.
Select the cells and then click Data > Forecast Sheet. Change the Forecast End to 11. I left the Options at the defaults as displayed.
Next click Create and you will see the formulas for the forecast and the data in an Excel Table. You can inspect the FORECAST functions used, for example in column C:
=FORECAST.ETS(A9,$B$2:$B$8,$A$2:$A$8,1,1)

Related

Sum of the greatest value in one column, plus the sum of the other values in another column

Consider the following sheet/table:
A B
1 90 71
2 40 25
3 60 16
4 110 13
5 87 82
I want to have a general formula in cell C1 that sums the greatest value in column A (which is 110), plus the sum of the other values in column B (which are 71, 25, 16 and 82). I would appreciate if the formula wasn't an array formula (as in requiring Ctrl + Shift + Enter). I don’t have Office 365, I have Excel 2019.
My attempt
Getting the greatest value in column A is easy, we use MAX(A1:A5).
So the formula I want in cell C1 should be something like:
=MAX(A1:A5) + SUM(array_of_values_to_be_summed)
Obtaining the values of the other rows in column B (what I called array_of_values_to_be_summed in the previous formula) is the hard part. I've read about using INDEX, MATCH, their combination, and obtaining arrays by using parenthesis and equal signs, and I've tried that, without success so far.
For example, I noticed that NOT((A1:A5 = MAX(A1:A5))) yields an array/list containing ones (or TRUEs) for the relative position of the rows to be summed, and containing a zero (or FALSE) for the relative position of the row to be omitted. Maybe this is useful, I couldn't find how.
Any ideas? Thanks.
Edit 1 (solution)
I managed to obtain what I wanted. I simply multiplied the array obtained with the NOT formula, by the range B1:B5. The final formula is:
=MAX(A1:A5) + SUM(NOT((A1:A5 = MAX(A1:A5))) * B1:B5)
Edit 2 (duplicate values)
I forgot to explain what the formula should do if there are duplicates in column A. In that case, the first term of my final formula (the term that has the MAX function) would be the one whose corresponding value in column B is smallest, and the value in column B of the other duplicates would be used in the second term (the one containing the SUM function).
For example, consider the following sheet/table:
A B
1 90 71
2 110 25
3 60 16
4 110 13
5 110 82
Based on the above table, the formula should yield 110 + (71 + 25 + 16 + 82) = 304.
Just to give context, the reason I want such a formula is because I’m writing a spreadsheet that automatically calculates the electric current rating of the short-circuit protective device of the feeder of a group of electric motors in a house or building or mall, as required by the article 430.62(A) of the US National Electrical Code. Column A is the current rating of the short-circuit protective device of the branch-circuit of each motors, and column B is the full-load current of each motor.
You can use this formula
=MAX(A1:A5)
+SUM(B1:B5)
-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)
Based on #Anupam Chand's hint for max-value-duplicates there could also be min-value-duplicates in column B for corresponding max-value-duplicates in column A. :) This formula would account for that
=SUM(B1:B5)
+(MAX(A1:A5)-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1))
*SUMPRODUCT((A1:A5=MAX(A1:A5))*(B1:B5=AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)))
Or with #Anupam Chand's shorter and better readable and overall better style :)
=SUM(B1:B5)
+(MAX(A1:A5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
*COUNTIFS(A1:A5,MAX(A1:A5),B1:B5,MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
The explanation works for bot solutions:
The SUM-part just sums the whole list.
The second line gets the max-value for column A and the corresponding min-value of column B for the max-values in column A and adds or subtracts it respectively.
The third line counts, how many times the corresponding min-value for the max-value occurs and multiplies it with the second line.
Can you try this ?
=MAX(A1:A5)+SUM(B1:B5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5))
What we're doing is adding the max of A to all rows of B and then subtracting the min value of B where A is the max.
If you have Excel 365 you can use the following LET-Formula
=LET(A,A1:A5,
B,B1:B5,
MaxA,MAX(A),
MinBExclude, MINIFS(B,A,MaxA),
sumB1,SUMPRODUCT(B*(A=MaxA)*(B<>MinBExclude)),
sumB2,SUMPRODUCT(B*(A<>MaxA)),
MaxA +sumB1+sumB2
A and B are shortcuts for the two ranges
MaxA returns the max value for A (110)
MinBExclude filters the values of column B by the MaxA-value (25, 13, 82) and returns the min-value of the filtered result (13)
sumB1 returns the sum of the other MaxA values from column B (26 + 82)
sumB2 returns the sum of the values from B where value in A <> MaxA (71 + 60)
and finally the result is returned
If you don't have Excel 365 you can add helper columns for MaxA, MinBExclude, sumB1 and sumB2 and the final result

Excel Array formula to count moving average outliers

I've tried a few things on this and settled on a 'cheap' solution. Wanted to know if this can be done directly and more elegantly.
Problem Statement and Sample Data
Assume we have a table in excel with ~200 columns and a large number of rows (~10k).
Sample Data:
identifier
val1
val2
val3
...
val200
ID_1
100
102
34
...
89
We want to add a column at the end that shows us how many "moving average" outliers exist. A moving average outlier is defined as a point that is outside the range (mean - 2 * std deviations, mean + 2 * std deviations), where the mean and std dev is calculated using the previous 10 values (therefore its a moving average outlier).
We will not test the first 10 values. But from val11, the previous 10 values will be used to form the window and we want to test if the value is an outlier.
My Solution so far
I created another table of same dimensions as the original. In cells from val11 (to val200, for all columns), I put in the formula below in the new table. And then, I can simply sum the columns in each row in the new table.
Assume val11 is on X2 in the "shocks" worksheet (for first row):
=IF(OR(shocks!X2<AVERAGEA(shocks!D2:W2)-2STDEVA(shocks!D2:W2),shocks!X2>AVERAGEA(shocks!D2:W2)+2STDEVA(shocks!D2:W2)),1,"")
But if possible, I want to avoid having a second table since it bloats and slows down the file. Any help would be greaty appreciated

Google Sheets Arithmetic Search

I have two Google sheets tabs:
I.)
--A-- --B--
--1-- type lessThan10Apart
--2-- Car 1
--3-- Plane 0
II.)
--A-- --B-- --C--
--1-- type sourceA sourceB
--2-- Car 1 100
--3-- Plane 10 100
--4-- Car 2 4
My question is how to create the lessThan10Apart formula above. lessThan10Apart should match up the type from sheet I to sheet II and only count the rows that: Are less than 10 units between A and B. But you can also imagine wanting to do any kind of arithmetic between columns B and C and running a COUNT.
My first attempt is something along the lines of:
=COUNTIFS('sheetII'!A:A),$A2, //Match column A
ABS('sheetII'!C:C-'sheetII'!B:B)<10 //Doesn't work!
)
The problem is that you can't seem to be able to do range calculations like this in COUNTIFS.
For the count (per F4 in supplied image),
=SUMPRODUCT(--(ABS(B2:B4-C2:C4)<10))
For the validSum (sum of absolute difference between B & C; per G4 in supplied image),
=SUMPRODUCT(--(ABS(B2:B4-C2:C4)<10), ABS(B2:B4-C2:C4))
Do not use full column references. Minimize your referenced ranges.
Discard the Car text in E4 in the above image.

Excel - Multiply Until Total Reached

I want to multiply x*y until x>=20, then multiply z that value and have the results displayed as two values, the multiple and multiple*z
The question behind the formula is, how many boxes of x capacity do I need to have a total capacity of 20 liters and how much does that cost.
x = volume of bottle
y = number of bottles in a box
z = price per box
This could be done very easily by hand, but I've been playing (with little effect) in excel for a while and would like a solution.
I hope that makes sense
I rather think what you would like is the formula provided by #Jeeped but for:
I want to multiply x*y until x>=20, then multiply z that value and have the results displayed as two values, the multiple and multiple*z
label two arrays from 1 to 20 for columns and rows as shown, populate V1 with the price per box and in B2:
=IF(AND($A2*B$1>20,A2>=20),"",$A2*B$1)
and in X2:
=IF(B2="","",$V$1*B2)
with both formulae copied across 19 columns and those two sets of 20 formulae then copied down 19 rows. The result should be similar to:

Conditional Interpolation Excel

I have the following excel setup that is extremely massive but here is a simplified setup:
Site1 X-Given Y-Given Site2 X-New-Given Y-Interpolated
A 10 400 A 25 550
A 20 500 A 25 550
A 30 600 A 26 560
A 40 700 B 27 570
A 50 800 B 30 600
B 10 400 B 15 450
B 20 500 B 25 550
B 30 600 B 30 600
What I'm trying to accomplish is to have each Y-Interpolated only interpolate based upon its specific site and not have any cross over. So site A would only interpolate with site A, and same with site B... so on and so forth.
I'm using the interpolate excel addin which has the following syntax:
=interpolate(x_array,y_array,x_given)
Thanks for the help!
You could try this worksheet function alternative... with data in A1:E9, enter this in F2 and fill down:
=FORECAST(E2,IF(MMULT(ROW(B$2:B$9)-LOOKUP(0,(B$2:B$9>=E2)/(A$2:A$9=D2),ROW(B$2:B$9))-0.5,1)^2<1,C$2:C$9),B$2:B$9)
Update: Here's a slightly shorter alternative entered with CTRL+SHIFT+ENTER
=PERCENTILE(IF(A$2:A$9=D2,C$2:C$9),PERCENTRANK(IF(A$2:A$9=D2,B$2:B$9),E2,20))
This assumes a positive relationship between variables and returns values at both boundaries.
Background
If you're going to use worksheet functions for this, the obvious approach is to find the neighboring two points to X: (X1,Y1) and (X2,Y2). Then calculate Y using:
Y = Y1 + (X - X1) * (Y2 - Y1) / (X2 - X1)
The problem is that this leads to a lengthy formula involving six INDEX/MATCH combinations and six more conditions for restricting data to the specified site. This leads one to look for other options...
1. The first formula looks complicated but all it's doing is applying a straight line fit based on the two neighboring points for the same site. Evaluating the formula for the third row above - by highlighting each part of the formula and pressing F9 - gives:
=FORECAST(26,{FALSE;500;600;FALSE;...},{10;20;30;40;...})
FORECAST ignores non-numeric data so the result is the same as just using {500,600} and {20,30} for the 2nd and 3rd arguments. You can use F9 on other parts of the formula to break it down further - I'll leave details to you. (The MMULT(...,1) part just changes the argument to an array so you can enter the formula without array-entry.)
2. The second formula is easier to follow. First note that in Excel percentiles are calculated by linear interpolation and the IF part is just restricting the numeric data to the specified site. Assuming data is increasing it follows that we can find the k-value in the PERCENTILE formula that matches the lookup value in the x-range and return the y-range value with that k-value. For the example in question:
26 =PERCENTILE({10,20,30,40,50},0.4)
560 =PERCENTILE({400,500,600,700,800},0.4)
To calculate the value of 0.4 the PERCENTRANK can be used which is inverse to PERCENTILE:
0.4 =PERCENTRANK({10,20,30,40,50},26)
0.4 =PERCENTRANK({400,500,600,700,800},560)
The formula above follows by combining these two functions, the last argument is set to 20 for full precision (Excel stores values internally to around 15-17 digits of precision).
Because the tool that you're using is based on a .xll add in for excel, you(or we) can not modify the code or create a custom version of interpolate that allows adding conditions.
Instead, you'll have to filter your data apart and then run the custom-function on the filtered datasets.

Resources