Conditional Interpolation Excel - excel

I have the following excel setup that is extremely massive but here is a simplified setup:
Site1 X-Given Y-Given Site2 X-New-Given Y-Interpolated
A 10 400 A 25 550
A 20 500 A 25 550
A 30 600 A 26 560
A 40 700 B 27 570
A 50 800 B 30 600
B 10 400 B 15 450
B 20 500 B 25 550
B 30 600 B 30 600
What I'm trying to accomplish is to have each Y-Interpolated only interpolate based upon its specific site and not have any cross over. So site A would only interpolate with site A, and same with site B... so on and so forth.
I'm using the interpolate excel addin which has the following syntax:
=interpolate(x_array,y_array,x_given)
Thanks for the help!

You could try this worksheet function alternative... with data in A1:E9, enter this in F2 and fill down:
=FORECAST(E2,IF(MMULT(ROW(B$2:B$9)-LOOKUP(0,(B$2:B$9>=E2)/(A$2:A$9=D2),ROW(B$2:B$9))-0.5,1)^2<1,C$2:C$9),B$2:B$9)
Update: Here's a slightly shorter alternative entered with CTRL+SHIFT+ENTER
=PERCENTILE(IF(A$2:A$9=D2,C$2:C$9),PERCENTRANK(IF(A$2:A$9=D2,B$2:B$9),E2,20))
This assumes a positive relationship between variables and returns values at both boundaries.
Background
If you're going to use worksheet functions for this, the obvious approach is to find the neighboring two points to X: (X1,Y1) and (X2,Y2). Then calculate Y using:
Y = Y1 + (X - X1) * (Y2 - Y1) / (X2 - X1)
The problem is that this leads to a lengthy formula involving six INDEX/MATCH combinations and six more conditions for restricting data to the specified site. This leads one to look for other options...
1. The first formula looks complicated but all it's doing is applying a straight line fit based on the two neighboring points for the same site. Evaluating the formula for the third row above - by highlighting each part of the formula and pressing F9 - gives:
=FORECAST(26,{FALSE;500;600;FALSE;...},{10;20;30;40;...})
FORECAST ignores non-numeric data so the result is the same as just using {500,600} and {20,30} for the 2nd and 3rd arguments. You can use F9 on other parts of the formula to break it down further - I'll leave details to you. (The MMULT(...,1) part just changes the argument to an array so you can enter the formula without array-entry.)
2. The second formula is easier to follow. First note that in Excel percentiles are calculated by linear interpolation and the IF part is just restricting the numeric data to the specified site. Assuming data is increasing it follows that we can find the k-value in the PERCENTILE formula that matches the lookup value in the x-range and return the y-range value with that k-value. For the example in question:
26 =PERCENTILE({10,20,30,40,50},0.4)
560 =PERCENTILE({400,500,600,700,800},0.4)
To calculate the value of 0.4 the PERCENTRANK can be used which is inverse to PERCENTILE:
0.4 =PERCENTRANK({10,20,30,40,50},26)
0.4 =PERCENTRANK({400,500,600,700,800},560)
The formula above follows by combining these two functions, the last argument is set to 20 for full precision (Excel stores values internally to around 15-17 digits of precision).

Because the tool that you're using is based on a .xll add in for excel, you(or we) can not modify the code or create a custom version of interpolate that allows adding conditions.
Instead, you'll have to filter your data apart and then run the custom-function on the filtered datasets.

Related

FORECAST / TREND with *really* simple data Excel - GSheets

I'm tracking something extremely simple; a week number, and body weight.
I can't get Excel or Google Sheets to use either of those functions to predict what the weight will be for the next 4 weeks.
I have a chart like this
1 185
2 184.3
3 186
4 189
5 183
6 186
7 188
etc
I need a prediction of 8, 9, 10, 11.
I've tried =FORECAST(X57,Y53:Y56,X53:X56) where the x57 is the next week number, but what happens is Excel/sheets starts counting at a lower number than the last weight. I got a weird negative number with TREND. I know I'm not doing something right. I tried switching the ranges too.
I've inserted a screenshot from Sheets.
I feel really stupid because I should be able to figure this out but it's been over an hour of scratching my head and getting frustrated. I don't have Excel 2016 with the Forecast graph function.
Am I doing this right, but because the numbers go up and down this is Excel/Sheet's best guess?
I placed your data in G1 through H7:
2nd Order Polynomial Trendline
Equation: y = (A * x2) + (B * x ) + C
A =INDEX(LINEST(y,x^{1,2}),1)
B =INDEX(LINEST(y,x^{1,2}),1,2)
C =INDEX(LINEST(y,x^{1,2}),1,3)
So in I1 through I3, enter these equations for the coefficients A, B and C:
=INDEX(LINEST(H1:H7,G1:G7^{1,2}),1)
=INDEX(LINEST(H1:H7,G1:G7^{1,2}),1,2)
=INDEX(LINEST(H1:H7,G1:G7^{1,2}),1,3)
Then enter 8 through 11 in column G. Then H8 enter:
=$I$1*G8^2+$I$2*G8+$I$3
and copy down:
With the data in A1:B7 you could either chart that, set a trendline (linear seems reasonable) and pick up the formula from the chart:
=0.3357*(A1)+184.56
(in C1 and copied down for comparison) or apply this in B8 and copy down:
=FORECAST(A8,B$1:B$7,A$1:A$7)
The known points are used for the chart on the left and the known points plus FORECAST ones for the chart on the right:
I know you don't have Excel 2016, but here is how it would look if you insert a forecast from Excel's Data menu.
Select the cells and then click Data > Forecast Sheet. Change the Forecast End to 11. I left the Options at the defaults as displayed.
Next click Create and you will see the formulas for the forecast and the data in an Excel Table. You can inspect the FORECAST functions used, for example in column C:
=FORECAST.ETS(A9,$B$2:$B$8,$A$2:$A$8,1,1)

Excel IF OR Statement

I am having trouble determining the correct way to calculate a final rank order for four categories. Each of the four metrics make up a higher group. A Top 10 of each category is applied to the respective product to risk analysis.
CURRENT LOGIC - Assignment of 25% max per category.
Columns - Y4
Parts
0.25
25
=IF(L9=1,$Y$4,IF(L9=2,$Y$4*0.9, IF(L9=3,$Y$4*0.8, IF(L9=4,$Y$4*0.7, IF(L9=5,$Y$4*0.6, IF(L9=6,$Y$4*0.5, IF(L9=7,$Y$4*0.4, IF(L9=8,$Y$4*0.3, IF(L9=9,$Y$4*0.2, IF(L9=10,$Y$4*0.1,0))))))))))
DESIRED...
I would like to use a statement to determine three criteria in order to apply a score (1=100, 2=90, 3=80, etc..).
SUM the rank positions of each of the four categories-apply product rank ascending (not including NULL since it's not in the Top 10)
IF a product is identified in more than one metric-apply a significant contribution weight of (*.75),
IF a product has the number 1 rank in any of the four metrics-apply a score of (100).
Data - UPDATED EXAMPLE
(Product) Parts Labor Overhead External Final Score
"XYZ" 3 1 7 7 100
"ABC" NULL 6 NULL 2 100
"LMN" 4 NULL NULL NULL 70
This is way beyond my capability. ANY assistance is appreciated greatly!!!
Jim
I figured this is a good start and I can alter the weight as needed to reflect the reality of the situation.
=AVERAGE(G28:I28)+SUM(G28:I28)*0.25
However, I couldn't figure out how to put a cap on the score of no more than 100 points.
I am still unclear of what exactly you are attempting and if this will work, but how about this simple matrix using an array formula and some conditional formatting.
Array Formula in F2 (make sure to press Ctrl+Shift+Enter when exiting formula edit mode)
=MIN(100,SUM(IF(B2:E2<>"NULL",CHOOSE(B2:E2,100,90,80,70,60,50,40,30,20,10))))
Conditional Formatting defined as shown below.
Red = 100 value where it comes from a 1
Yellow = 100 value where it comes from more than 1 factor, but without a 1.

Excel - Multiply Until Total Reached

I want to multiply x*y until x>=20, then multiply z that value and have the results displayed as two values, the multiple and multiple*z
The question behind the formula is, how many boxes of x capacity do I need to have a total capacity of 20 liters and how much does that cost.
x = volume of bottle
y = number of bottles in a box
z = price per box
This could be done very easily by hand, but I've been playing (with little effect) in excel for a while and would like a solution.
I hope that makes sense
I rather think what you would like is the formula provided by #Jeeped but for:
I want to multiply x*y until x>=20, then multiply z that value and have the results displayed as two values, the multiple and multiple*z
label two arrays from 1 to 20 for columns and rows as shown, populate V1 with the price per box and in B2:
=IF(AND($A2*B$1>20,A2>=20),"",$A2*B$1)
and in X2:
=IF(B2="","",$V$1*B2)
with both formulae copied across 19 columns and those two sets of 20 formulae then copied down 19 rows. The result should be similar to:

Need to make a calculation based on multiple lookup results on another Excel sheet

Need to try and get a result based on possible 3 lookups in Excel.
I have a price for a certain size hire vehicle and need to check to see if I want to add in a supplement or not based on entry into a cell in another sheet.
I have a sheet called Keys that has the criteria I base my calculations on and a second sheet I have the rates loaded for all the vehicle sizes available, cars to coaches. I would like to calculate the supplement for customers of the move to a larger vehicle or even a reduction dependent on what I choose.
Keys data is:
Vehicle Sizes
Range # Seats Rate Column Supplement Range to work on
1 4 R N
2 7 S Y 1
3 16 T N
4 24 U Y 5
5 29 V N
6 35 W N
7 45 X N
So for example if the I have chosen to calculate the supplement on the 7 seater then I want to calculate the difference between the 7 seater and 4 seater and that is my supplement. I have also chosen to calculate the reduction between the 29 and 24 seater vehicles.
Am trying to figure out how to combine multiple IF and LOOKUP, if they are correct or not.
So basically IF I have a Y in the supplement column on Keys then calculate the difference in the rates based on the Rate Column based on the Range to work on.
Any suggestions or help appreciated
Sorry think I forgot about the actual rates. They are stored on another sheet as per below. the charges are per service, like an airport transfer etc., they are in VN Dong so thats why they are in the 100,000 + range.
R S T U V W X
Rate with Surcharge
4 7 16 24 29 35 45
340000 373000 394000 735000 780000 1050000 1210000
I have tried to tweak the answer from pnuts but getting a bit lost, note sure if I need the MATCH in the formula of not.
I doubt this will suit but it may help to clarify your requirements:
=IF($D2="N","",INDEX(Sheet2!$Q$2:$X$4,MATCH(F$1,Sheet2!$Q$2:$Q$4,0),CODE($C2)-80)-INDEX(Sheet2!$Q$2:$X$4,MATCH(F$1,Sheet2!$Q$2:$Q$4,0),CODE($C2)+$E2-$A1-81))
in F2 copied across and down to suit.

How to calculate average of a column of numbers linked to each frequency bin making up a histogram, Excel 2010?

I have three columns. Column A consists of numbers, column B consists of bin ranges, and column C consists of number data relevant to the individual data in column A.
Using columns A and B, I created a frequency histogram where all the data in column A have been grouped into the bins of column B. I would like to calculate the average value of each bin using the data from column C (i.e., calculate a mean value for each bin using data from column C that is associated to each value (from column A) that made up each bin).
Can anybody help?
Thanks for the replies. Here is an example of the data (Unfortunately I can not paste in images):
Below are three columns with headers Jar Type (in volume (ml)), Cookies (he number of chocolate chip cookies in the jar), and Interval for bins (bins to count the jar types):
Jar type-cookies-intervals for bins
500 3 100
500 1 150
500 0.5 200
250 3 250
150 1 300
500 1 350
150 2 400
250 2 450
### # 500
Making a histogram of the frequency of jar types gives this grouping:
Bin-Frequency
100 0
150 2
200 0
250 2
300 0
350 0
400 0
450 0
500 4
More 0
Now what I am trying to do is to find out what is the mean number of cookies that can be found in each type of jar. For example, for the 500ml we know that there are 4x500ml jars, and that in each of the 500ml we have 3+1+0.5+1 = 5.5 cookies in total. the mean would be 1.735 cookies.
My issue is that I have 5000+ numbers that separate into 100 bins.
The question calls for a "wandering trace" of a scatterplot: the values of column A (plot them on the horizontal axis) are placed into bins, which therefore comprise vertical strips in the scatterplot. The values of column C (plotted on the vertical axis) are averaged within each strip. This technique smooths out and summarizes apparent trends in the scatterplot.
In this example with 100 records the original data are in black and computed values are in green. Here is the wandering trace of means:
The open circles plot column C (associated values) against column A (data) while the solid squares, connected with a dashed red trace, plot the bin means (column G) against the midpoints (column F).
Any statistical package will provide functions for grouping data and performing operations on those groups. Excel does this to a limited extent with its SUMIF and COUNTIF functions. To use them, create a column (D in the spreadsheet) showing the grouping factor. (That's a simple lookup in the sorted BINS vector using the VLOOKUP function with its "range" option set to true.) SUMIF computes sums by group factor and COUNTIF counts by group factor. Their ratios are the bin means.
Here is what the formulas look like:
Only three formulas were actually entered and then copied down as needed:
=VLOOKUP(A2, Bins, 1, TRUE) computes the group for the value in cell A2. Bins a name for the array $(-2,-3, \ldots, 3)$ in column B.
=AVERAGE(B3:B4) computes the midpoint of the first bin. This was used as a horizontal plotting position in the scatterplot.
=SUMIF(Bin,"="&B3,NewValues)/COUNTIF(Bin, "="&B3) is where all the work is done. Bin refers to the group codes in column D and NewValues refers to the associated values in column C. The tricky parts are the constructs "="&B3: these form a text value instructing the data to be grouped by comparison to the number in cell B3, which is the first endpoint. Because this is a formula, copying it down automatically updates the B3 to B4, then B5, etc.

Resources