VLOOKUP exact value and closest date - excel

I have different water tanks and 2 employees who measures the water tanks. Sometimes they measure the volume of the tank on the same day and sometimes not. I want to see how much their measurements differed. I understand sometimes the dates are not the same, thus, I would like to vlookup the volume of an exact tank to the closest date possible of Bob's reading dates.
Bob's Readings
Water_Tank_Name
Date
Volume
Red
15/02/2021
300
Blue
15/02/2021
145
Red
21/02/2021
280
Red
04/03/2021
339
Blue
05/03/2021
170
Sarah's Readings
Water_Tank_Name
Date
Volume
Blue
15/02/2021
148
Blue
19/02/2021
190
Red
23/02/2021
294
Blue
01/03/2021
140
I used xlookup but that only returns a value if the exact Water_Tank_Name and exact Date return a value. However, I would like to exactly watch the Water_Tank_Name and match to the closet Date.
=XLOOKUP(Bob!A2 & Bob!A2, Sarah!A:A & Sarah!B:B, Sarah!C:C)

You could use (with Excel 365):
=LET( tf, Bob!A2, df, Bob!B2,
tS, Sarah!A:A, dS,Sarah!B:B, dV, Sarah!C:C,
L, tS & dS,
S, SIGN(ABS(IFERROR(XLOOKUP(tf & df, L, dS,,-1)-df, 999)) - ABS(IFERROR(XLOOKUP(tf & df, L, dS,,1)-df, 999))),
XLOOKUP(tf & df, L, dV,,S) )
Where tf is the tank identity that you want use for the search and df is the date value that you want to search. This finds the nearest date and determines if it is smaller or larger than df and then tells the XLOOKUP to search for the next larger or smaller (S is either 1 or -1) that will arrive at the nearest date. It might be possible to replace the two XLOOKUPs for S with FILTERs, but I am not sure if it would be faster. The use of whole columns for SARAH should be replaced with Excel table columns - otherwise, it will run slow.

I am almost positive there is something that is less verbose than this monstrosity, but it works and makes sense if you take it apart piece by piece.
=IFERROR(FILTER($EG$12:$EG$15,$EE$12:$EE$15=$EE5)*$EF$12:$EF$15=$EF5+MIN(ABS(EF5-FILTER($EF$12:$EF$15, $EE$12:$EE$15=$EE5))))),FILTER($EG$12:$EG$15,($EE$12:$EE$15=$EE5)*($EF$12:$EF$15=$EF5-MIN(ABS(EF5-FILTER($EF$12:$EF$15, $EE$12:$EE$15=$EE5))))))

Related

Find value with the biggest age difference

I have a spreadsheet containing data from hospital patients. I'm running Excel 2021.
I need to create a function (or a macro) that tells me how many people live in the household that has the biggest age difference between the oldest and the youngest person. This is how my data looks like :
EDIT: I've changed the screenshot of the data for a table so it's easier to work with.
hserial
hhsize
age
101051
1
92
101151
1
63
101201
1
56
101271
2
38
101271
2
25
101351
3
37
101351
3
14
101351
3
10
101371
2
35
101371
2
29
where :
age: age of the patient
hserial: serial number of household. This is how we identify a household.
hhsize: household size
I was thinking on maybe using the filter function, and finding the maximum between the subtraction of the oldest and youngest of each household.
You can try the following in E2 cell for O365:
=LET(hs, A2:A10, hsize, B2:B10, age, C2:C10, ux, UNIQUE(hs),
diff, MAP(ux, LAMBDA(u, LET(f, FILTER(age, hs=u), MAX(f)-MIN(f)))),
x, XLOOKUP(MAX(diff), diff, ux), INDEX(hsize, XMATCH(x, hs)))
You can use instead of INDEX/MATCH the following XLOOKUP(x, hs, hsize).
For Excel 2021 you don't have MAP available, but you can use the following approach that replaces the second line of the previous formula and uses XLOOKUP instead of INDEX/XMATCH, but you can use them too:
=LET(hs, A2:A10, hsize, B2:B10, age, C2:C10, ux, UNIQUE(hs),
diff, MAXIFS(age,hs, ux) - MINIFS(age,hs, ux),
x, XLOOKUP(MAX(diff), diff, ux), XLOOKUP(x, hs, hsize))
Here is the output for the first formula, for the second you get the same result:
Excel 2021 does not have LAMBDA functions, but you still can do so using the following:
=LET(rng, A2:C11,
hs, INDEX(rng,,1),
hh, INDEX(rng,,2),
age, INDEX(rng,,3),
mx, MAXIFS(age,hs,hs),
mn, MINIFS(age,hs,hs),
diff, mx-mn,
INDEX(hh,XMATCH(MAX(diff),diff)))
Or if it could be multiple different hhsize values with the same difference in age:
=LET(rng, A2:C11,
hs, INDEX(rng,,1),
hh, INDEX(rng,,2),
age, INDEX(rng,,3),
mx, MAXIFS(age,hs,hs),
mn, MINIFS(age,hs,hs),
diff, mx-mn,
UNIQUE(FILTER(hh,diff=MAX(diff))))
It takes the full range A2:C11 and divides it into separate named ranges: hs for hserial, hh for hhsize and age.
Than it calculates the conditional max value mx of the age where the hs value in the range equals itself.
Same for mn but this is the min value.
Than diff is an array of the difference between mx and mn.
Than either INDEX / MATCH or FILTER is used to get the hh value in the row of the max value in diff

Find a temperature and work out how long it remained >= this temperature

I have an excel sheet with times in one column and temperatures in another. I'm trying to work out a formula that will find a certain temperature and measure how long it remained at that temperature.
11:25:29 AM 69.3°C
11:26:29 AM 69.6°C
11:27:29 AM 69.8°C
11:28:29 AM 70.0°C
11:29:29 AM 70.2°C
11:35:29 AM 70.8°C
11:36:29 AM 70.3°C
11:37:29 AM 69.5°C
11:38:29 AM 68.5°C
11:39:29 AM 67.5°C
12:39:29 PM 66.3°C
1:39:29 PM 52.1°C
2:39:29 PM 12.1°C
3:39:29 PM 5.0°C
In this example, I would like to find when it hit 70.0°C and how long it stayed above 70.0°C.
This is a bit of a tough problem because you might have multiple occasions where you go above 70 degrees. In that case, do you want the total time spent above 70 in the entire dataset, or do you want the total time spent above 70 consecutively? And then, how are you determining which of these potential multiple nonconsecutive periods you are talking about?
That said, you can try this. If column A is your datetime, and column B is your temp reading, specify another cell as your temperature reference value ($D$1 here), and in column C starting in row 2 do this:
=(A2-A1)*IF(B2>=$D$1,1,0)
and then copy that all the way down. What that does is it calculates the time difference between measurements and then if the temperature at that time is greater than your reference, it multiplies it by 1, otherwise it multiplies by 0. Because a date/time in Excel is really just a number, what you get is an interval of a day between measurements in each cell of column C. In other words, .25 = 6 hours.
Now that you have that data in column C, you are free to further parse it. You can use a simple SUM(C:C) formula in a cell, or you can go back and sum up individual ranges. I hope this helps.

Excel IF OR Statement

I am having trouble determining the correct way to calculate a final rank order for four categories. Each of the four metrics make up a higher group. A Top 10 of each category is applied to the respective product to risk analysis.
CURRENT LOGIC - Assignment of 25% max per category.
Columns - Y4
Parts
0.25
25
=IF(L9=1,$Y$4,IF(L9=2,$Y$4*0.9, IF(L9=3,$Y$4*0.8, IF(L9=4,$Y$4*0.7, IF(L9=5,$Y$4*0.6, IF(L9=6,$Y$4*0.5, IF(L9=7,$Y$4*0.4, IF(L9=8,$Y$4*0.3, IF(L9=9,$Y$4*0.2, IF(L9=10,$Y$4*0.1,0))))))))))
DESIRED...
I would like to use a statement to determine three criteria in order to apply a score (1=100, 2=90, 3=80, etc..).
SUM the rank positions of each of the four categories-apply product rank ascending (not including NULL since it's not in the Top 10)
IF a product is identified in more than one metric-apply a significant contribution weight of (*.75),
IF a product has the number 1 rank in any of the four metrics-apply a score of (100).
Data - UPDATED EXAMPLE
(Product) Parts Labor Overhead External Final Score
"XYZ" 3 1 7 7 100
"ABC" NULL 6 NULL 2 100
"LMN" 4 NULL NULL NULL 70
This is way beyond my capability. ANY assistance is appreciated greatly!!!
Jim
I figured this is a good start and I can alter the weight as needed to reflect the reality of the situation.
=AVERAGE(G28:I28)+SUM(G28:I28)*0.25
However, I couldn't figure out how to put a cap on the score of no more than 100 points.
I am still unclear of what exactly you are attempting and if this will work, but how about this simple matrix using an array formula and some conditional formatting.
Array Formula in F2 (make sure to press Ctrl+Shift+Enter when exiting formula edit mode)
=MIN(100,SUM(IF(B2:E2<>"NULL",CHOOSE(B2:E2,100,90,80,70,60,50,40,30,20,10))))
Conditional Formatting defined as shown below.
Red = 100 value where it comes from a 1
Yellow = 100 value where it comes from more than 1 factor, but without a 1.

Excel - Multiply Until Total Reached

I want to multiply x*y until x>=20, then multiply z that value and have the results displayed as two values, the multiple and multiple*z
The question behind the formula is, how many boxes of x capacity do I need to have a total capacity of 20 liters and how much does that cost.
x = volume of bottle
y = number of bottles in a box
z = price per box
This could be done very easily by hand, but I've been playing (with little effect) in excel for a while and would like a solution.
I hope that makes sense
I rather think what you would like is the formula provided by #Jeeped but for:
I want to multiply x*y until x>=20, then multiply z that value and have the results displayed as two values, the multiple and multiple*z
label two arrays from 1 to 20 for columns and rows as shown, populate V1 with the price per box and in B2:
=IF(AND($A2*B$1>20,A2>=20),"",$A2*B$1)
and in X2:
=IF(B2="","",$V$1*B2)
with both formulae copied across 19 columns and those two sets of 20 formulae then copied down 19 rows. The result should be similar to:

How to calculate average of a column of numbers linked to each frequency bin making up a histogram, Excel 2010?

I have three columns. Column A consists of numbers, column B consists of bin ranges, and column C consists of number data relevant to the individual data in column A.
Using columns A and B, I created a frequency histogram where all the data in column A have been grouped into the bins of column B. I would like to calculate the average value of each bin using the data from column C (i.e., calculate a mean value for each bin using data from column C that is associated to each value (from column A) that made up each bin).
Can anybody help?
Thanks for the replies. Here is an example of the data (Unfortunately I can not paste in images):
Below are three columns with headers Jar Type (in volume (ml)), Cookies (he number of chocolate chip cookies in the jar), and Interval for bins (bins to count the jar types):
Jar type-cookies-intervals for bins
500 3 100
500 1 150
500 0.5 200
250 3 250
150 1 300
500 1 350
150 2 400
250 2 450
### # 500
Making a histogram of the frequency of jar types gives this grouping:
Bin-Frequency
100 0
150 2
200 0
250 2
300 0
350 0
400 0
450 0
500 4
More 0
Now what I am trying to do is to find out what is the mean number of cookies that can be found in each type of jar. For example, for the 500ml we know that there are 4x500ml jars, and that in each of the 500ml we have 3+1+0.5+1 = 5.5 cookies in total. the mean would be 1.735 cookies.
My issue is that I have 5000+ numbers that separate into 100 bins.
The question calls for a "wandering trace" of a scatterplot: the values of column A (plot them on the horizontal axis) are placed into bins, which therefore comprise vertical strips in the scatterplot. The values of column C (plotted on the vertical axis) are averaged within each strip. This technique smooths out and summarizes apparent trends in the scatterplot.
In this example with 100 records the original data are in black and computed values are in green. Here is the wandering trace of means:
The open circles plot column C (associated values) against column A (data) while the solid squares, connected with a dashed red trace, plot the bin means (column G) against the midpoints (column F).
Any statistical package will provide functions for grouping data and performing operations on those groups. Excel does this to a limited extent with its SUMIF and COUNTIF functions. To use them, create a column (D in the spreadsheet) showing the grouping factor. (That's a simple lookup in the sorted BINS vector using the VLOOKUP function with its "range" option set to true.) SUMIF computes sums by group factor and COUNTIF counts by group factor. Their ratios are the bin means.
Here is what the formulas look like:
Only three formulas were actually entered and then copied down as needed:
=VLOOKUP(A2, Bins, 1, TRUE) computes the group for the value in cell A2. Bins a name for the array $(-2,-3, \ldots, 3)$ in column B.
=AVERAGE(B3:B4) computes the midpoint of the first bin. This was used as a horizontal plotting position in the scatterplot.
=SUMIF(Bin,"="&B3,NewValues)/COUNTIF(Bin, "="&B3) is where all the work is done. Bin refers to the group codes in column D and NewValues refers to the associated values in column C. The tricky parts are the constructs "="&B3: these form a text value instructing the data to be grouped by comparison to the number in cell B3, which is the first endpoint. Because this is a formula, copying it down automatically updates the B3 to B4, then B5, etc.

Resources