I have a spreadsheet containing data from hospital patients. I'm running Excel 2021.
I need to create a function (or a macro) that tells me how many people live in the household that has the biggest age difference between the oldest and the youngest person. This is how my data looks like :
EDIT: I've changed the screenshot of the data for a table so it's easier to work with.
hserial
hhsize
age
101051
1
92
101151
1
63
101201
1
56
101271
2
38
101271
2
25
101351
3
37
101351
3
14
101351
3
10
101371
2
35
101371
2
29
where :
age: age of the patient
hserial: serial number of household. This is how we identify a household.
hhsize: household size
I was thinking on maybe using the filter function, and finding the maximum between the subtraction of the oldest and youngest of each household.
You can try the following in E2 cell for O365:
=LET(hs, A2:A10, hsize, B2:B10, age, C2:C10, ux, UNIQUE(hs),
diff, MAP(ux, LAMBDA(u, LET(f, FILTER(age, hs=u), MAX(f)-MIN(f)))),
x, XLOOKUP(MAX(diff), diff, ux), INDEX(hsize, XMATCH(x, hs)))
You can use instead of INDEX/MATCH the following XLOOKUP(x, hs, hsize).
For Excel 2021 you don't have MAP available, but you can use the following approach that replaces the second line of the previous formula and uses XLOOKUP instead of INDEX/XMATCH, but you can use them too:
=LET(hs, A2:A10, hsize, B2:B10, age, C2:C10, ux, UNIQUE(hs),
diff, MAXIFS(age,hs, ux) - MINIFS(age,hs, ux),
x, XLOOKUP(MAX(diff), diff, ux), XLOOKUP(x, hs, hsize))
Here is the output for the first formula, for the second you get the same result:
Excel 2021 does not have LAMBDA functions, but you still can do so using the following:
=LET(rng, A2:C11,
hs, INDEX(rng,,1),
hh, INDEX(rng,,2),
age, INDEX(rng,,3),
mx, MAXIFS(age,hs,hs),
mn, MINIFS(age,hs,hs),
diff, mx-mn,
INDEX(hh,XMATCH(MAX(diff),diff)))
Or if it could be multiple different hhsize values with the same difference in age:
=LET(rng, A2:C11,
hs, INDEX(rng,,1),
hh, INDEX(rng,,2),
age, INDEX(rng,,3),
mx, MAXIFS(age,hs,hs),
mn, MINIFS(age,hs,hs),
diff, mx-mn,
UNIQUE(FILTER(hh,diff=MAX(diff))))
It takes the full range A2:C11 and divides it into separate named ranges: hs for hserial, hh for hhsize and age.
Than it calculates the conditional max value mx of the age where the hs value in the range equals itself.
Same for mn but this is the min value.
Than diff is an array of the difference between mx and mn.
Than either INDEX / MATCH or FILTER is used to get the hh value in the row of the max value in diff
Related
I have the following sample data.
Date Category Price Quantity
02-01-2019 BASE_Y-20 279 1
02-01-2019 BASE_Y-21 271.25 0
03-01-2019 BASE_Y-20 276.5 2
03-01-2019 BASE_Y-21 266.5 0
04-01-2019 BASE_Y-20 272.88 14
04-01-2019 BASE_Y-21 266.5 1
07-01-2019 BASE_Y-20 270.48 29
07-01-2019 BASE_Y-21 262.75 0
08-01-2019 BASE_Y-20 270 4
08-01-2019 BASE_Y-21 264 0
09-01-2019 BASE_Y-20 270.06 31
09-01-2019 BASE_Y-21 262.85 0
What is a dynamic formula that I can use to return the last 5 prices corresponding to category BASE_Y-20 ? The formula must return whatsoever prices are available, if 5 values are not present, which is the challenging part. (Eg: For the given data, 270.06, 270, 270.48, 272.88 and 276.5 must be returned. If we only had 1st row, it must return 279)
I have tried sumproduct. That of course gives the corresponding prices. Offset can be availed to get last 5 data. But no way for getting last 5 prices corresponding to a specific category that is dynamic.
You can try:
Formula in F3:
=TAKE(SORT(FILTER(A:C,B:B=F1),1),-F2,-1)
Few notes:
The latest price will be at the bottom;
If your data is always sorted to begin with, just ditch the nested SORT() and use =TAKE(FILTER(A:C,B:B=F1),-F2,-1);
If no value is present at all, nest the formula in an =IFERROR(<Formula>,"") to return any value you'd like to display in such event.
Last Matches From Bottom to Top
EDIT
With great help from P.b, the formula got reduced to the following:
=LET(cData,B2:B13,rData,C2:C13,cStr,G1,rCount,G2,
rFiltered,IFERROR(TAKE(TAKE(FILTER(HSTACK(cData,rData),cData=cStr),,-1),-rCount),""),
Result,SORTBY(rFiltered,SEQUENCE(ROWS(rFiltered)),-1),Result)
Screenshot Formulas
J2 =HSTACK(B2:B13,C2:C13)
L2 =FILTER(J2#,B2:B13=G1)
N2 =TAKE(L2#,,-1)
O2 =TAKE(N2#,-G2)
P2 =ROWS(O2#)
Q2 =SEQUENCE(P2)
R2 =SORTBY(O2#,Q2#,-1)
Issues in the Initial Post
I'm not sure what drove me to the decision that the data is A3:D13 when it is obviously B3:B13 and C3:C13.
TAKE will work if there are fewer rows/columns than asked for i.e. if you need five rows and there are only two, two will be returned.
Instead of using ROWS with the SEQUENCE function and then using it with INDEX, it is simpler to use SORTBY to sort by the sequence, in this particular case descending (-1).
Initial Post (Bad)
LET
=LET(Data,A2:D13,cCol,2,cStr,G1,rCol,3,rCount,G2,
cData,INDEX(Data,,cCol),rData,INDEX(Data,,rCol),Both,HSTACK(cData,rData),
bFiltered,FILTER(Both,cData=cStr),rFiltered,TAKE(bFiltered,,-1),rRows,ROWS(rFiltered),
fRows,IF(rRows>rCount,rCount,rRows),rSequence,SEQUENCE(fRows,,rRows,-1),
Result,INDEX(rFiltered,rSequence),Result)
Screenshot Formulas
J3 =INDEX(A2:D13,,2)
K3 =INDEX(A2:D13,,3)
L3 =HSTACK(J3#,K3#)
N3 =FILTER(L3#,J3#=G1)
P3 =TAKE(N3#,,-1)
Q3 =ROWS(P3#)
R3 =IF(Q3>G2,G2,Q3)
S3 =SEQUENCE(R3,,Q3,-1)
T3 =INDEX(P3#,S3#)
I'm looking for a solution for a problem I'm facing in Excel. This is my table simplified:
Every sale has an unique ID, but more people can have contributed to a sale. the column "name" and "share of sales(%)" show how many people have contributed and what their percentage was.
Sale_ID
Name
Share of sales(%)
1
Person A
100
2
Person B
100
3
Person A
30
3
Person C
70
Now I want to add a column to my table that shows the name of the person that has the highest share of sales percentage per Sales_ID. Like this:
Sale_ID
Name
Share of sales(%)
Highest sales
1
Person A
100
Person A
2
Person B
100
Person B
3
Person A
30
Person C
3
Person C
70
Person C
So when multiple people have contributed the new column shows only the one with the highest value.
I hope someone can help me, thanks in advance!
You can try this on cell D2:
=LET(maxSales, MAXIFS(C2:C5,A2:A5,A2:A5),
INDEX(B2:B5, XMATCH(A2:A5&maxSales,A2:A5&C2:C5)))
or just removing the LET since maxSales is used only one time:
=INDEX(B2:B5, XMATCH(A2:A5&MAXIFS(C2:C5,A2:A5,A2:A5),A2:A5&C2:C5))
On cell E2 I provided another solution via MAP/XLOOKUP:
=LET(maxSales, MAXIFS(C2:C5,A2:A5,A2:A5),
MAP(A2:A5, maxSales, LAMBDA(a,b, XLOOKUP(a&b, A2:A5&C2:C5, B2:B5))))
similarly without LET:
=MAP(A2:A5, MAXIFS(C2:C5,A2:A5,A2:A5),
LAMBDA(a,b, XLOOKUP(a&b, A2:A5&C2:C5, B2:B5)))
and here is the output:
Explanation
The trick here is to identify the max share of sales per each group and this can be done via MAXIFS(max_range, criteria_range1, criteria1, [criteria_range2, criteria2], ...). The size and shape of the max_range and criteria_rangeN arguments must be the same.
MAXIFS(C2:C5,A2:A5,A2:A5)
it produces the following output:
maxSales
100
100
70
70
MAXIFS will provide an output of the same size as criteria1, so it returns for each row the corresponding maximum sales for each Sale_ID column value.
It is the array version equivalent to the following formula expanding it down:
MAXIFS($C$2:$C$5,$A$2:$A$5,A2)
INDEX/XMATCH Solution
Having the array with the maximum Shares of sales, we just need to identify the row position via XMATCH to return the corresponding B2:B5 cell via INDEX. We use concatenation (&) to consider more than one criteria to find as part of the XMATCH input arguments.
MAP/XLOOKUP Solution
We use MAP to find for each pair of values (a,b) per row, of the first two MAP input arguments where is the maximum value found for that group and returns the corresponding Name column value. In order to make a lookup based on an additional criteria we use concatenation (&) in XLOOKUP first two input arguments.
Consider the following sheet/table:
A B
1 90 71
2 40 25
3 60 16
4 110 13
5 87 82
I want to have a general formula in cell C1 that sums the greatest value in column A (which is 110), plus the sum of the other values in column B (which are 71, 25, 16 and 82). I would appreciate if the formula wasn't an array formula (as in requiring Ctrl + Shift + Enter). I don’t have Office 365, I have Excel 2019.
My attempt
Getting the greatest value in column A is easy, we use MAX(A1:A5).
So the formula I want in cell C1 should be something like:
=MAX(A1:A5) + SUM(array_of_values_to_be_summed)
Obtaining the values of the other rows in column B (what I called array_of_values_to_be_summed in the previous formula) is the hard part. I've read about using INDEX, MATCH, their combination, and obtaining arrays by using parenthesis and equal signs, and I've tried that, without success so far.
For example, I noticed that NOT((A1:A5 = MAX(A1:A5))) yields an array/list containing ones (or TRUEs) for the relative position of the rows to be summed, and containing a zero (or FALSE) for the relative position of the row to be omitted. Maybe this is useful, I couldn't find how.
Any ideas? Thanks.
Edit 1 (solution)
I managed to obtain what I wanted. I simply multiplied the array obtained with the NOT formula, by the range B1:B5. The final formula is:
=MAX(A1:A5) + SUM(NOT((A1:A5 = MAX(A1:A5))) * B1:B5)
Edit 2 (duplicate values)
I forgot to explain what the formula should do if there are duplicates in column A. In that case, the first term of my final formula (the term that has the MAX function) would be the one whose corresponding value in column B is smallest, and the value in column B of the other duplicates would be used in the second term (the one containing the SUM function).
For example, consider the following sheet/table:
A B
1 90 71
2 110 25
3 60 16
4 110 13
5 110 82
Based on the above table, the formula should yield 110 + (71 + 25 + 16 + 82) = 304.
Just to give context, the reason I want such a formula is because I’m writing a spreadsheet that automatically calculates the electric current rating of the short-circuit protective device of the feeder of a group of electric motors in a house or building or mall, as required by the article 430.62(A) of the US National Electrical Code. Column A is the current rating of the short-circuit protective device of the branch-circuit of each motors, and column B is the full-load current of each motor.
You can use this formula
=MAX(A1:A5)
+SUM(B1:B5)
-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)
Based on #Anupam Chand's hint for max-value-duplicates there could also be min-value-duplicates in column B for corresponding max-value-duplicates in column A. :) This formula would account for that
=SUM(B1:B5)
+(MAX(A1:A5)-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1))
*SUMPRODUCT((A1:A5=MAX(A1:A5))*(B1:B5=AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)))
Or with #Anupam Chand's shorter and better readable and overall better style :)
=SUM(B1:B5)
+(MAX(A1:A5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
*COUNTIFS(A1:A5,MAX(A1:A5),B1:B5,MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
The explanation works for bot solutions:
The SUM-part just sums the whole list.
The second line gets the max-value for column A and the corresponding min-value of column B for the max-values in column A and adds or subtracts it respectively.
The third line counts, how many times the corresponding min-value for the max-value occurs and multiplies it with the second line.
Can you try this ?
=MAX(A1:A5)+SUM(B1:B5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5))
What we're doing is adding the max of A to all rows of B and then subtracting the min value of B where A is the max.
If you have Excel 365 you can use the following LET-Formula
=LET(A,A1:A5,
B,B1:B5,
MaxA,MAX(A),
MinBExclude, MINIFS(B,A,MaxA),
sumB1,SUMPRODUCT(B*(A=MaxA)*(B<>MinBExclude)),
sumB2,SUMPRODUCT(B*(A<>MaxA)),
MaxA +sumB1+sumB2
A and B are shortcuts for the two ranges
MaxA returns the max value for A (110)
MinBExclude filters the values of column B by the MaxA-value (25, 13, 82) and returns the min-value of the filtered result (13)
sumB1 returns the sum of the other MaxA values from column B (26 + 82)
sumB2 returns the sum of the values from B where value in A <> MaxA (71 + 60)
and finally the result is returned
If you don't have Excel 365 you can add helper columns for MaxA, MinBExclude, sumB1 and sumB2 and the final result
I have different water tanks and 2 employees who measures the water tanks. Sometimes they measure the volume of the tank on the same day and sometimes not. I want to see how much their measurements differed. I understand sometimes the dates are not the same, thus, I would like to vlookup the volume of an exact tank to the closest date possible of Bob's reading dates.
Bob's Readings
Water_Tank_Name
Date
Volume
Red
15/02/2021
300
Blue
15/02/2021
145
Red
21/02/2021
280
Red
04/03/2021
339
Blue
05/03/2021
170
Sarah's Readings
Water_Tank_Name
Date
Volume
Blue
15/02/2021
148
Blue
19/02/2021
190
Red
23/02/2021
294
Blue
01/03/2021
140
I used xlookup but that only returns a value if the exact Water_Tank_Name and exact Date return a value. However, I would like to exactly watch the Water_Tank_Name and match to the closet Date.
=XLOOKUP(Bob!A2 & Bob!A2, Sarah!A:A & Sarah!B:B, Sarah!C:C)
You could use (with Excel 365):
=LET( tf, Bob!A2, df, Bob!B2,
tS, Sarah!A:A, dS,Sarah!B:B, dV, Sarah!C:C,
L, tS & dS,
S, SIGN(ABS(IFERROR(XLOOKUP(tf & df, L, dS,,-1)-df, 999)) - ABS(IFERROR(XLOOKUP(tf & df, L, dS,,1)-df, 999))),
XLOOKUP(tf & df, L, dV,,S) )
Where tf is the tank identity that you want use for the search and df is the date value that you want to search. This finds the nearest date and determines if it is smaller or larger than df and then tells the XLOOKUP to search for the next larger or smaller (S is either 1 or -1) that will arrive at the nearest date. It might be possible to replace the two XLOOKUPs for S with FILTERs, but I am not sure if it would be faster. The use of whole columns for SARAH should be replaced with Excel table columns - otherwise, it will run slow.
I am almost positive there is something that is less verbose than this monstrosity, but it works and makes sense if you take it apart piece by piece.
=IFERROR(FILTER($EG$12:$EG$15,$EE$12:$EE$15=$EE5)*$EF$12:$EF$15=$EF5+MIN(ABS(EF5-FILTER($EF$12:$EF$15, $EE$12:$EE$15=$EE5))))),FILTER($EG$12:$EG$15,($EE$12:$EE$15=$EE5)*($EF$12:$EF$15=$EF5-MIN(ABS(EF5-FILTER($EF$12:$EF$15, $EE$12:$EE$15=$EE5))))))
So I have some results which may or may not have totals. In this case food. Each food is given an amount and a weight total. I have information below the results however which I do not want to shift. I would use a table however, if I hide rows, it shifts items up. If I sort the rows, the name of the food with no results will still show. Any idea how I could reformat my results into food with results only in a new referenced cell? I'm trying to automate this with no button pressing, and without using macro's/vba's. I could use something like =IF(ISBLANK(B21),"noResults",A21) but how could i list them all like my example blelow?
a1 b1 c1
Amount Weight
x Apples 5 10
x Oranges
x Peaches 6 10
x Lemons 2 10
x Tomatos
x Avacados 3 10
x
x
x xxxxdon't shift or move xxxxxxxxx
TO:
x g1 h1 i1
x Amount Weight
x Apples 5 10
x Peaches 6 10
x Lemons 2 10
x Avacados 3 10
x
xxxxxxxxdon't shift or move xxxxxxxxx
If you can add an ID column to the foods, you could do the following using the SMALL() function:
=IF(D3<>"",B3,"")
=IFERROR(SMALL($F$3:$F$8,ROW(C1)),"")
=VLOOKUP(H3,$B$3:$F$8,2,FALSE)
To Explain Further
The SMALL() function takes an array of numbers and will return the 1st smallest, 2nd smallest, or whatever smallest number you specify. Because this example only has Food names, I had to add an ID column (column B) to get an array of numbers.
Since we only want to view rows with data (amount & weight), I added another column ID Formula (column F) to only display the ID if the column has data.
Now that we are only displaying ID's with data, we can use the SMALL() function to get the 1st smallest ID still showing, then the 2nd smallest ID still showing, and so on... notice that I used the ROW() function to get 1,2,3...
Lastly, I used a simple VLOOKUP() to add in the Food, Amount, and Weight for the respective ID's.