How to return the last n number of values corresponding to a specific category? - excel

I have the following sample data.
Date Category Price Quantity
02-01-2019 BASE_Y-20 279 1
02-01-2019 BASE_Y-21 271.25 0
03-01-2019 BASE_Y-20 276.5 2
03-01-2019 BASE_Y-21 266.5 0
04-01-2019 BASE_Y-20 272.88 14
04-01-2019 BASE_Y-21 266.5 1
07-01-2019 BASE_Y-20 270.48 29
07-01-2019 BASE_Y-21 262.75 0
08-01-2019 BASE_Y-20 270 4
08-01-2019 BASE_Y-21 264 0
09-01-2019 BASE_Y-20 270.06 31
09-01-2019 BASE_Y-21 262.85 0
What is a dynamic formula that I can use to return the last 5 prices corresponding to category BASE_Y-20 ? The formula must return whatsoever prices are available, if 5 values are not present, which is the challenging part. (Eg: For the given data, 270.06, 270, 270.48, 272.88 and 276.5 must be returned. If we only had 1st row, it must return 279)
I have tried sumproduct. That of course gives the corresponding prices. Offset can be availed to get last 5 data. But no way for getting last 5 prices corresponding to a specific category that is dynamic.

You can try:
Formula in F3:
=TAKE(SORT(FILTER(A:C,B:B=F1),1),-F2,-1)
Few notes:
The latest price will be at the bottom;
If your data is always sorted to begin with, just ditch the nested SORT() and use =TAKE(FILTER(A:C,B:B=F1),-F2,-1);
If no value is present at all, nest the formula in an =IFERROR(<Formula>,"") to return any value you'd like to display in such event.

Last Matches From Bottom to Top
EDIT
With great help from P.b, the formula got reduced to the following:
=LET(cData,B2:B13,rData,C2:C13,cStr,G1,rCount,G2,
rFiltered,IFERROR(TAKE(TAKE(FILTER(HSTACK(cData,rData),cData=cStr),,-1),-rCount),""),
Result,SORTBY(rFiltered,SEQUENCE(ROWS(rFiltered)),-1),Result)
Screenshot Formulas
J2 =HSTACK(B2:B13,C2:C13)
L2 =FILTER(J2#,B2:B13=G1)
N2 =TAKE(L2#,,-1)
O2 =TAKE(N2#,-G2)
P2 =ROWS(O2#)
Q2 =SEQUENCE(P2)
R2 =SORTBY(O2#,Q2#,-1)
Issues in the Initial Post
I'm not sure what drove me to the decision that the data is A3:D13 when it is obviously B3:B13 and C3:C13.
TAKE will work if there are fewer rows/columns than asked for i.e. if you need five rows and there are only two, two will be returned.
Instead of using ROWS with the SEQUENCE function and then using it with INDEX, it is simpler to use SORTBY to sort by the sequence, in this particular case descending (-1).
Initial Post (Bad)
LET
=LET(Data,A2:D13,cCol,2,cStr,G1,rCol,3,rCount,G2,
cData,INDEX(Data,,cCol),rData,INDEX(Data,,rCol),Both,HSTACK(cData,rData),
bFiltered,FILTER(Both,cData=cStr),rFiltered,TAKE(bFiltered,,-1),rRows,ROWS(rFiltered),
fRows,IF(rRows>rCount,rCount,rRows),rSequence,SEQUENCE(fRows,,rRows,-1),
Result,INDEX(rFiltered,rSequence),Result)
Screenshot Formulas
J3 =INDEX(A2:D13,,2)
K3 =INDEX(A2:D13,,3)
L3 =HSTACK(J3#,K3#)
N3 =FILTER(L3#,J3#=G1)
P3 =TAKE(N3#,,-1)
Q3 =ROWS(P3#)
R3 =IF(Q3>G2,G2,Q3)
S3 =SEQUENCE(R3,,Q3,-1)
T3 =INDEX(P3#,S3#)

Related

Sum of the greatest value in one column, plus the sum of the other values in another column

Consider the following sheet/table:
A B
1 90 71
2 40 25
3 60 16
4 110 13
5 87 82
I want to have a general formula in cell C1 that sums the greatest value in column A (which is 110), plus the sum of the other values in column B (which are 71, 25, 16 and 82). I would appreciate if the formula wasn't an array formula (as in requiring Ctrl + Shift + Enter). I don’t have Office 365, I have Excel 2019.
My attempt
Getting the greatest value in column A is easy, we use MAX(A1:A5).
So the formula I want in cell C1 should be something like:
=MAX(A1:A5) + SUM(array_of_values_to_be_summed)
Obtaining the values of the other rows in column B (what I called array_of_values_to_be_summed in the previous formula) is the hard part. I've read about using INDEX, MATCH, their combination, and obtaining arrays by using parenthesis and equal signs, and I've tried that, without success so far.
For example, I noticed that NOT((A1:A5 = MAX(A1:A5))) yields an array/list containing ones (or TRUEs) for the relative position of the rows to be summed, and containing a zero (or FALSE) for the relative position of the row to be omitted. Maybe this is useful, I couldn't find how.
Any ideas? Thanks.
Edit 1 (solution)
I managed to obtain what I wanted. I simply multiplied the array obtained with the NOT formula, by the range B1:B5. The final formula is:
=MAX(A1:A5) + SUM(NOT((A1:A5 = MAX(A1:A5))) * B1:B5)
Edit 2 (duplicate values)
I forgot to explain what the formula should do if there are duplicates in column A. In that case, the first term of my final formula (the term that has the MAX function) would be the one whose corresponding value in column B is smallest, and the value in column B of the other duplicates would be used in the second term (the one containing the SUM function).
For example, consider the following sheet/table:
A B
1 90 71
2 110 25
3 60 16
4 110 13
5 110 82
Based on the above table, the formula should yield 110 + (71 + 25 + 16 + 82) = 304.
Just to give context, the reason I want such a formula is because I’m writing a spreadsheet that automatically calculates the electric current rating of the short-circuit protective device of the feeder of a group of electric motors in a house or building or mall, as required by the article 430.62(A) of the US National Electrical Code. Column A is the current rating of the short-circuit protective device of the branch-circuit of each motors, and column B is the full-load current of each motor.
You can use this formula
=MAX(A1:A5)
+SUM(B1:B5)
-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)
Based on #Anupam Chand's hint for max-value-duplicates there could also be min-value-duplicates in column B for corresponding max-value-duplicates in column A. :) This formula would account for that
=SUM(B1:B5)
+(MAX(A1:A5)-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1))
*SUMPRODUCT((A1:A5=MAX(A1:A5))*(B1:B5=AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)))
Or with #Anupam Chand's shorter and better readable and overall better style :)
=SUM(B1:B5)
+(MAX(A1:A5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
*COUNTIFS(A1:A5,MAX(A1:A5),B1:B5,MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
The explanation works for bot solutions:
The SUM-part just sums the whole list.
The second line gets the max-value for column A and the corresponding min-value of column B for the max-values in column A and adds or subtracts it respectively.
The third line counts, how many times the corresponding min-value for the max-value occurs and multiplies it with the second line.
Can you try this ?
=MAX(A1:A5)+SUM(B1:B5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5))
What we're doing is adding the max of A to all rows of B and then subtracting the min value of B where A is the max.
If you have Excel 365 you can use the following LET-Formula
=LET(A,A1:A5,
B,B1:B5,
MaxA,MAX(A),
MinBExclude, MINIFS(B,A,MaxA),
sumB1,SUMPRODUCT(B*(A=MaxA)*(B<>MinBExclude)),
sumB2,SUMPRODUCT(B*(A<>MaxA)),
MaxA +sumB1+sumB2
A and B are shortcuts for the two ranges
MaxA returns the max value for A (110)
MinBExclude filters the values of column B by the MaxA-value (25, 13, 82) and returns the min-value of the filtered result (13)
sumB1 returns the sum of the other MaxA values from column B (26 + 82)
sumB2 returns the sum of the values from B where value in A <> MaxA (71 + 60)
and finally the result is returned
If you don't have Excel 365 you can add helper columns for MaxA, MinBExclude, sumB1 and sumB2 and the final result

Multiple Return Vlookup Horizontal match range with Vertical return range

I have binary data running in the horizontal direction: For example the match ranges look like:
Mike 0 1 0 0 0 1
Julie 1 1 0 1 1 0
Joe 1 1 1 0 0 0
And the return Range contains textual data:
Q1: What is the capital of NY?
Q2: What is the capital of Ohio?
Q3: What is the capital of Washington?
.
.
.
I need to match every occurrence of 1 with corresponding data that runs in the vertical direction. i.e. horizontal index corresponding with vertical index. I have found several instances where a multiple return vlookup was accomplished by using:
=IFERROR(INDEX(return_range,SMALL(IF((1=match_range),ROW(match_range)-1),ROW(1:1)),2),"")
However this isn't working. I assume it isn't working because it is meant for two vertical data sets. I have tried switching the "row" for "column" in the function, but didnt have any luck.
Also, the match range and return range are on different sheets.
The match range (in horizontal direction) is binary information on whether a question was answered correctly. The return range is the corresponding set of questions (in vertical direction). Therefore, the output would be an array:
Mike: Q2 Q6
Julie: Q1 Q2 Q4 Q5
Joe: Q1 Q2 Q3
How can this function be modified to accomplish this?
To get the correct row in an array that then can be used in other formula we use INDEX:
INDEX($A:$G,MATCH($I2,$A:$A,0),0)
This will return all the values in Column A through G in the row where the name matches that in I2.
It can be used as such in a INDEX/AGGREGATE Function:
=IFERROR(INDEX($A$1:$G$1,AGGREGATE(15,6, COLUMN(INDEX($A:$G,MATCH($I2,$A:$A,0),0))/(INDEX($A:$G,MATCH($I2,$A:$A,0),0)=1),COLUMN(A:A))),"")
My best guess as to your data set up:
Use a formula like this:
=IFERROR(INDEX($I:$I,AGGREGATE(15,6, COLUMN(INDEX($A:$G,MATCH($K2,$A:$A,0),0))/(INDEX($A:$G,MATCH($K2,$A:$A,0),0)=1),COLUMN(A:A))),"")

Conditional formatting on the first x number of rows, regardless of filter or sort, in Excel

I'm trying to find a way to easily identify the first ten rows in a table column, no matter how it's been sorted/filtered. Is there a way to use conditional formatting to highlight these cells?
Examples of desired results...
Sample data:
product price units code
Item02 15.97 2191 7UQC
Item05 12.95 1523 TAAI
Item13 9.49 1410 LV9E
Item01 5.69 591 6DOY
Item04 15.97 554 ZCN2
Item08 10.68 451 2GN0
Item03 13.95 411 FP6A
Item07 25.45 174 PEWK
Item09 14.99 157 B5S4
Item06 18 152 XJ4G
Item10 11.45 148 BY8M
Item11 16.99 66 86C2
Item12 24.5 17 X31K
Item14 24.95 14 QJEI
When sorting by price the first 10 products highlighted differ from those in the next example.
The first 10 visible products are highlighted after filtering out Item12, Item05, and Item08.
Choosing to sort by units automatically highlights a different set of products.
Use this formula in the Conditional Formatting:
=SUBTOTAL(3,$A$2:$A2)<11
Make sure it applies to the entire dataset.
The formula returns the row number relative to the visible row number. Thus as a row is hidden the row beneath the hidden returns one greater than it would.
To see how it works place SUBTOTAL(3,$A$2:$A2) in an empty column. Then filter the table and watch as the numbers change.
The 3 refers to the COUNTA() function, which will count any non-empty cell.
Subtotal is designed to work with data that gets filtered to return only the visible data.
So the Formula will only count the visible cells that are not empty.
In the conditional formatting dialog, choose New rule -> Use a formula.... Enter =row()<=10.

Excel ranking based on grouping priorities

Hi everyone I have an excel question on how to rank but based first on a a ranking but then next on a second priority of a group. The formula is written in column 'Final_Rank' and I just hid a bunch of rows to show the clear example. Within the column Rank is just a normal rank function. I want the priority to be within Rank first, but then to add the next rank to the next item of the same group*. So if you look at Group HYP it will supersede ranked (3 and 4) and then 5 would be given to the next newest group.
I hope this is a clear explanation, thanks.
Group Rank Final_Rank_Manual
TAM 1 1
HYP 2 2
GAB 3 5
HYO 4 8
ALO 5 9
HYP 7 3
ACO 8 12
IBU 9 13
ACO 11 14
ALO 18 10
GAB 44 6
IBU 53 15
IBU 123 16
GAB 167 7
HYP 199 4
You can do this with an extra helper column. Assuming your table currently occupies columns A-C, with one header row, put the following in C2:
=SMALL(IF($A$2:$A$6=A2,$B$2:$B$6,9999999999),1)+(B2*0.000000001)
You'll need to enter this as an array formula by using Ctrl+Shift+Enter↵. Copy it down throughout the whole column. This gives you the group's ranking, and it adds a tiny decimal indicating the individual values position within each group. (e.g. the 3rd "HYP" value is converted to something like 2.0000000199, because out of all the available values, the second lowest belongs to "HYP", and this specific "HYP" value is 199).
Next, enter the following in D2 and copy it down throughout the column:
=RANK(C2,$C$2:$C$6,1)
This will give you the "Final" rankings. There won't be any ties because of the tiny decimals we added in the previous formula. The results end up looking just like your sample.

In excel, I need to find the maximum date based on the employee number

I have tried to use the following formula when trying to find the max date of these columns based on the employee number in my hundreds of thousands lines of data. The formula bar gives me 'yes' when it is the max, however in my cell it says 'no'. I cannot figure out what the issue is. Thanks for the help.
Tamara
Excel Max date formula Image
Formula used: =IF(AQ2=MAX(IF($C:$C=C2,$AQ:$AQ)),"YES","NO")
A B Employee Number Max?
11-Mar-13 12-Mar-13 199 NO
24-Mar-13 26-Mar-13 199 NO
1-Aug-13 6-Aug-13 199 NO
22-Dec-13 27-Dec-13 199 NO
15-Apr-13 17-Apr-13 206 NO
18-Apr-13 18-Apr-13 206 NO
8-Aug-13 10-Aug-13 206 NO
17-Oct-13 18-Oct-13 206 NO
25-Dec-13 20-Feb-14 206 YES
8-May-13 8-May-13 214 NO
You can also accomplish this without an array is all of the dates for a specific employee ID are unique--that is, you won't have two of the same date. In this case, the following formula will check that (a) the number of dates with employee ID is equal to (b) the number of dates with employee ID that are less than or equal to the current employee ID. This will only be true for the "max" date for said employee id:
=IF(COUNTIFS($C:$C,C2)=COUNTIFS($C:$C,C2,$A:$A,"<="&A2),"Yes","No")
If I understand your question correctly, you want to find the set of dates with the largest time span in between said dates. If this is the case, then I would recommend using two seperate fucntions, the =DAYS360 function and the =MAX function.
I have re-created your sheet and it will end up looking similar to this:
Here is the same picture of the same sheet with functions revealed, so that you can see how the functions are used:
The =DAYS360 function takes two inputs, and return the number of days in between two dates. The max function simply finds the largest number in a range. Please let me know if this helped.
EDIT: Also, if you want to see the actual word Max next to the largest date range, you can nest the Max fucntion from my column E within an If function, like this:
=IF(MAX(D:D)=D2,"Max","")
If I understand you correctly, do you want "YES" to appear for each employee's max date range? Assuming column AQ contains the spans between dates in columns A and B (i.e. =B2-A2 copied down), your formula should work.
This only works as an array formula, so make sure you press CTRL+SHIFT+ENTER when entering the formula, then copy it down to all cells in the same column.
=IF(AQ2=MAX(IF($C:$C=C2,$AQ:$AQ)),"YES","NO"), entered in D2 using CTRL+SHIFT+ENTERand copied down produces the following:
A B C D ... AQ
11-Mar-13 12-Mar-13 199 NO 1
24-Mar-13 26-Mar-13 199 NO 2
1-Aug-13 6-Aug-13 199 YES 5
22-Dec-13 27-Dec-13 199 YES 5
15-Apr-13 17-Apr-13 206 NO 2
18-Apr-13 18-Apr-13 206 NO 0
8-Aug-13 10-Aug-13 206 NO 2
17-Oct-13 18-Oct-13 206 NO 1
25-Dec-13 20-Feb-14 206 YES 57
8-May-13 8-May-13 214 YES 0
If you are simply looking for the greatest date range, the formula =IF(E2=MAX($E:$E),"YES","NO") entered in D2 and copied down will do the trick.

Resources