Return Kth largest Value of range that is determined by an Index & Match lookup - excel

My question is similar to one asked here, but I am having trouble making this work for my situation given my data. I have a data set that uses seeded numbers in row 1 that I use to index match columns. This is because there are drop-down menus that change the match column based on user selection. So the the columns cannot be directly referenced. My data very roughly looks like this:
45 46 50 28
Route
CCS 500 325 40 200
CCS 370 100 380 10
RCS 90 825 50 999
CCS 100 50 32 358
So when my user makes a selection, the number in AE2 changes to reflect the column seed I want (in example, either 45, 46, 50, or 28). I want to be able to return the Kth largest number in that column that is also "CCS". So lets say the user chooses 46 and I want the 2nd largest number that has "CCS" in Route. So the formula searches row 1 for "46", then once it finds the column with it, it looks down that column for the 2nd largest CCS value -- which is 100. I have tried to modify the formula suggested in the other question, (below) but that just seems to stop at the first observation, and I need it to search all of the observations.
LARGE(IF( 'Program Data'!O:O="CCS", INDEX('Program Data'!$A:$GB,0,(MATCH($AE$2,'Program Data'!$1:$1,0)))),1)
Any tips as to what I'm doing wrong?

Your formula works for me....but it's an "array formula" so you need to confirm with CTRL+SHIFT+ENTER so that curly braces like { and } appear around the formula

Related

Sum of the greatest value in one column, plus the sum of the other values in another column

Consider the following sheet/table:
A B
1 90 71
2 40 25
3 60 16
4 110 13
5 87 82
I want to have a general formula in cell C1 that sums the greatest value in column A (which is 110), plus the sum of the other values in column B (which are 71, 25, 16 and 82). I would appreciate if the formula wasn't an array formula (as in requiring Ctrl + Shift + Enter). I don’t have Office 365, I have Excel 2019.
My attempt
Getting the greatest value in column A is easy, we use MAX(A1:A5).
So the formula I want in cell C1 should be something like:
=MAX(A1:A5) + SUM(array_of_values_to_be_summed)
Obtaining the values of the other rows in column B (what I called array_of_values_to_be_summed in the previous formula) is the hard part. I've read about using INDEX, MATCH, their combination, and obtaining arrays by using parenthesis and equal signs, and I've tried that, without success so far.
For example, I noticed that NOT((A1:A5 = MAX(A1:A5))) yields an array/list containing ones (or TRUEs) for the relative position of the rows to be summed, and containing a zero (or FALSE) for the relative position of the row to be omitted. Maybe this is useful, I couldn't find how.
Any ideas? Thanks.
Edit 1 (solution)
I managed to obtain what I wanted. I simply multiplied the array obtained with the NOT formula, by the range B1:B5. The final formula is:
=MAX(A1:A5) + SUM(NOT((A1:A5 = MAX(A1:A5))) * B1:B5)
Edit 2 (duplicate values)
I forgot to explain what the formula should do if there are duplicates in column A. In that case, the first term of my final formula (the term that has the MAX function) would be the one whose corresponding value in column B is smallest, and the value in column B of the other duplicates would be used in the second term (the one containing the SUM function).
For example, consider the following sheet/table:
A B
1 90 71
2 110 25
3 60 16
4 110 13
5 110 82
Based on the above table, the formula should yield 110 + (71 + 25 + 16 + 82) = 304.
Just to give context, the reason I want such a formula is because I’m writing a spreadsheet that automatically calculates the electric current rating of the short-circuit protective device of the feeder of a group of electric motors in a house or building or mall, as required by the article 430.62(A) of the US National Electrical Code. Column A is the current rating of the short-circuit protective device of the branch-circuit of each motors, and column B is the full-load current of each motor.
You can use this formula
=MAX(A1:A5)
+SUM(B1:B5)
-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)
Based on #Anupam Chand's hint for max-value-duplicates there could also be min-value-duplicates in column B for corresponding max-value-duplicates in column A. :) This formula would account for that
=SUM(B1:B5)
+(MAX(A1:A5)-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1))
*SUMPRODUCT((A1:A5=MAX(A1:A5))*(B1:B5=AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)))
Or with #Anupam Chand's shorter and better readable and overall better style :)
=SUM(B1:B5)
+(MAX(A1:A5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
*COUNTIFS(A1:A5,MAX(A1:A5),B1:B5,MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
The explanation works for bot solutions:
The SUM-part just sums the whole list.
The second line gets the max-value for column A and the corresponding min-value of column B for the max-values in column A and adds or subtracts it respectively.
The third line counts, how many times the corresponding min-value for the max-value occurs and multiplies it with the second line.
Can you try this ?
=MAX(A1:A5)+SUM(B1:B5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5))
What we're doing is adding the max of A to all rows of B and then subtracting the min value of B where A is the max.
If you have Excel 365 you can use the following LET-Formula
=LET(A,A1:A5,
B,B1:B5,
MaxA,MAX(A),
MinBExclude, MINIFS(B,A,MaxA),
sumB1,SUMPRODUCT(B*(A=MaxA)*(B<>MinBExclude)),
sumB2,SUMPRODUCT(B*(A<>MaxA)),
MaxA +sumB1+sumB2
A and B are shortcuts for the two ranges
MaxA returns the max value for A (110)
MinBExclude filters the values of column B by the MaxA-value (25, 13, 82) and returns the min-value of the filtered result (13)
sumB1 returns the sum of the other MaxA values from column B (26 + 82)
sumB2 returns the sum of the values from B where value in A <> MaxA (71 + 60)
and finally the result is returned
If you don't have Excel 365 you can add helper columns for MaxA, MinBExclude, sumB1 and sumB2 and the final result

Reverse MATCH with a non existing value

I have data in Excel in the following format:
Column A Column B
20/03/2018 300
21/03/2018 200
22/03/2018 100
23/03/2018 90
24/03/2018 300
25/03/2018 200
26/03/2018 100
27/03/2018 50
28/03/2018 90
29/03/2018 100
30/03/2018 110
31/03/2018 120
I would like to get the date where the minimum of B would never be under 99 again chronologically. It the example above, that would happen the 29th of March.
If I try to get it with: =INDEX(A:A,MATCH(99,B1:B12,-1)) the value returned is 22/03/2018 as it is the first occurrence found, searched from top to bottom.
In this case it would be perfect to be able to do a reverse match(e.g. a match that searches from bottom to top of the range) but this option is not available. I have seen that it is possible to do reverse matches with the lookup function but in that case I need to provide a value that is actually in my data set (99 would not work).
The workaround I have found is to add a third column like the following (with the minimum of the upcoming value of B going down) and index match on top it.
Column A Column B Column C
20/03/2018 300 50
21/03/2018 200 50
22/03/2018 100 50
23/03/2018 90 50
24/03/2018 300 50
25/03/2018 200 50
26/03/2018 100 50
27/03/2018 50 50
28/03/2018 90 90
29/03/2018 100 100
30/03/2018 110 110
31/03/2018 120 120
Is there a way of achieving this without a third column?
The AGGREGATE function is great for problems like these:
=AGGREGATE(14,4,(B2:B13<99)*A2:A13,1)+1
What are those numeric arguments?
14 tells the function to replicate a LARGE function
4 to ignore no values (this function can ignore error values and other things)
More info here. I checked it works below:
If your dates aren't always consecutive, you'll need to add a bit more to the function:
=INDEX(A1:A12,MATCH(AGGREGATE(14,6,(B1:B12<99)*A1:A12,1),A1:A12,0)+1)
=INDEX(A1:A12,LARGE(IF(B1:B12<=99,ROW(B1:B12)+1),1))
This is an array formula (Ctrl+Shift+Enter while still in the formula bar)
Builds an array of the row 1 below results that are less than or equal to 99. Large then returns the largest row number for index.

Using Offset to get to the next instance of a variable?

I am attempting to find the next instance of a variable in order to generate a list base on another variable:
Mkt ID
10 908
15 915
15 416
25 312
25 215
32 482
Similar to the above. There are two drop downs, one for market and one for ID. I want the user to be able to select a market and in the ID drop down have the data validation filter to that list of IDs respective to the market in the first drop down.Let's say the market dropdown is $G$2. Market is Column A, and ID is column B.
Here's the formula I have so far:
OFFSET(ADDRESS(MATCH($G$2,A:A,0),1),0,1,COUNTIFS(A:A,$G$2),1)
This formula references the market, offsets by 0 rows and 1 column, counts the number of that market instance for height, and 1 row in width. I do not see why this is not working. Excel just gives the typical, are you really trying to type a formula? error code.
ADDRESS returns a string that looks like a cell reference. You need INDIRECT to turn that into a real cell reference that OFFSET can use.
=OFFSET(indirect(ADDRESS(MATCH($G$2, A:A, 0), 1)), 0, 1, COUNTIFS(A:A, $G$2), 1)

Conditional formatting on the first x number of rows, regardless of filter or sort, in Excel

I'm trying to find a way to easily identify the first ten rows in a table column, no matter how it's been sorted/filtered. Is there a way to use conditional formatting to highlight these cells?
Examples of desired results...
Sample data:
product price units code
Item02 15.97 2191 7UQC
Item05 12.95 1523 TAAI
Item13 9.49 1410 LV9E
Item01 5.69 591 6DOY
Item04 15.97 554 ZCN2
Item08 10.68 451 2GN0
Item03 13.95 411 FP6A
Item07 25.45 174 PEWK
Item09 14.99 157 B5S4
Item06 18 152 XJ4G
Item10 11.45 148 BY8M
Item11 16.99 66 86C2
Item12 24.5 17 X31K
Item14 24.95 14 QJEI
When sorting by price the first 10 products highlighted differ from those in the next example.
The first 10 visible products are highlighted after filtering out Item12, Item05, and Item08.
Choosing to sort by units automatically highlights a different set of products.
Use this formula in the Conditional Formatting:
=SUBTOTAL(3,$A$2:$A2)<11
Make sure it applies to the entire dataset.
The formula returns the row number relative to the visible row number. Thus as a row is hidden the row beneath the hidden returns one greater than it would.
To see how it works place SUBTOTAL(3,$A$2:$A2) in an empty column. Then filter the table and watch as the numbers change.
The 3 refers to the COUNTA() function, which will count any non-empty cell.
Subtotal is designed to work with data that gets filtered to return only the visible data.
So the Formula will only count the visible cells that are not empty.
In the conditional formatting dialog, choose New rule -> Use a formula.... Enter =row()<=10.

In excel, I need to find the maximum date based on the employee number

I have tried to use the following formula when trying to find the max date of these columns based on the employee number in my hundreds of thousands lines of data. The formula bar gives me 'yes' when it is the max, however in my cell it says 'no'. I cannot figure out what the issue is. Thanks for the help.
Tamara
Excel Max date formula Image
Formula used: =IF(AQ2=MAX(IF($C:$C=C2,$AQ:$AQ)),"YES","NO")
A B Employee Number Max?
11-Mar-13 12-Mar-13 199 NO
24-Mar-13 26-Mar-13 199 NO
1-Aug-13 6-Aug-13 199 NO
22-Dec-13 27-Dec-13 199 NO
15-Apr-13 17-Apr-13 206 NO
18-Apr-13 18-Apr-13 206 NO
8-Aug-13 10-Aug-13 206 NO
17-Oct-13 18-Oct-13 206 NO
25-Dec-13 20-Feb-14 206 YES
8-May-13 8-May-13 214 NO
You can also accomplish this without an array is all of the dates for a specific employee ID are unique--that is, you won't have two of the same date. In this case, the following formula will check that (a) the number of dates with employee ID is equal to (b) the number of dates with employee ID that are less than or equal to the current employee ID. This will only be true for the "max" date for said employee id:
=IF(COUNTIFS($C:$C,C2)=COUNTIFS($C:$C,C2,$A:$A,"<="&A2),"Yes","No")
If I understand your question correctly, you want to find the set of dates with the largest time span in between said dates. If this is the case, then I would recommend using two seperate fucntions, the =DAYS360 function and the =MAX function.
I have re-created your sheet and it will end up looking similar to this:
Here is the same picture of the same sheet with functions revealed, so that you can see how the functions are used:
The =DAYS360 function takes two inputs, and return the number of days in between two dates. The max function simply finds the largest number in a range. Please let me know if this helped.
EDIT: Also, if you want to see the actual word Max next to the largest date range, you can nest the Max fucntion from my column E within an If function, like this:
=IF(MAX(D:D)=D2,"Max","")
If I understand you correctly, do you want "YES" to appear for each employee's max date range? Assuming column AQ contains the spans between dates in columns A and B (i.e. =B2-A2 copied down), your formula should work.
This only works as an array formula, so make sure you press CTRL+SHIFT+ENTER when entering the formula, then copy it down to all cells in the same column.
=IF(AQ2=MAX(IF($C:$C=C2,$AQ:$AQ)),"YES","NO"), entered in D2 using CTRL+SHIFT+ENTERand copied down produces the following:
A B C D ... AQ
11-Mar-13 12-Mar-13 199 NO 1
24-Mar-13 26-Mar-13 199 NO 2
1-Aug-13 6-Aug-13 199 YES 5
22-Dec-13 27-Dec-13 199 YES 5
15-Apr-13 17-Apr-13 206 NO 2
18-Apr-13 18-Apr-13 206 NO 0
8-Aug-13 10-Aug-13 206 NO 2
17-Oct-13 18-Oct-13 206 NO 1
25-Dec-13 20-Feb-14 206 YES 57
8-May-13 8-May-13 214 YES 0
If you are simply looking for the greatest date range, the formula =IF(E2=MAX($E:$E),"YES","NO") entered in D2 and copied down will do the trick.

Resources