Finding nth Occurrence in multiple columns in Excel

Finding nth Occurrence in multiple columns in Excel - excel

I have two columns with team names and two columns with corresponding stats. I need to go through the 2 columns and find the stats that match the team name, and they need to be in order. VLOOKUP, MATCH, and SEARCH don't seem to work with multiple columns. Does anyone know how this can be done?

Assuming in your picture, the "Home" title is in cell B2 then the following array formula can be put in the cells H3:L7 (array formulas need to be entered with Ctrl+Shift+Enter)
=IFERROR(OFFSET($D$1,-1+SMALL(IF(($B$3:$B$7=H$2)+($C$3:$C$7=H$2),ROW($B$3:$B$7),"X"),ROW()-ROW($2:$2)),--NOT((INDEX($B:$B,SMALL(IF(($B$3:$B$7=H$2)+($C$3:$C$7=H$2),ROW($B$3:$B$7),"X"),ROW()-ROW($2:$2)))=H$2))),"")
Let me break it down...
the logic is: OFFSET(top_of_results,row_number_that_has_Nth_team_score,0_or_1_for_home_or_away)
this is wrapped in an IFERROR in for where there isn't e.g. a 5th score for team A
using the array part IF(($B$3:$B$7=H$2)+($C$3:$C$7=H$2),ROW($B$3:$B$7),"X" we get an array of that has the ROW number if either (done using +) B or C have a value matching our team header (H$2) or an X otherwise
using SMALL(...,ROW()-ROW($2:$2)) we get the Nth smallest row, based on 1st being in row 3, 2nd in row 4 etc.
to get whether it is home or away, we check column B on our row to see if it matches --NOT((INDEX($B:$B,row_number_that_has_Nth_team_score-O)=H$2)) this gives 0 for home, 1 for away and this is used to offset the column
Hopefully it makes sense. Array formulas are very powerful, if a little confusing :-) I recommend CPearson's intro for more information.
Good luck!

Related

Find a value in a range (any of multiple column or rows) and return the value in far left most column of that row

Looking for something like a typical index match formula that can look to the right and return value to the left, but look at all columns in a range. Take below valid formula for example.
(Excel 2021.)
Finds A1's value in column D, and it returns value from column C.
=INDEX($C$1:$C$10,MATCH(A1,$D$1:$D$10,0))
In my ideal world I can Keep $D$1 and change $D$10 to $F$10 so it searches all columns D/E/F, and still returns C like below. However that does not work in our real world, any other ideas please? Thanks!
=INDEX($C$1:$C$10,MATCH(A1,$D$1:$F$10,0))
Update*
To clarify there are mix of letters and numbers. Also this table will be about 50k rows so hoping as simple as possible.
Also Column C will all be unique for sure, and D-F should be unique values but there is a chance a mistake and a few duplicates might be in.

You need MMUL() with INDEX(). Try below formula if you have Excel-365.
=FILTER(C1:C10,MMULT(--(D1:F10=A1),SEQUENCE(COLUMNS(D1:F1))))
For older version try
=INDEX($C$1:$C$10,LARGE(MMULT(--($D$1:$F$10=A1),TRANSPOSE({1,1,1}))*ROW($C$1:$C$10),1))

Since your INDEX/MATCH take from the same rows, you can first simplify your original search with
=XLOOKUP(A1,$D$1:$D$10,$C$1:$C$10)
XLOOKUP combines HLOOKUP and VLOOKUP with exact match being the default.
This will work for searching three rows
IFERROR(IFERROR(XLOOKUP(A1,$D$1:$D$10,$C$1:$C$10), XLOOKUP(A1,$E$1:$E$10,$C$1:$C$10)), XLOOKUP(A1,$F$1:$F$10,$C$1:$C$10))
We can name the columns colC, colD, colE, and colF and it becomes
IFERROR(IFERROR(XLOOKUP(A1,colD,colC), XLOOKUP(A1,colE,colC)), XLOOKUP(A1,colF,colC))
As with other lookups, this returns the first value or #N/A error.
This could be made more scalable for higher number of rows if we are allowed to add a column somewhere.

excel formula: report a value from the B column that corresponds to the matching value A column

I am not familiar enough with excel to build this formula:
I am trying to write an excel formula to search for matching data in columns A and F, then report a value from column B that corresponds to the same row as the "found" A column into column G.
So basically, I'm putting the formula into G607 and I am searching, in column A for a value that matches F607.
If I find a matching value in A104, I want the value that reports in G607 to be B104.
My spreadsheet does not have any duplicate values in column A.

VLOOKUP is cool, but I would warn you away from it before you get comfortable with it. VLOOKUP is silly because:
It requires you to define an array where the unique IDs are ALWAYS the left column.
It requires you to COUNT COLUMNS LIKE A HEATHEN to tell it what the output is.
There are memory and other structural problems not worth going into.
You have to sort Column A if you care about exact matches. Amateur hour here.
Guys, I'm joking, I don't care that much, but seriously, Index/Match is cooler.
In G607, put this:
=INDEX(Sheet1!B:B,MATCH(Sheet1!F607,Sheet1!A:A,0))
Break it down:
INDEX() helps us by saying, "What is the answer you want? Cool, now tell me the row and column?" Obviously, if we knew the row/column, we wouldn't care.
Que the MATCH() Equation - this is where we say, "Hey, see F607? Yeah, find where it matches in Column A."
IF THERE WERE DUPLICATES IN COLUMN A, it would stop at the first entry and report that. Not a concern here since you don't have duplicates!
The 3rd argument (has a 0) in the MATCH equation just says "Hey, make it an exact match".
Index/Match like this makes sure that we can:
Pick any match column we want. VLOOKUP wouldn't work if Column B had the Unique IDs, and Column A was the answer instead.
Pick any output column we want without counting. Seriously, who counts in 2018?
Arrays are minimized b/c 1 and 2 above, plus other reasons.
No sorting required. Winner. Winner.

Assuming all cells are on the same sheet,
G607 = VLOOKUP(F607, A:B, 2, 0)

Find and remove duplicate IDs and replace

I have very basic user Excel knowledge. I have a spreadsheet where I keep track of reloading data. Each load I enter gets a unique load number that is calculated automatically with a formula, based on the caliber name and an incrementally increasing number. As of now, every load I enter gets a number, even if it's been repeated before. Popular loads that I repeat often are all the same except for the date and numbers of rounds made but currently will have different load numbers. Is there a way to skips these repeated loads and assign it the previous load number or not assign a load number at all, with a formula instead of manually?
I know this is asking for a great deal but I'd greatly appreciate any help! I'm certainly open to suggestions if this isn't even the best way to go about this.
Sample workbook at:
https://www.dropbox.com/s/v5y1ufxjiosmnap/My%20Reloading%20Data%20-%20Sample.xlsx?dl=0
Here's what I've tried so far:
In column Q2, combine all the criteria.
=C17&E17&F17&G17&H17&L17&M17&N17&P17
In column R2 look for duplicates.
=IF(COUNTIF($Q$2:$Q17, $Q17)>1, "Duplicate", "")
D2 is the Load # column.
=IF(R17="Duplicate","",(TEXT(C17,0)&"-"&TEXT(COUNTIF($C$2:C17,C17),"000")))
This will skip the duplicate loads and not give them a load # leaving the cell blank. I'd love to find and match what that load # should be and insert it. Also, when the sequential numbering resumes it acts as if it's counted the duplicate row. For instance D2 might look like:
9mm-001
9mm-002
(Skipped for duplicate and left blank, but would like it to find, match, and insert the duplicate load #)
9mm-004 (I'd like to to be 9mm-003)

You should be able to achieve this with a VLOOKUP formula or a combination of MATCH and INDEX.
VLOOKUP (Vertical Lookup) looks for a match in another cell and returns a value from an offset column. A non match, if you use FALSE as the last parameter, returns a #N/A error.
So, in D20 (for example) you could, using column Q as your determinant, use the following, assuming you had a copy of D in column R:
=IFERROR(VLOOKUP(Q20,Q$1:R19,2,FALSE),[value for newly found loadno])
What this formula does is calculates a VLOOKUP - if that doesn't find a record, calculate a new value. The VLOOKUP will look at the concatenated key in the current row Q column, search through all previous columns (note it is anchored at row 1, but not anchored for the bottom of the range so you can copy the formula), it uses the column 2 (Q is column 1, so R is column 2) for the result, and demands an exact match (FALSE). If it doesn't find one, return NA and let the second half of the IFERROR take over.
See how you go with this.
The MATCH INDEX may work better because you won't need the additional R column due to VLOOKUP only being able to look to the right of the key.
Here is an INDEX and MATCH solution - slightly harder to understand, but a more flexible solution.
=IFERROR(INDEX(D$1:D19,MATCH(Q20,Q$1:Q19,0)),[value for newly found load number])
I prefer this.
The outer function says return the nth value in the list. The inner MATCH function says find this value (Q20) in this list (Q1:Q19). The 0 as the third parameter of the MATCH function says the match has to be exact.

Transpose multiple occurrences

EDIT: I have revived the source data source to remove the ambiguity of my last screen shots
I am trying to transpose spreadsheet data where there are many rows where the customer name may be duplicated but each row contains a different product.
For instance
revised original data source
to
revised proposed data format
I would like to do it with formulae if possible as I struggle with VB
Thank you for any help

I realise this is a huge answer, apologies but I wanted to be clear. If you need anything from me, drop me a comment and I'll help out.
Here's the output from my formula:
EDITED ANSWER - Named ranges used for ease of understanding:
These are just an example of a few of the named ranges I have used, you can reference the ranges directly or name them yourself (simplest way is to highlight the data then put the name in the drop down next to the formula bar [top left])
Be wary that as we will be using Array formulas for AccNum and AccType, you will not want to select the entire column and instead opt for either the exact data length or overshoot it by 100 or so. Large array formulas tend to slow down calculation and will calculate every cell individually regardless of it being empty.
First formula
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Account Number ",LEFT((COLUMN(A:A)+1)/2,1)),"")
This formula is identical to the one in the original answer apart form the adjusted heading title.
=IF(Condition,True,False) - There are so many uses for the IF logic, it is the best formula in Excel in my opinion. I have used to IF with COUNTIF to check whether there is more than 0 cells that are more than BLANK (or ""). This is just a trick around using ISBLANK() or other blank identifiers that get confused when formula is present.
If the result is TRUE, I use CONCATENATE(Text1,Text2,etc.) to build a text string for the column header. ROW(1:1) or COLUMN(A:A) is commonly used to initiate an automatically increasing integer for formulas to use based on whether the count increase is required horizontally or vertically. I add 1 to this increasing integer and divide it by 2 so that the increase for each column is 0.5 (1 > 1.5 > 2 > 2.5) I then use LEFT formula to just take the first digit to the left of this decimal answer so the number increases only once every 2 columns.
If the result is FALSE then leave the cell blank ,""). Standard stuff here, no explanation needed.
Second Formula
=CONCATENATE(INDEX(Forename,MATCH(Sheet4!$A2,Reference,0)))
=CONCATENATE(INDEX(Surname,MATCH(Sheet4!$A2,Reference,0)))
CONCATENATE has only been used here to force blank cells to remain blank when pulled by INDEX. INDEX will read blank cells as values and therefore 0's whereas CONCATENATE will read them as text and therefore "".
INDEX(Range,Row,Column): This is a lookup formula that is much more advanced than VLOOKUP or HLOOKUP and not limited in the way that they are.
The range i have used is the expected output range - Forename or Surname
The row is then calculated using MATCH(Criteria,Range,Match Type). Match will look through a range and return the position as an integer where a match occurs. For this I have set the criteria to the unique reference number in column A for that row, the range to the named range Reference and the match type as 0 (1 Less than, 0 Exact Match, -1 Greater than).
I did not define a column number for INDEX as it defaults to the first column and I am only giving it one column of data to output from anyway.
Third Formula
Remember these need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter)
=IFERROR(INDEX(AccNum,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0))),"")
=IFERROR(INDEX(AccType,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))),"")
As you can see, one of these is used for AccNum and the other for AccType.
IFERROR(Value): The reason that this has been used is that we are not expecting the formula to always return something. When the formula cannot return something or SMALL has run out of matches to go through then an error will occur (usually #VALUE or #NUM!) so i use ,"") to force a blank result instead (again standard stuff).
I have already explained the INDEX formula above so let's just dive in to how I have worked out the rows that match what we are looking for:
SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))
The IF statement here is fairly self explanatory but as we have used it as an array formula, it will perform =Sheet4!$A2 which is the unique reference on every cell in the named range Reference individually. In your mock data this returns a result of: {FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} for the first entry (I included titles in the range, hence the initial FALSE). IF will do my row calculation* for every true but leave the FALSEs as they are.
This leaves a result of {FALSE;2;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} that SMALL(array,k) will use. SMALL will only work on numeric values and will display the 'k'th result. Again the column trick has been used but to cover more ground, I used another method: ROUNDDOWN(Number,digits) as opposed to using LEFT() Digits here means decimal places so I used 0 to round down to a whole integer for the same result. As this copies across the columns like so: 1, 1, 2, 2, 3, 3, SMALL will alternatively (as the formulas alternate) grab the 1st smallest AccNum then the 1st Smallest AccType before grabbing the 2nd AccNum and Acctype and so forth.
*(Row number of the match minus the first row number of the range then plus 1, again fairly common as a foolproof way to always get the correct row regardless of where the data starts; actually as your data starts on row 1 we could just do ROW(Reference) but I left it as is incase you had data in a different format)
ORIGINAL ANSWER - Same logic as above
Here's your solution in 3 parts
Part 1 being a trick for the auto completion of the titles so that they will hide when not used (in case you will just copay and paste values the whole lot to speed up use again).
=IF(COUNTIF(C2:C11,">""")>0,CONCATENATE("Product ",LEFT((COLUMN(A:A)+1)/2,1)),"") in C
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Prod code ",LEFT((COLUMN(B:B)+1)/2,1)),"") in D
Highlight both of the cells and drag across to stagger the outputs "Product " and "Prod code "
Part 2 would be inputting the unique IDs to the new sheet, I would suggest copying your entire column A across to a new sheet and using DATA > REMOVE DUPLICATES > Continue with current selection to trim out the multiple occurrences of unique IDs.
In column B use =INDEX(Sheet2!$B$1:$B$7,MATCH(Sheet4!$A2,Sheet2!$A$1:$A$7,0)) to get the names pulled across.
Part 3, the INDEX
Once again, we are doing a staggered input here before copying the formula across the page to cover the entirety of the data.
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0)),1),"") in C
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0)),2),"") in D
The formulas of Part 3 will need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter) . This will need to be done before copying the formulas across.
These formulas can now be dragged / copied in all directions and will feed off of the unique ID in column A.
My Answer is already rather long so I haven't gone on to break the formula down. If you have any trouble understanding how this works, let me know and I will be happy to write up a quick guide, breaking it down chunk by chunk for you.

Vlookup and get the min value (date)

TOP Table is Input, and bottom table is preview for required output.
For Each ID I need to find earliest datetime. I also need other information from other columns (please see image below).
My current solution is:
In Cell E2 =A2
Cell E3 drag down =IF(E2<>A3,IF(E1=A3,"",A3),"")
In Cell F2 drag down =IF(E2<>"",MIN(IF($A$2:$A$14=E2,$C$2:$C$14)),"") Ctrl+Shift+Enter

One more option without any intermediate calculations:
Select the whole range starting E2 and to the last row where IDs are located - for the sample given it's row 14, so select range E2:E14: =IFERROR(INDEX($A$2:$A$14,SMALL(IF(MATCH($A$2:$A$14,$A$2:$A$14,0)=ROW(INDIRECT("1:"&ROWS($A$2:$A$14))),MATCH($A$2:$A$14,$A$2:$A$14,0),""),ROW(INDIRECT("1:"&ROWS($A$2:$A$14))))),"") and press CTRL+SHIFT+ENTER instead of usual ENTER - this will define a Multicell ARRAY formula and will result in curly {} brackets around it (but do NOT type them manually!).
F2 (ID2): =IF(E2="","",SUMPRODUCT(--(E2=$A$2:$A$14),--(G2=$C$2:$C$14),$B$2:$B$14)) - normal formula.
G2 (Min Date): =IF(E2="","",MIN(IF(E2=$A$2:$A$14,$C$2:$C$14,2^100))) and press CTRL+SHIFT+ENTER instead of usual ENTER - this will define an ARRAY formula and will result in curly {} brackets around it (but do NOT type them manually!).
H2 (InCh): =IF(E2="","",INDEX($D$2:$D$14,SUMPRODUCT(--(E2=$A$2:$A$14),--(F2=$B$2:$B$14),--(G2=$C$2:$C$14),ROW(INDIRECT("1:"&ROWS($D$2:$D$14)))))) - normal formula.
Remarks:
To make the solution more compact and easy to read, define named range for ID column, and then reference other data columns using OFFSET.
ID2 values may not be unique - as they are on the sample for IDs 1...3.
Resulting set for Min Date should be formatted the same way as source Date row.
The key formula of the solution - is multicell monster which returns unique IDs without empty rows - as OP requested)
Sample file: https://www.dropbox.com/s/d2098updfh8djnf/MinDateIDs.xlsx

This is quite a challenge... I think I have found an approach that works. For the sake of clarity, I used a few helper columns. Also, I did not use any named ranges but stuck with the column-row indications. You might want to change that.
It looks like this:
and zooming in to the relevant columns:
Column F contains an array formula to filter out duplicates. An approach is explained here. The formula I used in F2 is
=INDEX($A$2:$A$14, MATCH(MIN(IF(COUNTIF($F$1:F1,$A$2:$A$14)=0, 1, MAX((COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1)*2))*(COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1)), COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1, 0))
Use Ctrl-Shift-Enter to confirm as array formula. Drag this down or copy into column F. Then columns G and H contain the starting and ending indices of the duplicate ID values. This answer helped, please upvote it :-). The two formulas used are:
=MATCH(2,1/FREQUENCY($F2,$A$2:$A$14))
in G2, and
=FREQUENCY($A$2:$A$14,$F2)
in H2. Again, drag them down to get the full column filled. Next, column I is for clarification only -- and for sanity checking. It contains the desired minimum date from each sub-array. Column J substitutes that formula into a MATCH to find the actual index of the desired date.
=MIN(OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1))
in I2 and
=$G2-1+MATCH(2,1/FREQUENCY(MIN(OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1)), OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1))
in J2. Finally, columns L, M and N index into the original set of data via
=INDEX(B$2:B$14,$J2)
in L2, which you can drag horizontally and then vertically.
When you are done, you can hide the helper columns, or fold everything into big formulas. Good luck with that... There might be an easier way to achieve this, but I did not find it.

If you want the value from column D in G then assuming that column C values are unique you could just use a VLOOKUP, i.e. in G2 copied down
=VLOOKUP(F2,C$2:D$14,2,0)

Per your picture, they're all in the same sheet. Just sort by ID, then Date (ascending). As you work your way down the ID column, each time the ID changes, you know you've found the row with the minimum Date for that specific ID. Create an extra column to signify where ID changes occur, and filter for those rows (hide the column if you so desire).
And... voila.

Know this link is old, but there is a much shorter and easier way!
How about using a pivot table using the Minimum as field setting and then do a =GETPIVOTDATA() to get the information back!
Seems a lot simpler as these formulas!

Actually, I just realized I've been overthinking this...Excel keeps the top item and removes all that follow when removing duplicates.
So if you are going to create an extra working table anyway, why not just copy the range/columns you want to keep, then use the basic sort.
Sort first by ID, then by the column you want as the second filter. Be sure the sorts are in the order you want (e.g. newest to oldest, oldest to newest, A to Z, Largest to smallest, etc).
Once the data is sorted, remove duplicates based on ID. You are left with all of your columns of data, filtered by newest/oldest/largest/smallest per individual.
This worked for my table with 30,000+ records, filtered down to 1500 unique individuals with most recent (plus associated amount), and with a second filter, the largest (plus associated date) for each person.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Finding nth Occurrence in multiple columns in Excel - excel

I have two columns with team names and two columns with corresponding stats. I need to go through the 2 columns and find the stats that match the team name, and they need to be in order. VLOOKUP, MATCH, and SEARCH don't seem to work with multiple columns. Does anyone know how this can be done?

Related

Find a value in a range (any of multiple column or rows) and return the value in far left most column of that row

excel formula: report a value from the B column that corresponds to the matching value A column

Find and remove duplicate IDs and replace

Transpose multiple occurrences

Vlookup and get the min value (date)

Categories

Resources