Multiple Match Results From Array Search - excel

For the sake of MWE, I have an array in $AP$4:$BO$20 with a single string in each cell. The data in each cell is an alphanumeric code, such as 1,1a,2b,3c, etc.
Row 22, starting in column AQ, contains a single string that matches one or more of the strings in the array named above. Goal: using each string in AQ22:AO22, create a formula that extracts EVERY row number of the cells in the array $AP$4:$BO$20 that contain exactly the value in AQ22:AO22.
Bonus for doing it without using an array formula. VBA is not an option since this is Google Sheets, and I'd really prefer to avoid g-apps-script.
I've attempted using
=INDIRECT(ADDRESS(MIN(IF(NOT(ISERROR(FIND(AQ22,$AP$4:$BO$20,1))),ROW($AP$4:$BO$20),"")),1))
and
=IFERROR(INDEX($AP$4:$BO$20,SMALL(IF($AP$4:$BO$20=AQ22,ROW($AP$4:$BO$20)-4),ROW(A1)),2),"")
and even the illustrious
=IF(ISERROR(INDEX($AP$4:$BO$20,SMALL(IF($AP$4:$BO$20=AQ22,ROW($AP$4:$BO$20)),ROW(1:1)),2)),"",INDEX($AP$4:$BO$20,SMALL(IF($AP$4:$BO$20=AQ22,ROW($AP$4:$BO$20)),ROW(1:1)),2))
Here is a toy sheet to test out ideas with using this information. Note the comment on the cell where the formula will begin.

Not sure if I understand the desired result correctly, but
= IfError( Filter( Row($AL$4:$AL$16), RegExMatch( $AL$4:$AL$16, "\b" & AQ22 & "\b" ) ), "")
results in 7, and 9 in a separate cell below it. \b is a word boundary that matches between alphanumeric and non-alphanumeric character. If you want the result in one cell, you can join them:
=IfError(Join(",", Filter(Row($AL$4:$AL$16), RegExMatch($AL$4:$AL$16, "\b"&AQ22&"\b"))), "")
You can also match multiple values:
=IfError(Filter(Row($AL$4:$AL$16), RegExMatch($AL$4:$AL$16, "\b(" & Join("|", AQ22:AZ22) & ")\b")), "")

In Excel you can do it without a CSE entered formula, but I don't know if the AGGREGATE function is available in Sheets:
=IFERROR(AGGREGATE(15,6,1/((AQ$22=arr)*(LEN(arr)>0))*ROW(arr),ROWS($1:1)),"")
For an array entered formula:
=IFERROR(SMALL(IF((AQ$22=arr)*(LEN(arr)>0),ROW(arr),""),ROWS($1:1)),"")
For either one of those formulas, enter in AQ24, then fill down until you get blanks, and across. When you fill down, none of the "target rows" can be hidden (or else the result of the formula will be hidden).
arr refers to $AP$4:$BO$20

Though this is a very specific application in response to this question, for the sake of the knowledge base, I'd like to show how I dealt with an instance of multiple match values. There is likely a much better way, but here is one way.
To give this context, imagine the LIST_CELL is a list of question numbers
(which are entered in as a header row, call the range QUESTIONS) on a test that correspond to certain standards, and the goal is to average only the questions that correspond to the standard next to which the list is written, and for each student. Using
=iferror(join(",",ArrayFormula(match(split(LIST_CELL,","),QUESTIONS,FALSE))),"")
The split function splits the a hand-entered list of questions on commas, the match function returns the column number of that particular question in QUESTIONS, and the join function joins the data back together. ArrayFormula allows the match to be performed on an array instead of just the first value.
Another single row heading lists the standards to which each question has been matched (possibly to more than one standard) by the comma separated list in LIST_CELL. For a column list of students in A:A, each standard needs to average the scores of every question that is listed next to the standard. This is accomplished by the nifty (if clunky):
average(ArrayFormula(hlookup(split(vlookup(LOOKUP_VAL,SEARCH_RANGE,COL_W_LIST),","),DATA_SOURCE,row(CURRENT_CELL))))
Breakdown from center outward:
LOOKUP_VAL is the value being looked up (the one that has multiple matches); in the example context, it's the standard.
SEARCH_RANGE is a range of cells containing both the list of lookup value (the standards in context) and the comma separated lists of column numbers generated by the first function. COL_W_LIST is the column number in the array SEARCH_RANGE that contains the list of row numbers matched from LIST_CELL.
Split takes the elements apart and placed them in a temporary array so that hlookup can be performed on each element. Via ArrayFormula the hlookup grabs each value on the same row in the appropriate QUESTIONS column - in context, it grabs the point scores for each question matched to the standard.
Finally, average is self-explanatory, and does take an array as input apparently.
These two functions in combination allow of use of indirect cell references in an array formula, and solves the much asked, "how do I include multiple matches in a calculation" question. At least in this specific context.

Related

Generate a unique ID (As much As possible) from a string in Excel using string functions

Let's say I have two strings in two cells
Cell A1 = Customer Country
Cell B1 = Customer City
I need to generate a unique ID using the Excel string functions (LEN, LEFT, MID, RIGHT etc.) or any other (CONCAT etc.) along with the ROW function.
Get first letter & last letter of each word, remove spaces and dashes, get the row number and return a unique string.
If I use
=IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=0,LEFT(A$1,1),IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=1,LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1),LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1)&MID(A$1,FIND(" ",A$1,FIND(" ",A$1)+1)+1,1))) &ROW(A$1)
I get results as CC1 in both cases. How would I get a unique ID in such as case.
The idea in the comment-section by #JosWoolley is a good one. Though, be careful how/where you'd add a column index. If you'd just add the column index number you'd create confusion between say CC111 from row 11 column 1 and the number from row 1 and possibly column 11. Just adding the actual address of the cell instead of these indices will help but can create confusion too if you don't add a delimiter first. Therefor I'd suggest something along the lines of:
Formula in D1:
=CONCAT(LEFT(TEXTSPLIT(A1," ")),"|",ADDRESS(ROW(A1),COLUMN(A1),4))
Note: If you don't yet have access to TEXTSPLIT() you can swap this with FILTERXML(). Also, you mentioned CONCAT() but if used with Excel 2019 you may need to CSE the formula.

Find common text within a range of cells(range containing blanks as well)

This is the problem i am facing in Excel formula
enter image description here
In column F, i want to find the common text across A2 to E2 (containing Blanks)
My Question:
Is there a simple way to get the result without VB?
Any help is appreciated,thanks
I found that google sheets has some really cool functions.
If you put the formula =SPLIT(A1, ",", TRUE,FALSE) in the cell after your row of common text (or probably even in a different sheet - "probably because hadn't tried it, though it should), the next x cells (where x is the number of "," in A1 - because "," is the delimitator) will be the text.
then you can put the code =IF(SUM(ARRAYFORMULA(if(REGEXMATCH($A$1:$D$1,F1),1,0)))=COUNTA($A$1:$D$1),F1,"") into an equal number of cells after that (probably should just put into the max number), and =CONCATENATE(I1:L1) into the last cell.
Ok. So to tweak this for yourself: I found that ARRAYFORMULA lets you put an array in place of a single cell in a function inside. how it exactly works I read its like a for loop. but I can't really vouch for that. but here it lets you have REGEXMATCH (which is a Boolean check on the cell you give it for if it contains the given REGEX) check each cell in the array.
the sum will add them up, and the if will match against the COUNTA to find if the number of cells in the array that contain this string is equal to the number of non-empty cells.
the concatenate at the end adds all the cells (containing the regex function) together, and since the only non-empty cells will be the one with the string, that is what this cell will return (no spaces).
code:
results:
the test data:
If you need in specifically Excel... this won't help.
We can use power query to achieve the desired result.
Unpivot the columns in Power query
Split all the columns by Comma delimiter
Create a custom column to see if the first column records exist in the remaining columns.
Use the functionText.contains.
Sample function: =Text.Contains([column.1],[column.1]&[column.2]&[column.3])
If the above function returns TRUE then get the first column result(This is the expected result) and load the data back to your excel

Three Dimensional Lookup Using INDEX/MATCH

This was taken and improved slightly from Question that has since been deleted
For those who can see deleted posts, it was taken from here: https://stackoverflow.com/questions/39793322/three-dimensional-lookup-no-concatenate-or-named-ranges-excel
I'm trying to do a three dimensional lookup without named ranges or concatenates. Simplified, my data is on the form:
Column1 Column2 Column3
Scott
P 1 2 3
M 4 5 6
N 7 8 9
George
P 10 11 12
M 13 14 15
N 16 17 18
I now want to search for a specific Name and then for a specific letter within that names table, I then want to match this row number with a specific column.
I tried a simple INDEX/MATCH:
=INDEX(A:D,MATCH("M",A:A,0),MATCH("Column1",1:1,0))
And that works for the fist name but not any others as it finds the first instance of M.
How do I modify it to look for a different name?
I have answered below, but want to see if someone has a better solution.
I used an IF() statement array formula to find what the P row number was after the George row... I also needed to use the MIN() function to get the first P row number after the name.
Beyond that, it's a simple INDEX() function.... that racked my brain for over an hour :).
=INDEX($A$1:$D$9,MIN(IF((ROW(A1:A9)>MATCH($F$4,A1:A9,0))*(A1:A9=$F$5),ROW(A1:A9),"")),MATCH($F$6,$A$1:$D$1,0))
Don't Forget!
Use Ctrl+Shift+Enter when finishing the formula, so it gets evaluated as an array formula.
You can use two other INDEX/MATCH's inside the first MATCH to set the lookup range. Then you simply need to add the MATCH() to find the absolute position of the name.
=INDEX(A:D,MATCH($H$4,INDEX(A:A,MATCH($H$3,A:A,0)):INDEX(A:A,MATCH($H$3,A:A,0)+4),0)+MATCH($H$3,A:A,0)-1,MATCH($H$5,$1:$1,0))
This one works better and does not have a size constraint:
=INDEX(A:D,MATCH(F4,INDEX(A:A,MATCH(F3,A:A,0)):A1040000,0)+MATCH(F3,A:A,0)-1,MATCH(F5,A1:D1,0))
You can do this just by adding the results of two matches together. One match for the names plus one match for the letter equals the total row.
=INDEX(A:D,MATCH(G5,A3:A5,0)+MATCH(G3,A:A,0),MATCH(G4,1:1,0))
In other words: Index(All of the Data, Match(Name, In name column, exact) + Match(Letter, In letter column, exact), Match(Column name, in Column row, exact)
Screen capture of working sheet
My answer attempts the general case with only one caveat:
That a letter is single character text, and a name is more than 1 character. Otherwise i feel there is no difference logically between letters and names, and it is then impossible to really do...
RE-EDIT for better function construction:
{=INDEX($A$1:$D$17, MATCH($H$3,$A1:$A17, 0)+MATCH($H$4, INDEX($A1:$A17, MATCH($H$3,$A1:$A17, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A1:$A17, 0)+POWER(SQRT(IF(LEN($A$1:$A$17)>1, ROW($A$1:$A$17), 0)-MATCH($H$3,$A$1:$A$17, 0)), 2)-1, ROWS($A$1:$A$17)), 2)), 0)-1, MATCH($H$5, $A$1:$D$1, 0))}
This uses an array formula along column A, and checks if the length is > 1 and throws the row nums into an array, with letters given a 0.
Then match row of unique name(e.g. George) is subtracted from each.
We then use a min(of all other name rows, with the last data row as the final default - SMALL function with 2 parameter) to find the next name row(or last data row if there is no following name).
Rest is standard index/match etc.
It will correctly return #N/A if there is no such letter under the chosen name...
My dataset is A1:A17, and the formula could use A:A instead each time, but the array calc inside the IF needs the A1:A17 for speed.
EDIT for better function construction:
If we wanted to avoid editing the formula when the data length changes, then we could let full column references of A:A go through the entire construction(and lose speed/efficiency) with the last data row in colA calculated via ROWS(A:A):
Re-edit:
{=INDEX($A:$D, MATCH($H$3,$A:$A, 0)+MATCH($H$4, INDEX($A:$A, MATCH($H$3,$A:$A, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A:$A, 0)+POWER(SQRT(IF(LEN($A:$A)>1, ROW($A:$A), 0)-MATCH($H$3,$A:$A, 0)), 2)-1, ROWS($A:$A)), 2)), 0)-1, MATCH($H$5,1:1, 0))}
It really depends on the setup...
Edit again for version which takes blanks as separators for names
If you want to use blanks as the separator for names, where no blanks are in the data results, but blanks appear in columns B to D where there is a name, then a tiny change in the above formulae will result in this:
=INDEX($A$1:$D$17, MATCH($H$3,$A$1:$A$17, 0)+MATCH($H$4, INDEX($A:$A, MATCH($H$3,$A:$A, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A:$A, 0)+POWER(SQRT(IF($B$1:$B$17="", ROW($A$1:$A$17), 0)-MATCH($H$3,$A$1:$A$17, 0)), 2)-1, ROWS($A$1:$A$17)), 2)), 0)-1, MATCH($H$5, $A$1:$D$1, 0))
This means that the names and letters do not have to be any specified length, but just one proviso is that blanks appear in the row with the name.
A small amendment to the condition to find the end range to search for the letter by replacing this: SQRT(IF(LEN($A$1:$A$17)>1, with this:
SQRT(IF($B$1:$B$17="",
I would use the area (4th parameter) of Index(). Below is a screenshot of test data. This example assumes the same columns and keys are sorted and consistent.
This works by using (Range1,Range2) as the first parameter of index. For the 4th parameter of index, use N for which area in the () you want Index to return.
I think this may be slightly tidier, and a little easier to modify maybe.
=INDEX(OFFSET(INDIRECT("A"&MATCH($H$3,$A:$A,0),TRUE),0,0,4,4),MATCH($H$4,$A:$A,0),MATCH(H5,$1:$1,0))
Using offset to create the range first, we're able to use the name from H3 to set that up, and then beyond that we are just indexing within that new range.
Now this is still dependendent on staying in Column A for the names.
Assuming the format of the data is always Name then P, M and N this formula does the work:
=INDEX($A:$D,
MATCH($H$3,$A:$A,0)
+LOOKUP($H$4,{"P",1;"M",2;"N",3}),
MATCH($H$5,$1:$1,0))
This solution works on almost all conditions. One restriction I found is when one of the subjects (Names) does no have data for any of the details (letters), but as of now the same occurs with all the other answers.
The formula assumes the data is located at B6:F30 (in order to ensure it can be applied regardless of the source range location).
The formula uses the Index\Match functions:
First, a MATCH to retrieve the position of the Name:
MATCH($H8,$B$6:$B$30,0)
With that info it uses INDEX to build a range that is used to obtain the position of the Detail (letter) using a second MATCH Function:
+ MATCH($I8,INDEX($B$6:$B$30, 1 + MATCH($H8,$B$6:$B$30,0))
:INDEX($B$6:$B$30,ROWS($B$6:$B$30)),0),
Adding the results of the first and second MATCH functions obtains the position of the Name`Detail` combination and uses it in an Index to the entire data. The position of the Data Column required is obtained with a Match:
INDEX($B$6:$F$30, 1st.MATCH + 2nd.MATCH,
MATCH(J$6,$B$6:$F$6,0))
With the results located at G6:L30 enter this formula in J8 then copy to J8:L30:
= INDEX( $B$6:$F$30,
MATCH( $H8, $B$6:$B$30, 0)
+MATCH( $I8, INDEX( $B$6:$B$30 , 1 + MATCH( $H8, $B$6:$B$30 ,0))
: INDEX( $B$6:$B$30, ROWS($B$6:$B$30) ),0),
MATCH( J$6, $B$6:$F$6, 0)),"")
This solution works in all conditions discussed so far (let me know of any condition that it does not work and I’ll try to cover it).
I’m posting this as a separated answer as the formulas applied in prior answer rightly apply to the conditions stated in them, as such they will be useful to users with those specific scenarios, so they don’t need to apply these long formulas.
This formula assumes the data is located at B6:E30 (in order to ensure it can be applied regardless of the source range location).
This formula uses the Index\Match functions and it’s a Formula Array.
FormulaArrays are entered pressing [Ctrl] + [Shift] + [Enter] simultaneously, you shall see { and } around the formula if entered correctly
Syntax:
=IFERROR(INDEX(DataRng,
MATCH(Value1,NamesRng,0)
+IFERROR(MATCH(Value2,INDEX(NamesRng,
1+MATCH(Value1,NamesRng,0))
:INDEX(NamesRng, IFERROR(MATCH(Value1,NamesRng,0)
+MATCH("#",IF((INDEX(Col1Rng,1+MATCH(Value1,NamesRng,0))
:INDEX(Col1Rng,ROWS(NamesRng)))="","#","!"),0),
ROWS(NamesRng))),0),NA()),MATCH(ValCol,DataHdr,0)),"")
Arguments:
Assuming the data is located at B6:E30.
Value1= Name to be found in Data, i.e. George, Scott, etc.
Value2= Detail to be found in Data, i.e. Detail1, Detalle2, etc.
ValCol = Column to be found in Data i.e. Column1, Column2, etc.
DataRng= $B$6:$E$30
DataHdr= $B$6:$E$6
NamesRng= $B$6:$B$30
Col1Rng= $C$6:$C$30
1st MATCH: Retrieves the position of the Name:
MATCH(Value1,NamesRng,0)
2nd MATCH: Retrieves the end position of the Name’s corresponding Details, which is determined by a blank value in column C or the end of the data range:
MATCH("#",IF((INDEX(Col1Rng, 1 + 1stMATCH)
:INDEX(Col1Rng,ROWS(NamesRng)))="","#","!"),0),
Builds a Range (vRange): With the Names's Details using the 1st and 2nd match functions. If 2nd Match returns an error then it uses the last row of the Data range:
INDEX(NamesRng, 1 + 1stMATCH )
:INDEX(NamesRng, IFERROR( 1stMATCH + 2ndMATCH, ROWS(NamesRng)))
3rd MATCH: Retrieves the position of the Detail within the vRange. It returns #NA if the combination is not present.
IFERROR(MATCH(Value2, vRange,0), NA())
Adding the results of the 1st and 3rd match functions obtains the Row index of the Name`Detailcombination or#NAif no found.
The Column index is obtained with a Match from the Header of the Data.
It then applying the INDEX function to the Data Range returns the value of theName\Detail\Columncombination.
If theName\Detail` combination is not found it returns blank.
=IFERROR( INDEX( DataRng, 1stMATCH + 3rdMATCH, MATCH(Column,DataHdr,0)),"")
With the results located at H6:L37 enter this Formula Array in J8 then copy to K8:L37 and to J9:L37:
=IFERROR( INDEX($B$6:$E$30,
MATCH($H8,$B$6:$B$30,0)
+IFERROR( MATCH($I8, INDEX($B$6:$B$30,
1+MATCH($H8,$B$6:$B$30,0))
:INDEX($B$6:$B$30, IFERROR(MATCH($H8,$B$6:$B$30,0)
+MATCH("#", IF((INDEX($C$6:$C$30,1+MATCH($H8,$B$6:$B$30,0))
:INDEX($C$6:$C$30,ROWS($B$6:$B$30)))="","#","!"),0),
ROWS($B$6:$B$30))),0),NA()),
MATCH(J$6,$B$6:$E$6,0)), "")
Wow... So many solutions already.
I think a simpler solution could be using offset to get a more generic answer.
=INDEX($A$1:$D$9, MATCH($G$3,OFFSET($A$1,MATCH($G$2,$A$1:$A$9,0),0,3,1),0)+MATCH($G$2,$A$1:$A$9,0), MATCH($G$4,$B$1:$D$1,0)+1)
The only variable to look for is 3 which is the number of M/N/P options present because that will affect the number of rows. Otherwise, the solution works fine in all possible scenarios and different orders.
When I have more than two inpunts for a data search I prefer to have the data organized as shown in the figure, so that I can use a pivot table and get it to organize the data in rows and columns as I like.
Then I use GETPIVOTDATA to search for a value.
Cell G9 contains this formula:
=GETPIVOTDATA("Value";$F$3;"Name";G15;"Letter";G16;"Column";G17)

Returning all possible values instead of a VLOOKUP

So I've looked up tutorials on how to do this, and I'm still struggling, so I could use some expert help. I know it involves a very complex nested formula with things like SMALL, ROW, INDEX, etc...
So here are two screenshots that provide a sample of what I'm looking for. In realities there is over 1000 rows, but this makes it easier for you guys.
So here is my first example, lets call this Sheet1!:
Code, ID_1 and ID_2. So as you can see (and just focus on the input in A2) there will be two separate IDs in the linked workbook. That sheet, or at least a tiny sample of it, looks like this:
In the first column we see the code we're looking for (which is what we have in A2 of the first one), each of them with different IDs. So as I'm sure you can tell by now, I'm looking for a formula that will allow me to return those values in ID_1 and ID_2 in the first sheet.
I have been going at this for an hour and I'm stumped, so I would greatly appreciate any help provided!
This is a more generic code if the ids are NOT listed consecutively: Obviously I have done this as an example to take in a more general case where the ids occur anywhere throughout the second dataset, AND where there are potentially several.
IFERROR(INDEX($V$2:$V$15, SMALL(IF($U$2:$U$15=$M2, ROW($U$2:$U$15), FALSE), COLUMNS($N2:N2))-ROW($V$1), 1), "")
This formula must be entered with Ctrl-Shift-Enter before copying across and down! Note all absolute and relative referencing/locking ($ signs)
The logical steps in constructing such a formula:
1) We use IF function to test if the values in the column U match the value in column M.
2) In the 'value-if-true' parameter, we will get the corresponding row number of values in column U. These numbers will be fed later in the SMALL function.
3) In the value-if-false part, we just return false, as that will later be used as a non-number in the SMALL function
Above 3 steps in the part: IF($U$2:$U$15=$M2, ROW($U$2:$U$15), FALSE)
4 ) We have now an array of mixed row numbers and FALSE values, which we want to feed to the INDEX function to simply get the corresponding value in column V(our second datset). BUT as we wish to retrieve the different row matches for each code, we have to fish them out of the mixed array with the SMALL function.
5) using our columns as an incrementer, we apply the SMALL function to the array with a varying k parameter. We USE the COLUMNS function (note carefully the different $ sign usage), so that as we drag the formula across, the column count increments: COLUMNS($N2:N2) - giving K values of 1, 2, 3, 4 as we drag the formula across from column N to column Q. Note that it is useful that the SMALL function disregards FALSE values when looking through the array for the values by size.
6) There is an adjustment to account for the fact that the rows are relative to the 'Ids' range which we will feed into the INDEX function to retrieve the different ids. SMALL(IF($U$2:$U$15=$M2, ROW($U$2:$U$15), FALSE), COLUMNS($N2:N2))-ROW($V$1).
This can be avoided if we use the entire column V as the look-up array parameter in the INDEX function, but that's another way...
7) This resulting value can now be passed to the INDEX function to obtain the various ids. The column_num parameter of 1 which I put in the function isn't necessary in a single-column look-up array, but is there for completeness.
8) The entire construction is then wrapped in an IFERROR function to give an empty string if there is no match, but some people may wish to have error outputs there...
well if the two ID will be consecutive in the second list try this:
=index('workbookname'SheetName!columnrangeofserialnumbers,match(A2,'workbookname'Sheetname!columnrangeofIDs,0))
Assuming your other workbook is called Serials, and all the info is on sheet1 you would enter the follow in B2:
=index('serials'sheet1!$B$2:$B$1000,match(A2,'serials'sheet1!$B$2:$B$1000,0))
in C2 enter the following (assuming ids will show up consecutively)
=index('serials'sheet1!$B$2:$B$1000,match(A2,'serials'sheet1!$B$2:$B$1000,0)+1)
This only works if the other workbook is open as far as I know and with the understanding that the two ID will be listed consecutively in the list.

Comparing unique strings of Excel data across worksheets

long time reader, first time pos(t)er of questions.
I have an Excel 2013 worksheet of about 4,000 unique records (rows) of data. We'll call this the data dump. I've filtered the data dump using any one of about six different data elements (columns). After each filter I save the results to a new worksheet. I clear the filter to start over, and ultimately wound up with about six different worksheets.
I need to be able to account for each unique record in the data dump--each one should (in theory) appear on at least one of the filtered worksheets, and I need to identify any that don't.
My big problem is that the only way to uniquely identify each record is by concatenating a text string out of five consecutive cells in each row. I cannot add a column of concatenated text to these worksheets (for which reasons I'll presently spare you), so essentially I'm trying to build a formula that says the following:
For a given, unique, concatenated string of text of five consecutive cells from one record on this data dump worksheet, identify any exact matching strings from any of the other worksheets and return TRUE if found or FALSE if not.
I will, of course, have to apply this formula to every record in the data dump.
Thoughts or tips? Ultimately I think it comes down to a lot of small moving parts that I could manage individually, but that I'm not confident I could manage collectively.
Any help is appreciated and I'll be happy to clarify where needed. And forgiveness if a similar question has been asked previously--I searched pretty fruitlessly for an answer all afternoon.
Thank you!
You could use Index to create a concatenated range that serves a lookup range to Match(). Match() can concatenate the lookup term. It then returns a number for a match or an error if no match is found. Wrap error trapping formulas around this for the TRUE/FALSE result. Along the lines of
=iferror(match(sheet1!A1&sheet1!B1&sheet1!C1&sheet1!d1&sheet1!e1,index(Sheet2!$a$1:$a$1000&Sheet2!$b$1:$b$1000&Sheet2!$c$1:$c$1000&Sheet2!$d$1:$d$1000&Sheet2!$e$1:$e$1000,0),0),FALSE)
Note that any match will return a number (which will evaluate to a boolean TRUE in summarising formulas) and a non-match will return a FALSE.
This will get you the row number of the match for the first row of original data on sheet1, where the first extract lives on Sheet2 in the first 1000 rows. Use the same principle for the other four sheets and wrap the five formulas into an OR() statement to arrive at a final TRUE or FALSE.
Note that the Index ranges should not encompass whole columns, but only the rows with data. Otherwise the formula will be very slow to recalculate, especially if you use it 4000 times.
Here is one way. If you have your datadump records from A1 downwards.
And assuming you can have your filter sheets similarly. Then adjust your filter ranges so that the formula calls the fixed ranges properly.
You might be able to name them...
This formula need CSE for it to work
Edit by teylyn: This formula is an array formula and needs to be confirmed with Ctrl-Shift-Enter. It
will not work if you only hit the Enter key after editing the formula.
Control-Shift-Enter is sometimes referred to as CSE. People also call
it "array-entering" a formula. Excel will put curly braces around the
formula, which you can see in the formula bar when the cell is
selected.
=OR(
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet1!$A$1:$A$200&FilterSheet1!$B$1:$B$200&FilterSheet1!$C$1:$C$200&FilterSheet1!$D$1:$D$200&FilterSheet1!$E$1:$E$200, 0), FALSE),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet2!$A$1:$A$200&FilterSheet2!$B$1:$B$200&FilterSheet2!$C$1:$C$200&FilterSheet2!$D$1:$D$200&FilterSheet2!$E$1:$E$200, 0), FALSE),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet3!$A$1:$A$200&FilterSheet3!$B$1:$B$200&FilterSheet3!$C$1:$C$200&FilterSheet3!$D$1:$D$200&FilterSheet3!$E$1:$E$200, 0), FALSE),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet4!$A$1:$A$200&FilterSheet4!$B$1:$B$200&FilterSheet4!$C$1:$C$200&FilterSheet4!$D$1:$D$200&FilterSheet4!$E$1:$E$200, 0), FALSE),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet5!$A$1:$A$200&FilterSheet5!$B$1:$B$200&FilterSheet5!$C$1:$C$200&FilterSheet5!$D$1:$D$200&FilterSheet5!$E$1:$E$200, 0), FALSE))
I have put hard returns so you can see what is going on better. Obviously you must collect up the formula
EDIT for new requirement: Ctrl+Shift+Enter required again
=CONCATENATE(
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet1!$A$1:$A$200&FilterSheet1!$B$1:$B$200&FilterSheet1!$C$1:$C$200&FilterSheet1!$D$1:$D$200&FilterSheet1!$E$1:$E$200, 0) & " - FilterSheet1", ""),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet2!$A$1:$A$200&FilterSheet2!$B$1:$B$200&FilterSheet2!$C$1:$C$200&FilterSheet2!$D$1:$D$200&FilterSheet2!$E$1:$E$200, 0) & " - FilterSheet2", ""),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet3!$A$1:$A$200&FilterSheet3!$B$1:$B$200&FilterSheet3!$C$1:$C$200&FilterSheet3!$D$1:$D$200&FilterSheet3!$E$1:$E$200, 0) & " - FilterSheet3", ""),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet4!$A$1:$A$200&FilterSheet4!$B$1:$B$200&FilterSheet4!$C$1:$C$200&FilterSheet4!$D$1:$D$200&FilterSheet4!$E$1:$E$200, 0) & " - FilterSheet4", ""),
(IFERROR(MATCH(A1&B1&C1&D1&E1,FilterSheet5!$A$1:$A$200&FilterSheet5!$B$1:$B$200&FilterSheet5!$C$1:$C$200&FilterSheet5!$D$1:$D$200&FilterSheet5!$E$1:$E$200, 0) & " - FilterSheet5", ""))
My edit for the new requirement just takes the matches found, as #Messy Jesse said, and also appends the sheet name too. If no match is found in a sheet, then ZLS is added to the string. The total string is then concatenated...

Resources