Excel - how to find all values with matching keys? - excel

In a sorted list, I'm trying to find all values in 'B' for each person in 'A' and display them as shown in column 'D'. I've gotten as far as what I have in column 'F' and I know how to concatenate the values if I had them but I'm stuck. How can I find all the values in 'B' that correspond to 'A'? (without vba)
https://imgur.com/a/hICdzwX
I'm new to excel so I'm sorry if this doesn't make sense.

Pretty sure this is what you're looking for:
IF(A5=A4,"",TEXTJOIN("-",FALSE,(B5:INDIRECT("B"&(ROW()-1+COUNTIF(A:A,A5))))))
(I hope the picture uploaded :-)
#Jeeped was correct in using textjoin(), which I wasn't familiar with, but you're primary criteria was for (essentially) matching rows. Conventional thought is to loop through matching names, but loops = VBA, which you want to avoid. So, the alternative is knowing the length of the list.
Breaking down the formula:
IF(A5=A4, # because the list is sorted, we display where name rows are different
"", # otherwise just return a blank
TEXTJOIN( # (thanks #Jeeped -- this made the work a lot easier)
"-", # your list delimiter
FALSE, # False (include) True(ignore empty cells -- your call
(B5: # use the range - starting from the current 'B' Cell (note same row as (A5) through...
INDIRECT("B"&(ROW()-1+COUNTIF(A:A,A5))))))
Indirect says: pass the 'created value' as if an entered reference, but in this case, the 'number of cell {down}' is determined by the countif(), that is 'how many of the keys' are in the list. The number is overstated by (+1) because were already sitting on the starting row, therefore (-1). So, this parameter starts with "B5:" and appends (as an indirect()), the current row as a number MINUS 1 PLUS the total count() of cells in column A that matches the value of column A in the current row, EG: ROW=5, Count = 2, => 'B5:B'+(5-1+2)
returns: 'B5:B7' to the TextJoin().
Screen shot of Formula & results...

Related

How to reverse Match for cell containing string Excel 2010

From column O I would like to lookup column A, starting from my current row, to find the first cell with a comma. Goal is to have the correct date in each row.
Table I'm working in https://i.imgur.com/BByfjzy.png
=MATCH("*"&","&"*",$A$1:INDIRECT("A" & ROW()),0)
If I could just run it backwards that be great but I'm not finding a way that works with wildcards or contains in excel 2010. My other thought was to make an a range based off position, invert it, find the index and do length - index but I'm not sure how I would go about that. I'm pretty new to excel so any help would be apricated.
=MAX(IF(ISNUMBER(FIND(",",A1:INDEX(A:A,ROW()))),ROW(A1:INDEX(A:A,ROW())),))
Instead of MATCH which looks from top to bottom and returns the first match, use MAX to return the max row number of the cell containing ,. You can use either FIND or SEARCH.
If you wrap it in INDEX you get your value:
=INDEX(A:A,MAX(IF(ISNUMBER(FIND(",",A1:INDEX(A:A,ROW()))),ROW(A1:INDEX(A:A,ROW())),)))
It might require to be entered with ctrl+shift+enter. I'm unable to test it in older Excel version.
Edit for further explanation of how it works:
A1:INDEX(A:A,ROW()) is to be read as cell A1 up to the current row in column A. So if you're at row # 10 it would equal A1:A10.
Wrapping that range in FIND returns the position of the character you try to find.
If given character is not found in the cell it returns error #N/A.
So if you have row 1 and 9 containing , in this case, it returns an array of numbers for the hits and errors for the non-hits, for instance {2,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A,6,#N/A}
Wrapping that in ISNUMBER changes the non errors to TRUE and the errors to FALSE.
IF takes that array and in case of TRUE (a number) it returns the row number (same indexed range is used).
Then MAX returns the largest row number of that array.
Instead of FIND you could also use SEARCH. FIND is case sensitive, and SEARCH isn't, further on they operate the same).

Appending two lists in excel

I have been trying and searching how to append two lists in excel to use in a formula. The lists do not exist in columns, they are created using a formula. I want to combine the two lists in a single one, not to show the values but to use the new list in a formula. I am using excel 365 (UNIQUE function). Let me replace my initial text by a real small case.
I have an excel file with 3 work sheets. Sheet1 is:
Sheet2 is:
Now I want to run some analysis in Sheet3. In my example I want to count how many unique values from column A have column B containing one of the letters 'a', 'b, 'c', or 'd'. For instance, in Sheet1, the letter 'a' appears in all rows. Column A has 3 unique values. So my result for 'a' is 3. The letter 'b' does not appear for the case where column A is '3'. Therefore the result for 'b' is '2'.
So I create a Sheet3 to show my results. The first column contains a list of letters {a, b, c, d}. I then use the formula:
=COUNT(UNIQUE(FILTER(Sheet1!$A$1:$A$100, ISNUMBER(SEARCH(A1, Sheet1!$B$1:$B$100)))))
From inside out: the SEARCH function looks in cells B1 to B100 (I can live with specifying a larger range) where is the position of the value specified in column A (of the current sheet). If it does, then SEARCH returns a number. I check if the return value is a number (ISNUMBER) and use this to filter values in column A of Sheet1. I then apply the UNIQUE function to these values and finally count them.
Then I do the same with values in Sheet2. And it works. This is the output:
Column B is the number of unique values (as specified above) from Sheet1 and Column C the same from Sheet2.
So far so good. But now I want to have the counting of unique values globally. Not for each Sheet. One cannot just add the values from column B and C, as there might be an overlap. For example, the result for 'a' should be 3, not 5.
The solution here would be to grab the two unique lists (one from Sheet1 and the other from Sheet2), join them, UNIQUE this new list, and count. How do I join them ? That is my question.
Note that this 'counting of unique values' is just an example. I might want to find the maximum, or sort them, or find only prime numbers, or the average, or the median, or something else. So I need a general approach to join the results.
I got options close to a workable thing when all the data is in the same worksheet.
Finally, note that the data size I have is not huge, but it is large (thousands of lines at the most).
Here is something you could try:
=LET(x,{"A","B","C"},y,{"D","E"},z,CHOOSE({1,2},x,y),cnt,MAX(COUNTA(x),COUNTA(y)),seq,SEQUENCE(cnt*2),final,INDEX(z,MOD(seq-1,cnt)+1,CEILING(seq/cnt,1)),FILTER(final,NOT(ISERROR(final))))
Here both 'x' and 'y' variables are placeholders for your two (vertical) arrays. In this case I used: {"A","B","C"} and {"D","E"}. Assuming you just want to place the 2nd array directly under the 1st one, the above suggestion does just that:

VBA : distinct values in a column and their associated distinct values

I'm trying to do something similar in vba which I have an idea of only in python for loops. Can someone teach me how to do this in vba, either in function or module macro please :
For each distinct values in column A4:A30, there should be no more than 9 distinct values in column C4:C30. If true, return 'OK' in cell A1. if false, return 'Error' in cell A1'
e.g As in the picture, Sam should not have more than 9 distinct fruits. Same goes to Mary
Update :
I have tried the filterxml method and unfortunately didn't seem work for me : [1] https://i.stack.imgur.com/cbmTs.png
Solution for excel with filter/unique formulas
Easiest way to achieve it in Excel365 is: add extra column which counts unique values (Fruits) for each Key (Names) and find maximum value in this column
Start with formula that find each non-blank which fits the key.
=FILTER($C$4:$C$30,($A$4:$A$30=A4)*($C$4:$C$30<>""))
Then delete duplicates:
=UNIQUE(FILTER($C$4:$C$30,($A$4:$A$30=A4)*($C$4:$C$30<>"")))
Then check how many cells we have in filtered data without blanks and duplicates:
=COUNTA(UNIQUE(FILTER($C$4:$C$30,($A$4:$A$30=A4)*($C$4:$C$30<>""))))
Then expand our new-column (column B in my case) formula to each row in our Keys.
And finally add formula to A1 which checks maximum counter:
=IF(MAX($B$4:$B$30)<10,"OK","Error - to many velues")
*There is a little typo, it should be "Error - to many values" =)
Below how the worksheet looks in my testfile
Solution for older versions of excel
I've checked if i am able to make it works without these formulas and it is possible:
We need to start with counting if there is for key-value above current row
=COUNTIFS($A$4:A4,A4,$C$4:C4,C4)
In case we have duplicates above, they should be already counted so we skip them:
=IF(COUNTIFS($A$4:A4,A4,$C$4:C4,C4)>1,"",1)
Now we have colum with "1" or blanks. In that case we need to count each non-empty cell above which correct key (name) and add 1 so instead "" and "1" we will have "" or 1, 2, 3, 4, ...
=IF(COUNTIFS($A$4:A4,A4,$C$4:C4,C4)>1,"",COUNTIFS($A$3:A3,A4,$F$3:F3,">0")+1)
Edit
I have added one extra IF to skip keys if value is blank:
=IF(C4="","",IF(COUNTIFS($A$4:A4,A4,$C$4:C4,C4)>1,"",COUNTIFS($A$3:A3,A4,$B$3:B3,">0")+1))
Cells Formula in A1 is the same
=IF(MAX($B$4:$B$30)<10,"OK","Error - to many values")
Quick Note:
Some formulas have range which starts on 3rd row instead of 4th; Its intended because we are counting cells above and at first row of data we need to have choose something above. This code assumes that you don't have numbers (on column B) or names (on column A) in row 3;
Below I am attaching screen with example; This screen have additional columns (D-F) which isn't required, its only do display how final formula was created.

Three Dimensional Lookup Using INDEX/MATCH

This was taken and improved slightly from Question that has since been deleted
For those who can see deleted posts, it was taken from here: https://stackoverflow.com/questions/39793322/three-dimensional-lookup-no-concatenate-or-named-ranges-excel
I'm trying to do a three dimensional lookup without named ranges or concatenates. Simplified, my data is on the form:
Column1 Column2 Column3
Scott
P 1 2 3
M 4 5 6
N 7 8 9
George
P 10 11 12
M 13 14 15
N 16 17 18
I now want to search for a specific Name and then for a specific letter within that names table, I then want to match this row number with a specific column.
I tried a simple INDEX/MATCH:
=INDEX(A:D,MATCH("M",A:A,0),MATCH("Column1",1:1,0))
And that works for the fist name but not any others as it finds the first instance of M.
How do I modify it to look for a different name?
I have answered below, but want to see if someone has a better solution.
I used an IF() statement array formula to find what the P row number was after the George row... I also needed to use the MIN() function to get the first P row number after the name.
Beyond that, it's a simple INDEX() function.... that racked my brain for over an hour :).
=INDEX($A$1:$D$9,MIN(IF((ROW(A1:A9)>MATCH($F$4,A1:A9,0))*(A1:A9=$F$5),ROW(A1:A9),"")),MATCH($F$6,$A$1:$D$1,0))
Don't Forget!
Use Ctrl+Shift+Enter when finishing the formula, so it gets evaluated as an array formula.
You can use two other INDEX/MATCH's inside the first MATCH to set the lookup range. Then you simply need to add the MATCH() to find the absolute position of the name.
=INDEX(A:D,MATCH($H$4,INDEX(A:A,MATCH($H$3,A:A,0)):INDEX(A:A,MATCH($H$3,A:A,0)+4),0)+MATCH($H$3,A:A,0)-1,MATCH($H$5,$1:$1,0))
This one works better and does not have a size constraint:
=INDEX(A:D,MATCH(F4,INDEX(A:A,MATCH(F3,A:A,0)):A1040000,0)+MATCH(F3,A:A,0)-1,MATCH(F5,A1:D1,0))
You can do this just by adding the results of two matches together. One match for the names plus one match for the letter equals the total row.
=INDEX(A:D,MATCH(G5,A3:A5,0)+MATCH(G3,A:A,0),MATCH(G4,1:1,0))
In other words: Index(All of the Data, Match(Name, In name column, exact) + Match(Letter, In letter column, exact), Match(Column name, in Column row, exact)
Screen capture of working sheet
My answer attempts the general case with only one caveat:
That a letter is single character text, and a name is more than 1 character. Otherwise i feel there is no difference logically between letters and names, and it is then impossible to really do...
RE-EDIT for better function construction:
{=INDEX($A$1:$D$17, MATCH($H$3,$A1:$A17, 0)+MATCH($H$4, INDEX($A1:$A17, MATCH($H$3,$A1:$A17, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A1:$A17, 0)+POWER(SQRT(IF(LEN($A$1:$A$17)>1, ROW($A$1:$A$17), 0)-MATCH($H$3,$A$1:$A$17, 0)), 2)-1, ROWS($A$1:$A$17)), 2)), 0)-1, MATCH($H$5, $A$1:$D$1, 0))}
This uses an array formula along column A, and checks if the length is > 1 and throws the row nums into an array, with letters given a 0.
Then match row of unique name(e.g. George) is subtracted from each.
We then use a min(of all other name rows, with the last data row as the final default - SMALL function with 2 parameter) to find the next name row(or last data row if there is no following name).
Rest is standard index/match etc.
It will correctly return #N/A if there is no such letter under the chosen name...
My dataset is A1:A17, and the formula could use A:A instead each time, but the array calc inside the IF needs the A1:A17 for speed.
EDIT for better function construction:
If we wanted to avoid editing the formula when the data length changes, then we could let full column references of A:A go through the entire construction(and lose speed/efficiency) with the last data row in colA calculated via ROWS(A:A):
Re-edit:
{=INDEX($A:$D, MATCH($H$3,$A:$A, 0)+MATCH($H$4, INDEX($A:$A, MATCH($H$3,$A:$A, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A:$A, 0)+POWER(SQRT(IF(LEN($A:$A)>1, ROW($A:$A), 0)-MATCH($H$3,$A:$A, 0)), 2)-1, ROWS($A:$A)), 2)), 0)-1, MATCH($H$5,1:1, 0))}
It really depends on the setup...
Edit again for version which takes blanks as separators for names
If you want to use blanks as the separator for names, where no blanks are in the data results, but blanks appear in columns B to D where there is a name, then a tiny change in the above formulae will result in this:
=INDEX($A$1:$D$17, MATCH($H$3,$A$1:$A$17, 0)+MATCH($H$4, INDEX($A:$A, MATCH($H$3,$A:$A, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A:$A, 0)+POWER(SQRT(IF($B$1:$B$17="", ROW($A$1:$A$17), 0)-MATCH($H$3,$A$1:$A$17, 0)), 2)-1, ROWS($A$1:$A$17)), 2)), 0)-1, MATCH($H$5, $A$1:$D$1, 0))
This means that the names and letters do not have to be any specified length, but just one proviso is that blanks appear in the row with the name.
A small amendment to the condition to find the end range to search for the letter by replacing this: SQRT(IF(LEN($A$1:$A$17)>1, with this:
SQRT(IF($B$1:$B$17="",
I would use the area (4th parameter) of Index(). Below is a screenshot of test data. This example assumes the same columns and keys are sorted and consistent.
This works by using (Range1,Range2) as the first parameter of index. For the 4th parameter of index, use N for which area in the () you want Index to return.
I think this may be slightly tidier, and a little easier to modify maybe.
=INDEX(OFFSET(INDIRECT("A"&MATCH($H$3,$A:$A,0),TRUE),0,0,4,4),MATCH($H$4,$A:$A,0),MATCH(H5,$1:$1,0))
Using offset to create the range first, we're able to use the name from H3 to set that up, and then beyond that we are just indexing within that new range.
Now this is still dependendent on staying in Column A for the names.
Assuming the format of the data is always Name then P, M and N this formula does the work:
=INDEX($A:$D,
MATCH($H$3,$A:$A,0)
+LOOKUP($H$4,{"P",1;"M",2;"N",3}),
MATCH($H$5,$1:$1,0))
This solution works on almost all conditions. One restriction I found is when one of the subjects (Names) does no have data for any of the details (letters), but as of now the same occurs with all the other answers.
The formula assumes the data is located at B6:F30 (in order to ensure it can be applied regardless of the source range location).
The formula uses the Index\Match functions:
First, a MATCH to retrieve the position of the Name:
MATCH($H8,$B$6:$B$30,0)
With that info it uses INDEX to build a range that is used to obtain the position of the Detail (letter) using a second MATCH Function:
+ MATCH($I8,INDEX($B$6:$B$30, 1 + MATCH($H8,$B$6:$B$30,0))
:INDEX($B$6:$B$30,ROWS($B$6:$B$30)),0),
Adding the results of the first and second MATCH functions obtains the position of the Name`Detail` combination and uses it in an Index to the entire data. The position of the Data Column required is obtained with a Match:
INDEX($B$6:$F$30, 1st.MATCH + 2nd.MATCH,
MATCH(J$6,$B$6:$F$6,0))
With the results located at G6:L30 enter this formula in J8 then copy to J8:L30:
= INDEX( $B$6:$F$30,
MATCH( $H8, $B$6:$B$30, 0)
+MATCH( $I8, INDEX( $B$6:$B$30 , 1 + MATCH( $H8, $B$6:$B$30 ,0))
: INDEX( $B$6:$B$30, ROWS($B$6:$B$30) ),0),
MATCH( J$6, $B$6:$F$6, 0)),"")
This solution works in all conditions discussed so far (let me know of any condition that it does not work and I’ll try to cover it).
I’m posting this as a separated answer as the formulas applied in prior answer rightly apply to the conditions stated in them, as such they will be useful to users with those specific scenarios, so they don’t need to apply these long formulas.
This formula assumes the data is located at B6:E30 (in order to ensure it can be applied regardless of the source range location).
This formula uses the Index\Match functions and it’s a Formula Array.
FormulaArrays are entered pressing [Ctrl] + [Shift] + [Enter] simultaneously, you shall see { and } around the formula if entered correctly
Syntax:
=IFERROR(INDEX(DataRng,
MATCH(Value1,NamesRng,0)
+IFERROR(MATCH(Value2,INDEX(NamesRng,
1+MATCH(Value1,NamesRng,0))
:INDEX(NamesRng, IFERROR(MATCH(Value1,NamesRng,0)
+MATCH("#",IF((INDEX(Col1Rng,1+MATCH(Value1,NamesRng,0))
:INDEX(Col1Rng,ROWS(NamesRng)))="","#","!"),0),
ROWS(NamesRng))),0),NA()),MATCH(ValCol,DataHdr,0)),"")
Arguments:
Assuming the data is located at B6:E30.
Value1= Name to be found in Data, i.e. George, Scott, etc.
Value2= Detail to be found in Data, i.e. Detail1, Detalle2, etc.
ValCol = Column to be found in Data i.e. Column1, Column2, etc.
DataRng= $B$6:$E$30
DataHdr= $B$6:$E$6
NamesRng= $B$6:$B$30
Col1Rng= $C$6:$C$30
1st MATCH: Retrieves the position of the Name:
MATCH(Value1,NamesRng,0)
2nd MATCH: Retrieves the end position of the Name’s corresponding Details, which is determined by a blank value in column C or the end of the data range:
MATCH("#",IF((INDEX(Col1Rng, 1 + 1stMATCH)
:INDEX(Col1Rng,ROWS(NamesRng)))="","#","!"),0),
Builds a Range (vRange): With the Names's Details using the 1st and 2nd match functions. If 2nd Match returns an error then it uses the last row of the Data range:
INDEX(NamesRng, 1 + 1stMATCH )
:INDEX(NamesRng, IFERROR( 1stMATCH + 2ndMATCH, ROWS(NamesRng)))
3rd MATCH: Retrieves the position of the Detail within the vRange. It returns #NA if the combination is not present.
IFERROR(MATCH(Value2, vRange,0), NA())
Adding the results of the 1st and 3rd match functions obtains the Row index of the Name`Detailcombination or#NAif no found.
The Column index is obtained with a Match from the Header of the Data.
It then applying the INDEX function to the Data Range returns the value of theName\Detail\Columncombination.
If theName\Detail` combination is not found it returns blank.
=IFERROR( INDEX( DataRng, 1stMATCH + 3rdMATCH, MATCH(Column,DataHdr,0)),"")
With the results located at H6:L37 enter this Formula Array in J8 then copy to K8:L37 and to J9:L37:
=IFERROR( INDEX($B$6:$E$30,
MATCH($H8,$B$6:$B$30,0)
+IFERROR( MATCH($I8, INDEX($B$6:$B$30,
1+MATCH($H8,$B$6:$B$30,0))
:INDEX($B$6:$B$30, IFERROR(MATCH($H8,$B$6:$B$30,0)
+MATCH("#", IF((INDEX($C$6:$C$30,1+MATCH($H8,$B$6:$B$30,0))
:INDEX($C$6:$C$30,ROWS($B$6:$B$30)))="","#","!"),0),
ROWS($B$6:$B$30))),0),NA()),
MATCH(J$6,$B$6:$E$6,0)), "")
Wow... So many solutions already.
I think a simpler solution could be using offset to get a more generic answer.
=INDEX($A$1:$D$9, MATCH($G$3,OFFSET($A$1,MATCH($G$2,$A$1:$A$9,0),0,3,1),0)+MATCH($G$2,$A$1:$A$9,0), MATCH($G$4,$B$1:$D$1,0)+1)
The only variable to look for is 3 which is the number of M/N/P options present because that will affect the number of rows. Otherwise, the solution works fine in all possible scenarios and different orders.
When I have more than two inpunts for a data search I prefer to have the data organized as shown in the figure, so that I can use a pivot table and get it to organize the data in rows and columns as I like.
Then I use GETPIVOTDATA to search for a value.
Cell G9 contains this formula:
=GETPIVOTDATA("Value";$F$3;"Name";G15;"Letter";G16;"Column";G17)

Sort Order formula to alphabetise in Excel

I am currently drawing up a spreadsheet that will automatically remove duplicates and alphabetize a list:
I am using the COUNTIF() function in column G to create a sort order and then VLOOKUP() to find the sort in column J.
The problem I am having is that I can't seem to get my SortOrder column to function properly. At the moment it creates an index for two number 1's meaning the cell highlighted in yellow is missed out and the last entry in the sorted list is null:
If anyone can find and rectify this mistake for me I'll be very grateful as it has been driving me insane all day! Many thanks.
I'll provide my usual method for doing an automatic pulling-in of raw data into a sorted, duplicate-removed list:
Assume raw data is in column A. In column B, use this formula to increase the counter each time the row shows a non-duplicate item in column A. Hardcord B2 to be "1", and use this formula in B3 and drag down.
=if(iserror(match(A3,$A$2:A2,0)),B2+1,B2)
This takes advantage of the fact that when we refer to this row counter in our revised list, we will use the match function, which only checks for the first matching number. Then say you want your new list of data on column D (usually I do this for display purposes, so either 'group-out' [hide] columns that form the formulas, or do this on another tab). You can avoid this step, but if you are already using helper columns I usually do each step in a different column - easier to document. In column C, starting in C3 [C2 hardcoded to 1] and drag down, just have a simple counter, which error-checks to the stop at the end of your list:
=if(C2<max(B:B),C2+1," ")
Then in column D, starting at D2 and dragged down:
=iferror(index(A:A,match(C2,B:B,0)),"")
The index function is like half of the vlookup function - it pulls the result out of a given array, when you provide it with a row number. The match function is like the other half of the vlookup function - it provides you with the row number where an item appears in a given array.
Hope this helps you in the future as well.
The actual reason that this is going wrong as implied by Jeeped's comment is that you can't meaningfully compare a string to a number unless you do a conversion because they are stored differently. So COUNTIF counts numbers and text separately.
20212 will give a count of 1 because it is the only (or lowest) number.
CS10Z002 will give a count of 1 because it is the first text string in alphabetical order.
Another approach is to add the count of numbers to the count if the current cell contains text:-
=COUNTIF(INDIRECT("$D$2:$D$"&$F$3),"<="&D2)+ISTEXT(D2)*COUNT(INDIRECT("$D$2:$D$"&$F$3))
It's easier to show the result of three different conversions with some test data:-
(0) No conversion - just use COUNTIF
=COUNTIF(D$2:D$7,"<="&D2)
"999"<"abc"<"def", 999<1000
(1) Count everything as text
=SUMPRODUCT(--(D$2:D$7&""<=D2&""))
"1000"<"999"
(2) Count numbers before text
=COUNTIF(D$2:D$7,"<="&D2)+ISTEXT(D2)*COUNT(D$2:D$7)
999<1000<"999"
(3) Count everything as text but convert numbers with leading zeroes
=SUMPRODUCT(--(TEXT(D$2:D$7,"000000")<=TEXT(D2,"000000")))
"000999" = "000999", "000999"<"001000"

Resources