excel matching data - excel

Hi I have a table with item codes in it eg.
A B C D E
Item 500ml 1000ml 2000ml 4000ml
1 Juice 8819686 8819687
2 Vinegar 8813998 8809981 8809982
3 Ice cream 8805690 8805691 8819815
Then I have another list of the above items (I've placed this next to the above table)
A B
Item Code
500ml Juice 8819686
1000ml Juice 8819687
500ml Vinegar 8813998
1000ml Vinegar 8809981
2000ml Vinegar 8809982
500ml Ice Cream 8805690
1000ml Ice Cream 8805691
2000ml Ice Cream 8819815
4000ml Ice Cream 8809984
I want to know which item code in the list is not appearing in the table above (ie. 8809984 is not in the table).
I tried using =IF(ISNA(MATCH(b2,$B$1:$E$E,0)),"Not Found", "Found"), but not working as it returns "Not Found" for every row.
Thank you

You can just use Countif for what you describe:
=CountIf(Sheet2!$B$1:$E$3,B2)>0
You'll get TRUE or FALSE as a result.

how about this way...
{=sum(if(b2=sheet2!$b$1:$e$3,1,0))}
this return 0 if there is no b2 in the target area, and 1 (or more if there are dups). it is array formula so you type everything except {} and then ctrl+alt+enter instead of regular enter.
once you confirmed that the table has unique entry of b2, then you will use following two formula to find index (assuming you do want to know).
{=sum(if(b2=sheet2!$b$1:$e$3,1,0)*{1,2,3,4})}
{=sum(if(b2=sheet2!$b$1:$e$3,1,0)*{1;2;3})}
top one tells the column, the bottom one tells row.
alternatively you could rearrange the original data somehow, but that's messy too...

Related

Random of Table Header Name of another table via INDEX MATCH

I have these 2 tables:
On column B i'm trying to get one of the Header Names of a feature that is not empty on Table B. I want it to be selected randomly. The order of the items in Table A can be different than the order of the items in Table B, I'll need some sort of INDEX MATCH here too.
Excel Version: Office 365
Attempted Formula: I tried to base my formula on this:
=INDEX(datarange,RANDBETWEEN(1,COLUMNS(datarange)),1)
but there are more things to consider, like header name if the index match of the same fruit isn't empty, so I know it is more complex.
Any help will be greatly appreciated.
Assuming you have Excel 365 and a volatile result is acceptable:
=LET(
Fruits, Table_B[Fruit],
Properties, Table_B[[Red]:[Green]],
PropertiesHeaders, Table_B[[#Headers],[Red]:[Green]],
ThisFruit, [#Fruits],
ThisProperties, FILTER(Properties, Fruits = ThisFruit),
ThisPropertiesFiltered, FILTER(PropertiesHeaders, ThisProperties <> 0),
ThisPropertiesCount, COUNTA(ThisPropertiesFiltered),
IndexRand, RANDBETWEEN(1,ThisPropertiesCount),
IFERROR(INDEX(ThisPropertiesFiltered,IndexRand),"-")
)
ThisProperties is the row in Table_B for your fruit. I left out the column for the fruit names.
ThisPropertiesFiltered is the names of the properties that the fruit has. I filtered the header names based on if the fruit row had a non-zero value or not.
IndexRand gets a random number between 1 and the number of available properties. Note, if there are zero available properties, ThisPropertiesFiltered returns #CALC! so ThisPropertiesCount will return 1. This is handled later on.
Last we use INDEX to get the random property name. IFERROR returns "-" if no properties were available.
Here are the tables:
Table_A:
Fruits
Result
Watermelon
Heavy
Melon
Green
Banana
Tropic
Peach
Red
Apple
Green
Table_B:
Fruit
Red
Yellow
Tropic
Heavy
Green
Apple
x
x
Banana
x
x
Peach
x
Melon
x
Watermelon
x
x
Since you have access to dynamic arrays you could try:
Formula in B2:
=LET(X,FILTER(E$1:I$1,INDEX(E$2:I$6,MATCH(A2,D$2:D$6,0),0)<>"","No Feature"),INDEX(X,RANDBETWEEN(1,COUNTA(X))))
Or without LET():
=#SORT(SORT(CHOOSE({1;2;3},E$1:I$1,FILTER(E$2:I$6,D$2:D$6=A2),RANDARRAY(1,5)),3,1,1),2,-1,1)
If you are working through actual tables this should spill down results under Random Feature automatically. However, if one does not use tables, you could nest the above in BYROW() if you are an 365-insider:
=BYROW(A2:A6,LAMBDA(r,LET(X,FILTER(E$1:I$1,INDEX(E$2:I$6,MATCH(r,D$2:D$6,0),0)<>"","No Feature"),INDEX(X,RANDBETWEEN(1,COUNTA(X))))))
This would not work with the 2nd option where we used '#' to parse only the topleft value of our array (implicit intersection).
The idea is that:
A combination of INDEX() & MATCH() will 'slice' the row of interest out of the lookup-table based on our input.
In the 2nd step we'd use FILTER() to only leave those headers where the elements from the herefor returned array are not empty. In the case all elements are empty, this function will return the value "No Feature" as a headsup for the users.
In our final step we combine INDEX() with RANDBETWEEN(). The latter will return a random integer between a LBound (1 in our case) and an Ubound which we based on the amount of returned elements.
I tried to visualize this below.

Is there a way to compare text strings in Excel and output a complete/partial/no match column (with the information missing listed)?

I have a large spreadsheet (upwards of 119K rows) of mismatched data. Column A contains a list of names in full (and occasionally a Trustee or company name), and Column B contains initialized first/middle names with full names (and occasionally Trustee or company names).
I do not currently have a way to compare them short of doing so manually as there are many variable, and am looking for some assistance.
So far I have tried using a VBA script from (How do I fuzzy match just adjacent cells?) to see if it can output the difference (which would allow me to eliminate the cells in Column 2 that had no matching data), but this did not function as intended.
I have also tried various LEFT/RIGHT to trim the names from Column A and then match this to Column B, but this has also not worked due to variance in text in Column A.
Here are some examples of the cells. Note that the names in Column A are not always in alphabetical order, but Column B is:
Example (complete match):
Column A: Column B:
Smith Marcus John J M Smith
Page Binder Book, Quoth Nevermore Raven B B Page, R N Quoth
Orange Apple Banana, Orange Pear Plum A B & P P Orange
Koala Bear, Koala Marsupial Pouch, Koala Gum Tree B, P M & T G Koala
S & P Limited S & P Limited
S & P Limited A D Cumin (S & P Limited)
Example (partial):
Column A: Column B:
Page Binder Book, Quoth Nevermore Raven B B Page
Orange Apple Banana, Orange Pear Plum A B & P P Orange (Fruit 2019 Limited)
Koala Bear, Koala Marsupial Pouch, Goanna Gumtree, Koala Gum Tree B, P M & T G Koala
Example (no match):
Column A: Column B:
Smith Marcus John H J Hyde
Sheppard Garrus Thane B B Page, R N Quoth
What I am hoping to do:
Firstly, I am hoping to correctly mark each cell in Column B as complete/partial/no match with a fill (green/yellow/red). Secondly, for partial matches (whether Column A has extra information, or Column B is missing information) I want to output in Column C the missing information, like so:
Column A: Column B:
Page Binder Book, Quoth Nevermore Raven B B Page
Orange Apple Banana, Orange Pear Plum A B & P P Orange (Fruit 2019 Limited)
Column C:
Quoth Nevermore Raven
(Fruit 2019 Limited)
Is this kind of thing even possible, or are there just too many variations in the way the data is presented in each column?
Very new to both this site and excel functions in general, this is my first task!
Thank you for your assistance/knowledge/time.
Importing and using this VBA module: https://github.com/kyledeer-32/vba_fuzzymatching
Which contains several User Defined Functions (UDFs) will get you a near optimal solution (you will still have to review matches), but you can easily fuzzy match, then calculate the similarity between strings, then a simple "=IF" function can rank them. Using this VBA module I recommended, I got the following results:
I noted that "Koala Bear..." in Column A matched to "S & P..." in Column B. I expected the value in Column B with "...Koala" to match. I checked the script and the Levenshtein Edit distance was actually equal for both. This scarce occurrence will require you to review your matches, but you can do this quickly by ranking your results based on string similarity. Here is a formula view of what I did:
To import the VBA module linked in the beginning of this answer - here is a guide: https://www.excelcampus.com/vba/copy-import-vba-code/
Note: after importing this module, you will need to enable the "Microsoft Scripting Runtime" library in the Visual Basic Editor Window it to run. Steps to do this (takes less than a minute):
From Excel Workbook:
Select Developer tab on ribbon
Select Visual Basic
Select Tools on the Toolbar
Select References
Scroll down until you see Microsoft Scripting Runtime, then check the
box
Press OK
Then your all set! You can use the UDFs (just like in my second image - above) just as you would use normal excel functions! Hope this helps!

Excel Formula to extract previous word (towards left) from a specific position

I have multiple records as below in an excel file say Col A:
Infogain India (P) Ltd. 3-6 yrs Noida
ROBOSPECIES TECHNOLOGIES PVT LTD 0-2 yrs New Delhi
Red Lemon 0-3 yrs Noida(Sector-7 Noida)
Within the data there is a range of years mentioned e.g. 3-6 yrs in the first list item.
I want to extract the data 3-6, 0-2, 0-3 etc from above 3 list items. I understand a search for " yrs " in all the strings will give me the end position. However, I am unable to determine how to find the starting position of the Number of years.
I require the excel formula which will give me the year range.
I do not want to use any VBA for the solution.
If there are no spaces between numbers then you can use following formula.
=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(SUBSTITUTE(A3," yrs",REPT(" ",99)),99))," ",REPT(" ",99)),99))
Try,
=TRIM(RIGHT(REPLACE(A1, FIND(" yrs", A1), LEN(A1), TEXT(,)), 4))
Try the following though pretty sure it can be condensed. I have attempted to handle additional white space potentially being present and also the years being multi digit in length e.g. 12-15. Incorporates a method by Raystafarian to find a last occurence of a character.
=RIGHT(TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)),LEN(TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)))-LOOKUP(9.9999999999E+307,FIND(" ",TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)),ROW($1:$1024))))
Try with below formula
=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1),LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1))-SEARCH("|",SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1))))

VLOOKUP MULTIPLE RANGES

Column A and B is a item and country post code. Column B contain two country post code USA and UK. Both country we have dispatched same part. I am trying to create vlookup formula corresponding to the range but its return na. Please help me.
Country code ranges;
USA Angeles10 Angeles20 Angeles30 Angeles40 Angeles50 Angeles60 Angeles70 Angeles80 Angeles90 Angeles100 Angeles110 Angeles120 Angeles130 Angeles140 Angeles150
UK London10 London20 London30 London40 London50 London60 London70 London80 London90 London100 London110 London120 London130 London140 London150
DATA
ITEM POST CODE
4 Angeles10
4 Angeles20
110489 Angeles30
110489 Angeles40
113388 Angeles50
113388 Angeles60
113636 Angeles70
113636 Angeles80
11363613001 Angeles90
11363613001 Angeles100
11363613002 Angeles110
11363613002 Angeles120
11363613003 Angeles130
11363613003 Angeles140
1136362001 Angeles150
4 London10
4 London20
110489 London30
110489 London40
113388 London50
113388 London60
113636 London70
113636 London80
11363613001 London90
11363613001 London100
11363613002 London110
11363613002 London120
11363613003 London130
11363613003 London140
1136362001 London150
DESIRED RESULT
ITEM USA UK
4 Los Angeles10 London10
I put the first data on a sheet named datasheet in starting in A1.
Then use a formula like so in the E3:
=INDEX($B:$B,AGGREGATE(15,6,ROW($B$2:$B$31)/((ISNUMBER(MATCH($B$2:$B$31,INDEX(datasheet!$1:$1048576,MATCH(E$2,datasheet!$A:$A,0),0),0)))*($A$2:$A$31=$D3)),1))
Then copy/drag over and down.
Easiset Answer
If your data isn't changing and you know exactly where Angeles stops and London starts, you can just use a standard VLOOKUP formula. You just give the bottom part of the table to the UK column.
E3: =VLOOKUP(D3,A$3:B$6,2,)
F3: =VLOOKUP(D3,A$7:B$10,2,)
A little more complicated
If you need to be able to add rows or locations, this solution will work better. Add helper columns for each of the locations you need and a helper column which combines the item ID with the location. You can then use VLOOKUP by searching for the combination of item ID and location.
B3: =A3&CONCAT(D3:E3) (can expand past E3 for extra locations)
D3: =IF(ISERR(SEARCH(D$2,$C3)),"",D$2)
E3: =IF(ISERR(SEARCH(E$2,$C3)),"",E$2) (can drag right for each extra location)
H3: =VLOOKUP($G3&H$2,$B$3:$C$10,2,)
I3: =VLOOKUP($G3&I$2,$B$3:$C$10,2,) (can drag right for each extra location)
My favorite Answer
Just use Scott Craner's approach! ☺

Column number where it contains certain text?

Is there a formula in Excel that returns the value of a row that's under a specific header?
For example, my current sheet looks something like this.
Pet Cost (£) Age
Car 12 5
Dog 11 7
Rabbit 13 9
Snake 5 3
Pet Cost ($) Age
Car 10 5
Dog 13 7
Rabbit 16 9
Snake 8 3
If I want to pull out the first figure for Rabbit that is under the Cost ($) header, how do I go about it? And then for the second figure that is on that row.
I realise I can do it with INDEX/MATCH, but i'm not sure how to specify an instance or one that occurs under a certain header.
First Case:If you have the simple scheme show, you can find the second value for Rabbit, using:
=SMALL(IF(A:A="Rabbit";ROW(A:A));2) Inserted with Ctrl+Shift+Enter
To have the Index, and:
=INDEX(B1:B21;D2)
To have the value...
Second Case:
If you want to use the Header, for the first, use:
=INDEX(B1:B21;MATCH("Rabbit";INDIRECT("A"&MATCH("Cost ($)";B1:B21)&":A9999")))
Third Case:
If you want to use the Header, because you can have more that one Rabbit in each header, Use:
=SMALL(IF(INDIRECT("A" & MATCH("Cost ($)";B1:B21) & ":A9999")="Rabbit";ROW(INDIRECT("A" & MATCH("Cost ($)";B1:B21) & ":A9999")));1)
for the first, and:
=SMALL(IF(INDIRECT("A" & MATCH("Cost ($)";B1:B21) & ":A9999")="Rabbit";ROW(INDIRECT("A" & MATCH("Cost ($)";B1:B21) & ":A9999")));2)
for the second ...
Inserted with Ctrl+Shift+Enter
Assuming you have Pets listed in A2:A100 and headers in B1:Z1 you can use this "array formula" to get the value at the intersection of the nth instance of "Rabbit" in the [first] column named cost:
=INDEX(INDEX(B2:Z100,0,MATCH("cost",B1:Z1,0)),SMALL(IF(A2:A100="Rabbit",ROW(A2:A100)-ROW(A2)+1),n))
confirmed with CTRL+SHIFT+ENTER
Replace n with the required instance number

Resources