Excel Substitute a substring in cell (sheet A) by the corresponding string in "lookup" (sheet B) - excel

How to do this in MS Excel 15.4
I want to process Column A to become Column B
Column A | Column B
----------------------------------------------------------------------
one, 1, two, 2, three, 3 | one apple, two bananas, three strawberries
one, 1, four, 4 | one apple, four oranges
.......................
... many other rows ...
.......................
two, 2, four, 4, three, 3 | two bananas, four oranges, three strawberries
The Column A can have n matching substrings in the lookup sheet.
I have another sheet (lookup table) with what to substitute the text in Column A with
Match col | Replace col
----------------------------
one, 1 | one apple
two, 2 | two bananas
three, 3 | three strawberries
four, 4 | four oranges
... and many more ...
I want to replace all the substrings found in Column A with the Replace col value of the lookup table
It looks like I may be able to combine VLOOKUP with SUBSTITUTE, but I am struggling with it

To do it in indivdual cells;
=IFERROR(VLOOKUP(TRIM(SUBSTITUTE(MID(SUBSTITUTE($A1,",",REPT(" ",999)),(COLUMN(A:A)-1)*2*999+1,2*999),REPT(" ",999),",")),Sheet2!$A:$B,2,FALSE),"")
If you have a subscription to Office 3651 excel you can use this array formula to put it all in one cell:
=TEXTJOIN(",",TRUE,IFERROR(LOOKUP(TRIM(SUBSTITUTE(MID(SUBSTITUTE($A1,",",REPT(" ",999)),(ROW(INDIRECT("1:" & INT((LEN($A1)-LEN(SUBSTITUTE($A1,",","")))/2)+1))-1)*2*999+1,2*999),REPT(" ",999),",")),Sheet2!A:A,Sheet2!B:B),""))
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. If done correctly then Excel will put {} around the formula.
On caveat is that the reference data must be sorted on the lookup column:
1 If you do not have Office 365 but want to use this formula you can place the code from my answer HERE that will mimic the TEXTJOIN() in a module attached to the worksheet. Then use the formula as described above

I have a rather clunky solution, but it'll work for you if you don't mind taking perhaps a few extra steps. (No VBA required).
With your original data, highlight all of it and do Text to Columns with a comma delimiter. Set the destination to wherever you like. I chose the column just right of it (so, B2):
So now you have it all split up.
I put the VLOOKUP() table in "Sheet2":
And back on Sheet1, in I2, I used this formula:
=IFERROR(VLOOKUP(TRIM(B2)&", "&TRIM(C2),Sheet2!$A$1:$B$4,2,FALSE),"")
And drag right. You'll have some empty columns which you can hide/Delete, then copy all the data.

Related

Excel formula to check if items from a comma separated list in one cell exist in a comma separated list in another cell?

I have a spreadsheet where every row has two columns, each containing a comma separated list of words or phrases.
Column 1 | Column 2
---------------------------------------------------------
Orange, Pear, Sugar apple, Kiwi | Orange, Sugar apple
Banana, Watermelon, Pomegranate | Strawberry, Banana
I'm trying to create a formula that checks if the items listed in Column 2 are a subset of the items listed in Column 1 and outputs true or false.
In the above example the output should be true for Row 1 and false for Row 2.
The solutions I tried using the search and find functions only work if the items in Column 2 are listed in the same sequence, i.e. if Column 2 is a substring of Column 1.
Use this array formula:
=AND(ISNUMBER(SEARCH(", " & TRIM(MID(SUBSTITUTE(B1,",",REPT(" ",99)),(ROW($XFD$1:INDEX(XFD:XFD,LEN(B1)-LEN(SUBSTITUTE(B1,",",""))+1))-1)*99+1,99)) & ",",", "&A1&",")))
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode

Search for duplicate text string in two columns and highlight, excel

I'm looking for a way to search and highlight duplicate text strings in two different columns in Excel; this means that the cell content doesn't have to be identical, instead of that is what I need is that if the content of column A is somehow contained in any cell of column B, both cells get highlighted.
For example, let's say that I have two columns, one named "Patient" and another one called "Couples". So, what I would need is to make a comparison between both columns, and if one of the patient's names is within a couple, both cells get highlighted:
Column A. Patient name | Column B. Couple name
John Smith | Adriana Lewis - Mark Rutte
Peter Brown | Giaccomo Down - Rosy Lawn
Jerry Goldsmith | Bob Loewe - Gigi Pink
Ewan Thompson | Sonia Farrel - John Smith
In this example, the content of A2 ("John Smith") is also contained in B5 ("Sonia Farrel - John Smith"), so that I would need that both A2 and B5 get highlighted. Also, both columns don`t have the same range, one is shorter than the other, since there are more names than couples; and it can happen that two names in different cell are contained in a single couple, so that all three cells should get highlighted.
I have tried everything, with no success... please help!
Multiple ways to do this but here's one option with conditional formatting.
Rule applied to data in column A, using COUNTIF and wildcards.
=COUNTIF($B$2:$B$5,"*"&A2&"*")>0
Rule applied to data in column B, using ISNUMBER, SEARCH and SUMPRODUCT.
=SUMPRODUCT(--ISNUMBER(SEARCH($A$2:$A$5,B2)))>0

Swapping street names

I have a list of street intersections in excel. Of course it reads S 74th St / Rogers Ave as being different from Rogers Ave / S 74th St. I am trying to swap the cells on the columns so that intersections like that all end up looking the same. I have broken them down into two columns and having been trying the iferror/index/match functions but obviously not doing it right. If there is a macro I could write, that would be ideal. Any ideas?
Assuming your data always appears in a single Cell, in the format "[Street 1] / [Street 2]", this can be done with some helper columns.
First in column B, use the following formula, which will pull out the left name from the intersection:
=LEFT(A1,SEARCH(" / ",A1)-1)
Then do a similar thing in column C:
=RIGHT(A1,LEN(A1)-SEARCH(" / ",A1)-2)
Then, in column D, you will create a new text string showing the intersection, sorted [sort of] alphabetically by the first 4 characters of each road. You can do this as follows:
First, consider the below formula, which picks up the ASCII character value of the first 4 characters of the word found in B1:
=SUM(CODE(MID(LOWER(B1),{1,2,3,4},1)))
This creates a single number which equals the sum of the specific code for each character. We can use that to sort the priority of one cell over another, by comparing with the sum of the same formula for the cell in C1, like so:
=SUM(CODE(MID(LOWER(B1),{1,2,3,4},1)))>SUM(CODE(MID(LOWER(C1),{1,2,3,4},1)))
This will show TRUE if the sum of those codes in B1 is bigger than the sum of those codes in C1. Put this formula in D1 and copy down.
Finally, recreate your ordered string as follows, in column E:
=IF(D1,B1&" / "&C1,C1&" / "&B1)
Now this can be used as a column of ordered data, which should eliminate matches in the streets [assuming no streets have the same 4 characters as any other, and no duplicate streets start differently - ie 5 Ave vs 5th Ave].

Add cell string to another cell if 2 cells are the same for 2 rows

I'm trying to make a macro that will go through a spreadsheet, and based on the first and last name being the same for 2 rows, add the contents of an ethnicity column to the first row.
eg.
FirstN|LastN |Ethnicity |ID |
Sally |Smith |Caucasian |55555 |
Sally |Smith |Native American | |
Sally |Smith |Black/African American | |
(after the macro runs)
Sally |Smith |Caucasian/Native American/Black/African American|55555 |
Any suggestions on how to do this? I read several different methods for VBA but have gotten confused as to what way would work to create this macro.
EDIT
There may be more than 2 rows that need to be combined, and the lower row(s) need to be deleted or removed some how.
If you can use a formula, then you can do those:
Couple of assumptions I'm making:
Sally is in cell A2 (there are headers in row 1).
No person has more than 2 ethnicities.
Now, for the steps:
Put a filter and sort by name and surname. This provides for any person having their names separated. (i.e. if there is a 'Sally Smith' at the top, there are no more 'Sally Smith' somewhere down in the sheet after different people).
In column D, put the formula =if(and(A2=A3,B2=B3),C2&"/"&C3,"")
Extend the filter to column D and filter out all the blanks.
That is does is it sees whether the names cells A2 and A3 are equal (names are the same), and whether the cells B2 and B3 are equal (surnames are the same).
If both are true, it's the same person, so we concatenate (using & is another way to concatenate besides using concatenate()) the two ethnicities.
Otherwise, if either the name, or username, or both are different, leave as blank.
To delete the redundant rows altogether, copy/paste values on column D, filter on the blank cells in column D and delete. Sort afterwards.
EDIT: As per edit of question:
The new steps:
Put a filter and sort by name and surname. (already explained above)
In column E, put the formula =IF(AND(A1=A2,B1=B2),E1&"/"&C2,C2) (I changed the formula to adapt to the new method)
In column F, put the formula =if(and(A1=A2,B1=B2),F1+1,1)
In column G, put the formula =if(F3<F2,1,0)
In column H, put the formula =if(and(D2="",A1=A2,B1=B2),H1,D2) (this takes the ID wherever it goes).
Put the formulae as from row 2. What step 3 does is putting an incremental number for the people with same name.
What step 4 does is checking for when the column F goes back to 1. This will identify your 'final rows to be kept'.
Here's my output from those formulae:
The green rows are what you keep (notice that there is 1 in column G that allows you to quickly spot them), and the columns A, B, C, E and H are the columns you keep in the final sheet. Don't forget to copy/paste values once you are done with the formulae and before deleting rows!
If first Sally is in A1 then =IF(AND(A1=A2,B1=B2),C1&"/"&C2,"")copied down as appropriate might suit. Assumes where not the same a blank ("") is preferred to repetition of the C value.

Return value of last match

I need a formula to return the value of Data for the last match of "Text". Row number is also acceptable. Macro is NOT acceptable. Name column is unsorted and cannot be sorted!
Only column "Name" is used as lookup value. I would rather use a/multiple helper column(s) instead of an array formula.
Row Name Data
1 Joe 10
2 Tom 20
3 Eva 30
4 Adam 40
5 Tom 21
LARGE only works with numbers, and VLOOKUP only returns the first match. LOOKUP only works sometimes, so its out too.
So if I wanted the last match for "Tom" then it should return "21".
Array formulas could be avoided with a helper column.
Suppose to have in F1 the name to match (i.e. Tom)
In the helper column row C2 enter
=IF(A2<>$F$1,0,row())
Then copy the formulas along your data.
Now the column C contains 0 for the unmatched names and the row number for the matched ones. Maxing the column yield the row of the solution.
Now the result is simple a matter of using the correct offset with the function offset:
=OFFSET(B1,max(C:C)-1,0)
PS: my copy of excel is in italian, so I can't test this english translaction of the formulas.
I think it's the easiest way to make it.
=LOOKUP("Tom";A2:B7)
Create a column with an array formula (enter it with Ctrl+Shift+Enter):
=VLOOKUP(MAX(IF($B$2:$B$6=B2, $A$2:A$6, 0)), $A$2:$C$6, 3, FALSE)
To make sure you did it right, click on the cell, and the formula should be shown encased in curly brackets ({}).
Note: This assumes that "Row" is in A1.
I have come up with a solution, but it requires that numbers in Data are concurrent, like so
Name Data
Joe 1
Tom 1
Eva 1
Adam 1
Tom 2
Tom 3
Eva 2
But thats okay, since that my data looks like that anyway. So if Name is used before then it must be the old highest +1 aka concurrent.
Name is A1 and Data is B1, and this formula goes into C2:
FLOOR(SQRT(2*SUMIF(A2:A7,A2,B2:B7)),1)

Resources