Comparing Two Columns with Different Two Columns in same sheet and Hightlighting which is not matched - excel

I am looking out to compare two columns “C and H” with “D and I” and return the specific rows in highlighted which are not matched with those two sets (which I call it as Failed)
Where column C and H contains name of the hotel and D and I contains the address of that particular hotel.
I don’t want those to be matched exactly, but if they both have a slight part of the content, its acceptable.
Example 1:
C1 contains - Hatta Fort Hotel
D1 contains - Hatta Oman Road
H1 column contains - Hatta Fort Hotel Dubai
I1 column contains - Old Dubai Abu Dhabi Road Dubai
In this case, I want to compare C1 and H1 and D1 and I1, and those two sets contains a part of the text, so it’s a Passed (in my words).
Example 2:
C2 contains - Hotel Arcobaleno
D2 contains - Provinciale
H2 contains - Hotel Arcobaleno
I2 contains - Contrada Taureana
In this case, If I compare C2 with H2 it matches, but if I compare D2 with I2 it does not, so I want this whole row to be highlighted with some color.
Is there any formula or macro for the same
Looking forward to hear a positive and helpful response.

Select columns C, D, H, and I (by using ctrl) and then apply this conditional format formula:
=AND($C1<>"",$D1<>"",$H1<>"",$I1<>"",NOT(AND(SUM(COUNTIF($H1,"*"&TRIM(MID(SUBSTITUTE($C1," ",REPT(" ",255)),255*(ROW(INDIRECT("1:"&LEN($C1)-LEN(SUBSTITUTE($C1," ",""))+1))-1)+1,255))&"*"))>0,SUM(COUNTIF($I1,"*"&TRIM(MID(SUBSTITUTE($D1," ",REPT(" ",255)),255*(ROW(INDIRECT("1:"&LEN($D1)-LEN(SUBSTITUTE($D1," ",""))+1))-1)+1,255))&"*"))>0)))
Example workbook here:
https://docs.google.com/file/d/0Bz-nM5djZBWYX0EwMk1GN1NjMmc/edit?usp=sharing

Related

Excel: Find the Position/Location of Each Occurrence of a Specific Word in a String in a Column

Using an MS Excel formula, I would like to find the Position(s)/Location of specific words found within a String of text located in a Range/Column of cells.
I'm using a formula that only identifies and finds the position(s) of a keyword by a single cell versus a column. I'm not able to repeat this action by looking throughout the column cells, using my numeric helper column (Cell D2:D12 "Occurrence") which provides the occurrence of the next position to be found.
Helper columns are welcomed if necessary to achieve the desired results.
The cells highlighted in "Red" is what I'm looking for as the final output results.
See below for formulas used for Column C and D. Text string is located in Column A2:A12.
COLUMN A
DATA TEXT
Dolly would not count her eggs but Count her apples
Tony drove a pickup truck to work
Over many nights he could not sleep
Only this and nothing more
She went by to pickup her son
To many times he would count over
They went shopping for Christmas
You can count him to pickup someone
They count so much they would sleep in his pickup
Nobody would play with Timmy
Trying to find the position location of each word
COLUMN B
Keyword List
Toe
Shoe
Count
Pumpkin
Pickup
Randy
Sally
Sleep
Jonathan
C2: =SUMPRODUCT((LEN($A$2:$A$12)-LEN(SUBSTITUTE((UPPER($A$2:$A$12)),UPPER(B2),"")))/LEN(B2))
D2&E2: =FILTER(B2:C10,C2:C10>0)
F2: =IF(E2="","",REPT(D2&"^",E2))
G2: =TEXTJOIN("",TRUE,F2:F4)
H2: =TRIM(MID(SUBSTITUTE($G$2,"^",REPT(" ",LEN($G$2))),(COUNTIF($H$1:H1,"<>&""")-1)*LEN($G$2)+1,LEN($G$2)))
I2: =IF(H2="","",IF(COUNTIF($H$2:H2,H2)>1,SUM(I1+1),1))
Try the below picture set up and formulas solution as in.
1] C2 "Total occurrence", formula copied down :
=SUMPRODUCT(LEN($A$2:$A$12)-LEN(SUBSTITUTE(LOWER($A$2:$A$12),LOWER(B2),"")))/LEN(B2)
2] D2 "Count", formula copied across to F2"Sleep" and all copied down :
=SUMPRODUCT(LEN($A2)-LEN(SUBSTITUTE(LOWER($A2),LOWER(D$1),"")))/LEN(D$1)
3] G2 "Keywords" , formula copied down:
=LOOKUP(ROW(A1),SUMIF(OFFSET(C$1,,,ROW($1:$12),),"<>")+1,B$2:B$4)&""
4] H2 "Occurrence", formula copied down:
=IF(G2="","",COUNTIF(G$2:G2,G2))
5] I2 "Content Keyword Data Text", formula copied down:
=IF(G2="","",LOOKUP(H2,SUMIF(OFFSET(INDEX($1:$1,MATCH(G2,$1:$1,0)),,,ROW($1:$12),),"<>")+1,$A$2:$A$12))
6] J2 "Position", formula copied down:
=IF(G2="","",FIND("~",SUBSTITUTE(LOWER(I2),LOWER(G2),"~",COUNTIFS(G$2:G2,G2,I$2:I2,I2))))

Formula to Return Text in the Row of Largest Number

Column A Has Text & Columns B, C & D contain numbers.
For Ex.)
A... …B C D
John 4 6 2
Dave 4 6 4
Mike 4 5 1
Bill 2 5 9
I would like a cell to return the name in column A that has the Largest Number in Column B. And if there are similar numbers, go to the next column and determine which is highest, and if that is tied go to the next column and so on.
Any help would be appreciated.
We can de-conflict ties.In E1 enter:
=B1 + C1/(10*MAX(C:C))+D1/(100*MAX(D:D))
and copy down. Then in another cell enter:
=INDEX(A:A,MATCH(MAX(E:E),E:E,0))
EDIT#1
This is only good for 3 columns of numbers, but it is very easy to add additional de-confliction terms if necessary:
=B1 + C1/(10*MAX(C:C))+D1/(100*MAX(D:D))+E1/(1000*MAX(E:E))
For an expandable number of rows/columns, use a helper row with the same number of columns as number columns in your data. The formulas below reference the following image (the data are in A1:G7):
B9-->=MAX(B1:B7)
C9 (fill over the remaining columns to G9)-->
=MAX(IF(MMULT(--($B1:B7=$B9:B9),--(ROW(INDIRECT("1:"&COLUMNS($B9:B9)))>0))=COLUMNS($B9:B9),C1:C7))
The following formula will give the answer (shown in A9 above):
=INDEX(A1:A7,MATCH(TRUE,(MMULT(--($B1:G7=$B9:G9),--(ROW(INDIRECT("1:"&COLUMNS($B9:G9)))>0))=COLUMNS($B9:G9)),0))
UPDATE WITH ALTERNATIVE METHOD
Using a helper column instead, again referencing the image below (the data are in A1:G7):
I1 (fill down to I7)-->
=SUM(--(MMULT(SIGN(B1:G1-$B$1:$G$7)*2^(COLUMN(G1)-COLUMN(A1:F1)),--(ROW(INDIRECT("1:"&COLUMNS(B1:G1)))>0))>0))
The following formula will give the answer (shown in J1 above):
=INDEX(A1:A7,MATCH(MAX(I1:I7),I1:I7,))
As a bonus, notice that the helper column corresponds to the order that you would get from sorting the data by each column left-to-right. In other words, you could use the helper column to perform a formula-based multi-column sort on strictly numeric data. For the last image, entering the following array formula into a range with the same dimensions as A1:G7 gives a descending sort on columns B through G:
=IF(A1:A7=A1:A7,INDEX(A1:G7,MATCH(ROW(A7)-ROW(A1:A7),I1:I7,0),))

Splitting addresses into columns in Excel

I need help splitting addresses into columns in Excel.
Addresses in COLUMN A are written like:
601 W Houston St Abbott, TX 76621 United States
13498 US 301 South Riverview, FL 33578 United States
COLUMN B is actually a helper column. It contains only the city names from COLUMN A. My idea was to somehow match COLUMN B with COLUMN A and then all matches move to another column. That would separate City from the Address.State, Zip and Country I can use "split text to columns" since "comma" is delimiter. But I need help splitting address and the city.
There is a "comma" right after the city name, but some cities has more than one word in city name.
What I need to do is split the addresses like it's highlighted in green in the image below.
What is the best way to do that in Excel? What would be the formula for that?
We can use a quirk of LOOKUP to get this working.
=LOOKUP(1E+99,FIND(B$2:B$100,A2),B$2:B$100) in D2 will return the city based on searching for matches in column B. Note that this will need the full range of column B specified to be filled.
Then we can put =LEFT(A2,FIND(D2,A2)-2) in C2 to get the first part of the address.
The rest is easy if we can assume that the state, Zip and country are of constant length (if you've got any addressses outside the US then you'll need to alter this):
=LEFT(RIGHT(A2,22),3) in E2
=LEFT(RIGHT(A2,19),5) in F2
=RIGHT(A2,13) in G2
Since you already have City in Col B, just replace the city in A
D2 =SUBSTITUTE(A2,C2,"")
Column C Paste special values in Col C
Split Column C using comma.
Then split the Column D using "space". Assuming you have all records in US, you can add the country to all rows if required.
EDIT
I missed that the city name in the row does not correspond to the address. To match the city from the Master, you can use this array formula:
C2 =INDEX(B:B,MATCH(1,MATCH(""&$B:$B&"",A2,0),0))
Array formula must be confirmed with Ctrl-Shift-Enter.
However, this will find the first match. If you have cities Foster & Foster City in your master, Foster City wlll never be matched. So, sort the cities in descending order of length.
Once you have the City name matched you can follow the steps I gave earlier. Note that I have adjusted the formula to take into account the city name that has been matched by this new formula.
So, it is possible to get it done with the formula. It may not be the best way, but I got what I needed.
I've added a new sheet and named it "cities"
I moved the city list from sheet 1 to COL A in sheet "cities"
In sheet 1, B1 = =INDEX(cities!$A$1:$A$10000;LOOKUP(99^99;MATCH(RIGHT(TRIM(LEFT(SUBSTITUTE(A2;",";REPT(" ";255));255));ROW($A$1:$A$100));cities!$A$1:$A$10000;0)))
Then I've simply use SUBSTITUTE to remove city names from column A in Sheet 1: C1==SUBSTITUTE(A1, B1, "")
And that's it!

Swapping street names

I have a list of street intersections in excel. Of course it reads S 74th St / Rogers Ave as being different from Rogers Ave / S 74th St. I am trying to swap the cells on the columns so that intersections like that all end up looking the same. I have broken them down into two columns and having been trying the iferror/index/match functions but obviously not doing it right. If there is a macro I could write, that would be ideal. Any ideas?
Assuming your data always appears in a single Cell, in the format "[Street 1] / [Street 2]", this can be done with some helper columns.
First in column B, use the following formula, which will pull out the left name from the intersection:
=LEFT(A1,SEARCH(" / ",A1)-1)
Then do a similar thing in column C:
=RIGHT(A1,LEN(A1)-SEARCH(" / ",A1)-2)
Then, in column D, you will create a new text string showing the intersection, sorted [sort of] alphabetically by the first 4 characters of each road. You can do this as follows:
First, consider the below formula, which picks up the ASCII character value of the first 4 characters of the word found in B1:
=SUM(CODE(MID(LOWER(B1),{1,2,3,4},1)))
This creates a single number which equals the sum of the specific code for each character. We can use that to sort the priority of one cell over another, by comparing with the sum of the same formula for the cell in C1, like so:
=SUM(CODE(MID(LOWER(B1),{1,2,3,4},1)))>SUM(CODE(MID(LOWER(C1),{1,2,3,4},1)))
This will show TRUE if the sum of those codes in B1 is bigger than the sum of those codes in C1. Put this formula in D1 and copy down.
Finally, recreate your ordered string as follows, in column E:
=IF(D1,B1&" / "&C1,C1&" / "&B1)
Now this can be used as a column of ordered data, which should eliminate matches in the streets [assuming no streets have the same 4 characters as any other, and no duplicate streets start differently - ie 5 Ave vs 5th Ave].

Excel Find Nth Instance of Multiple Criteria

I have 3 columns of data. Col A contains Names, Col B contains a client ID, Col C contains a date.
I'm trying to figure out how to write a formula that will find the top 2 and top 3 instances of a specific Name in Col A and client ID in Col B and return the value in Col C.
Trying to avoid using VBA, but not sure if this is doable.
So for example data looks like this and I would want to return that Sam dealt with Client ABC the 2nd time around on 12/16.
Sam ABC 12/3
Adam XYZ 12/5
John DEF 12/9
Sam ABC 12/16
Adam HIJ 12/18
Assuming
your headers are in A1:C1
your data starts from A3 (yes, not A2)
You enter the name in G2 & Client ID in G3 & you want the list of
dates starting from G5
Enter these formula/values:
A2: =G2
B2: =G3
C2: =0
G5:
=IFERROR(INDEX(($A$2:$A$500=$G$2)*($B$2:$B$500=$G$3)*($C$2:C$500),MATCH(0,COUNTIF($G$4:G4,($A$2:$A$500=$G$2)*($B$2:$B$500=$G$3)*($C$2:C$500)),0)),"End")
(Formula in G5 is an array formula; confirm this with Ctrl+Shift+Enter)
Drag the formula in G5 down until you see 'End'
Value in cell G5 will always be 0 or '1/0' based on your formatting.
The list of dates corresponding to the name & client ID combination will start from G6.
Let me see if I understood your need. Correct me if I'm wrong.
You want to be able to inform a Name and a Client ID and have Excel tell you the last 3 occurrences of that combination?
By "top 2 and top 3 instances of a specific name" I'm assuming you mean the top 2 and 3 dates found for that specific name and ID.
If so, try this:
Supposing you have your example data table starting at Cell A1 and ending at Cell C6 (including column headers) and that you'll enter the name in F1 and Client ID on F2
A B C
1 Name Client ID Date
2 Sam ABC 12/3
3 Adam XYZ 12/5
4 John DEF 12/9
5 Sam ABC 12/16
6 Adam HIJ 12/18
Type this formula where you want to return the date of the last occurrence:
=IFERROR(LARGE(IF($A$2:$A$6=$F$1,IF($B$2:$B$6=$F$2,$C$2:$C$6)),1),"-")
This should be entered as an Array Formula, so don't forget to press CTRL + SHIFT + ENTER or it'll not work.
To bring the 2nd last occurrence on another cell, just copy and paste the formula and change the number 1 to 2 (as indicated below):
=IFERROR(LARGE(IF($A$2:$A$6=$F$1,IF($B$2:$B$6=$F$2,$C$2:$C$6)),2),"-")
If you typed 'Sam' on F1 and 'ABC' on F2, this formula would return '12/16' as the last occurrence, '12/3' as the 2nd last occurrence and a dash (-) as the 3rd last occurrence, since there isn't one.
Of course, you'll have to adjust the ranges and other cell references accordingly in your real data set.
Hope this helps.

Resources