Separate full address into street address, city, state, zip, country in excel - excel

[I have more than 47K Full Address Data of different countries and I want to split them into Address, City, State, Zip Code, Country.
I have tried many ways but couldn't work any formula as these addresses are different in structure and pattern
N.B: I Don't have good knowledge about Excel VBA or Macro]

This task is too complicated to perform with a simple formula, you'll need VBA to do it, let me give you some guidance:
You can count the amount of commas in order to guess the content (apparently some addresses start with the name of the building). In case the name of the building is not there, just add a comma in order to have the same format everywhere.
Once everything has a similar format (the amount of commas is equal everywhere) you can start splitting, based on the comma as a separator. The results will be "Name", "Full street name and number", "Full city ID", ...
Things which are still composed of different items (like "Full city ID") can be split by taking the first part (which is a number, separated from the rest by a space) and the second (the rest of the "Full city ID").
Edit: add small macro
This macro contains the functions Split() and IsNumeric(), it's all you need:
Sub test()
Dim A, B As Integer
T = Split("1, 2, X", ",")
If IsNumeric(T(0)) Then A = T(0) Else A = -1
If IsNumeric(T(2)) Then B = T(2) Else B = -1
MsgBox "Result : A=[" & CStr(A) & "], B=[" & CStr(B) & "]"
End Sub

Related

Vba to break up text within a cell? Text to columns not working

How can I break up text within a cell with vba? I exported emails to an excel file using a vba and the information exported in one of the cells is formatted as seen below:
Name * xxxxxx
Country of residence * xxxxxx Email * xxxxx#gmail.com mailto:xxxxxxx#gmail.com
Mobile phone number * 0xxxxxx
Do you want to become a member of Assoc? Yes Check all that apply *
Members
Education
Ethical Conduct
Events
Regulation
I tried the solution below and it’s not working.
From article: If you need to build a formula to remove these line breaks all you need to know is that this ‘character’ is character 10 in Excel. You can create this character in an Excel cell with the formula =CHAR(10).
So to remove it we can use the SUBSTITUTE formula and replace CHAR(10) with nothing ( shown as “”).
https://www.auditexcel.co.za/blog/removing-line-breaks-from-cells-the-alt-enters/#:~:text=Building%20a%20formula%20to%20remove%20the%20ALT%20ENTER%20line%20breaks,-If%20you%20need&text=You%20can%20create%20this%20character,cell%20with%20no%20line%20breaks.
My understanding is that you dump an email into 1 excel cell and are hoping to separate a series of strings [Country, Email, Etc.] that are separated by a line break?
I suggest using the split function to separate the strings into an array, then loop through that array to put the information in the desired cells. Mind you this will only work if the items are in the same order everytime, if the order can change then you will need to add a data verification step. i.e. if inStr("#",[Range]) then its an email...
Split([string to split], [delimiter])
https://learn.microsoft.com/en-us/office/vba/language/reference/user-interface-help/split-function
Dim strEmail as String 'Email dump
Dim arrEmail() as String 'Array for looping
Dim ItemsInArray as Integer 'Used to hold array count
Dim i as Integer 'Counter
strEmail = ActiveSheet.Cells("[Column,Row]") 'Cell your email dumps to
arrEmail = Split(strEmail, char(10)) 'Populate array
ItemsInArray = UBound(arrEmail) 'Get upper bound of array (total item count)
For i = 0 to ItemsInArray
ActiveSheet.Cells("[Column,Row]") = arrEmail(i)
Column + 1
Next i
when i = 0 its a country code
when i = 1 its an email
when i = 2 its a phone #
etc....

Categorization Based on Text-Code in Columns

I have two columns with codes relating to various products. It is a 3-part code delimitized with a '-'. The length of each of the 3 parts are not constant and are alphanumeric.
The need is that I have to categorize them according to 4 criterias:
Compare each of the codes in B against A and vice-versa, and categorize them all as below and in the image:
1. Exactly Matched codes
2. Prefix or Suffix Changes codes
3. Totally New codes
However there seems to be a complication. The codes in two columns are not necessarily sorted and there can be a match anywhere in the other column, Is there a way to look up for the text and then do the compare function. I know this opens up a lot of complications -- my thought is to look up the value, and then pass the parameters to get the category.. Thanks again!!! – user1087661 1 hour ago
Kindly help me achieve this. Is there any formula to check through an array and Find functions? many thanks for the support.
you can use split function and select case to deal with your problem. i assume you know how to use a UDF.
Function CompareCode(Text1, Text2, Optional Delim = "-")
Dim T1, T2, CC
T1 = Split(Text1, Delim)
T2 = Split(Text2, Delim)
CC = (T1(0) <> T2(0)) * 100 + (T1(1) <> T2(1)) * 10 + (T1(2) <> T2(2)) * 1
CC = Format(-CC, "000")
Select Case CC
Case "000": CompareCode = "Same code"
Case "100": CompareCode = "Prefix changed"
Case "010": CompareCode = "Base changed"
Case "110": CompareCode = "Prefix and base changed"
Case "001": CompareCode = "Suffix changed"
Case "101": CompareCode = "Prefix and suffix changed"
Case "011": CompareCode = "Base and suffix changed"
Case "111": CompareCode = "Totally new code"
Case Else:
End Select
End Function
This is merely a partial answer:
For the first part, the exactely matching codes, you can use a simple lookup formula such as SUMIFS() where you are matching the items in Column B to the whole set in Column A.
For the other two requirements, if I wanted to do this by formula, then I would use the LEN(), LEFT() and RIGHT() formulas to extract the prefix, base, and sufix into separate columns. Do this for both Group A and B.
Finding your matching groups should become fairly straight forward from that point on.

Prevent Partial Duplicates in Excel

I have a worksheet with products where the people in my office can add new positions. The problem we're running into is that the products have specifications but not everybody puts them in (or inputs them wrong).
Example:
"cool product 14C"
Is there a way to convert Data Valuation option so that it warns me now in case I put "very cool product 14B" or anything that contains an already existing string of characters (say, longer than 4), like "cool produKt 14C" but also "good product 15" and so on?
I know that I can prevent 100% matches using COUNTIF and spot words that start/end in the same way using LEFT/RIGHT but I need to spot partial matches within the entries as well.
Thanks a lot!
If you want to cover typo's, word wraps, figure permutations etc. maybe a SOUNDEX algorithm would suit to your problem. Here's an implementation for Excel ...
So if you insert this as a user defined function, and create a column =SOUNDEX(A1) for each product row, upon entry of a new product name you can filter for all product rows with same SOUNDEX value. You can further automate this by letting user enter the new name into a dialog form first, do the validation, present them a Combo Box dropdown with possible duplicates, etc. etc. etc.
edit:
small function to find parts of strings terminated by blanks in a range (in answer to your comment)
Function FindSplit(Arg As Range, LookRange As Range) As String
Dim LookFor() As String, LookCell As Range
Dim Idx As Long
LookFor = Split(Arg)
FindSplit = ""
For Idx = 0 To UBound(LookFor)
For Each LookCell In LookRange.Cells
If InStr(1, LookCell, LookFor(Idx)) <> 0 Then
If FindSplit <> "" Then FindSplit = FindSplit & ", "
FindSplit = FindSplit & LookFor(Idx) & ":" & LookCell.Row
End If
Next LookCell
Next Idx
If FindSplit = "" Then FindSplit = "Cool entry!"
End Function
This is a bit crude ... but what it does is the following
split a single cell argument in pieces and put it into an array --> split()
process each piece --> For Idx = ...
search another range for strings that contain the piece --> For Each ...
add piece and row number of cell where it was found into a result string
You can enter/copy this as a formula next to each cell input and know immediately if you've done a cool input or not.
Value of cell D8 is [asd:3, wer:4]
Note the use of absolute addressing in the start of lookup range; this way you can copy the formula well down.
edit 17-Mar-2015
further to comment Joanna 17-Mar-2015, if the search argument is part of the range you're scanning, e.g. =FINDSPLIT(C5; C1:C12) you want to make sure that the If Instr(...) doesn't hit if LookCell and LookFor(Idx) are really the same cell as this would create a false positive. So you would rewrite the statement to
...
...
If InStr(1, LookCell, LookFor(Idx)) <> 0 And _
Not (LookCell.Row = Arg.Row And LookCell.Column = Arg.Column) _
Then
hint
Do not use a complete column (e.g. $C:$C) as the second argument as the function tends to become very slow without further precautions

How to concatenate a list of words into a sentence with "and" before last item in Excel?

I want to join a list of words in Excel (not in VBA... with an Excel formula in the worksheet) to the following specifications:
Formula should ignore empty cells.
Formula should concatenate the words with "and" before final item if there is more than one item in the array of cells.
Formula should add "," between items if there are more than two items.
Examples:
A1=dog
A2=cat
A3=bird
A4=fish
Result would be: dog, cat, bird, and fish
A1=dog
A2=cat
A3=(empty cell)
A4=fish
Result would be: dog, cat, and fish
A1=dog
A2=(empty cell)
A3=bird
A4=(empty cell)
Result would be: dog and bird
A1=dog
A2=(empty cell)
A3=(empty cell)
A4=(empty cell)
Result would be: dog
Pretty please? I promise I've searched and searched for the answer.
Edit: Thank you, ExcelArchitect, I got it! This was the first time I'd ever used a custom function. You use it just like any other function in the worksheet! This is so great.
Not to push my luck, but how to do I get two cells to concatenate with my result if there is only one word in the result and two other cells if there is more than one word? Example: If the function you made for me returns just "dog", I'd want it to concatenate a cell with the text (B1) "My favorite thing to wear is a " and then "dog" and then another cell (B2) that says " costume." to make the sentence "My favorite thing to wear is a dog costume." But if it returns more than one animal, it would concatenate two other cells like this: Cell C1 "My favorite things to wear are " and "dog, cat, and bird" and Cell C2 " costumes." so that it would say "My favorite things to wear are dog, cat, and bird costumes."
If you're curious, my data really has nothing to do with animals or costumes. I am writing a program that will score a psychological test and then create an interpretive report from the test scores (I'm a psychologist).
-Mary Anne
Mary Anne:
This would be a great time to use VBA! But if you don't want to, there is a way to accomplish your goal without it.
You have to account for all of the possible outcomes here. With 4 different animals that means you have 15 outcomes:
Your equation just has to take into account all 15. It is VERY long and drawn out as a result. As such, if you have more than 4 animals that you'd like to turn into phrases, you should go the VBA route.
Here is my set up:
The formula in A7 is the following:
=IF(AND(A2<>"", A3="", A4="", A5=""), A2, IF(AND(A2="", A3<>"", A4="", A5=""), A3, IF(AND(A2="", A3="", A4<>"", A5=""), A4, IF(AND(A2="", A3="", A4="", A5<>""), A5, IF(AND(A2<>"", A3<>"", A4="", A5=""), A2&" and "&A3, IF(AND(A2<>"", A3="", A4<>"", A5=""), A2&" and "&A4, IF(AND(A2<>"", A3="", A4="", A5<>""), A2&" and "&A5, IF(AND(A2="", A3<>"", A4<>"", A5=""),A3&" and "&A4, IF(AND(A2="", A3<>"", A4="", A5<>""), A3&" and "&A5, IF(AND(A2="", A3="", A4<>"", A5<>""),A4&" and "&A5, IF(AND(A2<>"", A3<>"", A4<>"", A5=""), A2&", "&A3&", and "&A4, IF(AND(A2<>"", A3<>"", A4="", A5<>""), A2&", "&A3&", and "&A5, IF(AND(A2<>"", A3="", A4<>"", A5<>""), A2&", "&A4&", and "&A5, IF(AND(A2="", A3<>"", A4<>"", A5<>""), A3&", "&A4&", and "&A5, A2&", "&A3&", "&A4&", and "&A5))))))))))))))
Here it is via Excel:
Mary Anne - I'm such a nerd that I had to do this. Here is the VBA solution, and you can have as many names as you want! Paste this code into a new module in the workbook (go to Developer -> Visual Basic, then Insert -> New Module, and paste), then you can use it in your worksheet like a regular function. Just give it the range where the names are and you should be good to go! -Matt
Function CreatePhrase(NamesRng As Range) As String
'Creates a comma-separated phrase given a list of words or names
Dim Cell As Range
Dim l As Long
Dim cp As String
'Add commas between the values in the cells
For Each Cell In NamesRng
If Not IsEmpty(Cell) And Not Cell.Value = "" And Not Cell.Value = " " Then
cp = cp & Cell.Value & ", "
End If
Next Cell
'Remove trailing comma and space
If Right(cp, 2) = ", " Then cp = Left(cp, Len(cp) - 2)
'If there is only one value (no commas) then quit here
If InStr(1, cp, ",", vbTextCompare) = 0 Then
CreatePhrase = cp
Exit Function
End If
'Add "and" to the end of the phrase
For l = 1 To Len(cp)
If Mid(cp, Len(cp) - l + 1, 1) = "," Then
cp = Left(cp, Len(cp) - l + 2) & "and" & Right(cp, l - 1)
Exit For
End If
Next l
'If there are only two words or names (only one comma) then remove the comma
If InStr(InStr(1, cp, ",", vbTextCompare) + 1, cp, ",", vbTextCompare) = 0 Then
cp = Left(cp, InStr(1, cp, ",", vbTextCompare) - 1) & Right(cp, Len(cp) - InStr(1, cp, ",", vbTextCompare))
End If
CreatePhrase = cp
End Function
Hope that helps!
Matt, via ExcelArchitect.com
VBA is simpler. A formula is quite complicated, since Excel has no native functions allowing concatenation of a range. However, given that you have written that you would have up to eight animals, it is doable with the following formula which concatenates the contents of A1:A8 according to your rules. You can change those locations in the formula in the obvious locations.
I made one change: I may be wrong, but I believe English rules indicate that the comma preceding the last and should be omitted, so I did so. It could be added in if necessary. EDIT: Further investigation reveals a difference between US and UK rules: US rules are as you requested, UK rules omit the comma before the conjunction. I will modify the formulas and UDF to comply with US conventions.
In the formulas, the modification is to place a comma immediately prior to the and. The change in the UDF is likewise minor.
The formula was constructed from the following sequences:
So putting those formulas together, so as only to refer to A1:A8, we wind up with this monster:
=SUBSTITUTE(IFERROR(SUBSTITUTE(MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CONCATENATE(",",A1,",",A2,",",A3,",",A4,",",A5,",",A6,",",A7,",",A8,","),",,",","),",,",","),",,",","),2,LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CONCATENATE(",",A1,",",A2,",",A3,",",A4,",",A5,",",A6,",",A7,",",A8,","),",,",","),",,",","),",,",","))-2),",",",and ",LEN(MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CONCATENATE(",",A1,",",A2,",",A3,",",A4,",",A5,",",A6,",",A7,",",A8,","),",,",","),",,",","),",,",","),2,LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CONCATENATE(",",A1,",",A2,",",A3,",",A4,",",A5,",",A6,",",A7,",",A8,","),",,",","),",,",","),",,",","))-2))-LEN(SUBSTITUTE(MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CONCATENATE(",",A1,",",A2,",",A3,",",A4,",",A5,",",A6,",",A7,",",A8,","),",,",","),",,",","),",,",","),2,LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CONCATENATE(",",A1,",",A2,",",A3,",",A4,",",A5,",",A6,",",A7,",",A8,","),",,",","),",,",","),",,",","))-2),",",""))),MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CONCATENATE(",",A1,",",A2,",",A3,",",A4,",",A5,",",A6,",",A7,",",A8,","),",,",","),",,",","),",,",","),2,LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CONCATENATE(",",A1,",",A2,",",A3,",",A4,",",A5,",",A6,",",A7,",",A8,","),",,",","),",,",","),",,",","))-2)),",",", ")
Here is a VBA solution which will allow for any number of items; it concatenate according to the same rules as above.
Option Explicit
Function ConcatRangeWithAnd(RG As Range, Optional Delim As String = ", ")
Dim COL As Collection
Dim C As Range
Dim S As String
Dim I As Long
Set COL = New Collection
For Each C In RG
If Len(C.Text) > 0 Then COL.Add C.Text
Next C
Select Case COL.Count
Case 0
Exit Function
Case 1
ConcatRangeWithAnd = COL(1)
Case 2
ConcatRangeWithAnd = COL(1) & " and " & COL(2)
Case Else
For I = 1 To COL.Count - 1
S = S & COL(I) & ", "
Next I
ConcatRangeWithAnd = S & "and " & COL(COL.Count)
End Select
End Function
With the new TEXTJOIN function, this can be done very easily.
Step 1: Use TEXTJOIN function with the ", " delimiter, and set the ignore_empty to TRUE. This will give you comma separated, concatenated string, ignoring the blank values.
Step 2: Count the number of not blank entries in the list using COUNTA function. And subtract 1 from it. You might want to floor the value at 1 using the MAX function at this point.
Step 3: Use the SUBSTITUTE function to replace the last instance of the comma, which was calculated in Step 2, with a " and ".
Putting it all together:
=SUBSTITUTE(TEXTJOIN(", ",TRUE,A1:A14),", "," and ",MAX(1,COUNTA(A1:A14)-1))
Plug in any Range you want instead of A1:A14 in the above formula, and you will get a comma separated concatenate with an and before the last word.
Regarding duplicates:
Firstly, I really love Matt's solution and I've added this to my collection of custom functions.
What I do miss though is the possibility to remove duplicates from the phrase without removing them from the original range.
As you can't create a virtual range (a range that you can just play with in VBA independently from your source data), the solution would probably involve converting the range to an array, running some deduplication code and then creating the phrase from that.
My solution (albeit inelegant) is just to use the UNIQUE and FILTER functions to get a deduplicated list elsewhere on the spreadsheet (can be hidden if it bothers you) and to use Matt's function on that.
=UNIQUE(FILTER(yourRange,yourRange<>""))

Excel Formula with concatenate

I am trying to generate a customer number using the first three letters of the customers last name, the first name initial and middle initial, followed by the last four of their phone number. How would I do this? All I need is the formula.
First_Name Middle_Initial Last_Name Street_Address City State Zip Phone
Nathaniel E. Conn 6196 View Ct Lancing TN 37770 567-273-3956
Something like this (assuming a table with [structured-references], fill in the actual cell names if not):
=LEFT([LastName] & "---", 3)
& LEFT([FirstName] & "-", 1)
& LEFT([MiddleInitial] & "-", 1)
& RIGHT([PhoneNumber] & "----", 4)
I have used dashes ("-") to fill in any spaces where the field might be smaller than the number of characters you need from it. You can change them to any fill character that suits you.
Well, it depends on if each piece of data has its own column, looks like it does.
You can use the left/right functions to parse the data out of your columns.
=CONCATENATE(RIGHT(C1,3) & LEFT(A1,1) & LEFT(B3,1) & RIGHT(H1,4))
I would do:
=MID(CELL_LAST_NAME;1;3)&MID(CELL_FIRST_NAME;1;1)&MID(CELL_MIDDLE_NAME;1;1)&MID(CELL_PHONE;LEN(CELL_PHONE)-3;4)

Resources