How to extract groups of numbers from a string in vba

How to extract groups of numbers from a string in vba - excel

I have a string of the following shape:
RRP 90 AVE DE GAULLE 92800 PUTEAUX 0109781431-0149012126
The numbers might be seperated by other chars than hyphens (eg spaces). I know how to differentiate them afterwards with len().
I need every string of numbers to be stored seperately (in an array for example), so that I can discriminate them with len() and then use them.
I have found how to strip the characters away from the string :
How to find numbers from a string?
But it doesn't suit my problem...
Could you direct me to a function or bit of code that could help me with that?

This will run much faster than looping
Public Function NumericOnly(s As String) As String
Dim s2 As String
Dim replace_hyphen As String
replace_hyphen = " "
Static re As RegExp
If re Is Nothing Then Set re = New RegExp
re.IgnoreCase = True
re.Global = True
re.Pattern = "[^0-9 -]" 'includes space, if you want to exclude space "[^0-9]"
s2 = re.Replace(s, vbNullString)
re.Pattern = "[^0-9 ]"
NumericOnly = re.Replace(s2, replace_hyphen)
End Function

Try below code :
Function parseNum(strSearch As String) As String
' Dim strSearch As String
'strSearch = "RRP 90 AVE DE GAULLE 92800 PUTEAUX 0109781431-0149012126"
Dim i As Integer, tempVal As String
For i = 1 To Len(strSearch)
If IsNumeric(Mid(strSearch, i, 1)) Then
tempVal = tempVal + Mid(strSearch, i, 1)
End If
Next
parseNum = tempVal
End Function

So I realize this was a long time ago... but I was looking for a similar solution online.
Some previous history on my programming skillz (sic): I started with Python and with Python I have a handy tool called List. VBA doesn't have this, so what I'm left with is something that I can input in a variable I called sample below, i.e. sample = [1,4,5].
Back to the small code. I made it so holder would only contain groups of numbers, as how you specified they should be grouped.
Dim count, count1 As Integer
Dim holder As String
Dim sample, smallSample As String
count = 0
count1 = 1
sample = "1ab33 efa 123 adfije-23423 123124-23423"
holder = ""
Do While count <> Len(sample)
smallSample = Left(sample, 1)
If smallSample = "0" Or smallSample = "1" Or smallSample = "2" Or smallSample = "3" Or smallSample = "4" Or smallSample = "5" Or smallSample = "6" Or smallSample = "7" Or smallSample = "8" Or smallSample = "9" Then
holder = holder & smallSample
Else
If holder <> "" Then
Cells(count1,1) = holder
count1 = count1 + 1
End If
holder = ""
End If
sample = Right(sample, Len(sample) - 1)
Loop
The output I got was
1
33
123
23423
123124
after I ran the code.

Great Simple Python-style loop above.
Extended for a list of strings in column A.
The numbers discovered will show in the columns to the right - B, C, etc.
Dim count, count1 As Integer
Dim holder As String
Dim sample, smallSample As String
Dim r As Integer
Dim c As Integer
r = 1
c = 1
Do While Sheet2.Cells(r, c) <> ""
count = 0
count1 = 1
sample = Sheet2.Cells(r, c)
holder = ""
Do While count <> Len(sample)
smallSample = Left(sample, 1)
If smallSample = "0" Or smallSample = "1" Or smallSample = "2" Or smallSample = "3" Or smallSample = "4" Or smallSample = "5" Or smallSample = "6" Or smallSample = "7" Or smallSample = "8" Or smallSample = "9" Then
holder = holder & smallSample
Else
If holder <> "" Then
Sheets(2).Cells(r, c + count1).Value = holder
count1 = count1 + 1
End If
holder = ""
End If
sample = Right(sample, Len(sample) - 1)
Loop
r = r + 1
Loop

If you are getting the Error "Issue while compiling, user defined type not defined" while using #Scotts Answer Enable the Regular Expression option as seen in Step 1 and 2 here: How to Use/Enable (RegExp object) Regular Expression using VBA (MACRO) in word (Works in Excel also)
P.s. Scotts Solution worked well for me.

Related

Extract only numbers with 8 digits and does not followed by contain characters (. , #)

I am using a function to extract numbers from string with conditions that number with 8 digits and does not contain characters (. , #).
It works with 8 digits , but if the number is followed by characters (. , #) ,it also extract that number and that not required.
This my string 11111111 12345678.1 11111112 11111113 and the expected output is 11111111 11111112 11111113 without 12345678.1.
I tried to use negative Lookahead \d{8}(?!.,#) but it is useless.
Thanks all for your help.
Function Find8Numbers(st As String) As Variant
Dim regex As New RegExp
Dim matches As MatchCollection, mch As match
regex.Pattern = "\d{8}" 'Look for variable length numbers only
regex.IgnoreCase = True
regex.Global = True
regex.MultiLine = True
If (regex.Test(st) = True) Then
Set matches = regex.Execute(st) 'Execute search
For Each mch In matches
Find8Numbers = LTrim(Find8Numbers) & " " & mch.value
Next
End If
End Function

In line with your question and current attempt, you could indeed use regex:
Function Find8Numbers(st As String) As String
With CreateObject("vbscript.regexp")
.Pattern = "(?:^|\s)(\d{8})(?![.,#\d])"
.Global = True
If .Test(st) Then
Set Matches = .Execute(st)
For Each mch In Matches
Find8Numbers = LTrim(Find8Numbers & " " & mch.submatches(0))
Next
End If
End With
End Function
Invoke through:
Sub Test()
Dim s As String: s = "11111111 12345678.1 11111112 11111113"
Debug.Print Find8Numbers(s)
End Sub
Prints:
11111111 11111112 11111113
Pattern used:
(?:^|\s)(\d{8})(?![.,#\d])
See an online demo
(?:^|\s) - No lookbehind in VBA thus used a non-capture group to match start-line anchor or whitespace;
(\d{8}) - Exactly 8 digits in capture group;
(?![.,#\d]) - Negative lookahead to assert position isn't followed by any of given characters including digits.

I'm not sure you need Regex for what is a reasonably simple pattern. You could just go with a VBA solution:
Public Function Find8Numbers(str As String) As String
Dim c As String, c1 As String
Dim i As Long, numStart As Long
Dim isNumSeq As Boolean
Dim result As String
If Len(str) < 8 Then Exit Function
For i = 1 To Len(str)
c = Mid(str, i, 1)
If i = Len(str) Then
c1 = ""
Else
c1 = Mid(str, i + 1, 1)
End If
If c >= "0" And c <= "9" Then
If isNumSeq Then
If i - numStart + 1 = 8 Then
If c1 <> "." And c1 <> "," And c1 <> "#" Then
If result <> "" Then result = result & " "
result = result & Mid(str, numStart, 8)
isNumSeq = False
End If
End If
Else
If i > Len(str) - 8 + 1 Then Exit For
isNumSeq = True
numStart = i
End If
Else
isNumSeq = False
End If
Next
Find8Numbers = result
End Function

Excel - VBA code works only for a few tries

I created this code in VBA so that every time I delete a number or the cell is empty(D7:O36), this code will run automatically(on selection change).
The code runs fine if a certain small amount of cells(~100) gets empty at once, then the cells will get filled with a "-".
The problem is that after doing it all at once more that around 100 times(each cell), Excel will stop working with error of run time error 1004. I've read about the error but it doesn't look like it applies here, at least not to the naked eye.
I don't know if the problem is how I implemented it or that i'm doing something too heavy for excel to handle.
UPDATE:
Thanks to - Tim Williams - comment bellow, the issue was not only fixed(for some reason it worked) but the code got super small and simple, AND it runs faster AND each time the "-" is added, Excel doesn't pull you to the active cell(you can activate other cell meanwhile the code is running)
Comment:
Maybe simpler: Dim c As Range: For Each c In Me.Range("D7:O36").Cells: If Len(c.Value)=0 Then c.Formula = "=""-""": Next c –
Tim Williams
Here is the updated code:
Private Sub Worksheet_SelectionChange(ByVal Target As Range)
Dim c As Range
For Each c In Me.Range("D7:O36").Cells
If c.Value = "" Then
c.Formula = "-":
End If
Next c
End Sub
Original code:
Private Sub Worksheet_SelectionChange(ByVal Target As Range)
Dim num As Integer
Dim letter As String
Dim count As Integer
Dim cellvalue As String
Dim cellnum As String
letter = "D"
num = 7
count = 0
For i = 0 To 432
cellnum = letter & num
cellvalue = Range(cellnum).Value
If cellvalue = "" Then
Range(cellnum).Select
ActiveCell.FormulaR1C1 = "-"
End If
If num = 36 Then
If count = 0 Then
letter = "E"
ElseIf count = 1 Then
letter = "F"
ElseIf count = 2 Then
letter = "G"
ElseIf count = 3 Then
letter = "H"
ElseIf count = 4 Then
letter = "I"
ElseIf count = 5 Then
letter = "J"
ElseIf count = 6 Then
letter = "K"
ElseIf count = 7 Then
letter = "L"
ElseIf count = 8 Then
letter = "M"
ElseIf count = 9 Then
letter = "N"
ElseIf count = 10 Then
letter = "O"
End If
num = 6
count = count + 1
End If
num = num + 1
Next i
End Sub

Convert String Characters to Text

I'm trying to convert a large amount of data into a written description of the text. Ex. Convert YYYY####### to "4 digit year, 7 numeric digits" and YYMMDD-#### to "2 digit year, 2 digit month, 2 digit day, hyphen, 4 numeric digits"
The constant characters are Y, M, D, #, - and X (X is for non-defined alpha characters). There are some defined alpha characters (Y, M, D and X are never used for anything other than Year, Month, Day and Alpha respectively) that are used, ie (RP-YYYY#####) where I want to try to capture those (anything other than the constant characters) and state them as they are. So the written text for RP-YYYY##### would be "RP, hypen, 4 digit year, 5 numeric digits"
I'm able to get a count of each character using the Len and Replace methods, however I'm struggling to figure out how to produce the written text in the correct order, or to capture non constant characters like RP and state them as is.
Any help would be much appreciated!
Sub getcharacters()
Dim casenumber As String
casenumber = Range("A1")
InitialCount = Len(casenumber)
YearDigits = Len(casenumber) - Len(Replace(casenumber, "Y", ""))
MonthDigits = Len(casenumber) - Len(Replace(casenumber, "MM", ""))
DayDigits = Len(casenumber) - Len(Replace(casenumber, "DD", ""))
NumberDigits = Len(casenumber) - Len(Replace(casenumber, "#", ""))
AlphaDigits = Len(casenumber) - Len(Replace(casenumber, "X", ""))
HyphenDigits = Len(casenumber) - Len(Replace(casenumber, "-", ""))
FinalCount = InitialCount - YearDigits - MonthDigits - DayDigits - Digits - AlphaDigits
If YearDigits = "0" Then WrittenYear = ""
If YearDigits = "2" Then WrittenYear = "Two digit year"
If YearDigits = "4" Then WrittenYear = "Four digit year"
If MonthDigits = "0" Then WrittenMonth = "" Else WrittenMonth = "Two digit month"
If DayDigits = "0" Then WrittenDay = "" Else WrittenDay = "Two digit day"
If NumberDigits = "0" Then WrittenDigits = "" Else WrittenDigits = NumberDigits & " digits"
If AlphaDigits = "0" Then WrittenAlpha = "" Else WrittenAlpha = AlphaDigits & " alpha characters"
WrittenCaseNumber = WrittenYear & WrittenMonth & WrittenDay & WrittenDigits & WrittenAlpha
End Sub

You may use Regex for the matching and replacing part along with some helper functions to convert digits to words, etc. I admit it's a bit ugly but it does the trick.
First, add the following two references to your project:
Microsoft VBScript Regular Expressions 5.5
Microsoft Scripting Runtime
Second, add the following code to a Module:
Option Explicit
Private DictAlphaCharacters As Scripting.Dictionary
Private Sub InitializeDictAlphaCharacters()
Set DictAlphaCharacters = New Scripting.Dictionary
DictAlphaCharacters.Add "Y", "digit year"
DictAlphaCharacters.Add "M", "digit month"
DictAlphaCharacters.Add "D", "digit day"
DictAlphaCharacters.Add "#", "numeric digits"
End Sub
Public Function DescribeThis(s As String) As String
If DictAlphaCharacters Is Nothing Then InitializeDictAlphaCharacters
Dim tmpStr As String: tmpStr = s
Dim regEx As New RegExp
regEx.Global = True
Dim matches As MatchCollection
Dim m As Match
Dim k As Variant ' Dictionary key.
Dim alpha As String ' The corresponding sentence for an alpha char.
Dim l As Integer ' Length of the matched string (consecutive alpha chars).
Dim w As String ' The corresponding word of a digit.
For Each k In DictAlphaCharacters.Keys
alpha = DictAlphaCharacters.Item(k)
regEx.Pattern = k & "{1,9}"
Set matches = regEx.Execute(tmpStr)
For Each m In matches
l = m.Length
w = DigitToWord(l)
' Pattern ex. = "([^Y])?,?Y{2}(?!Y)"
regEx.Pattern = "([^" & k & "])?,?" & k & "{" & l & "}(?!" & k & ")"
' Replacement example: "$1,Two digit year,"
tmpStr = regEx.Replace(tmpStr, "$1," & w & " " & alpha & ",")
Next
Next
regEx.Pattern = ",?-,?"
tmpStr = regEx.Replace(tmpStr, ",hyphen,")
regEx.Pattern = "^,+|,+$"
DescribeThis = regEx.Replace(tmpStr, "")
End Function
Public Function DigitToWord(d As Integer) As String
Select Case d
Case 1: DigitToWord = "One"
Case 2: DigitToWord = "Two"
Case 3: DigitToWord = "Three"
Case 4: DigitToWord = "Four"
Case 5: DigitToWord = "Five"
Case 6: DigitToWord = "Six"
Case 7: DigitToWord = "Seven"
Case 8: DigitToWord = "Eight"
Case 9: DigitToWord = "Nine"
End Select
End Function
Usage:
Sub Test()
Debug.Print DescribeThis("YYYY#######")
Debug.Print DescribeThis("YYMMDD-####")
Debug.Print DescribeThis("RP-YYYY#####")
Debug.Print DescribeThis("YYYMMM-YYMM")
End Sub
Output:
Four digit year,Seven numeric digits
Two digit year,Two digit month,Two digit day,hyphen,Four numeric digits
RP,hyphen,Four digit year,Five numeric digits
Three digit year,Three digit month,hyphen,Two digit year,Two digit month

This seems to accomplish what you want.
As written it assumes that all of the "like" characters in the set of [YMD#] are contiguous. If groups of Y's, for example, could repeat in different parts of the string, we just need to change the charCnt function.
Option Explicit
'set reference to Microsoft Scripting Runtime
Function convStr(S As String) As String
Dim myDict As Dictionary
Dim sRes() As String
Dim I As Long
Dim CH As String
Set myDict = New Dictionary
myDict.CompareMode = TextCompare
myDict.Add "Y", "digit year"
myDict.Add "M", "digit month"
myDict.Add "D", "digit day"
myDict.Add "#", "numeric digits"
myDict.Add "-", "hyphen"
ReDim sRes(0)
For I = 1 To Len(S)
CH = Mid(S, I, 1)
If myDict.Exists(CH) Then
sRes(UBound(sRes)) = IIf(CH <> "-", charCnt(S, CH) & " ", "") & myDict(CH)
I = I + charCnt(S, CH)
Else
Do While Not myDict.Exists(CH)
sRes(UBound(sRes)) = sRes(UBound(sRes)) & CH
I = I + 1
CH = Mid(S, I, 1)
Loop
End If
I = I - 1
ReDim Preserve sRes(UBound(sRes) + 1)
Next I
ReDim Preserve sRes(UBound(sRes) - 1)
convStr = Join(sRes, ", ")
End Function
Function charCnt(S As String, CH As String) As Long
Dim startChar As Long
startChar = InStr(S, CH)
If startChar > 0 Then
charCnt = Len(S) - Len(Replace(S, CH, ""))
Else
charCnt = 0
End If
End Function

Separate address elements from 1 cell in Excel

I have thousands of addresses in this format:
123 Happy St. Kansas City, MO 64521
9812 Main Street Minneapolis, MN 62154
12 Virgina Ave, Apt 8, Dallas, TX 54334
I want to extract the address, city, state, zip into individual cells (without using VB if possible). I've tried a couple variations of other methods posted, but I can't quite get desired results.

Analyze your problem!
you want to split your address string at the comma
you then want to split the right fragment from (1) at the first blank
ad 1): you get the position of the comma using =FIND(",", A1), and use the result in a =LEFT(...) and a =RIGHT(...) - for the latter you also need the string length (=LEN(...))
B1: =LEFT(A1;FIND(",";A1)-1)
C1: =RIGHT(A1;LEN(A1)-LEN(B1)-2)
Now comes the fun part ... in your 3rd example we mustn't split on the first comma, but on the third comma ... or as a more general rule, we always must split on the last comma .... but how do we find how many commas we have in the string, to feed its position as an additional argument into the =FIND(...) function?
Quick answer: look at Stackoverflow (exactly here) ... very clever ... subtract the length of the string with all commas removed from the original length, and then replace the last occurence of the comma by something else, because =SUBSTITUTE(...) works on occurence, whilst =FIND() only works on position. If you incorporate all this this, you will have
B1: =LEFT(A1;FIND("#";SUBSTITUTE(A1;",";"#"; LEN(A1)-LEN(SUBSTITUTE(A1;",";""))))-1) --> full address
C1: (same as above)
Here we use "#" as a neutral substitution string for the final comma as we asume that no address uses the "#"
ad 2): you apply the above (with blank instead of comma) once again to the right part. You can use the simple first version of the formulae as it's clear you want to split at the first blank
D1: =LEFT(C1;FIND(" ";C1)-1) --> state
E1: =RIGHT(C1;LEN(C1)-LEN(D1)-1) --> ZIP code

This VBA function extracts Zip, State, City, Street1, and Street2 (Suite, Apt, etc.) into separate columns. Would need minor modification to remove commas.
Option Explicit
Function ParseAddress(ByVal varAddress As Variant, ByVal strAddressPart As String) As String
Dim aryAddressTokens() As String
Dim strCity As String
Dim intCtr As Integer
Dim intStreet2Tokens As Integer
Dim strStreet1, strStreet2 As String
If IsMissing(varAddress) Or varAddress = vbNullString Then
ParseAddress = ""
Else
aryAddressTokens = Split(Trim(varAddress), " ")
'
If strAddressPart = "Zip" Then
ParseAddress = aryAddressTokens(UBound(aryAddressTokens))
ElseIf strAddressPart = "State" Then
ParseAddress = UCase(aryAddressTokens(UBound(aryAddressTokens) - 1))
ElseIf strAddressPart = "City" Then
strCity = aryAddressTokens(UBound(aryAddressTokens) - 2)
If Right(strCity, 1) = "," Then strCity = Left(strCity, Len(strCity) - 1)
ParseAddress = strCity
ElseIf strAddressPart = "Street1" Or strAddressPart = "Street2" Then
'Find Street2 if present because Street1 output is dependent on it.
' Assume address never begins with a # or Suite.
intCtr = 1
strStreet2 = ""
intStreet2Tokens = 0
While (intCtr < UBound(aryAddressTokens) - 2) And strStreet2 = ""
If Left(aryAddressTokens(intCtr), 1) = "#" Then
If Len(aryAddressTokens(intCtr)) = 1 Then
strStreet2 = aryAddressTokens(intCtr) & aryAddressTokens(intCtr + 1)
intStreet2Tokens = 2
Else
strStreet2 = aryAddressTokens(intCtr)
intStreet2Tokens = 1
End If
ElseIf Left(aryAddressTokens(intCtr), 5) = "Suite" Then
If Len(aryAddressTokens(intCtr)) = 5 Then
strStreet2 = aryAddressTokens(intCtr) & " " & aryAddressTokens(intCtr + 1)
intStreet2Tokens = 2
Else
strStreet2 = aryAddressTokens(intCtr)
intStreet2Tokens = 1
End If
ElseIf Left(aryAddressTokens(intCtr), 3) = "Apt" Then
strStreet2 = aryAddressTokens(intCtr) & " " & aryAddressTokens(intCtr + 1)
intStreet2Tokens = 2
End If
intCtr = intCtr + 1
Wend
If Not IsEmpty(strStreet2) Then
If Right(strStreet2, 1) = "," Then strStreet2 = Left(strStreet2, Len(strStreet2) - 1)
End If
' Now Street1.
strStreet1 = ""
For intCtr = 0 To UBound(aryAddressTokens) - (3 + intStreet2Tokens)
strStreet1 = strStreet1 & " " & aryAddressTokens(intCtr)
Next
If Right(strStreet1, 1) = "," Then strStreet1 = Left(strStreet1, Len(strStreet1) - 1)
'Assign.
If strAddressPart = "Street1" Then
ParseAddress = Trim(strStreet1)
Else
ParseAddress = Trim(strStreet2)
End If
End If
End If
End Function

How to call a macro, to Convert Accented Characters to Regular, that does not appear in the list?

I am trying to replace accented characters with regular characters.
When I try to run the macro it doesn't appear in the list.
Option Explicit
'-- Add more chars to these 2 string as you want
'-- You may have problem with unicode chars that has code > 255
'-- such as some Vietnamese characters that are outside of ASCII code (0-255)
Const AccChars = "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
Const RegChars = "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
Sub StripAccent(aRange As Range)
'-- Usage: StripAccent Sheet1.Range("A1:C20")
Dim A As String * 1
Dim B As String * 1
Dim i As Integer
For i = 1 To Len(AccChars)
A = Mid(AccChars, i, 1)
B = Mid(RegChars, i, 1)
aRange.Replace What:=A, _
Replacement:=B, _
LookAt:=xlPart, _
MatchCase:=True
Next
End Sub

I do not see the option to run the macro in my macros list. The macro name is not appearing in the list to select. I have macros enabled and I have a bunch of others I use so I do not understand why it's not showing. – BvilleBullet 4 mins ago
Please see the comment in the above code.
'-- Usage: StripAccent Sheet1.Range("A1:C20")
You have to call it like this
Option Explicit
'-- Add more chars to these 2 string as you want
'-- You may have problem with unicode chars that has code > 255
'-- such as some Vietnamese characters that are outside of ASCII code (0-255)
Const AccChars = "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
Const RegChars = "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
'~~> This is how you have to call it. Now You can see the macro "Sample" in the list
Sub Sample()
StripAccent Sheet1.Range("A1:C20")
End Sub
Sub StripAccent(aRange As Range)
'-- Usage: StripAccent Sheet1.Range("A1:C20")
Dim A As String * 1
Dim B As String * 1
Dim i As Integer
For i = 1 To Len(AccChars)
A = Mid(AccChars, i, 1)
B = Mid(RegChars, i, 1)
aRange.Replace What:=A, _
Replacement:=B, _
LookAt:=xlPart, _
MatchCase:=True
Next
End Sub

For those who need to remove accent marks from ALL Roman characters, including extended ones like those used in Vietnamese, then follow the instructions below.
First, let's prepare the spreadsheet to do its VBA magic. In the Microsoft VBA editor, select Tools / References and put a checkmark next to "Microsoft Scripting Runtime". We'll need this to define a dictionary object in the subsequent steps.
Next, we create a global dictionary in order to map the accented characters to their unaccented equivalents. This is done when the Workbook_Open event is triggered so that the dictionary is only initiated once when you open the spreadsheet instead of every time you call the function. AsciiDict is defined as a public variable in step 3. In the "Project - VBAProject" panel, double-click on ThisWorkbook to open the workbook scope. Paste the following code there (below Option Explicit):
Private Sub Workbook_Open()
InitDictionary
End Sub
Sub InitDictionary()
AsciiDict(192) = "A"
AsciiDict(193) = "A"
AsciiDict(194) = "A"
AsciiDict(195) = "A"
AsciiDict(196) = "A"
AsciiDict(197) = "A"
AsciiDict(199) = "C"
AsciiDict(200) = "E"
AsciiDict(201) = "E"
AsciiDict(202) = "E"
AsciiDict(203) = "E"
AsciiDict(204) = "I"
AsciiDict(205) = "I"
AsciiDict(206) = "I"
AsciiDict(207) = "I"
AsciiDict(208) = "D"
AsciiDict(209) = "N"
AsciiDict(210) = "O"
AsciiDict(211) = "O"
AsciiDict(212) = "O"
AsciiDict(213) = "O"
AsciiDict(214) = "O"
AsciiDict(217) = "U"
AsciiDict(218) = "U"
AsciiDict(219) = "U"
AsciiDict(220) = "U"
AsciiDict(221) = "Y"
AsciiDict(224) = "a"
AsciiDict(225) = "a"
AsciiDict(226) = "a"
AsciiDict(227) = "a"
AsciiDict(228) = "a"
AsciiDict(229) = "a"
AsciiDict(231) = "c"
AsciiDict(232) = "e"
AsciiDict(233) = "e"
AsciiDict(234) = "e"
AsciiDict(235) = "e"
AsciiDict(236) = "i"
AsciiDict(237) = "i"
AsciiDict(238) = "i"
AsciiDict(239) = "i"
AsciiDict(240) = "d"
AsciiDict(241) = "n"
AsciiDict(242) = "o"
AsciiDict(243) = "o"
AsciiDict(244) = "o"
AsciiDict(245) = "o"
AsciiDict(246) = "o"
AsciiDict(249) = "u"
AsciiDict(250) = "u"
AsciiDict(251) = "u"
AsciiDict(252) = "u"
AsciiDict(253) = "y"
AsciiDict(255) = "y"
AsciiDict(352) = "S"
AsciiDict(353) = "s"
AsciiDict(376) = "Y"
AsciiDict(381) = "Z"
AsciiDict(382) = "z"
AsciiDict(258) = "A"
AsciiDict(259) = "a"
AsciiDict(272) = "D"
AsciiDict(273) = "d"
AsciiDict(296) = "I"
AsciiDict(297) = "i"
AsciiDict(360) = "U"
AsciiDict(361) = "u"
AsciiDict(416) = "O"
AsciiDict(417) = "o"
AsciiDict(431) = "U"
AsciiDict(432) = "u"
AsciiDict(7840) = "A"
AsciiDict(7841) = "a"
AsciiDict(7842) = "A"
AsciiDict(7843) = "a"
AsciiDict(7844) = "A"
AsciiDict(7845) = "a"
AsciiDict(7846) = "A"
AsciiDict(7847) = "a"
AsciiDict(7848) = "A"
AsciiDict(7849) = "a"
AsciiDict(7850) = "A"
AsciiDict(7851) = "a"
AsciiDict(7852) = "A"
AsciiDict(7853) = "a"
AsciiDict(7854) = "A"
AsciiDict(7855) = "a"
AsciiDict(7856) = "A"
AsciiDict(7857) = "a"
AsciiDict(7858) = "A"
AsciiDict(7859) = "a"
AsciiDict(7860) = "A"
AsciiDict(7861) = "a"
AsciiDict(7862) = "A"
AsciiDict(7863) = "a"
AsciiDict(7864) = "E"
AsciiDict(7865) = "e"
AsciiDict(7866) = "E"
AsciiDict(7867) = "e"
AsciiDict(7868) = "E"
AsciiDict(7869) = "e"
AsciiDict(7870) = "E"
AsciiDict(7871) = "e"
AsciiDict(7872) = "E"
AsciiDict(7873) = "e"
AsciiDict(7874) = "E"
AsciiDict(7875) = "e"
AsciiDict(7876) = "E"
AsciiDict(7877) = "e"
AsciiDict(7878) = "E"
AsciiDict(7879) = "e"
AsciiDict(7880) = "I"
AsciiDict(7881) = "i"
AsciiDict(7882) = "I"
AsciiDict(7883) = "i"
AsciiDict(7884) = "O"
AsciiDict(7885) = "o"
AsciiDict(7886) = "O"
AsciiDict(7887) = "o"
AsciiDict(7888) = "O"
AsciiDict(7889) = "o"
AsciiDict(7890) = "O"
AsciiDict(7891) = "o"
AsciiDict(7892) = "O"
AsciiDict(7893) = "o"
AsciiDict(7894) = "O"
AsciiDict(7895) = "o"
AsciiDict(7896) = "O"
AsciiDict(7897) = "o"
AsciiDict(7898) = "O"
AsciiDict(7899) = "o"
AsciiDict(7900) = "O"
AsciiDict(7901) = "o"
AsciiDict(7902) = "O"
AsciiDict(7903) = "o"
AsciiDict(7904) = "O"
AsciiDict(7905) = "o"
AsciiDict(7906) = "O"
AsciiDict(7907) = "o"
AsciiDict(7908) = "U"
AsciiDict(7909) = "u"
AsciiDict(7910) = "U"
AsciiDict(7911) = "u"
AsciiDict(7912) = "U"
AsciiDict(7913) = "u"
AsciiDict(7914) = "U"
AsciiDict(7915) = "u"
AsciiDict(7916) = "U"
AsciiDict(7917) = "u"
AsciiDict(7918) = "U"
AsciiDict(7919) = "u"
AsciiDict(7920) = "U"
AsciiDict(7921) = "u"
AsciiDict(7922) = "Y"
AsciiDict(7923) = "y"
AsciiDict(7924) = "Y"
AsciiDict(7925) = "y"
AsciiDict(7926) = "Y"
AsciiDict(7927) = "y"
AsciiDict(7928) = "Y"
AsciiDict(7929) = "y"
AsciiDict(8363) = "d"
End Sub
Finally, we create a function called StripDiacritics() to normalize the text. In the "Project - VBAProject" panel, double-click on Modules / Module1 to open the module scope (if you don't see it, then you'll have to add it by right-clicking on ThisWorkbook and selecting Insert / Module). Paste the following code there (below Option Explicit):
'Dictionary initiated in Workbook_Open()
Public AsciiDict As New Scripting.Dictionary
Function StripDiacritics(Text As String) As String
Text = Trim(Text)
If Text = "" Then Exit Function
Dim Char As String, _
NormalizedText As String, _
UnicodeCharCode As Long, _
i As Long
'Remove accent marks (diacritics) from text
For i = 1 To Len(Text)
Char = Mid(Text, i, 1)
UnicodeCharCode = AscW(Char)
If (UnicodeCharCode < 0) Then
'See http://support.microsoft.com/kb/272138
UnicodeCharCode = 65536 + UnicodeCharCode
End If
If AsciiDict.Exists(UnicodeCharCode) Then
NormalizedText = NormalizedText & AsciiDict.Item(UnicodeCharCode)
Else
NormalizedText = NormalizedText & Char
End If
Next
StripDiacritics = NormalizedText
End Function
Save and re-open the spreadsheet for the mapping dictionary to be properly initiated.
Usage:
=StripDiacritics("Hermès Prêt à Porter")
Outputs "Hermes Pret a Porter"
=StripDiacritics("Việt Nam Textiles")
Outputs "Viet Nam Textiles"
For those who are curious, the complete mappings can be found here: https://goo.gl/Vvn9px. The dictionary keys correspond to the Dec column.

Function stripAccent(Text As String) As String
Const AccChars = "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
Const RegChars = "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
Dim A As String * 1
Dim B As String * 1
Dim i As Integer
For i = 1 To Len(AccChars)
A = Mid(AccChars, i, 1)
B = Mid(RegChars, i, 1)
Text = Replace(Text, A, B)
Next
stripAccent = Text
End Function

You mean the list of macros in Macro Dialog Box ? If so, it's because the range parameters, the Macro Dialog Box will list only procedures without parameters.

You can use a userForm with a refEdit and button control.
The routine that call the form is something like:
Sub ShowForm()
Dim d As dlg
Set d = New dlg
d.Show
Set d = Nothing
End Sub
...and in the click event of the button:
Private Sub cmdBtn_Click()
On Error GoTo cmdBtn_Click_Err
Dim strRange As String
Dim rng As Range
strRange = refeditControl.Text
Set rng = Range(strRange)
Call StripAccent(rng)
cmdBtn_Click_Exit:
Exit Sub
cmdBtn_Click_Err:
MsgBox Err.Description
Resume cmdBtn_Click_Exit
End Sub
Assuming the userForm is name dlg, the button cmdBtn and the refEdit control refEditControl.

The function provided by #notGeek stripAccent worked for me except it converted lower case accented characters to uppercase non-accented characters.
This seems to be because the Replace function by default is case insensitive. This can be changed by adding the compare setting of vbBinaryCompare as below
Text = Replace(Text, A, B, , , vbBinaryCompare)

Use this code to remove special character from the string.
Function Remove(Str As String) As String
Dim xChars As String
Dim I As Long
xChars = "/.',_#$%#!()^*&"
For I = 1 To Len(xChars)
Str = Replace$(Str, Mid$(xChars, I, 1), "")
Next
Remove = Str
End Function

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to extract groups of numbers from a string in vba - excel

Related

Extract only numbers with 8 digits and does not followed by contain characters (. , #)

Excel - VBA code works only for a few tries

Convert String Characters to Text

Separate address elements from 1 cell in Excel

How to call a macro, to Convert Accented Characters to Regular, that does not appear in the list?

Categories

Resources