extract specific set of digits from random strings in EXCEL VBA - string

Disclaimer- my case is specific, and in my case my code works because I know the pattern.
I was looking for an answer everywhere, and the codes I tried were not quite what I was looking for, this is my solution if you are looking for a set of numbers.
In my case, I was looking for 7 digits, starting with digit 1 in a a column with random strings, some string had the number some others didn't.
The number will appear in these three scenarios "1XXXXXX", "PXXXXXXXX", "PXXXXXXXXX"(this has more digits because there is a slash).
Here are the examples of strings:
9797 P/O1743061 465347 Hermann Schatte Earl Lowe
9797 Po 1743071 404440 Claude Gaudette Jose Luis Lopez
9817 1822037 463889 Jean Caron Mickelly Blaise
My Code
Sub getnum()
'i don't use explicit so i didn't declare everything
Dim stlen As String
Dim i As Integer
Dim arra() As String
Dim arran() As String
Orig.AutoFilterMode = False
Call BeginMacro
LastRow = Orig.Cells(Rows.Count, 1).End(xlUp).Row
Orig.Range("J2:J" & LastRow).Clear
'loop though column
For n = 2 To LastRow
celref = Orig.Cells(n, 4).Value
'split string on white spaces
arra() = Split(celref, " ")
'turn string to multiple strings
For counter = LBound(arra) To UBound(arra)
strin = arra(counter)
'remove white spaces from string
storage = Trim(strin)
lenof = Len(storage)
'if string has 9 characthers, check for conditions
If lenof = 9 Then
'position of first and last charachter
somstr = Mid(storage, 1, 1)
somot = Mid(storage, 9, 1)
If somstr = "P" Or somstr = "p" And IsNumeric(somot) = True Then
'removes Po or PO and keeps only 7 digits
storage = Right(storage, 7)
'stores in column J
Orig.Cells(n, 10).Value = storage
End If
ElseIf lenof = 10 Then
somstr = Mid(storage, 1, 1)
somot = Mid(storage, 10, 1)
'other conditions
If somstr = "P" Or somstr = "p" And IsNumeric(somot) = True Then
'removes Po or PO and keeps only 7 digits
storage = Right(storage, 7)
'stores in column J
Orig.Cells(n, 10).Value = storage
End If
End If
'eliminate comma within
arran() = Split(storage, ",")
If Orig.Cells(n, 10).Value <> storage Then
For counter2 = LBound(arran) To UBound(arran)
strin2 = arran(counter2)
storage2 = Trim(strin2)
'final condition if is 7 digits and starts with 1
If IsNumeric(storage2) = True And Len(storage2) = 7 Then
car = Mid(storage2, 1, 1)
If car = 1 Then
'stores in columns J at specific position
Orig.Cells(n, 10).Value = storage2
End If
Else
If isnumeric(orig.cells(n,10).value) =true and _
len(orig.cells(n,10).value = 7 then
orig.cells(n,10).value = orig.cells(n,10).value
else
Orig.Cells(n, 10).Value = "no po# in D"
End If
Next counter2
End If
Next counter
Next n
Call EndMacro
End Sub

you may try this
Option Explicit
Sub getnum()
Dim position As Variant
Dim cell As Range
With Worksheets("Orig") ' change it to your actual sheet name
With Intersect(.UsedRange, Columns("J"))
.Replace what:="P/O", replacement:="P/O ", lookat:=xlPart
For Each cell In .Cells
position = InStr(cell.Text, " 1")
If position > 0 Then cell.Value = Mid(cell.Value, position + 1, 7)
Next
End With
End With
End Sub

This code paste two formulas one in column G and one in column J). The first formula checks for a "P" in the first character of the cell in column 2 and if there is a "P" it extracts the last 7 characters in the string and puts them in column G. The second formula checks if there is not a "P" and if not extracts the last 7 characters in the string and puts them in column J.
Sub Extract()
Dim ws As Worksheet
Dim lRow As Long
Set ws = ThisWorkbook.Sheets("Sheet3")
lRow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
ws.Range("G2:G" & lRow).Formula = "=IF(LEFT(B2)=""P"",(RIGHT(B2,7)),"""")"
ws.Range("J2:J" & lRow).Formula = "=IF(LEFT(B2)<>""P"",(RIGHT(B2, 7)),"""")"
End Sub

You may use the RegEx to extract the number in desired format.
Please give this a try...
Function Get10DigitNumber(ByVal Str As String) As String
Dim RE As Object
Set RE = CreateObject("VBScript.RegExp")
With RE
.Global = False
.Pattern = "1\d{6}"
End With
If RE.test(Str) Then
Get10DigitNumber = RE.Execute(Str)(0)
End If
End Function
Then if you want to use this function on the worksheet itself, assuming your string is in A2, try this...
=Get10DigitNumber(A2)
OR
You may use this function in another sub routine/macro like this...
Debug.Print Get10DigitNumber(<pass your string variable here>)
Edited Function:
Function Get10DigitNumber(ByVal Str As String) As String
Dim RE As Object, Matches As Object
Set RE = CreateObject("VBScript.RegExp")
With RE
.Global = False
.Pattern = "[Pp]?\/?[Oo]?(1\d{6})\b"
End With
If RE.test(Str) Then
Set Matches = RE.Execute(Str)
Get10DigitNumber = Matches(0).SubMatches(0)
End If
End Function
And use if as already described above.

After understanding what you were doing, I think this will work. Any feedback would be appreciated.
Dim cell As Range, LRow As Long
LRow = ActiveSheet.Cells(Rows.Count, 4).End(xlUp).Row
For Each cell In Range("D2:D" & LRow)
If cell.Value Like "*Po *" Then
cell.Offset(0, 6).Value = Split(cell.Value, " ")(2)
Else: cell.Offset(0, 6).Value = Split(cell.Value, " ")(1)
End If
Next cell
For Each cell In Range("J2:J" & LRow)
If Len(cell.Value) > 7 Then
cell.Value = Right(cell.Value, 7)
End If
Next

Related

Simple Excel VBA takes ages

I have written a simple VBA script (code below) that should inspect every cell in a certain column. Here I want to do some string manipulation ( i wante to search for "." in the string and then take the right side, but because I could not get it to work I always take the 4 digit as a start). I then copy the manipulated string into another cell and later back. The code works, but for some reason, it takes ages to run on only 35 cells!
I´m still a kook on VBA and wanted to get input what could be the reason for it and what I could improve to get a faster runtime. Is it because I take all strings froms 4 up to 50 ?
Sub EditStatus()
Application.DisplayAlerts = False
ActiveSheet.Name = "Backend"
myNum = Application.InputBox("Please enter the row number until which you would like to update the status column (only for new entries)")
For i = 2 To myNum
Cells(i, 20).Value = Mid(Cells(i, 10), 4, 50)
Cells(i, 10).Value = Cells(i, 20).Value
Next i
Range("T1:T200").Clear
Application.DisplayAlerts = True
End Sub
Thanks
No need for a loop. You can enter the formula in the entire range in 1 go and then convert them to values before putting the values back in Col J
Replace
For i = 2 To myNum
Cells(i, 20).Value = Mid(Cells(i, 10), 4, 50)
Cells(i, 10).Value = Cells(i, 20).Value
Next i
With
With Range("T2:T" & myNum)
.Formula = "=MID(J2, 4, 50)"
.Value = .Value
Range("J2:J" & myNum).Value = .Value
End With
Alternatively, you can directly perform the same action in Col J without the helper column T. For example you can do all that in 1 line as explained HERE as well
Simply replace
For i = 2 To myNum
Cells(i, 20).Value = Mid(Cells(i, 10), 4, 50)
Cells(i, 10).Value = Cells(i, 20).Value
Next i
Range("T1:T200").Clear
with
Range("J2:J" & myNum).Value = Evaluate("Index(MID(" & "J2:J" & myNum & ", 4, 50),)")
Replace Values In-Place
Adjust the values in the constants section.
This solution overwrites the data and doesn't use a helper column, but you can test it with one indicated near the end of the code.
Solve the renaming (Backend) part as needed.
The Code
Option Explicit
Sub EditStatus()
' Define constants.
Const sPrompt As String = "Please enter the row number until which you " _
& "would like to update the status column (only for new entries)"
Const sTitle As String = "Enter Number"
Const wsName As String = "Backend"
Const First As Long = 2
Const cCol As Long = 10 ' J
Const Delim As String = "."
' Define workbook.
Dim wb As Workbook: Set wb = ThisWorkbook ' Workbook containing this code.
' Require input.
Dim Last As Variant
Last = Application.InputBox( _
Prompt:=sPrompt, Title:=sTitle, Default:=First, Type:=1)
' Validate input.
If VarType(Last) = vbBoolean Then
MsgBox "You cancelled."
Exit Sub
End If
If Last < First Then
MsgBox "Enter a number greater than " & First - 1 & "."
Exit Sub
End If
If Int(Last) <> Last Then
MsgBox "Enter a WHOLE number greater than " & First - 1 & "."
Exit Sub
End If
' Define column range.
Dim rg As Range
Set rg = wb.Worksheets(wsName).Cells(First, cCol).Resize(Last - First + 1)
' Write values from column range to array.
Dim Data As Variant
If rg.Rows.Count > 1 Then
Data = rg.Value
Else
ReDim Data(1 To 1, 1 To 1): Data = rg.Value
End If
' Declare additional variables
Dim cValue As Variant ' Current Value
Dim i As Long ' Current Row (Array Row Counter)
Dim fPos As Long ' Current Delimiter Position
' Replace strings containing the delimiter, with the sub string
' to the right of it.
For i = 1 To UBound(Data)
cValue = Data(i, 1)
If Not IsError(cValue) Then
fPos = InStr(1, cValue, Delim)
If fPos > 0 Then
Data(i, 1) = Right(cValue, Len(cValue) - fPos)
End If
End If
Next i
' Maybe rather test with the following (writes to column 20 ("T")).
'rg.Offset(, 10).Value = Data
' Write values from array to column range.
rg.Value = Data
End Sub

VBA translate column value in alphabet to a numeral

How do I convert the alpha part of an excel address "$C$2" to 3 and 2 so that I could re-use it in a cell object.
If it is "$E$4", then I need two separate values like 5 (for the letter E) and 4, so that I could reference that using the object - Cells(4,5)
Basically, I am trying to un-merge cells using this code below and that is where the need to get the numeral of the excel cell came about.
Sub UnMerge()
Dim i As Integer
Dim fromRange() As String
Dim toRange() As String
Dim temp() As String
ActiveSheet.UsedRange.MergeCells = False
fromRange() = Split(ActiveCell.Address, "$")
temp() = Split(Selection.Address, ":")
toRange() = Split(temp(1), "$")
For i = fromRange(2) To toRange(2)
If Cells(i, Range(temp(0)).Column) = "" Then
Cells(i, Range(temp(0)).Column) = Cells(i - 1, Range(temp(0)).Column).Value
End If
Next i
End Sub
Debug.Print Range("$E$4").Row & ", " & Range("A1").Column
changing and spliting strings to get to numbers is slow. Just use the selection.rows and selection.column:
Sub UnMerge()
Selection.MergeCells = False
With ActiveSheet
Dim i As Long
For i = Selection.Row To Selection.Rows.Count + Selection.Row - 1
If .Cells(i, Selection.Column) = "" Then
.Cells(i, Selection.Column) = .Cells(i - 1, Selection.Column).Value
End If
Next i
End With
End Sub

Formating Cells to only display characters after X number of Characters

Instead of receiving the first 12 character from the right, what's needed is the same cell without the first 12 characters.
Dim arrData As Variant, LastRow As Long, i As Long, ws As Worksheet
Set ws = ThisWorkbook.Sheets("Long List 15032019") 'change the name of the sheet to the one you are doing the code
With ws
LastRow = .Cells(.Rows.Count, 3).End(xlUp).Row
arrData = .Range("A2", .Cells(LastRow, "C")).Value
For i = 1 To UBound(arrData)
If arrData(i, 3) Like "Bus*" Then
arrData(i, 1) = "BU CRM"
Else
arrData(i, 1) = "CSI ACE"
End If
If arrData(i, 3) Like "CSI*" Or arrData(i, 3) = vbNullString Then
arrData(i, 2) = vbNullString
Else: arrData(i, 2) = Right(arrData(i, 3), 12)
End If
Next i
.Range("A2", .Cells(LastRow, "C")).Value = arrData
End With
i.e. If C3 = "Example (ID:15654534)" then B3 = "(ID:15654534)" or If C3 = "Example (ID:152)" then B3 = "(ID:152)"
I did attempt using Left or -12. But the outcome was not what I needed.
I hope that this clarifies my question.
Thank you
Sounds like you want a regex. In this case you would have
arrData(i,2) = GetId(arrData(i, 3) , "ID:\d+")
Code:
Option Explicit
Public Sub test()
Dim items(), item As Variant
items = Array("Example (ID:15654534)", "Example (ID:152)")
For Each item In items
Debug.Print GetId(item, "ID:\d+")
Next
End Sub
Public Function GetId(ByVal inputString As String, ByVal sPattern As String) As String
Dim matches As Object, iMatch As Object, arrMatches(), i As Long
i = 1
With CreateObject("vbscript.regexp")
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = sPattern
If .test(inputString) Then
Set matches = .Execute(inputString)
ReDim arrMatches(1 To matches.Count)
For Each iMatch In matches
arrMatches(i) = iMatch.Value
i = i + 1
Next iMatch
Else
GetId = "No match"
Exit Function
End If
End With
GetId = arrMatches(1)
End Function
This can be done without VBA, but only if input is the same structure. Your data now is:
text(idnumber) and you want to obtain only part (idnumber) including both parenthesis.
This formula will work ONLY as long as data structure is the same. If input changes something, it could not work properly (For example, if first parenthesis is missing, it won't work properly).
The formula I've used is this one:
=MID(C1;SEARCH("(";C1;1);SEARCH(")";C1)-SEARCH("(";C1;1)+1)
My data example:
Hope you can adap it to your needs.

VBA Splitting a String into multiple cells when it has variable Delimiters

If I have the below info all contained in a single cell and I want to split it into separate cells. I understand how to use the space as a delimiter but in this case, the name also has spaces and I want the name to stay together in a single cell. To further complicate the matter, the name is not always just first and last, it can also include middle so is not always a standard two names.
2172571122 Jane Doe 3143332222 John Doe
2172242237 Mary Mixer 2223334444 Mike M Martin
Want it to end up looking like this:
Cell 1 = 2172242237
Cell 2 = Mary Mixer
Cell 3 = 2223334444
Cell 4 = Mike M Martin
Any suggestions?
This regex based function alternates each split between numbers and text (words).
Option Explicit
Function customSplit(str As String, _
Optional ndx As Integer = 1) As Variant
Static rgx As Object, cmat As Object
Set rgx = CreateObject("VBScript.RegExp")
With rgx
.Global = True
.MultiLine = True
.IgnoreCase = True
If CBool(ndx Mod 2) Then
.Pattern = "[0-9]{10}"
ndx = (ndx + 1) \ 2
Else
.Pattern = "[A-Z]{1,9}\s[A-Z]{1,9}[\s[A-Z]{1,9}]?"
ndx = ndx \ 2
End If
If .test(str) Then
Set cmat = .Execute(str)
If ndx <= cmat.Count Then
customSplit = cmat.Item(ndx - 1)
End If
End If
End With
End Function
You could try:
Option Explicit
Sub test()
Dim strToSplit As String, strImport As String
Dim arrwords As Variant
Dim i As Long, counter As Long
With ThisWorkbook.Worksheets("Sheet1")
strToSplit = .Range("A1").Value
arrwords = Split(strToSplit, " ")
counter = 1
For i = LBound(arrwords) To UBound(arrwords)
If IsNumeric(arrwords(i)) = True Then
strImport = arrwords(i)
.Cells(3, counter).Value = strImport
counter = counter + 1
ElseIf Not IsNumeric(arrwords(i)) = True Then
If Not IsNumeric(.Cells(3, counter - 1).Value) Then
strImport = .Cells(3, counter - 1) & " " & arrwords(i)
.Cells(3, counter - 1).Value = strImport
counter = counter
Else
strImport = arrwords(i)
.Cells(3, counter).Value = strImport
counter = counter + 1
End If
End If
Next
End With
End Sub
Results look like this:
I have a few ideas on what you could do.
1) Read a Line
Do a split(line, " ") and loop through the indecies while performing a isNumeric() on each split value. If not, then add to a string Array() and set a flag to true.
Then, if isnumeric then, expect another name and set flag to true.
2) Read a line.
Then, loop through each character performing an isnumeric and if not then add that character to a string Array() and set flag until isnumeric again, etc....
I hope that helps or at least gets you in the right direction.
Additional variant to posted already:
Sub ZZZ()
Dim dic As Object: Set dic = CreateObject("Scripting.Dictionary")
Dim num$, cl As Range, data As Range, key, x
Dim Result As Worksheet
Set data = Range([A1], Cells(Rows.Count, "A").End(xlUp))
For Each cl In data
x = "": num = "":
For Each x In Split(cl, " ")
If IsNumeric(x) Then
num = x
dic.Add x, ""
ElseIf x <> "" And num <> "" Then
dic(num) = Trim(dic(num) & " " & x)
End If
Next x
Next cl
Set Result = Worksheets.Add
With Result
.Name = "Result " & Replace(Now, ":", "-")
x = 1
For Each key In dic
.Cells(x, "A").Value2 = key
.Cells(x, "B").Value2 = dic(key)
x = x + 1
Next key
.Columns("A:B").AutoFit
End With
End Sub
test:

How to read e & é as the same thing using search macro in excel

I'm not entirely sure how to word this but, I have an Excel macro that enables a search functionality within my workbook. My issue is that I need the search to understand 'é' as 'e'. So that if I search for 'Belem', my result would come back with 'Belém'. How would I go about this? Thanks for any time and consideration.
Sub city()
If ActiveSheet.Name <> "City" Then Exit Sub
LastRow = ActiveSheet.Cells.SpecialCells(xlCellTypeLastCell).Row
Sheets("Results").Range("3:10000").Delete
SearchTerm = Application.InputBox("What are you looking for?")
Application.ScreenUpdating = False
Range("W1") = SearchTerm
Range("W2:W" & LastRow).FormulaR1C1 = _
"=IF(ISERR(SEARCH(R1C23,RC[-22]&RC[-21]&RC[-20]&RC[-19]&RC[-18]&RC[-17]&RC[-16]&RC[-15]&RC[-15]&RC[-14]&RC[-13]&RC[-12]&RC[-11]&RC[-10]&RC[-9]&RC[-8]&RC[-7]&RC[-6]&RC[-5]&RC[-4]&RC[-3]&RC[-2]&RC[-1])),0,1)"
If WorksheetFunction.CountIf(Columns(23), 1) = 0 Then
Columns(23).Delete
Application.ScreenUpdating = True
MsgBox "None found."
Else
For Each Cell In Range("A2:A" & LastRow)
If Cell.Offset(, 22) = 1 Then
Cell.Resize(, 51).Copy Sheets("Results").Range("A" & Rows.Count).End(xlUp).Offset(1)
x = x + 1
End If
Next Cell
Columns(22).Delete
Application.ScreenUpdating = True
If x = 1 Then
MsgBox "1 matching record was copied to Search Results tab."
Else
MsgBox x & " matching records were copied to Search Results tab."
End If
End If
End Sub
You can modify the search parameter and then use the like operator as follows:
Sub city()
Dim rngResult As Range
Dim searchTerm As String, counter As Integer
Dim values As Variant, value As Variant
If ActiveSheet.Name <> "City" Then Exit Sub
'First Cell with the results
Set rngResult = <First cell of the result Range>
'Uses a variant array to get all values from the range. This speeds up the routine
values = <Area of Search>.Value
'Converts to lowercase to do a case insensitive search (e.g. Belem = belem)
searchTerm = LCase(Application.InputBox("What are you looking for?"))
If searchTerm = "" Then Exit Sub
' "§" is just a placeholder
searchTerm = Replace(searchTerm, "e", "§")
searchTerm = Replace(searchTerm, "é", "§")
searchTerm = Replace(searchTerm, "§", "[eé]")
Application.ScreenUpdating = False
counter = 0
For Each value In values
If LCase(value) Like searchTerm Then
rngResult = value
Set rngResult = rngResult.Offset(1, 0) 'Moves to the next line
counter = counter + 1
End If
Next value
If counter = 0 Then
MsgBox "None found."
Else
MsgBox "Found " & counter & " results"
'Do what you need to do with the results
End If
Application.ScreenUpdating = True
End Sub
All the results will be at the column of rngResult.
The code works by replacing "e" and "é" by "§" and then replacing "§" by "[eé]", (e.g. "bélem" -> "bél§m" -> "b§l§m" -> "b[eé]l[eé]m").
The like will match either "e" or "é" on that position. You can learn more about it here or in the help files. Here is a Example:
bélem Like "b[eé]l[eé]m" ' true
belem like "b[eé]l[eé]m" ' true
recife like "b[eé]l[eé]m" ' false
You can search more graphs by adding other criteria like:
'Like will match "a","á", "à" and "ã"
searchTerm = Replace(searchTerm, "a", "§")
searchTerm = Replace(searchTerm, "á", "§")
searchTerm = Replace(searchTerm, "à", "§")
searchTerm = Replace(searchTerm, "ã", "§")
searchTerm = Replace(searchTerm, "§", "[aáàã]")
This method has the advantage that you only need one "translation" in order to do comparisons. This can improve the performance if you have a large dataset
You can keep an array of all the characters you want to replace and what you want to replace them with. It's easier if you "search" your data a little differently that using that formula. Here's how I would do it.
Sub FindCity()
Dim shResults As Worksheet
Dim vaData As Variant
Dim i As Long, j As Long
Dim sSearchTerm As String
Dim sData As String
Dim rNext As Range
'Put all the data into an array
vaData = ActiveSheet.UsedRange.Value
'Get the search therm
sSearchTerm = Application.InputBox("What are you looking for?")
'Define and clear the results sheet
Set shResults = ActiveWorkbook.Worksheets("Results")
shResults.Range("A3").Resize(shResults.UsedRange.Rows.Count, 1).EntireRow.Delete
'Loop through the data
For i = LBound(vaData, 1) To UBound(vaData, 1)
For j = LBound(vaData, 2) To UBound(vaData, 2)
'Get rid of diacritial characters
sData = LCase(Anglicize(vaData(i, j)))
'Look for a match
If InStr(1, sData, LCase(Anglicize(sSearchTerm))) > 0 Then
'Write the row to the next available row on Results
Set rNext = shResults.Cells(shResults.Rows.Count, 1).End(xlUp).Offset(1, 0)
rNext.Resize(1, UBound(vaData, 2)).Value = Application.Index(vaData, i, 0)
'Stop looking in that row after one match
Exit For
End If
Next j
Next i
End Sub
Public Function Anglicize(ByVal sInput As String) As String
Dim vaGood As Variant
Dim vaBad As Variant
Dim i As Long
Dim sReturn As String
'Replace any 'bad' characters with 'good' characters
vaGood = Split("S,Z,s,z,Y,A,A,A,A,A,A,C,E,E,E,E,I,I,I,I,D,N,O,O,O,O,O,U,U,U,U,Y,a,a,a,a,a,a,c,e,e,e,e,i,i,i,i,d,n,o,o,o,o,o,u,u,u,u,y,y", ",")
vaBad = Split("Š,Ž,š,ž,Ÿ,À,Á,Â,Ã,Ä,Å,Ç,È,É,Ê,Ë,Ì,Í,Î,Ï,Ð,Ñ,Ò,Ó,Ô,Õ,Ö,Ù,Ú,Û,Ü,Ý,à,á,â,ã,ä,å,ç,è,é,ê,ë,ì,í,î,ï,ð,ñ,ò,ó,ô,õ,ö,ù,ú,û,ü,ý,ÿ", ",")
sReturn = sInput
For i = LBound(vaBad) To UBound(vaBad)
sReturn = Replace$(sReturn, vaBad(i), vaGood(i))
Next i
Anglicize = sReturn
End Function
List of characters from Excel 2007 VBA Converting Accented Characters to Regular

Resources