Extract 5 Digit Number From a String with Excel-VBA - string

I'm facing a problem as a non dev. I have a column in Excel that contains info as such:
46843 xxxx xxx x
xxxx 65483 xxxx
xxxx xxx 65432 xxxxx 4 xx
"x" being normal caracters.
What I want is to be able to extract only the numbers of five digits only.
I started something like this but struggle to put a loop so that it scans all the string:
Function test()
val_in = "rue 4 qsdqsd CURIE 38320 EYBENS"
Filte = Left(val_in, 5)
If IsNumeric(Filte) Then
test = Left(val_in, 5)
Else
sp1 = InStr(1, val_in, " ")
sp2 = InStr(sp1 + 1, val_in, " ")
spt = sp2 + sp1
If spt > 5 Then
extr = Mid(val_in, spt, 5)
End If
End If
End Function
How could I turn the part after "Else" into a loop so that it would scan every space of the string and extract only the numbers that contains 5 digits?

Using regex
Option Explicit
Public Function GetNumbers(ByVal rng As Range) As Variant
Dim arr() As String, i As Long, matches As Object, re As Object
Set re = CreateObject("VBScript.RegExp")
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "\b\d{5}\b"
If .test(rng.Value) Then
Set matches = .Execute(rng.Value)
ReDim arr(0 To matches.Count - 1)
For i = LBound(arr) To UBound(arr)
arr(i) = matches(i)
Next i
Else
arr(i) = rng.Value
End If
End With
GetNumbers = Join(arr, ",")
End Function
Data:
If there is more than one match a comma separated list is returned.

Sub TestMe()
Dim valIn As String
valIn = "rue 4 qsdqsd CURIE 38320 EYBENS 43443"
Dim i As Long
Dim splitted As Variant
splitted = Split(valIn)
For i = LBound(splitted) To UBound(splitted)
If IsNumeric(splitted(i)) And Len(splitted(i)) = 5 Then
Debug.Print splitted(i)
End If
Next i
End Sub
Considering that in your example you mean that the 5 digit numbers are splitted by space, the above works. It splits the string by space to an array and loops through the elements of the array. If the element is with 5 chars and is numeric, it prints it.
If the rule for the spaces is not something that one can count on, here is a different implementation:
Sub TestMe()
Dim valIn As String
valIn = "44244rue4qsdqsdCURIE383201EYBENS43443"
Dim i As Long
For i = 1 To Len(valIn) - 4
If IsNumeric(Mid(valIn, i, 5)) Then
Debug.Print Mid(valIn, i, 5)
End If
Next i
End Sub
It starts looping through the string, checking whether each 5 chars are numeric. When you have numeric 6 chars, it gives two results - 1 to 5 and 2 to 6. Thus 383201 is "translated" as the following 2:
38320
83201

If you have always space between words/numbers then this should do
Sub test()
Dim TestStr As String
Dim Temp As Variant
Dim i As Long, FoundVal As Long
TestStr = "rue 4 qsdqsd CURIE 38320 EYBENS"
Temp = Split(TestStr, " ")
For i = 0 To UBound(Temp)
If Len(Trim(Temp(i))) = 5 And IsNumeric(Temp(i)) Then
FoundVal = Temp(i)
MsgBox FoundVal
End If
Next i
End Sub

From the solution you are trying to apply (creating custom function in VBA) I understand that you actually need to use it in a formula.
To find number with five digits from cell A1 you can use the following formula without VBA:
=IF(ISERROR(FIND("0"&REPT("1",5)&"0",CONCAT(0+(ISNUMBER(-MID(" "&A1&" ",ROW(INDIRECT("1:"&LEN(A1)+2)),1))+0)))),"",MID(A1,FIND("0"&REPT("1",5)&"0",CONCAT(0+(ISNUMBER(-MID(" "&A1&" ",ROW(INDIRECT("1:"&LEN(A1)+2)),1))+0))),5))
To search for other number of digits change the three occurrences of number 5 to your desired digits count in the formula.

Related

Remove special characters from range in VBA

I have created a VBA code to remove all special characters available in a column. As an example I have a Alphanumeric character with some special characters in every cells of a column:
Suppose in a cell I have a value: abc#123!-245
After executing my code I got output abc 123 245
Here my code is working fine to remove all the special characters. My code is given below:
Sub ReplaceSpecial()
Dim cel As Range
Dim strVal As String
Dim i As Long
Application.ScreenUpdating = False
For Each cel In Selection
strVal = cel.Value
For i = 1 To Len(strVal)
Select Case Asc(Mid(strVal, i, 1))
Case 32, 48 To 57, 65 To 90, 97 To 122
' Leave ordinary characters alone
Case Else
Mid(strVal, i, 1) = " "
End Select
Next i
cel.Value = strVal
Next cel
Application.ScreenUpdating = True
End Sub
Now if I want to remove the space for my output so that output should look like abc123245, how to do that in VBA?
Input: abc#123!-245
Current Output: abc 123 245
Required Output: abc123245
You could construct a new string with just the permitted characters.
Sub ReplaceSpecial()
Dim cel As Range
Dim strVal As String, temp As String
Dim i As Long
Application.ScreenUpdating = False
For Each cel In Selection
strVal = cel.Value
temp = vbNullString
For i = 1 To Len(strVal)
Select Case Asc(Mid(strVal, i, 1))
Case 32, 48 To 57, 65 To 90, 97 To 122
temp = temp & Mid(strVal, i, 1)
End Select
Next i
cel.Value = temp
Next cel
Application.ScreenUpdating = True
End Sub
My sole intention for this late post was to
test some features of the ►Application.Match() function (comparing a string input against valid characters) and to
demonstrate a nice way to "split" a string into single characters as alternative and possibly instructive solution (see help function String2Arr()).
I don't intend, however to show better or faster code here.
Application.Match() allows not only to execute 1 character searches in an array, but to compare even two arrays in one go,
i.e. a character array (based on an atomized string input) against an array of valid characters (blanks, all digits and chars from A to Z).
As Application.Match is case insensitive, it suffices to take e.g. lower case characters.
All findings of input chars return their position in the valid characters array (otherwise resulting in Error 2042).
Furthermore it was necessary to exclude the wild cards "*" and "?", which would have been considered as findings otherwise.
Function ValidChars(ByVal s, Optional JoinResult As Boolean = True)
'Purp: return only valid characters if space,digits,"A-Z" or "a-z"
'compare all string characters against valid characters
Dim tmp: tmp = foundCharAt(s) ' get array with found positions in chars
'overwrite tmp array
Dim i As Long, ii As Long
For i = 1 To UBound(tmp)
If IsNumeric(tmp(i)) Then ' found in valid positions
If Not Mid(s, i, 1) Like "[?*]" Then ' exclude wild cards
ii = ii + 1
tmp(ii) = Mid(s, i, 1) ' get char from original string
End If
End If
Next
ReDim Preserve tmp(1 To ii) ' reduce to new size
'join tmp elements to resulting string (if argument JoinResult = True)
ValidChars = IIf(JoinResult, Join(tmp, ""), tmp)
End Function
Help function foundCharAt()
Returns an array of found character positions in the valid chars array:
Function foundCharAt(ByVal s As String) As Variant
'Purp: return array of found character positions in chars string
'Note: (non-findings show Error 2042; can be identified by IsError + Not IsNumeric)
Dim chars: chars = String2Arr(" 0123456789abcdefghijklmnopqrstuvwxyz")
foundCharAt = Application.Match(String2Arr(s), chars, 0)
End Function
Help function String2Arr()
Assigns an array of single characters after atomizing a string input:
Function String2Arr(ByVal s As String) As Variant
'Purp: return array of all single characters in a string
'Idea: https://stackoverflow.com/questions/13195583/split-string-into-array-of-characters
s = StrConv(s, vbUnicode)
String2Arr = Split(s, vbNullChar, Len(s) \ 2)
End Function
Use a regular expression's object and replace all unwanted characters by using a negated character class. For demonstration purposes:
Sub Test()
Dim str As String: str = "abc#123!-245"
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "[^0-9A-Za-z ]"
str = .Replace(str, "")
End With
Debug.Print str
End Sub
The pattern [^0-9A-Za-z ] is a negated character class and captured everything that is not a alphanumeric or a space character. You'll find a more in-depth explaination in this online demo.
At time of writing I'm unsure if you want to leave out the space characters or not. If so, just remove the space from the pattern.
Thought I'd chuck in another alternative using the Like() operator:
For i = Len(str) To 1 Step -1
If Mid(str, i, 1) Like "[!0-9A-Za-z ]" Then
str= Application.Replace(str, i, 1, "")
End If
Next
Or with a 2nd string-type variable (as per #BigBen's answer):
For i = 1 to Len(str)
If Mid(str, i, 1) Like "[0-9A-Za-z ]" Then
temp = temp & Mid(str, i, 1)
End If
Next
If you want to build on your current effort, replace:
cel.Value = strVal
with:
cel.Value = Replace(strVal, " ", "")
Consider:
Sub ReplaceSpecial()
Dim cel As Range
Dim strVal As String
Dim i As Long
Application.ScreenUpdating = False
For Each cel In Selection
strVal = cel.Value
For i = 1 To Len(strVal)
Select Case Asc(Mid(strVal, i, 1))
Case 32, 48 To 57, 65 To 90, 97 To 122
' Leave ordinary characters alone
Case Else
Mid(strVal, i, 1) = " "
End Select
Next i
cel.Value = Replace(strVal, " ", "")
Next cel
Application.ScreenUpdating = True
End Sub

Count number of occourences of specific values in a string

I have multiple rows with some words separeted by semicolons(;), and need to count how many times a certain word appears in Column A cell strings of Sheet1.
Using two rows for example:
Column "A"
Banana; Apple; Orange
Banana; Banana; Apple
I came up with this code for the counting of the specific word I want to count:
Sub count()
'The count will be registered in "B6"
strcount = "Banana"
For i = 2 to 30
If InStr(Sheets("Sheet1").Cells(i, "A").Text, strcount) <> 0 Then
Cells(6, "B").Value = Cells(6, "B").Value + 1
End If
Next i
End Sub
The problem with this code is that it doesn't recognize the 2 appearences of "Banana" in the second row returning me a count of 2 instead of 3:
Results for each fruit:
Banana: 2
Apple: 2
Orange: 1
I see that the problem is InStr only recognizes if the string is there, but how can I overcome this?
Solution:
Both basodre's and Алексей's answers worked.
For basodre's code I had to change only the delimiter from ";" to "; " (with a space after the semicolon) to match my string.
aFoods = Split(rIterator.Value, "; ")
Алексей's answer works perfectly too, but by the time of this edit is limited for Excel 2019 or above, given it uses the "TEXTJOIN" function and I couldn't come up with a replacement for that.
Here's an example that I think does what you need. Please review, modify to your range, and let us know if it works.
Sub CountWords()
Dim rng As Range
Dim aFoods As Variant
Dim rIterator As Range
Dim counter As Long
Const theFood As String = "Banana"
Set rng = Range("A1:A3")
counter = 0
For Each rIterator In rng
aFoods = Split(rIterator.Value, ";")
For i = LBound(aFoods) To UBound(aFoods)
If aFoods(i) = theFood Then
counter = counter + 1
End If
Next i
Next rIterator
Debug.Print counter
End Sub
Solution with RegExp:
Option Explicit
Sub test1()
Dim re As Object, result As Object, text As String, fruit As Variant
Set re = CreateObject("vbscript.regexp")
re.Global = True
text = WorksheetFunction.TextJoin(";", True, Columns("A"))
'In Excel < 2019 you can use: text = Join(WorksheetFunction.Transpose(Intersect(Columns("A"), ActiveSheet.UsedRange)), ";")
For Each fruit In Array("Banana", "Apple", "Orange")
re.Pattern = "\b" & fruit & "\b"
Set result = re.Execute(text)
Debug.Print "Amount of [" & fruit & "] = " & result.Count
Next
End Sub
Output:
Amount of [Banana] = 3
Amount of [Apple] = 2
Amount of [Orange] = 1
Using regular expression
Sub FindEntries()
Dim mc, rw
Const word$ = "Banana"
With CreateObject("VBScript.RegExp")
.IgnoreCase = True: .Global = True: .Pattern = "(^|;\s+)" & word & "(?=;|$)"
For rw = 2 To Cells(Rows.Count, "A").End(xlUp).Row
Set mc = .Execute(Cells(rw, "A")): [B6] = [B6] + mc.Count
Next
End With
End Sub

VBA printing a substring from a string

I would like to print each substrings in between the "..." from this string: "...covid...is...very...scary" in consecutive cells in a column in excel.
this is my code in VBA.
Sub copyd()
findandcopy("...covid...is...very...scary") 'not sure how to print in consecutive cells of a column
End Sub
Function findandcopy(brokenstr As String) As String
Dim first, second As Integer
Dim strtarget as string
strtarget = "..."
Do until second =0. 'second=0 so that loop ends when there is no more "..." found
first = InStr(brokenstr, strtarget)
second = InStr(first + 3, brokenstr, strtarget)
findandcopy = Mid(purpose, first +3, second - first -3) 'referred to https://stackoverflow.com/questions/2543225/how-to-get-a-particular-part-of-a-string#_=_
first = second 'so that loop can find next "..."
Loop
End Function
can anyone please advise? thank you for your help :)
Try this code:
Option Explicit
Sub copyd()
Dim arr As Variant
' get splitted text into horizontal array arr()
arr = Split("...covid...is...very...scary", "...")
If UBound(arr) > 0 Then ' if there is something in the array, display it on the sheet
' put onto sheet values from transposed array arr()
ThisWorkbook.Worksheets(1).Range("A1"). _
Resize(UBound(arr) + 1, 1).Value = _
WorksheetFunction.Transpose(arr)
End If
End Sub
Ahh, why not just split the string by "..."?
Like:
Function findandcopy(brokenstr As String, targetStr as string)
dim substr()
if instr(1, brokenstr, targetStr, vbTextCompare) > 0 then
'brokenstr has at least one instance of targetStr in it
brokenstr2 = split(brokenstr,targetStr)
if brokenstr2(0) = "" then
redim substr(ubound(brokenstr2)-1)
iStart = 1
else
redim substr(ubound(brokenstr2))
iStart = 0
end if
for i = iStart to ubound(brokenstr2)
substr(i-iStart) = brokenstr2(i)
next i
else
'No instances of targetStr in brokenstr
redim substr(0)
substr(0) = brokenstr
end if
findandcopy = substr
end function
Which will return an array of strings which are the bits between targetStr. Then you can do with it as you please within the parent sub.
If you start doing comparisons with the results and find issues - you can remove whitespace by modifying above as:
substr(i) = trim(brokenstr2(i))
and your calling code:
Sub main()
Dim covid as string
Dim remove as string
covid = "...covid...is....very...scary"
'covid = "really...covid...is...very...scary" 'For testing
remove = "..."
rtn = findandcopy(covid, remove)
end sub

Concatenate specific values from a cell with specific values from another cell into a particular format

A B C
1 numbers signs **Result**
2 *001* *alpha* 001-alpha
3 *001*111*221*104* *alpha*kappa*epislon*ETA* 001-alpha, 111-kappa, 221-epislon, 104-ETA
4 *001*085* *alpha*delta* 001-alpha, 085-delta
I'm trying to concatenate the values in columns A and B into the following format under the result section. Anything helps, thanks.
Formula solution
Using Textjoin and Filterxml function, of which Textjoin available in Office 365 or Excel 2019 and Filterxml available in Excel 2013 & later versions of Excel
In C2, array formula (confirm by pressing Ctrl+Shift+Enter) copied down :
=TEXTJOIN(", ",1,IFERROR(TEXT(FILTERXML("<a><b>"&SUBSTITUTE(A2,"*","</b><b>")&"</b></a>","//b"),"000")&FILTERXML("<a><b>"&SUBSTITUTE(B2,"*","</b><b>-")&"</b></a>","//b"),""))
I'm assuming this is doable with formulas but it might get unwieldy, so perhaps a UDF like this:
Public Function JoinNumbersAndSigns(ByVal numbersRng As Range, ByVal signsRng As Range) As String
Dim nums As String
nums = numbersRng.Cells(1).Value
nums = Mid$(nums, 2, Len(nums) - 2) ' remove leading and trailing *
Dim signs As String
signs = signsRng.Cells(1).Value
signs = Mid$(signs, 2, Len(signs) - 2) ' remove leading and trailing *
Dim tempNums As Variant
tempNums = Split(nums, "*")
Dim tempSigns As Variant
tempSigns = Split(signs, "*")
Dim i As Long
For i = LBound(tempNums) To UBound(tempNums)
Dim tempString As String
Dim sep As String
tempString = tempString & sep & tempNums(i) & "-" & tempSigns(i)
sep = ", "
Next i
JoinNumbersAndSigns = tempString
End Function
In Action:
The nums = Mid$(nums, 2, Len(nums) - 2) and similar line for signs could probably be made more robust, but should work given your current data.
Here's another approach using regular expressions ...
Option Explicit
Public Function Link(vNumbers As Range, vSigns As Range) As Variant
' ADD REFERENCE TO "Microsoft VBScript Regular Expressions 5.5"
Dim vRegEx As New RegExp
Dim vNumbersMatches As MatchCollection
Dim vSignsMatches As MatchCollection
Dim vCounter As Long
' The two parameters must only reference a single cell
If vNumbers.Cells.Count <> 1 Or vSigns.Cells.Count <> 1 Then
Link = CVErr(xlErrRef)
Exit Function
End If
' use regular expression to get the numbers
vRegEx.Pattern = "([0-9]+)"
vRegEx.Global = True
vRegEx.MultiLine = True
Set vNumbersMatches = vRegEx.Execute(vNumbers.Text)
' Use regular expression to get the signs
vRegEx.Pattern = "([^\*]+)"
vRegEx.Global = True
vRegEx.MultiLine = True
Set vSignsMatches = vRegEx.Execute(vSigns.Text)
' If the number of Numbers and Signs differs, then return an error
If vNumbersMatches.Count <> vSignsMatches.Count Then
Link = CVErr(xlErrValue)
Exit Function
End If
' Loop through the Numbers and Signs, appending each set
For vCounter = 0 To vNumbersMatches.Count - 1
Link = Link & vNumbersMatches.Item(vCounter) & "-" & vSignsMatches.Item(vCounter) & IIf(vCounter < vNumbersMatches.Count - 1, " ,", "")
Next
End Function
And the output ...
As long as there will always be a correlation between the number of elements in A & B this will work
Sub SplitandConcat()
' Declare working vars
Dim lRow As Long: lRow = 2
Dim sOutputString As String
Dim iWorkIndex As Integer
Dim CommaSpace As String
While ActiveSheet.Cells(lRow, 1) <> ""
CommaSpace = ""
'Split the incoming string on delimiter
arInput1 = Split(ActiveSheet.Cells(lRow, 1), "*")
arInput2 = Split(ActiveSheet.Cells(lRow, 2), "*")
' For each non blank item in the 1st array join the corresponding item int the second
For iWorkIndex = 0 To UBound(arInput1)
If arInput1(iWorkIndex) <> "" Then
ActiveSheet.Cells(lRow, 3) = ActiveSheet.Cells(lRow, 3) & CommaSpace & arInput1(iWorkIndex) & "-" & arInput2(iWorkIndex)
CommaSpace = ", "
End If
Next iWorkIndex
' check next row
lRow = lRow + 1
Wend
End Sub

Split and sort strings components using Excel

I have a column in Excel with the format:
A01G45B45D12
I need a way to format it like this, that is divide the string into groups of three characters, sort the groups alphabetically and then join them together with a + sign between:
A01+B45+D12+G45
I wonder it this is possible using the built in formulas in Excel or if I have to do this using VBA or something else, I already have the code for this in C# if there is an easy way to use that from Excel. I have not written plugins for Excel before.
Edit to add:
The above is just an example, the string can be of "any length" but its always divisible by three and the order is random so I cannot assume anything about the order beforehand.
Sub ArraySort()
Dim strStarter As String
Dim strFinish As String
Dim intHowMany As Integer
Dim intStartSlice As Integer
strStarter = ActiveCell.Offset(0, -1).Value 'Pulls value from cell to the left
intHowMany = Int(Len(strStarter) / 3)
ReDim arrSlices(1 To intHowMany) As String
intStartSlice = 1
For x = 1 To intHowMany
arrSlices(x) = Mid(strStarter, intStartSlice, 3)
intStartSlice = intStartSlice + 3
Next x
Call BubbleSort(arrSlices)
For x = 1 To intHowMany
strFinish = strFinish + arrSlices(x) & "+"
Next x
strFinish = Left(strFinish, Len(strFinish) - 1)
ActiveCell.Value = strFinish 'Puts result into activecell
End Sub
Sub BubbleSort(list() As String)
'Taken from power programming with VBA
'It’s a sorting procedure for 1-dimensional arrays named List
'The procedure takes each array element, if it is greater than the next element, the two elements swap positions.
'The evaluation is repeated for every pair of items (that is n-1 times)
Dim First As Integer, Last As Long
Dim i As Long, j As Long
Dim temp As String
First = LBound(list)
Last = UBound(list)
For i = First To Last - 1
For j = i + 1 To Last
If list(i) > list(j) Then
temp = list(j)
list(j) = list(i)
list(i) = temp
End If
Next j
Next i
End Sub

Resources