Anybody have a good solution for recursive replace?
For example, you still end up with commas in this string returned by MsgBox:
Dim s As String
s = "32,,,,,,,,,,,,,,,,23"
MsgBox Replace(s, ",,", ",")
I only want one comma.
Here is code that I developed, but it doesn't compile:
Function RecursiveReplace(ByVal StartString As String, ByVal Find As String, ByVal Replace As String) As String
Dim s As String
s = Replace(StartString, Find, Replace)
t = StartString
Do While s <> t
t = s
s = Replace(StartString, Find, Replace)
Loop
RecursiveReplace = s
End Function
The compiler complains about the second line in the function:
s = Replace(StartString, Find, Replace)
It says Expected Array.
???
You can use a regular expression. This shows the basic idea:
Function CondenseCommas(s As String) As String
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.Pattern = ",+"
CondenseCommas = RegEx.Replace(s, ",")
End Function
Tested like:
Sub test()
Dim s As String
s = "32,,,,,,,,,,,,,,,,23"
MsgBox CondenseCommas(s)
End Sub
Related
How can I grab matching regular expressions from a string, remove the duplicates, and append them to a string variable that separates each by a comma?
For example, in the string, "this is an example of the desired regular expressions: BPOI-G8J7R9, BPOI-G8J7R9 and BPOI-E5Q8D2" the desired output string would be "BPOI-G8J7R9,BPOI-E5Q8D2"
I have attempted to use a dictionary to remove the duplicates, but my function is spitting out the dreaded #Value error.
Can anyone see where I'm going wrong here? Or is there any suggestion for a better way of going about this task?
Code below:
Public Function extractexpressions(ByVal text As String) As String
Dim regex, expressions, expressions_dict As Object, result As String, found_expressions As Variant, i As Long
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "[A-Z][A-Z][A-Z][A-Z][-]\w\w\w\w\w\w"
regex.Global = True
Set expressions_dict = CreateObject("Scripting.Dictionary")
If regex.Test(text) Then
expressions = regex.Execute(text)
End If
For Each item In expressions
If Not expressions_dict.exists(item) Then expressions_dict.Add item, 1
Next
found_expressions = expressions_dict.items
result = ""
For i = 1 To expressions_dict.Count - 1
result = result & found_expressions(i) & ","
Next i
extractexpressions = result
End Function
If you call your function from a Sub you will be able to debug it.
See the comment below about adding the matches as keys to the dictionary - if you add the match object itself, instead of explicitly specifying the match's value property, your dictionary won't de-duplicate your matches (because two or more match objects with the same value are still distinct objects).
Sub Tester()
Debug.Print extractexpressions("ABCD-999999 and DFRG-123456 also ABCD-999999 blah")
End Sub
Public Function extractexpressions(ByVal text As String) As String
Dim regex As Object, expressions As Object, expressions_dict As Object
Dim item
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "[A-Z]{4}-\w{6}"
regex.Global = True
If regex.Test(text) Then
Set expressions = regex.Execute(text)
Set expressions_dict = CreateObject("Scripting.Dictionary")
For Each item In expressions
'A dictionary can have object-type keys, so make sure to add the match *value*
' and the not match object itself
If Not expressions_dict.Exists(item.Value) Then expressions_dict.Add item.Value, 1
Next
extractexpressions = Join(expressions_dict.Keys, ",")
End If
End Function
VBA's regex object actually supports the backreference to a previous capture group. Hence we can get all the unique items through the expression itself:
([A-Z]{4}-\w{6})(?!.*\1)
See an online demo
To put this in practice:
Sub Test()
Debug.Print extractexpressions("this is an example of the desired regular expressions: BPOI-G8J7R9, BPOI-G8J7R9 and BPOI-E5Q8D2")
End Sub
Public Function extractexpressions(ByVal text As String) As String
With CreateObject("VBScript.RegExp")
.Pattern = "([A-Z]{4}-\w{6})(?!.*\1)|."
.Global = True
extractexpressions = Replace(Application.Trim(.Replace(text, "$1 ")), " ", ",")
End With
End Function
Prints:
I need to find numbers from a string. How does one find numbers from a string in VBA Excel?
Assuming you mean you want the non-numbers stripped out, you should be able to use something like:
Function onlyDigits(s As String) As String
' Variables needed (remember to use "option explicit"). '
Dim retval As String ' This is the return string. '
Dim i As Integer ' Counter for character position. '
' Initialise return string to empty '
retval = ""
' For every character in input string, copy digits to '
' return string. '
For i = 1 To Len(s)
If Mid(s, i, 1) >= "0" And Mid(s, i, 1) <= "9" Then
retval = retval + Mid(s, i, 1)
End If
Next
' Then return the return string. '
onlyDigits = retval
End Function
Calling this with:
Dim myStr as String
myStr = onlyDigits ("3d1fgd4g1dg5d9gdg")
MsgBox (myStr)
will give you a dialog box containing:
314159
and those first two lines show how you can store it into an arbitrary string variable, to do with as you wish.
Regular expressions are built to parse. While the syntax can take a while to pick up on this approach is very efficient, and is very flexible for handling more complex string extractions/replacements
Sub Tester()
MsgBox CleanString("3d1fgd4g1dg5d9gdg")
End Sub
Function CleanString(strIn As String) As String
Dim objRegex
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "[^\d]+"
CleanString = .Replace(strIn, vbNullString)
End With
End Function
Expanding on brettdj's answer, in order to parse disjoint embedded digits into separate numbers:
Sub TestNumList()
Dim NumList As Variant 'Array
NumList = GetNums("34d1fgd43g1 dg5d999gdg2076")
Dim i As Integer
For i = LBound(NumList) To UBound(NumList)
MsgBox i + 1 & ": " & NumList(i)
Next i
End Sub
Function GetNums(ByVal strIn As String) As Variant 'Array of numeric strings
Dim RegExpObj As Object
Dim NumStr As String
Set RegExpObj = CreateObject("vbscript.regexp")
With RegExpObj
.Global = True
.Pattern = "[^\d]+"
NumStr = .Replace(strIn, " ")
End With
GetNums = Split(Trim(NumStr), " ")
End Function
Use the built-in VBA function Val, if the numbers are at the front end of the string:
Dim str as String
Dim lng as Long
str = "1 149 xyz"
lng = Val(str)
lng = 1149
Val Function, on MSDN
I was looking for the answer of the same question but for a while I found my own solution and I wanted to share it for other people who will need those codes in the future. Here is another solution without function.
Dim control As Boolean
Dim controlval As String
Dim resultval As String
Dim i as Integer
controlval = "A1B2C3D4"
For i = 1 To Len(controlval)
control = IsNumeric(Mid(controlval, i, 1))
If control = True Then resultval = resultval & Mid(controlval, i, 1)
Next i
resultval = 1234
This a variant of brettdj's & pstraton post.
This will return a true Value and not give you the #NUM! error. And \D is shorthand for anything but digits. The rest is much like the others only with this minor fix.
Function StripChar(Txt As String) As Variant
With CreateObject("VBScript.RegExp")
.Global = True
.Pattern = "\D"
StripChar = Val(.Replace(Txt, " "))
End With
End Function
This is based on another answer, but is just reformated:
Assuming you mean you want the non-numbers stripped out, you should be able to use something like:
'
' Skips all characters in the input string except digits
'
Function GetDigits(ByVal s As String) As String
Dim char As String
Dim i As Integer
GetDigits = ""
For i = 1 To Len(s)
char = Mid(s, i, 1)
If char >= "0" And char <= "9" Then
GetDigits = GetDigits + char
End If
Next i
End Function
Calling this with:
Dim myStr as String
myStr = GetDigits("3d1fgd4g1dg5d9gdg")
Call MsgBox(myStr)
will give you a dialog box containing:
314159
and those first two lines show how you can store it into an arbitrary string variable, to do with as you wish.
Alternative via Byte Array
If you assign a string to a Byte array you typically get the number equivalents of each character in pairs of the array elements. Use a loop for numeric check via the Like operator and return the joined array as string:
Function Nums(s$)
Dim by() As Byte, i&, ii&
by = s: ReDim tmp(UBound(by)) ' assign string to byte array; prepare temp array
For i = 0 To UBound(by) - 1 Step 2 ' check num value in byte array (0, 2, 4 ... n-1)
If Chr(by(i)) Like "#" Then tmp(ii) = Chr(by(i)): ii = ii + 1
Next i
Nums = Trim(Join(tmp, vbNullString)) ' return string with numbers only
End Function
Example call
Sub testByteApproach()
Dim s$: s = "a12bx99y /\:3,14159" ' [1] define original string
Debug.Print s & " => " & Nums(s) ' [2] display original string and result
End Sub
would display the original string and the result string in the immediate window:
a12bx99y /\:3,14159 => 1299314159
Based on #brettdj's answer using a VBScript regex ojbect with two modifications:
The function handles variants and returns a variant. That is, to take care of a null case; and
Uses explicit object creation, with a reference to the "Microsoft VBScript Regular Expressions 5.5" library
Function GetDigitsInVariant(inputVariant As Variant) As Variant
' Returns:
' Only the digits found in a varaint.
' Examples:
' GetDigitsInVariant(Null) => Null
' GetDigitsInVariant("") => ""
' GetDigitsInVariant(2021-/05-May/-18, Tue) => 20210518
' GetDigitsInVariant(2021-05-18) => 20210518
' Notes:
' If the inputVariant is null, null will be returned.
' If the inputVariant is "", "" will be returned.
' Usage:
' VBA IDE Menu > Tools > References ...
' > "Microsoft VBScript Regular Expressions 5.5" > [OK]
' With an explicit object reference to RegExp we can get intellisense
' and review the object heirarchy with the object browser
' (VBA IDE Menu > View > Object Browser).
Dim regex As VBScript_RegExp_55.RegExp
Set regex = New VBScript_RegExp_55.RegExp
Dim result As Variant
result = Null
If IsNull(inputVariant) Then
result = Null
Else
With regex
.Global = True
.Pattern = "[^\d]+"
result = .Replace(inputVariant, vbNullString)
End With
End If
GetDigitsInVariant = result
End Function
Testing:
Private Sub TestGetDigitsInVariant()
Dim dateVariants As Variant
dateVariants = Array(Null, "", "2021-/05-May/-18, Tue", _
"2021-05-18", "18/05/2021", "3434 ..,sdf,sfd 444")
Dim dateVariant As Variant
For Each dateVariant In dateVariants
Debug.Print dateVariant & ": ", , GetDigitsInVariant(dateVariant)
Next dateVariant
Debug.Print
End Sub
Public Function ExtractChars(strRef$) As String
'Extract characters from a string according to a range of charactors e.g'+.-1234567890'
Dim strA$, e%, strExt$, strCnd$: strExt = "": strCnd = "+.-1234567890"
For e = 1 To Len(strRef): strA = Mid(strRef, e, 1)
If InStr(1, strCnd, strA) > 0 Then strExt = strExt & strA
Next e: ExtractChars = strExt
End Function
In the immediate debug dialog:
? ExtractChars("a-5d31.78K")
-531.78
I have a requirement to remove the text between two strings.
An example of the text is:
Abc%678x”cv ","#metadata":{abxkl "DataArray"}},{"columnName":"
The requirement is to start removing text from ,"#met till "}
The requirement is to use ,"#met and "} as start and end identifiers and remove text between them including the identifiers.
There could be multiple occurrence of this start and end identifiers within the file.
The output should look like this:
Abc%678x”cv "},{"columnName":"
How to write an Excel formula or simple VBA script to remove text between two strings, including identifiers?
Formula:
=LEFT(A1,FIND(",""#met",A1)-1)&RIGHT(A1,LEN(A1)-FIND("}",A1,FIND("""#met",A1)))
VBA function:
Function RemoveBetweenSeparators( _
ByVal MyString As String, _
ByVal SepL As String, _
ByVal SepR As String) _
As String
Dim sL As String
Dim sR As String
sL = Split(MyString, SepL)(0)
sR = Replace(MyString, sL, "")
sR = Replace(sR, Split(sR, SepR)(0) & SepR, "")
RemoveBetweenSeparators = sL & sR
End Function
Which can be used like this:
=RemoveBetweenSeparators(A1,"""#meta","}")
EDIT: I also missed the 'multiple occurences' requirement, first time round! That makes it a little trickier, but try this:
Function RemoveBetweenSeparatorsMultiple( _
ByVal MyString As String, _
ByVal SepL As String, _
ByVal SepR As String) _
As String
Dim sOut As String
Dim sL As String
Do Until InStr(MyString, SepL) = 0
sL = Split(MyString, SepL)(0)
sOut = sOut & sL
MyString = Replace(MyString, sL & SepL, "", 1, 1)
sL = Split(MyString, SepR)(0)
MyString = Replace(MyString, sL & SepR, "", 1, 1)
Loop
RemoveBetweenSeparatorsMultiple = sOut & MyString
End Function
My apology, didn't notice that there could be multiple occurrence. I'll edit my answer later.
Assuming the original text is stored in A1.
A2=LEFT(A1,FIND(",""#met",A1)-1)&RIGHT(A1,LEN(A1)-FIND("""}",A1)-1)
Note: If you need to force excel treat a double quote mark as a normal text, you have to type two " for representing a ".
If there may be multiple occurrence, try this
Private Function RemoveText(ByVal tgtString As String, ByVal StartText As String, ByVal EndText As String) As String
Do While InStr(1, tgtString, StartText) > 0
tgtString = Left(tgtString, InStr(1, tgtString, StartText) - 1) & Right(tgtString, Len(tgtString) - InStr(1, tgtString, EndText) - 1)
Loop
RemoveText = tgtString
End Function
Private Sub test()
'remove certain string in A1 and store the result in A2
Range("A2").Value = RemoveText(Range("A1").Value, ",""#met", """}")
End Sub
maybe somthing like this (not tested though!) :
Function cleanedStr (inpStr as String; beginDel as string; endDel as Str) as String
Dim idx as long
Dim take as boolean
Dim outStr as String
Dim myCh as String
take = true
outStr = ""
for idx = 1 to len(inpStr)
myCh = mid(inpStr, idx, 1)
if myCh = beginDel then take = false
if take then
outStr = outStr & myCh
else
if myCh = endDel then take = true
end if
next idx
cleanedStr = outStr
end Function
Mind, the begin-identifier is 1 character only.
beginDel would be # and endDel would be }
This can be done easily using VBA and regular expressions:
Option Explicit
Function RemoveBetweenDelimiters(S As String) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.ignorecase = True
.Pattern = ",""#met[^}]+}"
RemoveBetweenDelimiters = .Replace(S, "")
End With
End Function
The regex interpretation:
,"#met[^}]+}
,"#met[^}]+}
Options: Case insensitive; ^$ match at line breaks
Match the character string “,"#met” literally ,"#met
Match any character that is NOT a “}” [^}]+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match the character “}” literally }
Created with RegexBuddy
If I have a string: "foo, bar" baz, test, blah, how do I remove a specific comma, i.e. not all of them, but just one of my choosing?
with Replace and INSTR it looks like I have not know where the comma is. The problem is, I'll only want to remove the comma if it appears between quotation marks.
So, I may want to remove the first comma and I may not.
Put more clearly, if there is a comma between a set of quotation marks, I need to remove it. if not, then there's nothing to do. But, I can't just remove all the commas, as I need the others in the string.
Try with Regexp in this way:
Sub foo()
Dim TXT As String
TXT = """foo, bar"" baz, test, blah"
Debug.Print TXT
Dim objRegExp As Object
Set objRegExp = CreateObject("vbscript.regexp")
With objRegExp
.Global = True '
.Pattern = "(""\w+)(,)(\s)(\w+"")"
Debug.Print .Replace(TXT, "$1$3$4")
End With
End Sub
It works as expected for the sample value you have provided but could require additional adjustments by changing .Pattern for more complicated text.
EDIT If you want to use this solution as an Excel function than use this code:
Function RemoveCommaInQuotation(TXT As String)
Dim objRegExp As Object
Set objRegExp = CreateObject("vbscript.regexp")
With objRegExp
.Global = True
.Pattern = "(""\w+)(,)(\s)(\w+"")"
RemoveCommaInQuotation = .Replace(TXT, "$1$3$4")
End With
End Function
Ugh. Here's another way
Public Function foobar(yourStr As String) As String
Dim parts() As String
parts = Split(yourStr, Chr(34))
parts(1) = Replace(parts(1), ",", "")
foobar = Join(parts, Chr(34))
End Function
With some error-checking for odd number of double quotes:
Function myremove(mystr As String) As String
Dim sep As String
sep = """"
Dim strspl() As String
strspl = Split(mystr, sep, -1, vbBinaryCompare)
Dim imin As Integer, imax As Integer, nstr As Integer, istr As Integer
imin = LBound(strspl)
imax = UBound(strspl)
nstr = imax - imin
If ((nstr Mod 2) <> 0) Then
myremove = "Odd number of double quotes"
Exit Function
End If
For istr = imin + 1 To imax Step 2
strspl(istr) = Replace(strspl(istr), ",", "")
Next istr
myremove = Join(strspl(), """")
End Function
I need to find numbers from a string. How does one find numbers from a string in VBA Excel?
Assuming you mean you want the non-numbers stripped out, you should be able to use something like:
Function onlyDigits(s As String) As String
' Variables needed (remember to use "option explicit"). '
Dim retval As String ' This is the return string. '
Dim i As Integer ' Counter for character position. '
' Initialise return string to empty '
retval = ""
' For every character in input string, copy digits to '
' return string. '
For i = 1 To Len(s)
If Mid(s, i, 1) >= "0" And Mid(s, i, 1) <= "9" Then
retval = retval + Mid(s, i, 1)
End If
Next
' Then return the return string. '
onlyDigits = retval
End Function
Calling this with:
Dim myStr as String
myStr = onlyDigits ("3d1fgd4g1dg5d9gdg")
MsgBox (myStr)
will give you a dialog box containing:
314159
and those first two lines show how you can store it into an arbitrary string variable, to do with as you wish.
Regular expressions are built to parse. While the syntax can take a while to pick up on this approach is very efficient, and is very flexible for handling more complex string extractions/replacements
Sub Tester()
MsgBox CleanString("3d1fgd4g1dg5d9gdg")
End Sub
Function CleanString(strIn As String) As String
Dim objRegex
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "[^\d]+"
CleanString = .Replace(strIn, vbNullString)
End With
End Function
Expanding on brettdj's answer, in order to parse disjoint embedded digits into separate numbers:
Sub TestNumList()
Dim NumList As Variant 'Array
NumList = GetNums("34d1fgd43g1 dg5d999gdg2076")
Dim i As Integer
For i = LBound(NumList) To UBound(NumList)
MsgBox i + 1 & ": " & NumList(i)
Next i
End Sub
Function GetNums(ByVal strIn As String) As Variant 'Array of numeric strings
Dim RegExpObj As Object
Dim NumStr As String
Set RegExpObj = CreateObject("vbscript.regexp")
With RegExpObj
.Global = True
.Pattern = "[^\d]+"
NumStr = .Replace(strIn, " ")
End With
GetNums = Split(Trim(NumStr), " ")
End Function
Use the built-in VBA function Val, if the numbers are at the front end of the string:
Dim str as String
Dim lng as Long
str = "1 149 xyz"
lng = Val(str)
lng = 1149
Val Function, on MSDN
I was looking for the answer of the same question but for a while I found my own solution and I wanted to share it for other people who will need those codes in the future. Here is another solution without function.
Dim control As Boolean
Dim controlval As String
Dim resultval As String
Dim i as Integer
controlval = "A1B2C3D4"
For i = 1 To Len(controlval)
control = IsNumeric(Mid(controlval, i, 1))
If control = True Then resultval = resultval & Mid(controlval, i, 1)
Next i
resultval = 1234
This a variant of brettdj's & pstraton post.
This will return a true Value and not give you the #NUM! error. And \D is shorthand for anything but digits. The rest is much like the others only with this minor fix.
Function StripChar(Txt As String) As Variant
With CreateObject("VBScript.RegExp")
.Global = True
.Pattern = "\D"
StripChar = Val(.Replace(Txt, " "))
End With
End Function
This is based on another answer, but is just reformated:
Assuming you mean you want the non-numbers stripped out, you should be able to use something like:
'
' Skips all characters in the input string except digits
'
Function GetDigits(ByVal s As String) As String
Dim char As String
Dim i As Integer
GetDigits = ""
For i = 1 To Len(s)
char = Mid(s, i, 1)
If char >= "0" And char <= "9" Then
GetDigits = GetDigits + char
End If
Next i
End Function
Calling this with:
Dim myStr as String
myStr = GetDigits("3d1fgd4g1dg5d9gdg")
Call MsgBox(myStr)
will give you a dialog box containing:
314159
and those first two lines show how you can store it into an arbitrary string variable, to do with as you wish.
Alternative via Byte Array
If you assign a string to a Byte array you typically get the number equivalents of each character in pairs of the array elements. Use a loop for numeric check via the Like operator and return the joined array as string:
Function Nums(s$)
Dim by() As Byte, i&, ii&
by = s: ReDim tmp(UBound(by)) ' assign string to byte array; prepare temp array
For i = 0 To UBound(by) - 1 Step 2 ' check num value in byte array (0, 2, 4 ... n-1)
If Chr(by(i)) Like "#" Then tmp(ii) = Chr(by(i)): ii = ii + 1
Next i
Nums = Trim(Join(tmp, vbNullString)) ' return string with numbers only
End Function
Example call
Sub testByteApproach()
Dim s$: s = "a12bx99y /\:3,14159" ' [1] define original string
Debug.Print s & " => " & Nums(s) ' [2] display original string and result
End Sub
would display the original string and the result string in the immediate window:
a12bx99y /\:3,14159 => 1299314159
Based on #brettdj's answer using a VBScript regex ojbect with two modifications:
The function handles variants and returns a variant. That is, to take care of a null case; and
Uses explicit object creation, with a reference to the "Microsoft VBScript Regular Expressions 5.5" library
Function GetDigitsInVariant(inputVariant As Variant) As Variant
' Returns:
' Only the digits found in a varaint.
' Examples:
' GetDigitsInVariant(Null) => Null
' GetDigitsInVariant("") => ""
' GetDigitsInVariant(2021-/05-May/-18, Tue) => 20210518
' GetDigitsInVariant(2021-05-18) => 20210518
' Notes:
' If the inputVariant is null, null will be returned.
' If the inputVariant is "", "" will be returned.
' Usage:
' VBA IDE Menu > Tools > References ...
' > "Microsoft VBScript Regular Expressions 5.5" > [OK]
' With an explicit object reference to RegExp we can get intellisense
' and review the object heirarchy with the object browser
' (VBA IDE Menu > View > Object Browser).
Dim regex As VBScript_RegExp_55.RegExp
Set regex = New VBScript_RegExp_55.RegExp
Dim result As Variant
result = Null
If IsNull(inputVariant) Then
result = Null
Else
With regex
.Global = True
.Pattern = "[^\d]+"
result = .Replace(inputVariant, vbNullString)
End With
End If
GetDigitsInVariant = result
End Function
Testing:
Private Sub TestGetDigitsInVariant()
Dim dateVariants As Variant
dateVariants = Array(Null, "", "2021-/05-May/-18, Tue", _
"2021-05-18", "18/05/2021", "3434 ..,sdf,sfd 444")
Dim dateVariant As Variant
For Each dateVariant In dateVariants
Debug.Print dateVariant & ": ", , GetDigitsInVariant(dateVariant)
Next dateVariant
Debug.Print
End Sub
Public Function ExtractChars(strRef$) As String
'Extract characters from a string according to a range of charactors e.g'+.-1234567890'
Dim strA$, e%, strExt$, strCnd$: strExt = "": strCnd = "+.-1234567890"
For e = 1 To Len(strRef): strA = Mid(strRef, e, 1)
If InStr(1, strCnd, strA) > 0 Then strExt = strExt & strA
Next e: ExtractChars = strExt
End Function
In the immediate debug dialog:
? ExtractChars("a-5d31.78K")
-531.78