In Excel VBA, how to replace all sub-strings of xyz(*)in a string which contains several instances of this sub-string?
* in xyz(*) means every thing in between the two parenthesis. For example the string is "COVID-19 xyz(aaa) affects xyz(bbbbbb) so much families." This changes to "COVID-19 affects so much families."
You should use a regular expression.
for example:
Sub a()
Dim Regex As New RegExp
Dim SubjectString As String
SubjectString = "COVID-19 xyz(test) affects xyz(test) so much, families."
With Regex
.Global = True
.Pattern = "(\sxyz(\S*))"
End With
Dim ResultString As String
ResultString = Regex.Replace(SubjectString, "")
MsgBox (ResultString)
End Sub
the first \s used to grab 1 whitespace before the xyz, so when you delete replace, it won't leave 2 white spaces. <br> then looking for the string xyz and the opening parenthesis, inside it I look for \S which is any char and * means 0 or more times and then I look for the closing parenthesis.
here's a solution avoiding regexp, which I tend to avoid whenever possible and convenient (as this case seems to me)
Dim s As String
s = "COVID-19 xyz(aaa) affects xyz(bbbbbb) so much families."
Dim v As Variant
For Each v In Filter(Split(s, " "), "xyz(")
s = Replace(s, v & " ", vbNullString)
Next
I got the use of Filter() from this post
Related
I have a column of Hexadecimal strings with many TRAILING zeros.
The problem i have is that the trailing Zeros from the string, needs to be removed
I have searched for a VBA formula such as Trim but my solution has not worked.
Is there a VBA formula I can use to remove all these Trailing zeros from each of the strings.
An example of the HEX string is 4153523132633403277E7F0000000000000000000000000000. I would like to have it in a format of 4153523132633403277E7F
The big issue is that the Hexadecimal strings can be of various lengths.
Formula:
You could try:
Formula in B1:
=LET(a,TEXTSPLIT(A1,,"0"),TEXTJOIN("0",0,TAKE(a,XMATCH("?*",a,2,-1))))
This would TEXTSPLIT() the input and the fact that we can then use XMATCH() to return the position of the last non-empty string with a wildcard match ?*. However, given the fact we can use arrays in our TEXTSPLIT() function, a little less verbose could be:
=TEXTBEFORE(A1,TAKE(TEXTSPLIT(A1,TEXTSPLIT(A1,"0",,1)),,-1),-1)
Or another option, though more verbose, is to use REDUCE() for what it's intended to do, which is to loop a given array:
=REDUCE(A1,SEQUENCE(LEN(A1)),LAMBDA(a,b,IF(RIGHT(a)="0",LEFT(a,LEN(a)-1),a)))
VBA:
If VBA is a must, one way of dealing with this is through the RTrim() function. Since your HEX-string should not contain spaces to begin with I think the following is a safe bet:
Sub Test()
Dim s As String: s = "4153523132633403277E7F0000000000000000000000000000"
Dim s_new As String
s_new = Replace(RTrim(Replace(s, "0", " ")), " ", "0")
Debug.Print s_new
End Sub
If you happen to have spaces anywhere else in your string, another option would be to look for trailing zero's using a regular expression:
Sub Test()
Dim s As String: s = "4153523132633403277E7F0000000000000000000000000000"
Dim s_new As String
With CreateObject("vbscript.regexp")
.Pattern = "0+$"
s_new = .Replace(s, "")
End With
Debug.Print s_new
End Sub
Both the above options should print: 4153523132633403277E7F
As far as I know, there is no function to do that for you. The way I would do it is presented in the pseudo-code below:
while last character is "0"
remove last character
end while
It is quit slow, but VBA itself is not race car either, so you will probably not notice especially if you do not need to that for many times at once.
A more beautiful solution would involve VBA being able to search for the beginning or the end of a string.
An improvement of the solution above is to parse the string backwards and count the "0" characters, and then remove them all at the same time.
I have a String in VBA with this text: < History Version="1.10" Client="TestClient001" >
I want to get this TestClient001 or anything that's inside Client="xxxx"
I made this code but it's not working
Client = MID(text,FIND("Client=""",text)+1,FIND("""",text)-FIND("Client=""",text)-1)
Is there a way to specifically get the text inside Client="xxxx"?
There's no such function as Find in VBA - that's a worksheet function. The VBA equivalent is InStr, but I don't think you need to use it here.
The best tool for extracting one string from another in VBA is often Split. It takes one string and splits it into an array based on a delimiting string. The best part is that the delimiter doesn't have to be a single character - you can make it an entire string. In this case, we'd probably do well with two nested Split functions.
Client = Split(Split(text,"Client=""")(1),Chr(34))(0)
The inner Split breaks your text string where it finds "Client="". The (1) returns array element 1. Then the outer Split breaks that returned text where it finds a " character, and returns array element 0 as the final result.
For better maintainability, you may want to use constants for your delimiters as well.
Sub EnclosedTextTest()
Const csFlag1 As String = "Client="""
Const csFlag2 As String = """"
Const csSource As String = "< History Version=""1.10"" Client=""TestClient001"" >"
Dim strClient As String
strClient = Split(Split(csSource, csFlag1)(1), csFlag2)(0)
Debug.Print strClient
End Sub
However, if the Split method doesn't work for you, we can use a method similar to the one you were using, with InStr. There are a couple of options here as well.
InStr will return the position in a string that it finds a matching value. Like Split, it can be given an entire string as its delimiter; however, if you use more than one character you need to account for the fact that it will return where it finds the start of that string.
InStr(1,text,"Client=""")
will return 26, the start of the string "Client="" in the text. This is one of the places where it's helpful to have your delimiter stored in a constant.
intStart = InStr(1,text,csFlag1)+len(csFlag1)
This will return the location it finds the start of the delimiter, plus the length of the delimiter, which positions you at the beginning of the text.
If you store this position in a variable, it makes the next part easier as well. You can use that position to run a second InStr and find the next occurrence of the " character.
intEnd = InStr(intStart,text,csFlag2)
With those values, you can perform your mid. You code overall will look something like this:
Sub InstrTextTest()
Const csFlag1 As String = "Client="""
Const csFlag2 As String = """"
Const csSource As String = "< History Version=""1.10"" Client=""TestClient001"" >"
Dim strClient As String
Dim intPos(0 To 1) As Integer
intPos(0) = InStr(1, csSource, csFlag1) + Len(csFlag1)
intPos(1) = InStr(intPos(0), csSource, csFlag2)
strClient = Mid(csSource, intPos(0), intPos(1) - intPos(0))
Debug.Print strClient
End Sub
This will work, but I prefer the Split method for ease of reading and reuse.
You can make use of Split function to split at character = then with last element of the resulting array remove character quotes and > with help of replace function and you will get the required output.
In the end I got it thanks to the idea given by #alok and #Bigben
Dim cl() As String
Dim ClientCode As String
If (InStr(1, temp, "Client=", vbTextCompare) > 0) Then
cl = Split(temp, "=")
ClientCode = cl(UBound(cl))
ClientCode = Replace(ClientCode, """", "")
ClientCode = Replace(ClientCode, ">", "")
It's XML, so you could do this:
Dim sXML As String
sXML = "<History Version=""1.10"" Client=""TestClient001"">"
With CreateObject("MSXML.Domdocument")
.LoadXML Replace(sXML, ">", "/>") 'close the element
Debug.Print .FirstChild.Attributes.getnameditem("Client").Value
End With
This question already has answers here:
Extracting digits from a cell with varying char length
(4 answers)
Closed 2 years ago.
I need to be able to remove all alphabetical characters from a string, leaving just the numbers behind.
I don't need to worry about any other characters like ,.?# and so on, just the letters of the alphabet a-z, regardless of case.
The closest I could get to a solution was the exact opposite, the below VBA is able to remove the numbers from a string.
Function removenumbers(ByVal input1 As String) As String
Dim x
Dim tmp As String
tmp = input1
For x = a To Z
tmp = Replace(tmp, x, "")
Next
removenumbers = tmp
End Function
Is there any modification I can make to remove the letters rather than numbers to the above, or am I going at this completely wrong.
The letters could fall anywhere in the string, and there is no pattern to the strings.
Failing this I will use CTRL + H to remove all letters one by one, but may need to repeat this again each week so UDF would be much quicker.
I'm using Office 365 on Excel 16
Option Explicit
dim mystring as String
dim regex as new RegExp
Private Function rgclean(ByVal mystring As String) As String
'function that find and replace string if contains regex pattern
'returns str
With regex
.Global = True ' return all matches found in string
.Pattern = "[A-Z]" ' add [A-Za-z] if you want lower case as well the regex pattern will pick all letters from A-Z and
End With
rgclean = regex.Replace(mystring, "") '.. and replaces everything else with ""
End Function
Try using regular expression.
Make sure you enable regular expression on: Tools > References > checkbox: "Microsoft VBScript Regular Expressions 5.5"
The function will remove anything from [A-Z], if you want to include lower case add [A-Za-z] into the regex.pattern values. ( .Pattern = "[A-Za-z]")
You just pass the string into the function, and the function will use regular expression to remove any words from in a string
Thanks
is there a way to check if the string begins with any 4 letters. I am looking for something like this:
If string like "####*" then
'DO STUFF
end if
"#" is for digits, I need the same thing but for letters only.
Can this be done without regEx?
I don't know a way to do this without using regular expressions. We can try using regex Test along with the pattern ^[A-Z]{4}.*$:
Dim input As String
Dim regex As Object
Set regex = New RegExp
regex.Pattern = "^[A-Z]{4}.*$"
input = "ABCD blah"
If regex.Test(input) Then
'DO STUFF
End If
You can do it with Like almost the same as with RegEx.
"{#}" - doesn't exist in Like operators, but "[A-Z]" absolutely valid
if string like "[A-Z][A-Z][A-Z][A-Z]*" then
'DO STUFF
end if
Can this be done without regEx?
Yes, there is no specific need for Regular Expressions since the Like operator is quite capable as some sort of last resort to handle the situation, just like the writer of this article explains. Also, RegEx is sort of slow on a larger database. Nonetheless, RegEX is a great tool to use!
The solution provided by #AlexandruHapco would tell you if the string starts with 4 capital letters. But to account for lower OR upper, you can extend this logic:
If str Like "[a-zA-Z][a-zA-Z][a-zA-Z][a-zA-Z]*" Then
However, to shorten this a bit we can use [!charlist] to tell the operator we are looking for something that is NOT in the provided range. In other words, we could use:
If str Like "[!0-9][!0-9][!0-9][!0-9]*" Then
This last solution won't work when your string has any other characters than alphanumeric ones.
Approach using the FilterXML function
The WorksheetFunction FilterXML() has been added in ►Excel 2013 and allows to specify any XPath search string for a given XML document, which hasn't to be a locally saved file (needing WebService() function), but can be a string within well formed opening and closing nodes, i.e. our test string with some easy node additions (partly comparable to a html structure).
Example call
Sub TextXML()
Dim myString As String
myString = "ABCD blah"
If check(myString) Then
'DO STUFF
Debug.Print "okay"
Else
Debug.Print "oh no"
End If
End Sub
Help function
Function check(ByVal teststring As String) As Boolean
Const s As String = Chr(185) ' unusual character, e.g. Chr(185): "¹"
On Error GoTo oops
If Len(WorksheetFunction.FilterXML("<all><i>" & teststring & "</i></all>", "//i[substring(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','" & _
String(26, s) & "'),1,4)='" & String(4, s) & "']")) > 0 Then check = True
Exit Function
oops:
Err.Clear
End Function
tl;tr - how to use VBA in Excel versions before 2013
For the sake of the art the classic way to use XPath via XMLDOM methods:
Example call
Sub TextXML2()
Dim myString As String
myString = "ABCD blah"
If check2(myString) Then
'DO STUFF
Debug.Print "okay"
Else
Debug.Print "oh no"
End If
End Sub
Help functions
Function check2(ByVal teststring As String) As Boolean
' Purpose: check if first 4 characters of a test string are upper case letters A-Z
' [0] late bind XML document
Dim xDoc As Object
Set xDoc = CreateObject("MSXML2.DOMDocument.6.0")
' [1] form XML string by adding opening and closing node names ("tags")
teststring = "<all><i>" & teststring & "</i></all>"
' [2] load XML
If xDoc.LoadXML(teststring) Then
' [3a] list matching item(s) via XPath
Dim myNodeList As Object
Set myNodeList = xDoc.SelectNodes(XPath())
'Debug.Print teststring, " found: " & myNodeList.Length
' [3b] return true if the item matches, i.e. the list length is greater than zero
If myNodeList.Length > 0 Then check2 = True
End If
End Function
Function XPath() As String
' Purpose: create XPath string to get nodes where the first 4 characters are upper case letters A-Z
' Result: //i[substring(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹'),1,4)="¹¹¹¹"]
' get UPPER case alphabet
Const ABC As String = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
' define replacement string consisting of an unusual character repeated 26 times
Const UNUSUAL As String = "¹" ' << replace by your preferenced character
Dim replacement As String: replacement = String(Len(ABC), UNUSUAL)
'return XPath string
XPath = "//i[substring(translate(.,'" & ABC & "','" & replacement & "'),1,4)=""" & String(4, UNUSUAL) & """]"
End Function
To test a few characters -- the first 4 letters in this case -- you can always do the following:
If Not (Mid(string, 1, 1) Like "#" And Mid(string, 2, 1) Like "#" _
And Mid(string, 3, 1) Like "#" And Mid(string, 4, 1) Like "#") Then
' DO STUFF
End If
It's a bit more to type then when using the Like operator, but so what? Also, you can use Select Case in a loop...
Another option is to use IsNumeric(Mid(string, i, 1)) instead of Mid(string, i, 1) Like "#", etc.
Granted, this approach is still quite practical with 4 characters, but is not as flexible and very much not scalable like RegEx is.
I am currently building a numberplate checker on an excel spread sheet that will determine if the letters and numbers of the numberplate are in the correct places and are valid.
The 3 criteria I have are if the numberplates are in on of these formulas:
(I have represented a number as 1 and a letter as A)
AAA111A
A111AAA
AA11AAA
The ultimate objective is for the program to ask the question "Look at these number plates, do they follow a format as shown above."
So far I have only been able to check to see if I have numbers in certain places, however I cannot specify the characters A - Z when trying to do a search function from the left, right and centre.
=ISNUMBER(--MID(A3,1,3))
If I wanted to search within a cell for example, the first character, is it a letter a-z, return true or false? How would I go about doing this?
An example in this instance might be:
DJO148R
The formula
=ISNUMBER(--MID(A5,4,3))
This would turn back as true because the 4th character is a number and so are the next 2.
With the same numberplate, how do I change it to search for letters rather than numbers within the numberplate?
Here is a simpler RegEx implementation. Make sure you include references to Microsoft VBScript Regular Expressions 5.5. This will go in a new inserted module
Function PlateCheck(cell As Range) As Boolean
Dim rex As New RegExp
rex.Pattern = "[A-Z][0-9|A-Z][0-9|A-Z][0-9|A-Z][0-9|A-Z][0-9|A-Z][A-Z]"
If rex.Test(cell.Value) Then
PlateCheck = True
Else
PlateCheck = False
End If
End Function
As per the guys comments, here's how you do it with regex:
Make sure to include MS VB regular expressions 5.5 as a reference.
To do that, in your VBA IDE, go Tools, Reference and then look the regex reference.
Then Add this in a new module:
Function VerifyLicensePlate(ip As Range) As String
Dim regex As New RegExp
Dim inputstr As String: inputstr = ip.Value
With regex
.Global = True
.IgnoreCase = True
End With
Dim strpattern(2) As String
strpattern(0) = "[A-Z][A-Z][A-Z][0-9][0-9][0-9][A-Z]"
strpattern(1) = "[A-Z][A-Z][0-9][0-9][A-Z][A-Z][A-Z]"
strpattern(2) = "[A-Z][0-9][0-9][0-9][A-Z][A-Z][A-Z]"
For i = 0 To 2
regex.pattern = strpattern(i)
If regex.Test(inputstr) Then
VerifyLicensePlate = "Match"
Exit Function
Else
VerifyLicensePlate = "No match"
End If
Next
End Function
Output:
Occam's Razor would suggest,
=NOT(ISNUMBER(--MID(A5,4,3)))
... or,
=ISERROR(--MID(A5,4,3))
Here's a version that uses late-binding, so no need to set a reference. IT is case insensitive, as that seemed to be implied in your question, but that is easily changed.
Option Explicit
Function MatchPattern(S As String) As Boolean
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = "\b(?:[A-Z]{3}\d{3}[A-Z]|[A-Z]{2}\d{2}[A-Z]{3}|[A-Z]\d{3}[A-Z]{3})\b"
.ignorecase = True
MatchPattern = .test(S)
End With
End Function
But, as pointed out by G Serg, you don't really need regex for this:
Option Explicit
Option Compare Text 'Case Insensitive
Function MatchPattern(S As String) As Boolean
Const S1 As String = "[A-Z][A-Z][A-Z]###[A-Z]"
Const S2 As String = "[A-Z]###[A-Z][A-Z][A-Z]"
Const S3 As String = "[A-Z][A-Z]##[A-Z][A-Z][A-Z]"
MatchPattern = False
If Len(S) = 7 Then
If S Like S1 Or _
S Like S2 Or _
S Like S3 Then _
MatchPattern = True
End If
End Function
Here is a rather complicated formula that seems to match your specifications:
=AND(LEN(A1)=7,
OR(MMULT(--(CODE(MID(A1,{1,2,3,4,5,6,7},1))>64),--(TRANSPOSE(CODE(MID(A1,{1,2,3,4,5,6,7},1))<91)))={4,5}),
CODE(LEFT(A1,1))>64,CODE(LEFT(A1,1))<91,
CODE(RIGHT(A1,1))>64,CODE(RIGHT(A1,1))<91,
ISNUMBER(-MID(A1,MIN(FIND({1,2,3,4,5,6,7,8,9,0},A1&"0123456789")),
7-MMULT(--(CODE(MID(A1,{1,2,3,4,5,6,7},1))>64),--(TRANSPOSE(CODE(MID(A1,{1,2,3,4,5,6,7},1))<91))))))
Ensure we have only seven characters
The OR(MMULT... function counts the number of letters and returns TRUE if four or five.
Check to make sure first and last character is a letter
There should remain a consecutive string of either two or three digits (seven less the number of letters)
If you want to make the formula case insensitive, replace the instances of A1 with UPPER(A1)
I think the UDF solution is better.