What's the best way to keep regex matches in Excel? - excel

I'm working off of the excellent information provided in "How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops", however I'm running into a wall trying to keep the matched expression, rather than the un-matched portion:
"2022-02-14T13:30:00.000Z" converts to "T13:30:00.000Z" instead of "2022-02-14", when the function is used in a spreadsheet. Listed below is the code which was taken from "How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops". I though a negation of the strPattern2 would work, however I'm still having issues. Any help is greatly appreciated.
Function simpleCellRegex(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strPattern2 As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "^T{0-9][0-9][:]{0-9][0-9][:]{0-9][0-9][0-9][Z]"
strPattern2 = "^(19|20)\d\d([- /.])(0[1-9]|1[012])\2(0[1-9]|[12][0-9]|3[01])"
If strPattern2 <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern2
End With
If regEx.test(strInput) Then
simpleCellRegex = regEx.Replace(strInput, strReplace)
Else
simpleCellRegex = "Not matched"
End If
End If
End Function

Replace is very powerful, but you need to do two things:
Specify all the characters you want to drop, if your regexp is <myregexp>, then change it to ^.*?(<myregexp>).*$ assuming you only have one date occurrence in your string. The parentheses are called a 'capturing group' and you can refer to them later as part of your replacement pattern. The ^ at the beginning and the $ at the end ensure that you will only match one occurrence of your pattern even if Global=True. I noticed you were already using a capturing group as a back-reference - you need to add one to the back-reference number because we added a capturing group. Setting up the pattern this way, the entire string will participate in the match and we will use the capturing groups to preserve what we want to keep.
Change your strReplace="" to strReplace="$1", indicating you want to replace whatever was matched with the contents of capturing group #1.
Here is a screenprint from Excel using my RegexpReplace User Defined Function to process your example with my suggestions:
I had to fix up your time portion regexp because you used curly brackets three times where you meant square, and you left out the seconds part completely. Notice by adjusting where you start and end your capturing group parentheses you can keep or drop the T & Z at either end of the time string.
Also, if your program is being passed system timestamps from a reliable source then they are already well-formed and you don't need those long, long regular expressions to reject March 32. You can code both parts in one as
([-0-9/.]{10,10})T([0-9:.]{12,12})Z and when you want the date part use $1 and when you want the time part use $2.

Related

Delete character after ':' but not delete the rest

For example my excel column is
CodeandPrice
3&12|4&200|2&
5&|2&
4&|
2&12|35&744
With & is separation between code and price, | is separation betweeen 2 item.
I want to only get the code, so character before &.
CodeandPrice
3&|4&|2&
5&|2&
4&|
2&|35&
I ve googled but what I found is remove all character after/before. But what I want is,
remove a character after & but not all, since there will be another code.
For Excel 2010, maybe the easiest is to quickly throw together an UDF and invoke this as a function in your sheet, for example:
Function RegexReplace(s_in As String, pat As String, repl As String) As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = pat
RegexReplace = .Replace(s_in, repl)
End With
End Function
Invoke like: =RegexReplace(A2,"\d+\|","")
If one happens to have newer functionality available try:
Formula in B2:
=MAP(A2:A5,LAMBDA(a,TEXTJOIN("|",,TAKE(TEXTSPLIT(a,"&","|"),,1)&"&")))

Excel formula function to remove spaces between characters

In Excel sheet i did a form that customer need to fill out, i have a cell that the customer need to enter his Email Address, I need to data validate the cell as much i can and am nearly success this is what i did:
' this formula is for email structuring
=ISNUMBER(MATCH("*#*.???",A5,0))
' this formula to check if there is spaces at start and the end
=IF(LEN(A5)>LEN(TRIM(A5)),FALSE,TRUE)
But if i right for example (admin#ad min.com) the second formula will not detect the space between the email address, any clue?
Use SUBSTITUTE()
=IF(LEN(A5)>LEN(SUBSTITUTE(A5," ","")),FALSE,TRUE)
How about:
=IF(LEN(A5)>LEN(SUBSTITUTE(A5," ","")),FALSE,TRUE)
based on Jeeped's comment:
=A5=SUBSTITUTE(A5," ","")
You can use VBA to perform validation using regular expressions - after removing any spaces.
Option 1
Returning a Boolean True/False
Public Function validateEmail(strEmail As String) As Boolean
' Remove spaces
strEmail = Replace(strEmail, " ", "")
' Validate email using regular expressions
With CreateObject("VBScript.RegExp")
.ignorecase = True
.Pattern = "^[-.\w]+#[-.\w]+\.\w{2,5}$"
If .test(strEmail) Then validateEmail = True
End With
End Function
This can be used as a normal worksheet function such as:
=validateEmail("yourEmail#test.com")
=validateEmail($A1)
Can also be used in VBA as well
debug.print validateEmail("yourEmail#test.com")
Option 2
Returning the email itself, or return False
If you would prefer that it returns the validated email instead of a Boolean (true/False), then you can do something like:
Public Function validateEmail(strEmail As String) As Variant
' Remove spaces
strEmail = Replace(strEmail, " ", "")
' Validate email using regular expressions
With CreateObject("VBScript.RegExp")
.ignorecase = True
.Pattern = "^[-.\w]+#[-.\w]+\.\w{2,5}$"
If .test(strEmail) Then
validateEmail = strEmail
Else
validateEmail = False
End If
End With
End Function
So, using in a worksheet function for example, using =validateEmail("yourEmail # test.com") will return the string: yourEmail#test.com. However, if the email is invalid such as validateEmail("yourEmailtest.com") then it will return False.
Why use Regular Expressions? Checking for a simple # in the string to validate an email is only a minimal workaround. A string input such as ()#&&*^$#893---------6584.ido would match your =ISNUMBER(MATCH("*#*.???",A5,0)) formula, yet that is obviously not a valid email. Obviously there is no way to 100% validate an email - however, this does a decent job at at the very least ensuring the email could be valid.

Excel: Extract text from cell where text is always #.#

I have a bunch of text in cells but many of the cells contain some text in the format of #.# (where # is actually a number from 0-9).
I'm using this formula which works okay, but sometimes there is junk in the cell that causes the formula to return the wrong information.
=MID(B7,(FIND({"."},B7,1)-1),3)
For instance, sometimes a cell contains: "abc (1st. list) testing 8.7 yay". Thus I end up with t. instead of the desired 8.7.
Any ideas?
Thank you!
Here is a User Defined Function that will return a numeric pattern in the string if and only if it matches the pattern you describe. If the pattern you describe is not exactly representative, you'll need to provide a better example:
Option Explicit
Function reValue(S As String)
Dim RE As Object, MC As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = "\b\d\.\d\b"
If .test(S) = True Then
Set MC = .Execute(S)
reValue = CDbl(MC(0))
Else
reValue = ""
End If
End With
End Function

Trying to parse excel string

I am trying to parse a string from teamspeak. I am new to the functions of excel. I have accomplished this with php but I am driving myself nuts excel. This is the string I am trying to parse:
[URL=client://4792/noEto+VRGdhvT9/iV375Ck1ZIfo=~Rizz]Rizz[/URL]
This is what I have accomplished so far:
=TRIM(MID(B22, 15, FIND("=",B22,12) - FIND("//",B22)))
which returns
4792/noEto+VRGdhvT9/iV375Ck1ZIfo=~
I am trying to get it to return:
noEto+VRGdhvT9/iV375Ck1ZIfo=
Any suggestions? I am looked of splitting of strings and the phrasing is just really confusing. Any help would be appriciated.
Paste the URL in A3, then this formula in B3. You can adjust the cell references as needed. It's a lot of nested functions, but it works.
=left(right(A3, len(A3)-find("/",A3,find("//",A3,1)+2)),find("=",right(A3, len(A3)-find("/",A3,find("//",A3,1)+2)),1))
Or you can use a user-defined function in VBA:
Function RegexExtract(myRange As Range) As String
'VBA Editor, menu Tools - References, add reference to Microsoft VBScript Regular Expressions 5.5
Dim regex As New RegExp, allMatches As MatchCollection
With regex
.Global = True
.pattern = "\d+/(.+=)"
End With
Set allMatches = regex.Execute(myRange.Value)
With allMatches
If .Count = 1 Then
RegexExtract = .Item(0).SubMatches(0)
Else
RegexExtract = "N/A"
End If
End With
End Function
Then use it as formula:
=RegexExtract(A1)
I am trying to parse a string
For that:
=MID(A1,20,28)
works.
Now if you have more than one string maybe the others are not of an identical pattern, so the above might not work for them. But in that case if to help you we'd need to know something about the shape of the others wouldn't we.

Identify pattern in words

I have a question I believe is quite simple, but I don't know the proper way to do it.
Basically, I would like my program to be able to identify words with a certain pattern in it, and if so, to extract what's before the pattern.
The pattern would be, in this case /F, specifically at the end of the word, and it would extract what's before.
For example, if the program finds 21/F, it will identify it as a good match and will extract 21. But if the word was 21/Fudge, it wouldn't do anything.
Do you know the way to look for a match at a specific position in the word?
I would do:
If str Like "*/F" Then
before=Left(str, Len(str)-len("/F"))
Else
'No match!
End If
I would use a regular expression, something like this:
\b\w+?(\d+)\/F\b
This will help you match any digits before "/F" and ignore the rest of the word. In order to use it in VBA you will need to add a reference to 'Microsoft VBScript Regular Expressions 5.5' and here's the VBA behind this. Pattern is "\b\w+?(\d+)/F\b"
Public Sub Extract(Pattern as String, Text as String)
Dim regEx As VBScript_RegExp_55.RegExp
Dim matches As VBScript_RegExp_55.MatchCollection
Set regEx = CreateObject("VBScript.RegExp") ' Create a regular expression.
regEx.Pattern = Pattern
Set matches = regEx.Execute(Text)
Dim i as Long
For i = 0 To (matches.Count - 1)
Debug.Print Matches.Item(i)
Next i
End Sub
Hope this helps.

Resources