Excel formula function to remove spaces between characters - excel

In Excel sheet i did a form that customer need to fill out, i have a cell that the customer need to enter his Email Address, I need to data validate the cell as much i can and am nearly success this is what i did:
' this formula is for email structuring
=ISNUMBER(MATCH("*#*.???",A5,0))
' this formula to check if there is spaces at start and the end
=IF(LEN(A5)>LEN(TRIM(A5)),FALSE,TRUE)
But if i right for example (admin#ad min.com) the second formula will not detect the space between the email address, any clue?

Use SUBSTITUTE()
=IF(LEN(A5)>LEN(SUBSTITUTE(A5," ","")),FALSE,TRUE)

How about:
=IF(LEN(A5)>LEN(SUBSTITUTE(A5," ","")),FALSE,TRUE)
based on Jeeped's comment:
=A5=SUBSTITUTE(A5," ","")

You can use VBA to perform validation using regular expressions - after removing any spaces.
Option 1
Returning a Boolean True/False
Public Function validateEmail(strEmail As String) As Boolean
' Remove spaces
strEmail = Replace(strEmail, " ", "")
' Validate email using regular expressions
With CreateObject("VBScript.RegExp")
.ignorecase = True
.Pattern = "^[-.\w]+#[-.\w]+\.\w{2,5}$"
If .test(strEmail) Then validateEmail = True
End With
End Function
This can be used as a normal worksheet function such as:
=validateEmail("yourEmail#test.com")
=validateEmail($A1)
Can also be used in VBA as well
debug.print validateEmail("yourEmail#test.com")
Option 2
Returning the email itself, or return False
If you would prefer that it returns the validated email instead of a Boolean (true/False), then you can do something like:
Public Function validateEmail(strEmail As String) As Variant
' Remove spaces
strEmail = Replace(strEmail, " ", "")
' Validate email using regular expressions
With CreateObject("VBScript.RegExp")
.ignorecase = True
.Pattern = "^[-.\w]+#[-.\w]+\.\w{2,5}$"
If .test(strEmail) Then
validateEmail = strEmail
Else
validateEmail = False
End If
End With
End Function
So, using in a worksheet function for example, using =validateEmail("yourEmail # test.com") will return the string: yourEmail#test.com. However, if the email is invalid such as validateEmail("yourEmailtest.com") then it will return False.
Why use Regular Expressions? Checking for a simple # in the string to validate an email is only a minimal workaround. A string input such as ()#&&*^$#893---------6584.ido would match your =ISNUMBER(MATCH("*#*.???",A5,0)) formula, yet that is obviously not a valid email. Obviously there is no way to 100% validate an email - however, this does a decent job at at the very least ensuring the email could be valid.

Related

What's the best way to keep regex matches in Excel?

I'm working off of the excellent information provided in "How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops", however I'm running into a wall trying to keep the matched expression, rather than the un-matched portion:
"2022-02-14T13:30:00.000Z" converts to "T13:30:00.000Z" instead of "2022-02-14", when the function is used in a spreadsheet. Listed below is the code which was taken from "How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops". I though a negation of the strPattern2 would work, however I'm still having issues. Any help is greatly appreciated.
Function simpleCellRegex(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strPattern2 As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "^T{0-9][0-9][:]{0-9][0-9][:]{0-9][0-9][0-9][Z]"
strPattern2 = "^(19|20)\d\d([- /.])(0[1-9]|1[012])\2(0[1-9]|[12][0-9]|3[01])"
If strPattern2 <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern2
End With
If regEx.test(strInput) Then
simpleCellRegex = regEx.Replace(strInput, strReplace)
Else
simpleCellRegex = "Not matched"
End If
End If
End Function
Replace is very powerful, but you need to do two things:
Specify all the characters you want to drop, if your regexp is <myregexp>, then change it to ^.*?(<myregexp>).*$ assuming you only have one date occurrence in your string. The parentheses are called a 'capturing group' and you can refer to them later as part of your replacement pattern. The ^ at the beginning and the $ at the end ensure that you will only match one occurrence of your pattern even if Global=True. I noticed you were already using a capturing group as a back-reference - you need to add one to the back-reference number because we added a capturing group. Setting up the pattern this way, the entire string will participate in the match and we will use the capturing groups to preserve what we want to keep.
Change your strReplace="" to strReplace="$1", indicating you want to replace whatever was matched with the contents of capturing group #1.
Here is a screenprint from Excel using my RegexpReplace User Defined Function to process your example with my suggestions:
I had to fix up your time portion regexp because you used curly brackets three times where you meant square, and you left out the seconds part completely. Notice by adjusting where you start and end your capturing group parentheses you can keep or drop the T & Z at either end of the time string.
Also, if your program is being passed system timestamps from a reliable source then they are already well-formed and you don't need those long, long regular expressions to reject March 32. You can code both parts in one as
([-0-9/.]{10,10})T([0-9:.]{12,12})Z and when you want the date part use $1 and when you want the time part use $2.

How do you lookup a value in Excel against a list of wildcard possibilities? AHRI wildcard information

I've been racking my brain for the past few weeks trying to "decode" a list of products and determine an easy way to check a value against what I would best describe as an "input mask."
The table that contains the value I'm looking for contains a value that has multiple wildcards.
Ex. 48FC**04***5* or 48LC**04*2*1A*****
The value I'd be searching for is an active/inactive flag in the same table as the wildcard. But, I only have the actual string in the cell of my lookup value. Ex. 48FC000406850
The wildcards are not in the same places most of the time, and the length of each string varies throughout.
The approximation parameters in VLOOKUP, XLOOKUP, and INDEX/MATCH will get me there 80% of the way, but I can't afford this to be less than 100% accurate. It's also too difficult to predefine every value; there are 130,000 unique "masks".
In the table below, I used the approximation for VLOOKUP and it returned the wrong result.
String
Active/Inactive
Lookup Value
Result
48FC**04***1*
Active
48FC0660001
48FC**06***6*
48FC**04***3*
Active
Formula
Expected Result
48FC**04***5*
Active
=VLOOKUP(C2,B:B,1,1)
48FC**06***1*
48FC**04***6*
Active
48FC**05***1*
Active
48FC**05***3*
Active
48FC**05***5*
Active
48FC**05***6*
Active
48FC**06***1*
Inactive
48FC**06***3*
Inactive
48FC**06***5*
Inactive
48FC**06***6*
Inactive
Replace multiple "*" by multiple "?" (wilcards as usual).
To check if values match your masks:
' Returns true if pSearched matches pPattern with wildcards
' Needs reference "Microsoft VBScript Regular Expressions 5.5"
Function RegexMatch(ByVal pSearched As String, ByVal pPattern As String, _
Optional ByVal pIgnoreCase As Boolean = True) As Boolean
Dim rRegex As Object
Set rRegex = CreateObject("VBScript.RegExp")
pPattern = Replace(pPattern, ".", "[.]")
pPattern = Replace(pPattern, "*", ".*")
pPattern = Replace(pPattern, "?", ".")
With rRegex
.IgnoreCase = pIgnoreCase
.Pattern = pPattern
End With
RegexMatch = rRegex.test(pSearched)
End Function
' ------------------------------------------------------------------------
Sub TestMe()
Debug.Print RegexMatch("48FC660001", "48FC??0001*") 'True
Debug.Print RegexMatch("48FC660001", "48FC*000*") 'True
End Sub
In D2, enter formula :
=INDEX(A2:A13,MATCH(1,INDEX(0+ISNUMBER(SEARCH(A2:A13,C2)),0),0)

VBA - Parsing Date from Free Form Text String

I am attempting to parse out clean target DATES from cells populated with free form TEXT STRINGS.
ie: TEXT STRING: "ETA: 11/22 (Spring 4.5)" or "ETA 10/30/2019 EOD"
As you can see, there is no clear standard for the position of the date in the string, rendering LEFT or RIGHT formulas futile.
I tried leveraging a VBA function that I found which essentially breaks up the string into parts based on spaces in the string; however it has not been working.
Public Function GetDate(ResNotes As String) As Date
Dim TarDate As Variant
Dim part As Variant
TarDate = Split(ResNotes, " ")
For Each part In ResNotes
If IsDate(part) = True Then
GetDate = part
Exit Function
End If
Next
GetDate = "1/1/2001"
End Function
I'm referring to the cells with text strings as "ResNotes", short for "Resolution Notes" which is the title of the column
"TarDate" refers to the "Target Date" that I am trying to parse out
The result of the custom GETDATE function in Excel gives me a #NAME? error.
I expected the result to give me something along the lines of "10/30/2019"
Unless you need VBA for some other part of your project, this can also be done using worksheet formulas:
=AGGREGATE(15,6,DATEVALUE(MID(SUBSTITUTE(A1," ",REPT(" ",99)),seq_99,99)),1)
where seq_99 is a named formula and refers to:
=IF(ROW($A$1:INDEX($A:$A,255,1))=1,1,(ROW($A$1:INDEX($A:$A,255,1))-1)*99)
*seq_99 generates an array of numbers {1;99;198;297;396;495;...
Format the cell with the formula as a Date of some type.
If there are no dates, it will return an error which you can either leave, or wrap the function in an IFERROR(your_formula,your_error_message)
Algorithm
Split the cell on the spaces
Replace each space with 99 spaces
Using the MID function, return an array of substrings 99 characters long
Apply the DATEVALUE function which will return either an error (if the substring is not a date) or a date serial number.
Since dates in Excel are serial numbers since 1/1/1900, we can use the AGGREGATE function to pick out a value, and ignore errors.
If you are getting #NAME then the code is not stored in a general module. It should NOT be in a worksheet module or ThisWorkbook module.
Also there are few errors in the code. Split returns a String Array. And since IsDate returns TRUE/FALSE the = True is not needed.
As per #MathieuGuindon we can change the string to a date in the code if found and return an error if not. For that we need to allow the return to be a variant.
Public Function GetDate(ResNotes As String)
Dim TarDate() As String
Dim part As Variant
TarDate = Split(ResNotes, " ")
For Each part In TarDate
If IsDate(part) Then
GetDate = CDate(part)
Exit Function
End If
Next
GetDate = "1/1/2001"
'Instead of a hard coded date, one can return an error, just use the next line instead
'GetDate =CVErr(xlErrValue)
End Function
Approach isolating the date string via Filter function
Just for fun another approach demonstrating the use of the Filter function in combination with Split to isolate the date string and split it into date tokens in a second step; finally these tokens are transformed to date using DateSerial:
Function getDat(rng As Range, Optional ByVal tmp = " ") As Variant
If rng.Cells.count > 1 Then Set rng = rng.Cells(1, 1) ' allow only one cell ranges
If Len(rng.value) = 0 Then getDat = vbNullString: Exit Function ' escape empty cells
' [1] analyze cell value; omitted year tokens default to current year
' (valid date strings must include at least one slash, "11/" would be interpreted as Nov 1st)
tmp = Filter(Split(rng.Value2, " "), Match:="/", include:=True) ' isolate Date string
tmp = Split(Join(tmp, "") & "/" & Year(Now), "/") ' split Date tokens
' [2] return date
Const M% = 0, D% = 1, Y& = 2 ' order of date tokens
getDat = VBA.DateSerial(Val(tmp(Y)), Val(tmp(M)), _
IIf(tmp(D) = vbNullString, 1, Val(tmp(D))))
End Function

how to format many emails with regex using vba

With vba, i want to validate many emails between then with semicolon,every mail must end with #customercurrency.com and user can put 2 or 3 or 4 or many emails as he want.
Example : aung#customercurrency.com;thet#customercurrency.com;htoo#customercurrency.com
My code is here.But it might be something wrong.
Public Function ValidateEmailAddressWithSemi(ByRef strEmailAddress As String) As Boolean
'Create Regular expression object
Dim objRegExp As New RegExp
'Set Case insensitive
objRegExp.IgnoreCase = True
objRegExp.pattern = "^\s?([_a-z0-9-]+(.[a-z0-9-]+)#customconcurrency.com)+([;.]([_a-z0-9-]+(.[a-z0-9-]+)#customconcurrency.com)*$"
ValidateEmailAddress = objRegExp.Test(strEmailAddress)
End Function
try this pattern :
"^\s?([_a-z0-9-]+(.[a-z0-9-]+)#customercurrency.com)+([;.]([_a-z0-9-]+(.[a-z0-9-]+)#customercurrency.com))*$"
(mistake in the domain name and a paranthesis is missing)

Excel: Extract text from cell where text is always #.#

I have a bunch of text in cells but many of the cells contain some text in the format of #.# (where # is actually a number from 0-9).
I'm using this formula which works okay, but sometimes there is junk in the cell that causes the formula to return the wrong information.
=MID(B7,(FIND({"."},B7,1)-1),3)
For instance, sometimes a cell contains: "abc (1st. list) testing 8.7 yay". Thus I end up with t. instead of the desired 8.7.
Any ideas?
Thank you!
Here is a User Defined Function that will return a numeric pattern in the string if and only if it matches the pattern you describe. If the pattern you describe is not exactly representative, you'll need to provide a better example:
Option Explicit
Function reValue(S As String)
Dim RE As Object, MC As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = "\b\d\.\d\b"
If .test(S) = True Then
Set MC = .Execute(S)
reValue = CDbl(MC(0))
Else
reValue = ""
End If
End With
End Function

Resources