How to get string in between two characters in excel/spreadsheet - excel

I have this string
Weiss,Emery/Ap #519-8997 Quam. Street/Hawaiian Gardens,IN - 79589|10/13/2010
how do I get the Hawaiian Gardens only?
I already tried Using some
=mid(left(A1,find("/",A1)-1),find(",",A1)+1,len(A1))
it gives me emery instead

If there are always two slashes before the string you want to extract, based onTyler M's answer you can use this
=MID(E1,
FIND("~",SUBSTITUTE(E1,"/","~",2))+1,
FIND(",",RIGHT(E1,LEN(E1)-FIND("~",SUBSTITUTE(E1,"/","~",2))))-1
)
This substitutes the second occurence of / with a character which normally would not occur in the address, thus making it findable.

Was your intention to also include Google Spreadsheets (looking at your title)? If so,you can use the REGEXEXTRACT() function. For example in B1
=REGEXEXTRACT(A1,"\/([\w\s]*)\,")
In Excel you could build a UDF using this regex rule like so (as an example):
Function REGEXEXTRACT(S As String, PTRN As String) As String
'We will get the last possible match in your string...
Dim regex As Object
Set regex = CreateObject("VBScript.RegExp")
With regex
.Pattern = PTRN
.Global = True
End With
Set matches = regex.Execute(S)
For Each Match In matches
If Match.SubMatches.Count > 0 Then
For Each subMatch In Match.SubMatches
REGEXEXTRACT = subMatch
Next subMatch
End If
Next Match
End Function
Call the function in B1 like so:
=REGEXEXTRACT(A1,"\/([\w\s]*)\,")

Related

Remove characters A-Z from string [duplicate]

This question already has answers here:
Extracting digits from a cell with varying char length
(4 answers)
Closed 2 years ago.
I need to be able to remove all alphabetical characters from a string, leaving just the numbers behind.
I don't need to worry about any other characters like ,.?# and so on, just the letters of the alphabet a-z, regardless of case.
The closest I could get to a solution was the exact opposite, the below VBA is able to remove the numbers from a string.
Function removenumbers(ByVal input1 As String) As String
Dim x
Dim tmp As String
tmp = input1
For x = a To Z
tmp = Replace(tmp, x, "")
Next
removenumbers = tmp
End Function
Is there any modification I can make to remove the letters rather than numbers to the above, or am I going at this completely wrong.
The letters could fall anywhere in the string, and there is no pattern to the strings.
Failing this I will use CTRL + H to remove all letters one by one, but may need to repeat this again each week so UDF would be much quicker.
I'm using Office 365 on Excel 16
Option Explicit
dim mystring as String
dim regex as new RegExp
Private Function rgclean(ByVal mystring As String) As String
'function that find and replace string if contains regex pattern
'returns str
With regex
.Global = True ' return all matches found in string
.Pattern = "[A-Z]" ' add [A-Za-z] if you want lower case as well the regex pattern will pick all letters from A-Z and
End With
rgclean = regex.Replace(mystring, "") '.. and replaces everything else with ""
End Function
Try using regular expression.
Make sure you enable regular expression on: Tools > References > checkbox: "Microsoft VBScript Regular Expressions 5.5"
The function will remove anything from [A-Z], if you want to include lower case add [A-Za-z] into the regex.pattern values. ( .Pattern = "[A-Za-z]")
You just pass the string into the function, and the function will use regular expression to remove any words from in a string
Thanks

Removing Whole Numbers from an Alphanumeric String

I'm having trouble finding a way to remove floating integers from a cell without removing numbers attached to the end of my string. Could I get some help as to how to approach this issue?
For example, in the image attached, instead of:
john123 456 hamilton, I want:
john123 hamilton
This can be done using regular expressions. You will match on the data you want to remove, then replace this data with an empty string.
Since you didn't provide any code, all I can do you for is provide you with a function that you can implement into your own project. This function can be used in VBA or as a worksheet function, such as =ReplaceFloatingIntegers(A1).
You will need to add a reference to Microsoft VBScript Regular Expressions 5.5 by going to Tools, References in the VBE menu.
Function ReplaceFloatingIntegers(Byval inputString As String) As String
With New RegExp
.Global = True
.MultiLine = True
.Pattern = "(\b\d+\b\s?)"
If .Test(inputString) Then
ReplaceFloatingIntegers = .Replace(inputString, "")
Else
ReplaceFloatingIntegers = inputString
End If
End With
End Function
Breaking down the pattern
( ... ) This is a capturing group. Anything captured in this group will be able to be replaced with the .Replace() function.
\b This is a word boundary. We use this because we want to test from the edge to edge of any 'words' (which includes words that contain only digits in our case).
\d+\b This will match any digit (\d), one to unlimited + times, to the next word boundary\b
\s? will match a single whitespace character, but it's optional ? if this character exists
You can look at this personalized Regex101 page to see how this matches your data. Anything matched here is replaced with an empty string.

excel formula find part number in file path text string

I have a extract of all the files on a network drive, and in the some file names is a part number, the part numbers format is 0000-000000-00. Now in the 600,000+ path names in this file I'm trying to figure out how to extract my part numbers out of the path names. I think a mid formula might work but I am at a loss on how to tell it to find anything with the part # format 0000-000000-00 and extract only those 14 characters from the path?
input looks like this
c:\users\stuff\folder_name\1234-000001-01_ baskets_1.pdf
c:\users\stuff\folder_name\1234-000001-02_ baskets_2.pdf
c:\users\stuff\folder_name\1234-000001-03_ baskets_3.pdf
c:\users\stuff\folder_name\1234-000030-01_ tree_30.pdf
c:\users\stuff\folder_name\random text_1234-000030-02_ tree_30.pdf
c:\users\stuff\folder_name\more random stuff_1234-000030-02_ tree_30.pdf
output I'm hoping for
1234-000001-01
1234-000001-02
1234-000001-03
1234-000030-01
Since you have a pattern we can exploit, use this:
=MID(A1,SEARCH("????-??????-??",A1),14)
Finds the start of the pattern and returns the 14 character after.
You wanted a formula but a UDF could also be used to apply a regex to get the pattern (a little overkill in this instance but worth being aware of):
Option Explicit
Public Sub GetCustomString()
Dim i As Long, tests()
tests = Array("c:\users\stuff\folder_name\1234-000001-01_ baskets_1.pdf", _
"c:\users\stuff\folder_name\1234-000001-02_ baskets_2.pdf", _
"c:\users\stuff\folder_name\1234-000001-03_ baskets_3.pdf", _
"c:\users\stuff\folder_name\1234-000030-01_ tree_30.pdf", _
"c:\users\stuff\folder_name\random text_1234-000030-02_ tree_30.pdf", _
"c:\users\stuff\folder_name\more random stuff_1234-000030-02_ tree_30.pdf")
For i = LBound(tests) To UBound(tests)
Debug.Print GetString(tests(i))
Next
End Sub
Public Function GetString(ByVal inputString As String) As String
Dim arr() As String, i As Long, matches As Object, re As Object
Set re = CreateObject("VBScript.RegExp")
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "\d{4}-\d{6}-\d{2}"
If .test(inputString) Then
GetString = .Execute(inputString)(0)
Else
GetString = vbNullString
End If
End With
End Function
Using UDF in sheet:
Pattern: \d{4}-\d{6}-\d{2}
Explanation:
\d{4} matches a digit (equal to [0-9])
{4} Quantifier — Matches exactly 4 times
"-" matches the character - literally (case sensitive)
\d{6} matches a digit (equal to [0-9])
{6} Quantifier — Matches exactly 6 times
"-" matches the character - literally (case sensitive)
\d{2} matches a digit (equal to [0-9])
{2} Quantifier — Matches exactly 2 times
Global pattern flags:
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

Search for a specific number (in inch) where the number isn't part of a larger expression

I want to use an Excel formula that returns the correct index in the cell when the text in the cell contains the term '2"' (two inch). This is possible with the search function.
The catch is that I only want to find instances where it's actually '2"', not cases where you have other expressions such as '1/2"' or '12"'. See the image below for an example to clarify where search works and where it doesn't.
I think a VBA solution using Regular Expressions will be easiest in order to be able to return measurements like 1 1/2".
To enter this User Defined Function (UDF), alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like
=FindMeasure(A1,$E$1)
in some cell, where E1 contains a value like 2" or 1 1/2"
Option Explicit
Function FindMeasure(sSearch As String, ByVal sMeasure As String)
Dim RE As Object, MC As Object, SM As Variant
Dim sPat As String
sPat = "\D(\s+)" & sMeasure & "|^" & sMeasure
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = False
.MultiLine = True
.Pattern = sPat
End With
If RE.test(sSearch) = True Then
Set MC = RE.Execute(sSearch)
SM = MC(0).submatches(0)
FindMeasure = MC(0).firstindex + Len(SM) + IIf(Len(SM) > 0, 2, 1)
Else
FindMeasure = 0
End If
End Function
EDIT: Reviewing my answer reveals that under certain circumstances, incorrect results will be returned.
If there is a "word" preceding the measurement which ends with a digit, the routine will fail to recognize the measurement. This can be avoided by ensuring that there is at least one non-digit in the string preceding the measurement (by modifying the regex). However, if the entire word consists of digits, the measurement will not be recognized.
If the line starts with a SPACE, the measurement will not be recognized. This can be corrected by modifying both the code and the regex to account for that possibility.
If the cell containing the measurement, or the cell containing the string, is blank, then the result will be incorrect. This can be avoided by testing for those conditions, by modifying the code.
Modified Code
Option Explicit
Function FindMeasure(sSearch As String, ByVal sMeasure As String)
Dim RE As Object, MC As Object, SM As Variant
Dim sPat As String
sPat = "(\S*\D\S*\s+)" & sMeasure & "|(^\s*)" & sMeasure
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = False
.MultiLine = True
.Pattern = sPat
End With
If RE.test(sSearch) = True And _
Len(sSearch) > 0 And _
Len(sMeasure) > 0 Then
Set MC = RE.Execute(sSearch)
SM = MC(0).submatches(0) & MC(0).submatches(1)
FindMeasure = MC(0).firstindex + Len(SM) + 1
Else
FindMeasure = 0
End If
End Function
Explanation of Regex with sMeasure = 2"
(\S*\D\S*\s+)2"|(^\s*)2"
(\S*\D\S*\s+)2"|(^\s*)2"
Options: Case insensitive; ^$ match at line breaks
Match this alternative (\S*\D\S*\s+)2"
Match the regex below and capture its match into backreference number 1 (\S*\D\S*\s+)
Match a single character that is NOT a “whitespace character” \S*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match a single character that is NOT a “digit” \D
Match a single character that is NOT a “whitespace character” \S*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match a single character that is a “whitespace character” \s+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match the character string “2"” literally 2"
Or match this alternative (^\s*)2"
Match the regex below and capture its match into backreference number 2 (^\s*)
Assert position at the beginning of a line ^
Match a single character that is a “whitespace character” \s*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the character string “2"” literally 2"
Created with RegexBuddy
use:
=SEARCH(" 2"""," " & A1)-1
There are three tiny tricks associated with this formula:
we search for {space} 2 "
we place a blank at the start of the string
we account for the blank by subtracting one from the position
EDIT#1:
This may be better. With data in A3 and the string in B2 try:
=IFERROR(IF(LEFT(A3,2)=$B$2,1,SEARCH(" " & $B$2, A3)+1),0)
If I'm reading this right the requirements are:
1) if it starts with 2" (followed by a space)
2) there is 2" in the middle of the string (with a space on each side)
3) the string ends with 2" (preceeded by a space) it should be"OK", otherwise zero
If those are the requirements this formula should work:
=IF(OR(LEFT(A2,3)="2"" ",ISNUMBER(SEARCH(" 2"" ",A2)),RIGHT(A2,3)=" 2"""),"OK",0)
-- or you may have to use this for the second row depends on your requirements --
=IF(OR(LEFT(A2,2)="2""",ISNUMBER(SEARCH(" 2"" ",A2)),RIGHT(A2,3)=" 2"""),"OK",0)

Excel ISTEXT alternative?

I am currently building a numberplate checker on an excel spread sheet that will determine if the letters and numbers of the numberplate are in the correct places and are valid.
The 3 criteria I have are if the numberplates are in on of these formulas:
(I have represented a number as 1 and a letter as A)
AAA111A
A111AAA
AA11AAA
The ultimate objective is for the program to ask the question "Look at these number plates, do they follow a format as shown above."
So far I have only been able to check to see if I have numbers in certain places, however I cannot specify the characters A - Z when trying to do a search function from the left, right and centre.
=ISNUMBER(--MID(A3,1,3))
If I wanted to search within a cell for example, the first character, is it a letter a-z, return true or false? How would I go about doing this?
An example in this instance might be:
DJO148R
The formula
=ISNUMBER(--MID(A5,4,3))
This would turn back as true because the 4th character is a number and so are the next 2.
With the same numberplate, how do I change it to search for letters rather than numbers within the numberplate?
Here is a simpler RegEx implementation. Make sure you include references to Microsoft VBScript Regular Expressions 5.5. This will go in a new inserted module
Function PlateCheck(cell As Range) As Boolean
Dim rex As New RegExp
rex.Pattern = "[A-Z][0-9|A-Z][0-9|A-Z][0-9|A-Z][0-9|A-Z][0-9|A-Z][A-Z]"
If rex.Test(cell.Value) Then
PlateCheck = True
Else
PlateCheck = False
End If
End Function
As per the guys comments, here's how you do it with regex:
Make sure to include MS VB regular expressions 5.5 as a reference.
To do that, in your VBA IDE, go Tools, Reference and then look the regex reference.
Then Add this in a new module:
Function VerifyLicensePlate(ip As Range) As String
Dim regex As New RegExp
Dim inputstr As String: inputstr = ip.Value
With regex
.Global = True
.IgnoreCase = True
End With
Dim strpattern(2) As String
strpattern(0) = "[A-Z][A-Z][A-Z][0-9][0-9][0-9][A-Z]"
strpattern(1) = "[A-Z][A-Z][0-9][0-9][A-Z][A-Z][A-Z]"
strpattern(2) = "[A-Z][0-9][0-9][0-9][A-Z][A-Z][A-Z]"
For i = 0 To 2
regex.pattern = strpattern(i)
If regex.Test(inputstr) Then
VerifyLicensePlate = "Match"
Exit Function
Else
VerifyLicensePlate = "No match"
End If
Next
End Function
Output:
Occam's Razor would suggest,
=NOT(ISNUMBER(--MID(A5,4,3)))
... or,
=ISERROR(--MID(A5,4,3))
Here's a version that uses late-binding, so no need to set a reference. IT is case insensitive, as that seemed to be implied in your question, but that is easily changed.
Option Explicit
Function MatchPattern(S As String) As Boolean
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = "\b(?:[A-Z]{3}\d{3}[A-Z]|[A-Z]{2}\d{2}[A-Z]{3}|[A-Z]\d{3}[A-Z]{3})\b"
.ignorecase = True
MatchPattern = .test(S)
End With
End Function
But, as pointed out by G Serg, you don't really need regex for this:
Option Explicit
Option Compare Text 'Case Insensitive
Function MatchPattern(S As String) As Boolean
Const S1 As String = "[A-Z][A-Z][A-Z]###[A-Z]"
Const S2 As String = "[A-Z]###[A-Z][A-Z][A-Z]"
Const S3 As String = "[A-Z][A-Z]##[A-Z][A-Z][A-Z]"
MatchPattern = False
If Len(S) = 7 Then
If S Like S1 Or _
S Like S2 Or _
S Like S3 Then _
MatchPattern = True
End If
End Function
Here is a rather complicated formula that seems to match your specifications:
=AND(LEN(A1)=7,
OR(MMULT(--(CODE(MID(A1,{1,2,3,4,5,6,7},1))>64),--(TRANSPOSE(CODE(MID(A1,{1,2,3,4,5,6,7},1))<91)))={4,5}),
CODE(LEFT(A1,1))>64,CODE(LEFT(A1,1))<91,
CODE(RIGHT(A1,1))>64,CODE(RIGHT(A1,1))<91,
ISNUMBER(-MID(A1,MIN(FIND({1,2,3,4,5,6,7,8,9,0},A1&"0123456789")),
7-MMULT(--(CODE(MID(A1,{1,2,3,4,5,6,7},1))>64),--(TRANSPOSE(CODE(MID(A1,{1,2,3,4,5,6,7},1))<91))))))
Ensure we have only seven characters
The OR(MMULT... function counts the number of letters and returns TRUE if four or five.
Check to make sure first and last character is a letter
There should remain a consecutive string of either two or three digits (seven less the number of letters)
If you want to make the formula case insensitive, replace the instances of A1 with UPPER(A1)
I think the UDF solution is better.

Resources