Removing particular string from a cell - excel

I have text in a range of cells like
Manufacturer#||#Coaster#|#|Width (side to side)#||#20" W####Height (bottom to top)#||#35" H#|#|Depth (front to back)#||#20.5" D####Seat Depth#||#14.25"**#|#|Material & Finish####**Composition#||#Wood Veneers & Solids#|#|Composition#||#Metal#|#|Style Elements####Style#||#Contemporary#|#|Style#||#Casual
From this cell i need to remove strings between #|#|"needtoremove"#### only without affecting other strings.
I have tried find and replace, finding #|#|*#### and replacing it with #|#|. However its not giving the exact result.
Can anyone help me?

The other solution will remove anything between the first #|#| and ####, event the #||# etc.
In case you only need to remove the text between #|#| and #### only if there is no other ##|| inbetween, I think the simplest way is to use a regex.
You will need to activate the Microsoft VBScript Regular Expressions 5.5 library in Tools->References from the VBA editor.
Change range("D166") to wherever your cell is. The expression as it is right now ("#\|#\|[A-Za-z0-9& ]*####")matches any text that starts with #|#|, ends with #### and has any number of alphanumerical character, & or space. You can add other caracters between the brakets if needed.
Sub remove()
Dim reg As New RegExp
Dim pattern As String
Dim replace As String
Dim strInput As String
strInput = Range("D166").Value
replace = ""
pattern = "#\|#\|[A-Za-z0-9& ]*####"
With reg
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = pattern
End With
If reg.test(strInput) Then Range("D166").Value = reg.replace(strInput, replace)
End Sub

Something like this.
If that value is in cell A1
Dim str As String
Dim i As Integer
Dim i2 As Integer
Dim ws As Excel.Worksheet
Set ws = Application.ActiveSheet
str = ws.Range("A1").Value
i = InStr(str, "#|#|")
i2 = InStr(str, "####")
str = Left(str, i) & Right(str, Len(str) - i2)
ws.Range("A1").Value = str

Related

VBA extract all fieldname within a string

I have a string like this one :
"'where CAST(a.DT_NPE_SORTIE as integer ) < cast (add_months (cast (a.dt_nep_restructuration as date format 'YYYYMMDD'), 12) as integer) and a.DT_NPE_SORTIE is not null and a.DT_NPE_SORTIE <> '99991231' and a.dt_npe_restructuration is not null and a.dt_npe_restructuration <> '99991231'"
I need to extract all "a.FIELDNAME" like a.dt_nep_restructuration, a.DT_NPE_SORTIE from the previous screen.
I need to do this in VBA for a project at work.
So far I used If & InStr to check if a list of value is present in the string. But it will be easier for me to extract all a.FIELDNAME then check if they match with fieldname in an array.
Best regards,
jouvzer
I saw there are many with "NPE" and one with "NEP"? Is that a typo? If it is, then will it always start with a.dt_nep_...? – Siddharth Rout 9 mins ago
No it's a typo in order to raise an error for my vba function. It will always start with "a." – Jouvzer 6 mins ago
I have handled both NPE/NEP. Is this what you are trying?
Option Explicit
Private Sub simpleRegex()
Dim strPattern As String: strPattern = "a.dt_(nep|npe)_\w+"
Dim regEx As Object
Dim strInput As String
Dim inputMatches As Object
Dim i As Long
strInput = "'where CAST(a.DT_NPE_SORTIE as integer ) < cast (add_months (cast (a.dt_nep_restructuration as date format 'YYYYMMDD'), 12) as integer) and a.DT_NPE_SORTIE is not null and a.DT_NPE_SORTIE <> '99991231' and a.dt_npe_restructuration is not null and a.dt_npe_restructuration <> '99991231'"
Set regEx = CreateObject("VBScript.RegExp")
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = strPattern
End With
Set inputMatches = regEx.Execute(strInput)
If inputMatches.Count <> 0 Then
For i = 0 To inputMatches.Count - 1
Debug.Print inputMatches.Item(i)
Next i
End If
End Sub
Note: If it starts with a.dt then you can also use a.dt_\w+_\w+
Here is one RegExp based code which will print output for cell A1 in immediate window.
Public Sub FindMatches()
Dim oRgEx As Object, oMatches As Object
Dim i As Long
Set oRgEx = CreateObject("VBScript.RegExp")
With oRgEx
.Global = True
.MultiLine = True
.Pattern = "\ba\.[A-z_]+\b"
Set oMatches = .Execute(Range("A1").Value)
If oMatches.Count <> 0 Then
For i = 0 To oMatches.Count - 1
Debug.Print oMatches.Item(i)
Next i
End If
End With
End Sub
Depending on your actual data, you can adjust this further and adapt in your code.
Note: I am a basic level user of RegExp so you may want to consider suggestions indicated below such as \ba\.[A-Za-z_]+\b or \ba\.\w+\b if you get unusual results.
Simple alternative via Filter()
Filtering of all Split() elements in a string array containing the start identification a. allows to receive already a resulting array with all wanted elements (see section a) and b)).
An eventual cosmetic action removes unnecessary characters before the identifying prefix "a." (see section c))
Function ExtractFieldnames(s As String)
Const PREFIX As String = ".a"
'a) split string into tokens
Dim tmp() As String
tmp = Split(s, " ")
'b) leave only elements that include fieldnames
tmp = Filter(tmp, PREFIX, True)
'c) let them start with "a."
Dim i As Long
For i = 0 To UBound(tmp)
tmp(i) = PREFIX & Split(tmp(i), PREFIX)(1)
Next
'd) return array as function result
ExtractFieldnames = tmp
End Function
Example call
Sub TestExtract()
Dim s As String
s = "'where CAST(a.DT_NPE_SORTIE as integer ) < cast (add_months (cast (a.dt_nep_restructuration as date format 'YYYYMMDD'), 12) as integer) and a.DT_NPE_SORTIE is not null and a.DT_NPE_SORTIE <> '99991231' and a.dt_npe_restructuration is not null and a.dt_npe_restructuration <> '99991231'"
Debug.Print Join(ExtractFieldnames(s), vbNewLine)
End Sub
Results in VB Editor's immediate window
a.DT_NPE_SORTIE
a.dt_nep_restructuration
a.DT_NPE_SORTIE
a.DT_NPE_SORTIE
a.dt_npe_restructuration
a.dt_npe_restructuration

In VBA, how to extract the string before a number from the text

From ActiveWorkbook.name, I would like to extract the strings that are before (left side of ) the numbers. Since I want to use the same code in multiple workbooks, the file names would be variable, but every file name has date info in the middle (yyyymmdd).
In case of excel file, I can use the below formula, but can I apply the same kind of method in VBA?
=LEFT(A1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},ASC(A1)&1234567890))-1)
Example: MyExcelWorkbook_Management_20200602_MyName.xlsm
In above case, I want to extract "MyExcelWorkbook_Management_".
The most basic thing you could do is to replicate something that worked for you in Excel through Evaluate:
Sub Test()
Dim str As String: str = "MyExcelWorkbook_Management_20200602_MyName.xlsm"
Debug.Print Evaluate(Replace("=LEFT(""X"",MIN(FIND({0,1,2,3,4,5,6,7,8,9},ASC(""X"")&1234567890))-1)", "X", str))
End Sub
Pretty? Not really, but it does the job and got it's limitations.
You could use Regular Expressions to extract any letters / underscores before the number as well
Dim str As String
str = "MyExcelWorkbook_Management_20200602_MyName.xlsm"
With CreateObject("vbscript.regexp")
.Pattern = "^\D*"
.Global = True
MsgBox .Execute(str)(0)
End With
Gives:
MyExcelWorkbook_Management_
So basically you want to use the Midfunction to look for the first numerical character in your input string, and then cut your input string to that position.
That means we need to loop through the string from left to right, look at one character at a time and see if it is a digit or not.
This code does exactly that:
Option Explicit
Sub extratLeftText()
Dim someString As String
Dim result As String
someString = "Hello World1234"
Dim i As Long
Dim c As String 'one character of your string
For i = 1 To Len(someString)
c = Mid(someString, i, 1)
If IsNumeric(c) = True Then 'should write "If IsNumeric(c) = True AND i>1 Then" to avoid an "out of bounds" error
result = Left(someString, i - 1)
Exit For
End If
Next i
MsgBox result
End Sub
Last thing you need to do is to load in some workbook name into your VBA function. Generally this is done with the .Name method of the workbookobject:
Sub workbookName()
Dim wb As Workbook
Set wb = ActiveWorkbook
MsgBox wb.Name
End Sub
Of course you would need to find some way to replace the Set wb = ActiveWorkbook line with code that suits your purpose.

How to remove the last character of a word in a text string and insert to another cell using VBA in Excel?

Everything is working except for that little comma in the 5th word. How to remove that? My code is as follows.
The text looks like this: The data as of 20.12.2019, and so on.
I only want 20.12.2019 without that comma. Any clue? Thanks.
Public Function FindWord(Source As String, Position As Integer)
Dim arr() As String
arr = VBA.Split(Source, " ")
xCount = UBound(arr)
If xCount < 1 Or (Position - 1) > xCount Or Position < 0 Then
FindWord = ""
Else
FindWord = arr(Position - 1)
End If
End Function
subroutine calls the function.
Sub InsertDate()
Sheets("Sheet1").Range("B3").Value = FindWord(Sheets("Sheet2").Range("A2"), 5)
End Sub
So just for fun, a short introduction to regular expressions (which, by no means, I am an expert in):
Sub Test()
Dim str As String: str = "The data as of 20.12.2019, and so on."
Dim regex As Object: Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "\b(\d{2}.\d{2}.\d{4})"
regex.Global = True
Debug.Print regex.Execute(str)(0)
End Sub
This would be good practice if your string won't follow that same pattern all the time. However when it does, there are some other good alternatives mentioned in comments and answers.
One option is to Replace:
Sub InsertDate()
With Sheets("Sheet1").Range("B3")
.Value = FindWord(Sheets("Sheet2").Range("A2"), 5)
.Value = Replace(.Value, ",", "")
End With
End Sub
This is still text-that-looks-like-a-date, so you can call DateValue to convert it.
.Value = Replace(.Value, ",", "")
.Value = DateValue(.Value) '<~ add this line

how to remove the first comma in a string in excel vba

If I have a string: "foo, bar" baz, test, blah, how do I remove a specific comma, i.e. not all of them, but just one of my choosing?
with Replace and INSTR it looks like I have not know where the comma is. The problem is, I'll only want to remove the comma if it appears between quotation marks.
So, I may want to remove the first comma and I may not.
Put more clearly, if there is a comma between a set of quotation marks, I need to remove it. if not, then there's nothing to do. But, I can't just remove all the commas, as I need the others in the string.
Try with Regexp in this way:
Sub foo()
Dim TXT As String
TXT = """foo, bar"" baz, test, blah"
Debug.Print TXT
Dim objRegExp As Object
Set objRegExp = CreateObject("vbscript.regexp")
With objRegExp
.Global = True '
.Pattern = "(""\w+)(,)(\s)(\w+"")"
Debug.Print .Replace(TXT, "$1$3$4")
End With
End Sub
It works as expected for the sample value you have provided but could require additional adjustments by changing .Pattern for more complicated text.
EDIT If you want to use this solution as an Excel function than use this code:
Function RemoveCommaInQuotation(TXT As String)
Dim objRegExp As Object
Set objRegExp = CreateObject("vbscript.regexp")
With objRegExp
.Global = True
.Pattern = "(""\w+)(,)(\s)(\w+"")"
RemoveCommaInQuotation = .Replace(TXT, "$1$3$4")
End With
End Function
Ugh. Here's another way
Public Function foobar(yourStr As String) As String
Dim parts() As String
parts = Split(yourStr, Chr(34))
parts(1) = Replace(parts(1), ",", "")
foobar = Join(parts, Chr(34))
End Function
With some error-checking for odd number of double quotes:
Function myremove(mystr As String) As String
Dim sep As String
sep = """"
Dim strspl() As String
strspl = Split(mystr, sep, -1, vbBinaryCompare)
Dim imin As Integer, imax As Integer, nstr As Integer, istr As Integer
imin = LBound(strspl)
imax = UBound(strspl)
nstr = imax - imin
If ((nstr Mod 2) <> 0) Then
myremove = "Odd number of double quotes"
Exit Function
End If
For istr = imin + 1 To imax Step 2
strspl(istr) = Replace(strspl(istr), ",", "")
Next istr
myremove = Join(strspl(), """")
End Function

#VALUE error in Excel 2010 RegExp

I am trying to implement Regular Expressions in Excel 2010 on a mac, but with any formulas and data all I get is #VALUE errors
Here is my implementation in a module:
Function RegExp1(ReplaceIn, ReplaceWhat As String, _
ReplaceWith As String, Optional IgnoreCase As Boolean = False)
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.IgnoreCase = IgnoreCase
re.Pattern = ReplaceWhat
re.Global = True
RegExp1 = re.Replace(ReplaceIn, ReplaceWith)
End Function
And then in the cell I try:
=RegExp1(D2,"(PR2001\.)(\d)","$100$2")
All of this is executing on cells similar to:
PR2001.1
PR2001.2
PR2001.3
etc... I am trying to add zeros in between the last digit and period to format for easier sorting. Any help would be appreciated
Excel X does not support VBScript, so you will not be able to do this. The functon works and the expression is fine, btw.
What I can suggest to you is to write a function using InStrRev (actually this is a good solution even if you could use regexp).
Function AddZeros(ByVal text As String) As String
Dim lastPeriod As Long
lastPeriod = InStrRev(text, ".")
If lastPeriod <> 0 Then
AddZeros = Left$(text, lastPeriod) & ("00" & Mid$(text, lastPeriod + 1))
Else
AddZeros = text
End If
End Function

Resources