Find replace on the entire string fragments between delimiters VBA - excel

I have pipe-delimited strings I need to find and replace on the entire substring between the pipes
So if my strings looks like
AAAP|AAA TTT|AAA|000 or AAA|AAAP|AAA TTT|AAA|000 Or AAA|AAAP|AAA TTT|AAA|AAA
The AAA can be anywhere in the string. beginning and/or end or exist multiple times
and I want to replace AAA with ZZZ
The result I need:
AAAP|AAA TTT|ZZZ|000 or ZZZ|AAAP|AAA TTT|ZZZ|000 or ZZZ|AAAP|AAA TTT|ZZZ|ZZZ
The result I am getting
AAAP|ZZZ TTT|ZZZ|000 ...
How to restrict the replacement to the entire substring
Sub ExtractSubstringReplace()
Dim strSource, strReplace, strFind, RegExpReplaceWord, r, a
strSource = "AAAP|AAA TTT|AAA|000"
strFind = "AAA"
strReplace = "ZZZ"
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Global = True
re.IgnoreCase = True
re.Pattern = "\b" & strFind & "\b"
RegExpReplaceWord = re.Replace(strSource, strReplace)
MsgBox RegExpReplaceWord
End Sub

You can use
re.Pattern = "(^|\|)" & strFind & "(?![^|])"
RegExpReplaceWord = re.Replace(strSource, "$1" & strReplace)
See the (^|\|)AAA(?![^|]) regex demo. Note it is equal to (^|\|)AAA(?=\||$).
Details:
(^|\|) - Capturing group 1: either start of string or a pipe char
AAA - search string
(?![^|]) / (?=\||$) - a lookahead that makes sure there is either | or end of string immediately to the right of the current location.
NOTE: if your strFind can contain special regex metacharacters, make sure you escape the string using the solution from Regular Expression and embedded special characters.

Put the string on which replacement is to be made (AAAP|AAA TTT|AAA|000) in cell A1 of Sheet1 and run the following code.
You will get the whole string with replacements made in cell A2.
Sub ExtractSubstringReplace()
Dim strArr, str, strArrNew(), strSource As String, strReplace As String, strFind As String, i As Long
strSource = Sheet1.Range("A1").Value
strFind = "AAA"
strReplace = "ZZZ"
strArr = Split(Sheet1.Range("A1").Value, "|")
For Each str In strArr
If str = strFind Then str = strReplace
ReDim Preserve strArrNew(i)
strArrNew(i) = str
i = i + 1
Next str
For Each str In strArrNew
Debug.Print str
Next str
Sheet1.Range("A2").Value = Join(strArrNew, "|")
End Sub

I would go with the option presented in the comments by #Siddharth Rout as it is probably the most efficient.
Sub Test()
MsgBox ExtractSubstringReplace("AAAP|AAA TTT|AAA|000", "AAA", "ZZZ")
MsgBox ExtractSubstringReplace("AAA|AAAP|AAA TTT|AAA|000", "AAA", "ZZZ")
MsgBox ExtractSubstringReplace("AAA|AAAP|AAA TTT|AAA|AAA", "AAA", "ZZZ")
End Sub
Public Function ExtractSubstringReplace(ByVal strSource As String _
, ByVal strFind As String _
, ByVal strReplace As String _
, Optional ByVal delimiter As String = "|" _
) As String
Dim result As String
'Duplicate delimiter and also add leading and trailing delimiter
result = delimiter & Replace(strSource, delimiter, delimiter & delimiter) & delimiter
'Replace substrings
result = Replace(result, delimiter & strFind & delimiter, delimiter & strReplace & delimiter)
'Remove leading and trailing delimiter that we added previously
result = Mid$(result, Len(delimiter) + 1, Len(result) - Len(delimiter) * 2)
'Restore delimiters
ExtractSubstringReplace = Replace(result, delimiter & delimiter, delimiter)
End Function
You can also use the function in an Excel cell.

Slightly shortened code
As addition to #ChristianBuse 's fine solution another fast approach based on the same idea (needing only 0.00 to max. 0.02 seconds):
Function SubRep(src, fnd, repl, Optional ByVal delim As String = "|") As String
SubRep = delim & src & delim
Dim i As Long
For i = 1 To 2
SubRep = Replace(SubRep, delim & fnd & delim, delim & repl & delim)
Next
SubRep = Mid$(SubRep, 2, Len(SubRep) - 2)
End Function
Example call
Sub ExampleCall()
Dim terms
terms = Array("AAAP|AAA TTT|AAA|000", "AAA|AAAP|AAA TTT|AAA|000", "AAA|AAAP|AAA TTT|AAA|AAA")
Dim i As Long
For i = LBound(terms) To UBound(terms)
Debug.Print Format(i, "0 ") & terms(i) & vbNewLine & " " & _
SubRep(terms(i), "AAA", "ZZZ")
Next
End Sub
Results in VB Editor's immediate window
0 AAAP|AAA TTT|AAA|000
AAAP|AAA TTT|ZZZ|000
1 AAA|AAAP|AAA TTT|AAA|000
ZZZ|AAAP|AAA TTT|ZZZ|000
2 AAA|AAAP|AAA TTT|AAA|AAA
ZZZ|AAAP|AAA TTT|ZZZ|ZZZ

Related

Search and replace text in a string

I use the code below to search and replace a part of a text in a string. It works fine for almost 97 % of the replacements but not when one string that is supposed to be replaced is identical with another part of the string. Is there a straightforward method to avoid this?
Sub Macro1()
Dim i As Integer
For i = 2 To Worksheets("table1").Range("A1").End(xlDown).Row
Worksheets("table1").Range("H:H").Replace What:=Worksheets("table2").Range("A" & i), Replacement:=Worksheets("table2").Range("B" & i), LookAt:= _
xlPart, SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False, FormulaVersion:=xlReplaceFormula2
Next i
End Sub
Important: The delimiter is not always ",". It can also be any combination blank space(s) with a comma. Examples:
", "
" ,"
" , "
This is what is called a False Positive. If the delimiter is going to be always , then split the string. Do the replace and then join them again.
Is this what you are trying? I have commented the code. If you still have questions then simply ask.
Option Explicit
'~~> This is the delimiter. Change as applicable
Private Const Delim As String = ","
Sub Sample()
Dim wsTblA As Worksheet
Dim wsTblB As Worksheet
Dim lRow As Long
Dim i As Long, j As Long
Dim ArTable1 As Variant
Dim ArTable2 As Variant
'~~> Change this to the relevant worksheet
Set wsTblA = Worksheets("Table2")
Set wsTblB = Worksheets("Table1")
'~~> Get the values in Col A and B from Sheet Table2 in an array
With wsTblA
lRow = .Range("A" & .Rows.Count).End(xlUp).Row
ArTable2 = .Range("A2:B" & lRow).Value2
End With
'~~> Get the values in Col H from Sheet Table1 in an array
With wsTblB
lRow = .Range("H" & .Rows.Count).End(xlUp).Row
ArTable1 = .Range("H2:H" & lRow).Value2
End With
'~~> Loop through the array
For i = LBound(ArTable2) To UBound(ArTable2)
For j = LBound(ArTable1) To UBound(ArTable1)
'~~> Check if the search string is present
If InStr(1, ArTable1(j, 1), ArTable2(i, 1), vbTextCompare) Then
'~~> If it is present then attempt a replace
ArTable1(j, 1) = ReplaceText(ArTable1(j, 1), ArTable2(i, 1), ArTable2(i, 2))
End If
Next j
Next i
'~~> Write the array back to the worksheet
wsTblB.Range("H2").Resize(UBound(ArTable1), 1).Value = ArTable1
End Sub
'~~> Function to split the text and then compare. If exact match, then replace
Private Function ReplaceText(CellValue As Variant, ReplaceWhat As Variant, ReplaceWith As Variant) As String
Dim tmpAr As Variant
Dim ReplacedText As String
Dim k As Long
'~~> Split the test using the delimiter
tmpAr = Split(CellValue, Delim)
'~~> If exact match, then replace
For k = LBound(tmpAr) To UBound(tmpAr)
If UCase(Trim(tmpAr(k))) = UCase(Trim(ReplaceWhat)) Then
tmpAr(k) = ReplaceWith
End If
Next k
'~~> Rejoin using delimiter
ReplacedText = Join(tmpAr, Delim)
ReplaceText = ReplacedText
End Function
Sheets TABLE2
Sheets TABLE1
Sheets TABLE1 OUTPUT
EDIT
Thank you for your wonderful solution. Problem is the delimiter is not always ",". It can also be a blank space " ". Problem using a blank space as additional delimiter might be the case that each element of the string e. g. "4711 Text_A" always has a blank space after the first 4 chars. – D3merzel 44 mins ago
In that case, you can take another approach. The text can appear in 3 positions. At the begining (TEXT & Delim), in the middle (Delim & TEXT & Delim) and in the end (Delim & TEXT)
Can you try the below code. I have not extensively tested it. If you find a scenario where it doesn't work then share it, I will tweak the code.
Option Explicit
'~~> This is the delimiter. Change as applicable
Private Const Delim As String = " "
Sub Sample()
Dim wsTblA As Worksheet
Dim wsTblB As Worksheet
Dim lRow As Long
Dim i As Long, j As Long
Dim ArTable1 As Variant
Dim ArTable2 As Variant
'~~> Change this to the relevant worksheet
Set wsTblA = Worksheets("Table2")
Set wsTblB = Worksheets("Table1")
'~~> Get the values in Col A and B from Sheet Table2 in an array
With wsTblA
lRow = .Range("A" & .Rows.Count).End(xlUp).Row
ArTable2 = .Range("A2:B" & lRow).Value2
End With
'~~> Get the values in Col H from Sheet Table1 in an array
With wsTblB
lRow = .Range("H" & .Rows.Count).End(xlUp).Row
ArTable1 = .Range("H2:H" & lRow).Value2
End With
'~~> Loop through the array
For i = LBound(ArTable2) To UBound(ArTable2)
For j = LBound(ArTable1) To UBound(ArTable1)
'~~> Check if the search string is present
If Left(ArTable1(j, 1), Len(ArTable2(i, 1) & Delim)) = ArTable2(i, 1) & Delim Then
ArTable1(j, 1) = Replace(ArTable1(j, 1), ArTable2(i, 1) & Delim, ArTable2(i, 2) & Delim)
ElseIf InStr(1, ArTable1(j, 1), Delim & ArTable2(i, 1) & Delim, vbTextCompare) Then
ArTable1(j, 1) = Replace(ArTable1(j, 1), Delim & ArTable2(i, 1) & Delim, Delim & ArTable2(i, 2) & Delim)
ElseIf Right(ArTable1(j, 1), Len(Delim & ArTable2(i, 1))) = Delim & ArTable2(i, 1) Then
ArTable1(j, 1) = Replace(ArTable1(j, 1), Delim & ArTable2(i, 1), Delim & ArTable2(i, 2))
End If
Next j
Next i
'~~> Write the array back to the worksheet
wsTblB.Range("H2").Resize(UBound(ArTable1), 1).Value = ArTable1
End Sub
Sheets TABLE2
Sheets TABLE1
Sheets TABLE1 OUTPUT
EDIT
The above code handles all the ranges in one go! But if the code is too overwhelming (which it should not be), the above code can be reduced to a function to handle say individual string. One can use this function to check if the replace is happening correctly using a single string. For example
Debug.Print SidRepcl("bbb b_ bb b__ ccc_ bb b_ ccc", "ccc_", "ccc", " ")
Output: bbb b_ bb b__ ccc bb b_ ccc
As I mentioned earlier, all my codes above are based on the below logic
Logic: The text can appear in 3 positions. At the begining (TEXT & Delim), in the middle (Delim & TEXT & Delim) and in the end (Delim & TEXT)
Option Explicit
Function SidRepcl(txt As String, srch As String, repl As String, Delim As String) As String
Dim i As Long
Dim RetVal As String: RetVal = txt
'~~> Check if the search string is present
If Left(txt, Len(srch & Delim)) = srch & Delim Then
RetVal = Replace(txt, srch & Delim, repl & Delim)
ElseIf InStr(1, txt, Delim & srch & Delim, vbTextCompare) Then
RetVal = Replace(txt, Delim & srch & Delim, Delim & repl & Delim)
ElseIf Right(txt, Len(Delim & srch)) = Delim & srch Then
RetVal = Replace(txt, Delim & srch, Delim & repl)
End If
SidRepcl = RetVal
End Function
Flexible solution with any combinations of blank space(s) with comma(ta)
As alternative to Siddharth 's approaches you could change the logic by
splitting the input text via the ►search string itself instead of applying punctuation delimiters like e.g. ", ", "," or " ";
checking the last character in the current token and the starting character in each following token to execute replacements.
The following (edited 2023-01-02) function solves the additional requirements in comment that
... the delimiter is not always ",". It can also be a blank space " ". Problem using a blank space as additional delimiter might be the case that each element of the string e. g. "4711 Text_A" always has a blank space after the first 4 chars
by checking only one right or left neighbour character to each contained search string for " " or "," (c.f. returned helper function result IsMatch = curEndChar Like "[ ,]" And nxtStartChar Like "[ ,]" as well as comments to function TMreplc()).
Note that the substitution logic doesn't focus only to these evident delimiters but will change input strings like e.g. "4711 TEXT_A" also to e.g. 4711 TEXT_A/1.
Function TMreplc(txt As String, srch As String, repl As String) As String
'a) special case: replace entire text if identical to search string
If txt = srch Then TMreplc = repl: Exit Function
'b) get tokens by splitting via "search string" itself
Dim tokens: tokens = Split(txt, srch)
Dim ub As Long: ub = UBound(tokens)
'c) remember penultimate item
Dim mem As String: If ub > 0 Then mem = tokens(ub - 1)
'd) check most right token for content
Dim chk As Boolean: chk = tokens(ub) = vbNullString
If chk And ub > 0 Then
tokens(ub - 1) = tokens(ub - 1) & IIf(Len(mem) = 0, srch, repl)
If ub = 1 And tokens(0) = vbNullString Then tokens(0) = repl
End If
'e) Check predecessing tokens for substitutability
Dim i As Long
For i = 0 To ub - IIf(chk, 2, 1) ' if no srch finding at all (ignores: 0 To -1)
tokens(i) = tokens(i) & IIf(IsMatch(tokens, i), repl, srch)
Next i
'f) return result string
TMreplc = Join(tokens, vbNullString)
End Function
Function IsMatch(tokens, ByVal idx) As Boolean
Dim curEndChar As String
curEndChar = Right(IIf(idx = 0 And Len(tokens(0)) = 0, " ", "") & tokens(idx), 1)
Dim nxtStartChar As String: nxtStartChar = Left(tokens(idx + 1), 1)
IsMatch = curEndChar Like "[ ,]" And nxtStartChar Like "[ ,]"
End Function
Output examples
History
My first incomplete attempt below tried to include the cited additional requirement by checking only the following character, but didn't take into account those cases where the search string included preceding characters in the current token. I leave this attempt for learning purposes. - See Siddharth 's helpful comments that pointed me in the right direction.
A. First incomplete try
Function replc(txt As String, srch As String, repl As String) As String
'a) split input text into tokens via srch delimiter
Dim tokens: tokens = Split(txt, srch)
Dim ub As Long: ub = UBound(tokens)
'b) check possible change in last search item
Dim chg As Boolean: chg = tokens(ub) = vbNullString
If chg Then tokens(ub - 1) = tokens(ub - 1) & repl
'c) modify tokens
Dim i As Long
For i = 0 To ub - IIf(chg, 2, 1)
Dim nxtStartChar As String: nxtStartChar = Left(tokens(i + 1), 1)
tokens(i) = IIf(nxtStartChar Like "[ ,]", tokens(i) & repl, tokens(i) & srch)
Next i
'd) return joined tokens
replc = Join(tokens, vbNullString)
End Function
Additional note
It might be instructive, too how I tried to solve the original question (originally without the need of a different delimiter than ", "). Note the 2nd argument in the Match() function passed as array of a single string value.
Function replc2(txt As String, srch As String, repl As String) As String
Dim tokens: tokens = Split(txt, ", ")
Dim mtch: mtch = Application.Match(tokens, Array(srch), 0)
Dim i As Long
For i = 1 To UBound(mtch)
If IsNumeric(mtch(i)) Then tokens(i - 1) = repl
Next i
replc2 = Join(tokens, ", ")
End Function
B. My second try (as of 2022-12-13) includes a helper function IsMatch, but failed on certain scenarios (e.g. if the input txt is 100% identical to the search string - see last edit on top of post); I include it only for comparison reasons to complete history:
Function replc(txt As String, srch As String, repl As String) As String
Dim tokens: tokens = Split(txt, srch)
Dim i As Long
Dim ub As Long: ub = UBound(tokens)
Dim chg As Boolean: chg = tokens(ub) = vbNullString
If chg Then tokens(ub - 1) = tokens(ub - 1) & repl
For i = 0 To ub - IIf(chg, 2, 1)
tokens(i) = tokens(i) & IIf(IsMatch(tokens, i), repl, srch)
Next i
replc = Join(tokens, vbNullString)
End Function
Function IsMatch() - see top of post
Replace in Delimited Strings
Main
Sub ReplaceData()
Const SRC_DELIMITER As String = ","
Const DST_DELIMITER As String = ", "
Dim wb As Workbook: Set wb = ThisWorkbook ' workbook containing this code
' Write the values from the source range to an array.
Dim sws As Worksheet: Set sws = wb.Sheets("Table2")
If sws.AutoFilterMode Then sws.AutoFilterMode = False ' turn off AutoFilter
Dim srg As Range
Set srg = sws.Range("A2:B" & sws.Cells(sws.Rows.Count, "A").End(xlUp).Row)
Dim Data(): Data = srg.Value
' Sort the array by length descending so that the longer strings
' are first matched to avoid finding shorter strings before longer ones.
BubbleSortDataByLen Data, 1, True
' Write the unique values from the array to a dictionary.
Dim dict As Object: Set dict = TwoColumnsToDictionary(Data, 1, 2)
' Write the values from the destination range to an array.
Dim dws As Worksheet: Set dws = wb.Sheets("Table1")
If dws.AutoFilterMode Then dws.AutoFilterMode = False ' turn off AutoFilter
Dim drg As Range
Set drg = dws.Range("H2", dws.Cells(dws.Rows.Count, "H").End(xlUp))
Data = drg.Value
' Replace.
ReplaceSingleColumnData Data, dict, SRC_DELIMITER, DST_DELIMITER
' Write back to the range.
drg.Value = Data
' Inform
MsgBox "Data replaced.", vbInformation
End Sub
Sort
Sub BubbleSortDataByLen( _
ByRef Data() As Variant, _
ByVal SortColumnIndex As Long, _
Optional ByVal Descending As Boolean = False)
Dim rLB As Long, rUB As Long: rLB = LBound(Data, 1): rUB = UBound(Data, 1)
Dim cLB As Long, cUB As Long: cLB = LBound(Data, 2): cUB = UBound(Data, 2)
Dim T, i As Long, j As Long, c As Long, IsNotsorted As Boolean
For i = rLB To rUB - 1
For j = rLB + 1 To rUB
If Descending Then
If Len(CStr(Data(i, SortColumnIndex))) < Len(CStr( _
Data(j, SortColumnIndex))) Then IsNotsorted = True
Else
If Len(CStr(Data(i, SortColumnIndex))) > Len(CStr( _
Data(j, SortColumnIndex))) Then IsNotsorted = True
End If
If IsNotsorted Then
For c = cLB To cUB
T = Data(i, c): Data(i, c) = Data(j, c): Data(j, c) = T
Next c
End If
Next j
Next i
End Sub
Dictionary
Function TwoColumnsToDictionary( _
Data() As Variant, _
ByVal KeyColumnIndex As Long, _
ByVal ItemColumnIndex As Long, _
Optional ByVal MatchCase As Boolean = False) _
As Object
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")
dict.CompareMode = IIf(MatchCase, vbBinaryCompare, vbTextCompare)
Dim r As Long, kStr As String
For r = LBound(Data, 1) To UBound(Data, 1)
kStr = CStr(Data(r, KeyColumnIndex))
If Len(kStr) > 0 Then ' exclude blanks
' Use the first occurrences if any duplicates (shouldn't be any).
If Not dict.Exists(kStr) Then
dict(kStr) = CStr(Data(r, ItemColumnIndex))
End If
End If
Next r
If dict.Count = 0 Then Exit Function
Set TwoColumnsToDictionary = dict
End Function
Replace
Sub ReplaceSingleColumnData( _
ByRef Data() As Variant, _
ByVal dict As Object, _
ByVal InDelimiter As String, _
ByVal OutDelimiter As String)
Dim r As Long, n As Long
Dim sStrings() As String, sStr As String
For r = LBound(Data, 1) To UBound(Data, 1)
sStr = CStr(Data(r, 1))
If Len(sStr) > 0 Then
sStrings = Split(sStr, InDelimiter)
For n = 0 To UBound(sStrings)
sStr = Application.Trim(sStrings(n)) ' reusing 'sStr'
If dict.Exists(sStr) Then
sStrings(n) = dict(sStr)
Else
sStrings(n) = sStr
End If
Next n
Data(r, 1) = Join(sStrings, OutDelimiter)
End If
Next r
End Sub
you may want to stick to the Range.Replace() approach as much as possible
Option Explicit
Sub Macro1()
Const delimiter As String = " "
With Worksheets("table2") ' reference the find&replace sheet
Dim findRng As Range
Set findRng = .Range("A2", .Cells(.Rows.Count, 1).End(xlUp)) ' set the range in referenced sheet from column A row 2 down to last not empty row
End With
With Worksheets("table1") ' reference the data sheet
With .Range("H1", .Cells(.Rows.Count, "H").End(xlUp)) ' reference referenced sheet column "H" range from row 1 down to last not empty row
'-----------
'normalize the referenced range values to:
' - start with the delimiter
' - end with delimiter
' - only single spaces
Dim cel As Range
For Each cel In .Cells
cel.Value = delimiter & WorksheetFunction.Trim(cel.Value) & delimiter
Next
.Replace what:=" " & delimiter, replacement:=delimiter, lookat:=xlPart
.Replace what:=delimiter & " ", replacement:=delimiter, lookat:=xlPart
'-----------
' do the replacement
For Each cel In findRng
.Replace what:=cel.Value & delimiter, replacement:=cel.Offset(, 1) & delimiter, _
lookat:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
Next
' un-normalize the references range
' - remove leading delimiters
' - remove trailing delimiters
For Each cel In .Cells
cel.Value = Mid$(cel.Value, 2, Len(cel.Value) - 2)
Next
End With
End With
End Sub
Where you only have to set the needed delimiter in Const delimiter As String = " "
Of course, should you suffer from speed issues, you can switch to a "range to array" approach.
First by acting on the "normalize" and "un-normalize" code sections, only
If necessary, acting on the "do the replacement" section, too
For simplicity, this should work
Sub Macro1()
Dim i As Integer
Dim rng As Range
Set rng = Application.Intersect(Worksheets("table1").Range("H:H"), Worksheets("table1").UsedRange)
endDel = ", , ,,,"
For Each cell1 In rng
cell1.Value = cell1.Value & endDel
Next cell1
For i = 2 To Worksheets("table1").Range("A1").End(xlDown).Row
Worksheets("table1").Range("H:H").Replace What:=Worksheets("table2").Range("A" & i) & " ", _
Replacement:=Worksheets("table2").Range("B" & i) & " ", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, ReplaceFormat:=False ', FormulaVersion:=xlReplaceFormula2
Worksheets("table1").Range("H:H").Replace What:=Worksheets("table2").Range("A" & i) & ",", _
Replacement:=Worksheets("table2").Range("B" & i) & ",", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, ReplaceFormat:=False ', FormulaVersion:=xlReplaceFormula2
Next i
rng.Replace What:=endDel, _
Replacement:="", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, ReplaceFormat:=False ', FormulaVersion:=xlReplaceFormula2
End Sub
If you have multiple delimiters, I assume they don't really matter and the string doesn't need to be completely identical apart from the replacements. Assuming that, the easiest way would be just to replace all the possible delimiters, with just one. You can then easily split the string, test each individually then recombine for a standardised string.
Example code uses Split(", # ,# , ", "#") with a delimiter that is not , for ease
Sub Blah()
Debug.Print Replacement("A, B , d,e,f,g , h", "e", "G")
End Sub
Function Replacement(strInput As String, ToFind As String, ToReplace As String) As String
Dim i As Long
Dim DelimArray() As String: DelimArray = Split(", # ,# , ", "#")
For i = LBound(DelimArray) To UBound(DelimArray)
strInput = Replace(strInput, DelimArray(i), ",")
Next i
Dim TextArray() As String: TextArray = Split(strInput, ",")
For i = LBound(TextArray) To UBound(TextArray)
If TextArray(i) = ToFind Then TextArray(i) = ToReplace
Next i
Replacement = Join(TextArray, ",")
End Function
Problem: The Range.Replace method (Excel) generates unreliable results under the following conditions:
The strings to be replaced are also part of other strings.
Strings in the target range have multiple delimiters.
The strings to be replaced contain one or more of the delimiters.
Data:
Old (What)
New (Replacement)
4711 TEXT_A
4711 TEXT_A/1
4710 TEXT_B
4710 TEXT_B/1
String
4711 TEXT_A 4710 TEXT_B 4711 TEXT_AA,4710 TEXT_BB , 4711 TEXT_A , 4710 TEXT_B,4711 TEXT_AA, 4710 TEXT_BB, 4711 TEXT_A,4710 TEXT_B, 4711 TEXT_AA, 4710 TEXT_BB
The string above presents all the conditions previously mentioned:
Solution Proposed:
This problem can be solved using the Range_Replace_With_MultiDelimiters procedure:
Syntax
Range_Replace_With_MultiDelimiters (rTarget, aDelimiters, aOldNew, [blIncludeFormulas])
Parameters
Remarks
Delimiters that contain other delimiters must be placed in the array before the delimiters it contains, e.g.:
Variables:
Method:
1 - Mask all sOld strings to be replaced: As the strings to be replaced may contain one or more of the delimiters; when we try to standardize the delimiters, the sOld strings contained in the target strings will be affected, particularly when the delimiter is {space} therefore we need to modify (Mask) the sOld strings before standardizing the delimiters.
To do this we define a one-character constant to be used as a Mask Character:
Const kMsk As String = "‡"
This character must not be present in the rTarget range, nor in any sNew string.
The sOld will be masked using the format ‡i‡, where i (position of sOld in the aOldNew array) is wrapped with the Mask Character ‡.
2 - Standardize the Delimiters: Define a one-character constants to be used as Standard Delimiter:
Const kChr As String = "§" '(ASCII code 167)
This character must not be present in the rTarget range, nor in any sNew string.
Then convert all delimiters to a Standard Delimiter Index using the format §i§, where i (position of the delimiter in the array) is wrapped by the Standard Delimiter §.
This is the standardizing formula:
= "§" & SUBSTITUTE( … SUBSTITUTE( rTarget, aDelimiters(1), "§1§" ) … , aDelimiters(n), "§n§") & "§"
After the entire replacement process is completed, all delimiters will be reset to their original value. This is the reset formula:
= SUBSTITUTE(SUBSTITUTE( … SUBSTITUTE( rTarget, "§1§", aDelimiters(1) ), … , "§n§", aDelimiters(n) ), "§", TEXT(,) )
These formulas will be created by the function: Range_Replace_ƒDelimiters_Formula and applied to the rTarget using the Application.Evaluate method (Excel).
3 - Replace masked sOld strings with sNew string: Before replacing the masked sOld strings with the corresponding sNew string, we need to wrap both masked sOld strings and the sNew strings with the Standard Delimiter constant previously defined:
sOld = kChr & kMsk & lIdx & kMsk & kChr '[kMsk & lIdx & kMsk] is masked sOld
sNew = kChr & aOldNew(lIdx, 2) & kChr
4 - Replace Mask strings with sOld string: Notice that as we wrapped the masked sOld string before replacing when the sOld string was contained in another string it was not replaced as it did not match the wrapped masked sOld string, achieving the expected result. Now we need to replace back the remaining masked sOld strings with the original sOld strings where partial matches within a larger string happened.
5 - Reset the delimiters: Replace the Standard Delimiter Index back to each original delimiter, using the formula mentioned (step 2). This step could also be used to reset the original delimiters to a standard delimiter, however, as the list of delimiters includes the {space} it’s advisable not doing it.
The results returned by the Range_Replace_With_MultiDelimiters procedure mathed the expected results:
Procedures:
Sub Search_and_Replace_Text()
Dim aDelimiters() As Variant
aDelimiters = Array( _
" , ", _
" ,", _
", ", _
",", _
" ")
Dim rTarget As Range, aOldNew() As Variant
Dim sWsh As String, sRng As String, sFml As String
Rem Set array with strings to be replaced (Old\New)
sWsh = "Table2"
sRng = "A:B"
With ThisWorkbook.Worksheets(sWsh).Range(sRng)
Rem Activate target worksheet (needed to apply the Application.Evaluate method)
Application.Goto .Cells(1), 1
With .Cells(2, 1).Resize(-1 + .Cells(.Rows.Count, 1).End(xlUp).Row, 2)
sFml = "=UNIQUE(FILTER(" & .Address _
& ",(" & .Columns(1).Address & "<>TEXT(,))))"
aOldNew = Application.Evaluate(sFml)
End With: End With
Rem Set range to apply the replace method
sWsh = "Table1"
sRng = "H:H"
With ThisWorkbook.Worksheets(sWsh).Range(sRng)
Set rTarget = Range(.Cells(2), .Cells(.Rows.Count).End(xlUp))
End With
Call Range_Replace_With_MultiDelimiters(rTarget, aDelimiters, aOldNew)
End Sub
Private Sub Range_Replace_With_MultiDelimiters( _
ByVal rTarget As Range, aDelimiters() As Variant, aOldNew() As Variant, _
Optional blIncludeFormulas As Boolean)
Rem Uncomment the lines the start with [':]
Rem to have in the Immediate Window a record of each step perfomed by the procedure
Const kChr As String = "§"
Const kMsk As String = "‡"
Dim rArea As Range
Dim sOld As String, sNew As String, lIdx As Long
Dim sFmlA As String, sFmlB As String
Dim sFml As String, aValues As Variant
Rem Built Delimiters Formula - Standardization & Reset
sFmlA = Range_Replace_ƒDelimiters_Formula(aDelimiters, kChr)
sFmlB = Range_Replace_ƒDelimiters_Formula(aDelimiters, kChr, True)
': Debug.Print vbLf; "Built Delimiters Formula - Standardization & Reset"
': Debug.Print "Standardization: "; vbLf; "sFmlA: "; sFmlA
': Debug.Print "Reset: "; vbLf; "sFmlB: "; sFmlB
Rem Exclude Formulas from Target range
If Not (blIncludeFormulas) Then
With rTarget
Set rTarget = Union(.SpecialCells(xlCellTypeBlanks), _
.SpecialCells(xlCellTypeConstants, 23))
End With: End If
With rTarget
Rem Activate target range worksheet (needed to apply the Application.Evaluate method)
Application.Goto .Worksheet.Cells(1), 1
For Each rArea In .Areas
With rArea
Rem Replace Old string To Mask string
': Debug.Print vbLf; "Replace Old To Mask"
': Debug.Print vbTab; "Old"; Tab(21); "New"
For lIdx = 1 To UBound(aOldNew)
sOld = aOldNew(lIdx, 1)
sNew = kMsk & lIdx & kMsk
': Debug.Print vbTab; sOld; Tab(21); sNew
.Replace What:=sOld, Replacement:=sNew, _
LookAt:=xlPart, SearchOrder:=xlByRows, _
MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False, FormulaVersion:=xlReplaceFormula2
Next
Rem Standardize Delimiters
sFml = Replace(sFmlA, "#RNG", .Address(0, 0))
aValues = Application.Evaluate(sFml)
.Value2 = aValues
': Debug.Print vbLf; "Standardize Delimiters"
': Debug.Print "Fml: "; sFml
Rem Replace Mask string To New string
': Debug.Print vbLf; "Replace Mask To New"
': Debug.Print vbTab; "Old"; Tab(21); "New"
For lIdx = 1 To UBound(aOldNew)
sOld = kChr & kMsk & lIdx & kMsk & kChr
sNew = kChr & aOldNew(lIdx, 2) & kChr
': Debug.Print vbTab; sOld; Tab(21); sNew
.Replace What:=sOld, Replacement:=sNew, _
LookAt:=xlPart, SearchOrder:=xlByRows, _
MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False, FormulaVersion:=xlReplaceFormula2
Next
Rem Replace Mask string To Old string
': Debug.Print vbLf; "Replace Mask To Old"
': Debug.Print vbTab; "Old"; Tab(21); "New"
For lIdx = 1 To UBound(aOldNew)
sOld = kMsk & lIdx & kMsk
sNew = aOldNew(lIdx, 1)
': Debug.Print vbTab; sOld; Tab(21); sNew
.Replace What:=sOld, Replacement:=sNew, _
LookAt:=xlPart, SearchOrder:=xlByRows, _
MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False, FormulaVersion:=xlReplaceFormula2
Next
Rem Reset Delimiters
sFml = Replace(sFmlB, "#RNG", .Address(0, 0))
aValues = Application.Evaluate(sFml)
.Value2 = aValues
': Debug.Print vbLf; "Reset Delimiters"
': Debug.Print "Fml: "; sFml
End With: Next: End With
End Sub
Private Function Range_Replace_ƒDelimiters_Formula( _
aDelimiters() As Variant, sChr As String, Optional blReset As Boolean) As String
Dim sOld As String, sNew As String
Dim sFml As String
Dim vItem As Variant, bItem As Byte
Rem Separators
For Each vItem In aDelimiters
Rem Separators Old & New
bItem = 1 + bItem
sOld = IIf(blReset, sChr & bItem & sChr, vItem)
sNew = IIf(blReset, vItem, sChr & bItem & sChr)
Rem Built Formula - Delimiters Array
Select Case bItem
Case 1: sFml = "SUBSTITUTE(#RNG,""" & sOld & """,""" & sNew & """)"
Case Else: sFml = "SUBSTITUTE(" & sFml & ",""" & sOld & """,""" & sNew & """)"
End Select
Next
Rem Built Formula - Delimiters Character
Select Case blReset
Case True: sFml = "=SUBSTITUTE(" & sFml & ",""" & sChr & """,TEXT(,))"
Case Else: sFml = "= """ & sChr & """&" & sFml & "&""" & sChr & """"
End Select
Range_Replace_ƒDelimiters_Formula = sFml
End Function

Search sentence for all values in a list of values and return all found into adjacent cell

I am looking to search for values from a list in Sheet1 in each cell of column C on sheet2 to be separated by commas.
Sheet1 has a list of names:
Sheet 2 has a set of sentences in column C. The output in column D should be the names in Sheet1.
I have searched but haven't found a solution.
I don't have any code to show that has been effective in this regard but I did come across a function that seemed promising but, since I don't know what would surround the name per cell it isn't quite what I need.
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String, _
Optional separator As String = ", ") As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
Dim i As Long, j As Long
Dim result As String
RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
For i = 0 To allMatches.Count - 1
For j = 0 To allMatches.Item(i).submatches.Count - 1
result = result & (separator & allMatches.Item(i).submatches.Item(j))
Next
Next
If Len(result) <> 0 Then
result = Right$(result, Len(result) - Len(separator))
End If
RegexExtract = result
End Function
Using regexp Test:
Function CheckList(ByVal text As String, list As Range) As String
Static RE As Object
Dim arr, sep, r As Long, result As String, v
If RE Is Nothing Then Set RE = CreateObject("vbscript.regexp")
If Len(text) > 0 Then
arr = list.Value
'check each name
For r = 1 To UBound(arr, 1)
v = arr(r, 1)
If Len(v) > 0 Then
RE.Pattern = "\b" & v & "\b" '<< whole word only
If RE.test(text) Then
result = result & sep & v
sep = ", " 'populate the separator
End If
End If
Next r
End If
CheckList = result
End Function
You can use a Dictionary object to check each string against the NameList, assuming that the names in the sample string do not have punctuation.
If they do, this method can still be used, but would require some modification. For example, one could replace all of the punctuation with spaces; or do something else depending on how complex things might be.
eg:
Option Explicit
Function ckNameList(str As String, nameList As Range) As String
Dim D As Dictionary
Dim vNames, I As Long, V, W
Dim sOut As String
vNames = nameList
Set D = CreateObject("Scripting.Dictionary")
D.CompareMode = TextCompare
For I = 1 To UBound(vNames)
If Not D.Exists(vNames(I, 1)) Then _
D.Add vNames(I, 1), vNames(I, 1)
Next I
V = Split(str, " ")
sOut = ""
For Each W In V
If D.Exists(W) Then _
sOut = sOut & ", " & W
Next W
ckNameList = Mid(sOut, 3)
End Function
Scott showed how to use TEXTJOIN, when you don't have access to this function. Your best best might be VBA. We could emulate some sort of TEXTJOIN, possibly like so:
Function ExtractNames(nms As Range, str As Range) As String
ExtractNames = Join(Filter(Evaluate("TRANSPOSE(IF(ISNUMBER(SEARCH(" & nms.Address & "," & str.Address & "))," & nms.Address & ",""|""))"), "|", False), ", ")
End Function
Called in D2 like: =ExtractNames($A$2:$A$7,C2) and dragged down. Downside of this Evalate method is that it's making use of an array formula, however the native TEXTJOIN would have been so too. Plusside is that it's avoiding iteration.
EDIT
As #TimWilliams correctly stated, this might end up confusing substrings that hold part of what we are looking for, e.g. > Paul in Pauline.
I also realized that to overcome this, we need to substitute special characters. I've rewritten my function to the below:
Function ExtractNames(nms As Range, str As Range) As String
Dim chr() As Variant, arr As Variant
'Create an array of characters to ignore
chr = Array("!", ",", ".", "?")
'Get initial array of all characters, with specified characters in chr substituted for pipe symbol
arr = Evaluate("TRANSPOSE(IF(ISNUMBER(MATCH(MID(" & str.Address & ",ROW(A1:A" & Len(str.Value) & "),1),{""" & Join(chr, """,""") & """},0)),""|"",MID(" & str.Address & ",ROW(A1:A" & Len(str.Value) & "),1)))")
'Get array of words used to check against names without any specified characters
arr = Split(Join(Filter(arr, "|", False), ""), " ")
'Check which names occur in arr
For Each cl In nms
If IsNumeric(Application.Match(cl.Value, arr, 0)) Then
If ExtractNames = "" Then
ExtractNames = cl.Value
Else
ExtractNames = Join(Array(ExtractNames, cl.Value), ", ")
End If
End If
Next cl
End Function
As you can tell, it's possible still, but my conclusion and recommendation would be to go with RegEx. #TimWilliams has a great answer explaining this, which I slightly adapted to prevent an extra iteration:
Function ExtractNames(nms As Range, str As Range) As String
Dim regex As Object: Set regex = CreateObject("VBScript.RegExp")
Dim arr() As Variant: arr = Application.Transpose(nms.Value)
Dim del As String
regex.Pattern = "\b(?:" & Join(arr, "|") & ")\b"
regex.Global = True
regex.Ignorecase = True
Set hits = regex.Execute(str.Value)
For Each hit In hits
ExtractNames = ExtractNames & del & hit
del = ", "
Next hit
End Function
Or even without iteration:
Function ExtractNames(nms As Range, str As Range) As String
Dim regex As Object: Set regex = CreateObject("VBScript.RegExp")
Dim arr() As Variant: arr = Application.Transpose(nms.Value)
Dim del As String
regex.Global = True
regex.Ignorecase = True
'Perform 1st replace on non-alphanumeric characters
regex.Pattern = "[^\w]"
ExtractNames = Application.Trim(regex.Replace(str.Value, " "))
'Perferom 2nd replace on all words that are not in arr
regex.Pattern = "\b(?!" & Join(arr, "|") & ")[\w-]+\b"
ExtractNames = Application.Trim(regex.Replace(ExtractNames, " "))
ExtractNames = Replace(ExtractNames, " ", ", ")
End Function

How to extract Upper case words (with at least two letters)

I need to extract last names, which are allways written upper case, from cells, where is written all the name. The name can have different shapes e.g.:
Jan H. NOVAK
Petr Karel POUZAR
Frantisek Ix GREGOR
I have tried to find some VBAs on the web. I found this one, but it extract also the one letter middle names which are also upper case:
Function UpperWords(str As Variant) As String
Dim i As Integer, sTemp As String, StrTmp As String
For i = 0 To UBound(Split(str, " "))
StrTmp = Split(str, " ")(i)
If UCase(StrTmp) = StrTmp Then sTemp = sTemp & " " & StrTmp
Next i
UpperWords = Trim(sTemp)
End Function
I need to define in the VBA tahat the upper case word which I want to extract has at least two letters.
Thank you for your ideas.
Add the test to the If:
Function UpperWords(str As Variant) As String
Dim i As Integer, sTemp As String, StrTmp As String
For i = 0 To UBound(Split(str, " "))
StrTmp = Split(str, " ")(i)
If UCase(StrTmp) = StrTmp And Len(StrTmp) > 1 Then sTemp = sTemp & " " & StrTmp
Next i
UpperWords = Trim(sTemp)
End Function
If what you want is to extract last names, then you can use a formula/function to do just that. Capitalization would seem to be irrelevant.
worksheet formula
=TRIM(RIGHT(SUBSTITUTE(TRIM(A1)," ",REPT(" ",99)),99))
UDF
Function lastName(S As String) As String
lastName = Mid(S, InStrRev(Trim(S), " ") + 1)
End Function
Dim regEx As Object
Dim allMatches As Object
Dim Surname As Variant
Dim rng As Range
Dim cell As Range
Dim m As Match
Set rng = ThisWorkbook.Worksheets("Sheet2").Range("A1:A4")
For Each cell In rng
Set regEx = CreateObject("VBScript.RegExp")
With regEx
.IgnoreCase = True
.MultiLine = False
.Pattern = "\s([A-Z]+)$"
.Global = True
End With
Set allMatches = regEx.Execute(cell.Value)
For Each m In allMatches
Surname = m.SubMatches(0)
Debug.Print Surname
Next
Next cell

Excel VBA: How to remove substrings from a cell?

I have a cell value like this:
This is a <"string">string, It should be <"changed">changed to <"a"> a number.
There are some words repeated in this part <" ">.
I want use Excel VBA to change the cell value to:
This is a string, It should be changed to a number.
Any help will be appreciated. Thanks.
Following up on the suggestion to use regular expressions, here's an example:
Option Explicit
Sub RemoveByRegexWithLateBinding()
Dim strIn As String
Dim strOut As String
Dim objRegex As Object
'input
strIn = "This is a <""string"">string, It should be <""changed"">changed to <""a""> a number."
Debug.Print "Input:" & vbCr & strIn
'create and apply regex
Set objRegex = CreateObject("VBScript.RegExp")
objRegex.Pattern = "<""[^<>""]*"">"
objRegex.Global = True
strOut = objRegex.Replace(strIn, "")
'test output
Debug.Print "Output:" & vbCr & strOut
End Sub
Produces this output:
Input:
This is a <"string">string, It should be <"changed">changed to <"a"> a number.
Output:
This is a string, It should be changed to a number.
Diagram of regular expression:
Which can be explained as finding a string that:
begins with <"
contains anything apart from the characters <, > and "
ends with ">
Assuming the text in cell A1, then try this code
Sub DelDoubleString()
Dim Text As String, Text2Replace As String, NewText As String
On Error Resume Next 'Optional, in case there's no double string to be deleted
Text = Cells(1, 1)
Do
Text2Replace = Mid$(Text, InStr(Text, "<"), InStr(Text, ">") - InStr(Text, "<") + 1)
NewText = Application.WorksheetFunction.Substitute(Text, Text2Replace, vbNullString)
Text = NewText
Loop Until InStr(NewText, "<") = 0
Cells(1, 1) = NewText
End Sub
Select the cells containing your text and run this short macro:
Sub Kleanup()
Dim d As Range, s As String, rng As Range
Dim gather As Boolean, L As Long, DQ As String
Dim i As Long, s2 As String, CH As String
Set rng = Selection
DQ = Chr(34)
For Each r In rng
s = Replace(r.Text, "<" & DQ, Chr(1))
s = Replace(s, DQ & ">", Chr(2))
gather = True
L = Len(s)
s2 = ""
For i = 1 To L
CH = Mid(s, i, 1)
If CH = Chr(1) Then gather = False
If CH = Chr(2) Then gather = True
If gather And CH <> Chr(2) Then s2 = s2 & CH
Next i
r.Value = s2
Next r
End Sub
U can Use Replace function
ActiveSheet.Cells(1, 1).Value = Replace(ActiveSheet.Cells(1, 1).Value, "String", "Number")

Extract number sets from string, and return as comma seperated values in new string

Can anyone give me any pointers as to how to achieve this? I already managed to extract all numbers, but can't get them separated in the correct position. Hoping to achieve this through an excel function, but VBA will be OK too.
This is what I have now:
=SUMPRODUCT(MID(0&A1;LARGE(INDEX(ISNUMBER(--MID(A1;ROW($1:$25);1))*ROW($1:$25);0);ROW($1:$25))+1;1)*10^ROW($1:$25)/10)
Example of the required result:
Cell A1: Result:
123abc 123
abc456 456
123abc456 123,456
789sometext753 789,753
6ty cents, 5% 6,5
You could try this UDF:
Public Function GetNumbers(sMark As String) As String
'regexp declaration
Dim objRegExp As Object
Set objRegExp = CreateObject("vbscript.regexp")
With objRegExp
.Global = True
.Pattern = "(\d+)\D*"
Dim i As Long
If .Execute(sMark).Count > 1 Then
For i = 0 To .Execute(sMark)(0).submatches.Count
GetNumbers = GetNumbers & .Execute(sMark)(i).submatches(0) & ","
Next
Else
GetNumbers = .Execute(sMark)(0).submatches(0) & ","
End If
End With
GetNumbers = Left(GetNumbers, Len(GetNumbers) - 1)
End Function
it's tried and tested for examples you provided and hope it will work for all situations you need.
You could use this UDF:
Function GetNums(target As Range)
Dim MyStr As String, i As Integer
MyStr = ""
If Len(target.Value) = 0 Then GoTo GoExit
For i = 1 To Len(target.Value)
If IsNumeric(Mid(target, i, 1)) Then
MyStr = MyStr & Mid(target, i, 1)
Else
MyStr = MyStr & " "
End If
Next i
GoExit:
GetNums = Application.WorksheetFunction.Substitute(Application.WorksheetFunction.Trim(MyStr), " ", ",")
End Function

Resources