VBA Function to exclude parts of a string - string

My sub compares two lists of strings and returns the closest matches. I've found that the sub gets tripped up over some common words such as "the" and "facility". I would like to write a function that would be supplied an array of words to exclude and check each string for these words and exclude them if found.
Here is a sample input:
|aNames | bNames | words to exclude
|thehillcrest |oceanview health| the
|oceanview, the|hillCrest | health
Intended Output:
|aResults |bResuts
|hillcrest |hillcrest
|oceanview |oceanview
So far I have:
Dim ub as Integer
Dim excludeWords() As String
'First grab the words to be excluded
If sheet.Cells(2, 7).Value <> "" Then
For y = 2 To sheet.Range("G:G").End(xlDown).Row
ub = UBound(excludeWords) + 1 'I'm getting a subscript out of range error here..?
ReDim Preserve excludeWords(0 To ub)
excludeWords(ub) = sheet.Cells(y, 7).Value
Next y
End If
Then my comparison function, using a double loop, will compare each string in column A with column B. Before the comparison, the value in column a and b will go through our function which will check for these words to exclude. It's possible that there will be no words to exclude, so the parameter should be optional:
Public Function normalizeString(s As String, ParamArray a() As Variant)
if a(0) then 'How can I check?
for i = 0 to UBound(a)
s = Replace(s, a(i))
next i
end if
normalizeString = Trim(LCase(s))
End Function
There's probably a few parts in this code that won't work. Might you be able to point me in the right direction?
Thank you!

To store the list in the array, you can do this
Sub Sample()
Dim excludeWords As Variant
Dim lRow As Long
With Sheet1 '<~~ Change this to the relevant sheet
'~~> Get last row in Col G
lRow = .Range("G" & .Rows.Count).End(xlUp).Row
excludeWords = .Range("G2:G" & lRow).Value
'Debug.Print UBound(excludeWords)
'For i = LBound(excludeWords) To UBound(excludeWords)
'Debug.Print excludeWords(i, 1)
'Next i
End With
End Sub
And then pass the array to your function. The above array is a 2D array and hence needs to be handled accordingly (see commented section in the code above)
Also like I mentioned in the comments above
How does oceanview, the become Oceanview? You can replace the but that would give you oceanview, (notice the comma) and not Oceanview.
You may have to pass those special characters to Col G in the sheet or you can handle them in your function using a loop. For that you will have to use the ASCII characters. Please see this
Followup from comments
Here is something that I wrote quickly so it is not extensively tested. Is this what you are looking for?
Sub Sample()
Dim excludeWords As Variant
Dim lRow As Long
With Sheet1
lRow = .Range("G" & .Rows.Count).End(xlUp).Row
excludeWords = .Range("G2:G" & lRow).Value
'~~> My column G has the word "habilitation" and "this"
Debug.Print normalizeString("This is rehabilitation", excludeWords)
'~~> Output is "is rehabilitation"
End With
End Sub
Public Function normalizeString(s As String, a As Variant) As String
Dim i As Long, j As Long
Dim tmpAr As Variant
If InStr(1, s, " ") Then
tmpAr = Split(s, " ")
For i = LBound(a) To UBound(a)
For j = LBound(tmpAr) To UBound(tmpAr)
If LCase(Trim(tmpAr(j))) = LCase(Trim(a(i, 1))) Then tmpAr(j) = ""
Next j
Next i
s = Join(tmpAr, " ")
Else
For i = LBound(a) To UBound(a)
If LCase(Trim(s)) = LCase(Trim(a(i, 1))) Then
s = ""
Exit For
End If
Next i
End If
normalizeString = Trim(LCase(s))
End Function

First of all, you cannot call UBound function for the Array that doesn't have a size yet:
Dim excludeWords() As String
ub = UBound(excludeWords) + 1 'there is no size yet
To remove some of the unwanted words use Replace function
String1 = Replace(String1, "the", "")
To do the comparison you described I would use Like function. Here is documentation.
http://msdn.microsoft.com/pl-pl/library/swf8kaxw.aspx

Related

Concatenate and Replace with VBA on Excel

Can someone point me in the right direction with code or keywords to lookup? I am trying to concatenate, add email, and replace:
Example
Names
Example 1
FirstName1 LastName1, FirstName2 Lastname2, FirstName3 Lastname3
Into something like:
Example
Names
Example 1
FirstName1.LastName1#email.com, FirstName2,Lastname2#email.com, FirstName3.LastName3#email.com
Names will be separated by commas.
Thanks in advance!
Please, use the next function:
Function processNameMailAccount(x As String)
Dim arrNames, i As Long
Const domain As String = "#email.com"
arrNames = Split(x, ", ")
For i = 0 To UBound(arrNames)
arrNames(i) = Join(Split(arrNames(i)), ".") & domain
Next i
processNameMailAccount = Join(arrNames, ", ")
End Function
It can be used/tested in the next way:
Sub testprocessNameMailAccount()
Dim x As String
x = "FirstName1 LastName1, FirstName2 Lastname2, FirstName3 Lastname3"
Debug.Print processNameMailAccount(x)
End Sub
Supposing that the column having the names is "B:B" you can use the function to process as you need with a code like the next. For testing reasons, it returns in column "D:D":
Sub processNames()
Dim sh As Worksheet, lastR As Long, arr, i As Long
Set sh = ActiveSheet 'use here the sheet you need
lastR = sh.Range("B" & sh.rows.count).End(xlUp).row
arr = sh.Range("B2:B" & lastR).Value2
For i = 1 To UBound(arr)
arr(i, 1) = processNameMailAccount(CStr(arr(i, 1)))
Next i
'drop the processed array result in D:D:
sh.Range("D2").Resize(UBound(arr), 1).Value2 = arr
End Sub
If it returns as you need, you can change "D2" from the last code line in "B2" and the initial values in B:B will be replaced by the processed strings...

Appending to an array while looping

I've got the following:
Dim dupArray As Variant
dupArray = Array("dup")
For j = LBound(numArray) To UBound(numArray)
If IsInArray(Range(numArray(j)).Text, dupArray) Then
Range(numArray(j)).Interior.ColorIndex = 3
MsgBox "Duplicate value"
ElseIf IsInArray(Range(numArray(j)).Text, keyArray) Then
Application.OnKey Range(numArray(j)).Text, comArray(j)
ReDim Preserve dupArray(j)
Dim cellEntry As String
cellEntry = Range(numArray(j)).Text
MsgBox cellEntry
dupArray(j) = cellEntry
Else
Range(numArray(j)).Interior.ColorIndex = 3
MsgBox "Unrecognized key code in " & numArray(j)
End If
Next j
In essence, I want VBA to read the contents of a cell and append them to an array for comparison. Code to read the contents of a cell is Range(numArray(j)).Text... For some reason, cellEntry and dupArray(j) are not equal. More specifically, for the cell A6, cellEntry is "b" (which is the correct contents), but dupArray(j) is "A6"... any thoughts? There's no error code, it's just not putting the correct value in the array.
Thank you!
(Edit) Code for Function IsInArray:
Function IsInArray(stringToBeFound As String, arr As Variant) As Boolean
IsInArray = Not IsError(Application.Match(stringToBeFound, arr, 0))
End Function
(Edit 2) Don't pay attention to much else... I'm just wondering why cellEntry doesn't match dupArray(j) for all values of j when they should clearly be the same thing.
Your code seems to work, but it depends on keyArray being populated.
I've run this demo code, including populating numArray and keyArray with test values, to illustrate what happens.
If how I've populated these arrays doesn't match your code, please add that info to your Q.
Sub Demo()
Dim dupArray As Variant
Dim numArray As Variant
Dim keyArray As Variant
Dim comArray As Variant
Dim j As Long
' for testing
numArray = Application.Transpose([A1:A6].Value)
ReDim keyArray(1 To 3)
keyArray(1) = "x"
keyArray(2) = "a"
keyArray(3) = "s"
dupArray = Array("dup")
For j = LBound(numArray) To UBound(numArray)
If IsInArray(Range(numArray(j)).Text, dupArray) Then
Range(numArray(j)).Interior.ColorIndex = 3
'MsgBox "Duplicate value"
ElseIf IsInArray(Range(numArray(j)).Text, keyArray) Then
'Application.OnKey Range(numArray(j)).Text, comArray(j)
ReDim Preserve dupArray(j)
Dim cellEntry As String
cellEntry = Range(numArray(j)).Text
'MsgBox cellEntry
dupArray(j) = cellEntry
Else
Range(numArray(j)).Interior.ColorIndex = 4 ' changed to be distinct for testing
'MsgBox "Unrecognized key code in " & numArray(j)
End If
Next j
End Sub
Sheet, before
Sheet, after
Variable values at end of execution
As you can see, dupArray has been populated sparsley, in line with numArray. This is fine for how it's used with IsInArray. If it's used for something else too, you can change how it's populated.

Getting an array of strings from cells in excel - VBA

I am working on a program that needs to read an array of values from cells in another worksheet in the same workbook. I am able to read a single value just fine, but when I try to read multiple, I cannot return an array.
Here is what I am trying to do:
Dim list() As Variant
list = ActiveWorkbook.Worksheets("Sheet2").Range("A2:C2").value
Debug.Print TypeName(list)
Debug.Print UBound(list)
Debug.Print LBound(list)
Debug.Print TypeName(list(UBound(list)))
For which the output is:
Variant()
1
1
Subscript out of range
However, If I try it where I expect a single string, instead of an array of strings
Dim value As String
Let value = ActiveWorkbook.Worksheets("Site IDs and CJONs").Range("A2").value
Debug.Print TypeName(value)
Debug.Print value
for which I get the output
String
Expected Value
According to this question I should be able to simply return an array from the range function (example from the answer below), but it doesn't seem to be working for me. What am I doing wrong?
Dim DirArray As Variant
DirArray = Range("a1:a5").Value
Although it is not obvious, this:
Dim DirArray As Variant
DirArray = Range("a1:a5").Value
actually is like:
Dim DirArray(1 To 5, 1 To 1) As Variant
DirArray(1, 1) = Range("A1").Value
DirArray(2, 1) = Range("A2").Value
DirArray(3, 1) = Range("A3").Value
DirArray(4, 1) = Range("A4").Value
DirArray(5, 1) = Range("A5").Value
Pulling a set of cells into an array usually makes a 2-D array.
NOTE:
If you want to go from array to worksheet cells then, for example:
Sub ytrewq()
Dim DirArray(1 To 5, 1 To 1) As Variant
DirArray(1, 1) = "Larry"
DirArray(2, 1) = "Moe"
DirArray(3, 1) = "Curly"
DirArray(4, 1) = "Shepp"
DirArray(5, 1) = "James"
Range("B9").Resize(5, 1) = DirArray
End Sub
I might as well put my comment as an answer:
Option Explicit
Sub test()
Dim list As Variant
list = Application.Transpose(Application.Transpose(ActiveWorkbook.Worksheets("Sheet1").Range("A2:C2").Value))
Debug.Print TypeName(list)
Debug.Print UBound(list)
Debug.Print LBound(list)
'Debug.Print UBound(list, 2) Error
'Debug.Print LBound(list, 2) Error
Debug.Print TypeName(list(UBound(list)))
Debug.Print list(UBound(list))
End Sub
Gives output:
Variant()
3
1
String
x
where C2 contains letter x.

How do I remove anything that starts with numbers at the end of the text string in VBA?

So currently my VBA code looks like this:
Sub Testing()
Dim K As Long
Dim LR As Long
LR = Cells(Rows.Count, 1).End(xlUp).Row
For K = 2 To LR
Cells(K, 2).Value = StripAfter(Cells(K, 1), "_", 6)
Next K
End Sub
Function StripAfter(ByVal txt As String, ByVal delimiter As String, ByVal
occurrence As Long) As String
Dim x As Variant
x = Split(expression:=txt, delimiter:=delimiter, limit:=occurrence + 1)
StripAfter = x(UBound(x))
End Function
I have this linked to a button that will output the data like this:
(Side note: Column A is pasted in, Column B is the result after having the VBA Macro run)
With this output it's exactly what the formula is made to do which is great! My question is and I can't wrap my head around this (I'm new with VBA Macros, trying to learn as best as I can) for the results in Column B, they all end in numbers with an X between the numbers. How would I adjust my code to make it so it deletes that portion of text? So the result would look like:
As you can see from the results I'm looking for compared to the results that are given, the ###X### is taken out at the end. I've played around outside of VBA and found this to work but its essentially a two step process:
=RIGHT(SUBSTITUTE(A1,"_",CHAR(10),12),LEN(A1)-FIND(CHAR(10),SUBSTITUTE(A1,"_",CHAR(10),12),1)+1)
^^^ This will grab the last section of the string from A1 (First image)
=LEFT(A20,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A20&"0123456789")) -1)
^^^ (A20 is the cell I used from the formula above to grab the last section of the string in A1) And this will delete anything after the very first number. This works exactly how I want it too, but I have no idea where to begin to implement this in the VBA Formula above.
Any help would be greatly appreciated!
I also struggle with formulae like that, so I have used regular expressions, adding a few lines to your function. It's basically pattern matching. Your sub is as before.
Function StripAfter(ByVal txt As String, ByVal delimiter As String, ByVal occurrence As Long) As String
Dim x As Variant
x = Split(expression:=txt, delimiter:=delimiter, limit:=occurrence + 1)
StripAfter = x(UBound(x))
With CreateObject("VBScript.RegExp")
.Global = True
.Pattern = "\d+x\d+$" 'match 1+ numbers followed by x followed by 1+ numbers at the end of a string
If .Test(StripAfter) Then StripAfter = .Replace(StripAfter, "") 'if pattern found replace with empty string
End With
End Function
You could also try:
Option Explicit
Sub test()
Dim LastRow As Long, i As Long, Chr As Long
Dim str As String, NewStr As String
With ThisWorkbook.Worksheets("Sheet1")
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
For i = 2 To LastRow
str = .Range("A" & i).Value
For Chr = 1 To Len(str)
If Not IsNumeric(Mid(str, Chr, 1)) Then
NewStr = NewStr & Mid(str, Chr, 1)
End If
Next Chr
.Range("B" & i).Value = NewStr
NewStr = ""
Next i
End With
End Sub

Excel: Find All Instances of Text Strings in Range and Return Each Instance

I need to find all instances of particular identifier that may occur in one column and concatenate them into one string.
The identifier will start with "ECP" and be separated by a dash or space and have several characters after the separator. E.g. "ECP 05-00012A1, "ECP-123456."
I was using the formula below, but didn't think of multiple "ECP numbers."
=INDEX('Raw WAM Data'!$A$1:$A$10000,MATCH(VLOOKUP("*"&"ECP"&"*",'Raw WAM Data'!$A$1:$A$10000,1,FALSE),'Raw WAM Data'!$A$1:$A$10000,0))
I was then parsing the data in an adjacent cell using: =LEFT($C$62,FIND(" ", $C$62, FIND(" ", $C$62)+1))
This string was then loaded into a UserForm TextBox.
I would then need concatenate all the returned values into one string separated by commas so that it can load into the UserForm TextBox.
I would think that VBA would be ideal for this, but I am open to any suggestions.
If I've got correct understanding of what you trying to achive then you can use something like this:
Sub TEST()
Dim dic As Object: Set dic = CreateObject("Scripting.Dictionary")
Dim cl As Range, x&
With Sheets("Sheet1") 'replace sheet1 by name of your sheet
x = .Cells(Rows.Count, "A").End(xlUp).Row
For Each cl In .Range(.[A1], .Cells(x, "A"))
If UCase(cl.Value2) Like "ECP*" And Not dic.exists(cl.Value2) Then
dic.Add cl.Value2, Nothing
End If
Next cl
End With
Debug.Print Join(dic.keys, Chr(10))
End Sub
test
Updated
What's the best way to put the results in Column E relative to the cell in which it was found? Also, if I wanted to search multiple columns, how should I adapt the code?
you can use this way:
Sub TEST2()
Dim cl As Range, x&
With Sheets("Sheet1") 'replace sheet1 by name of your sheet
x = .[A:C].Find("*", , , , xlByRows, xlPrevious).Row 'get the last used row in range
For Each cl In .Range(.[A1], .Cells(x, "C"))
If UCase(cl.Value2) Like "*ECP*" Then
If .Cells(cl.Row, "E").Value2 = "" Then
.Cells(cl.Row, "E").Value2 = cl.Value2
Else
.Cells(cl.Row, "E").Value2 = .Cells(cl.Row, "E").Value2 & "; " & cl.Value2
End If
End If
Next cl
End With
End Sub
Output
If your values are in column A of a worksheet this routine will gather your ECP numbers and load them into an array. You can then load the array into your TextBox.
Sub GatherECPs()
Dim ECParr
'Loop down each row starting at row 2 (assuming you have headers)
For x = 2 To SourceSheet.Range("A2").End(xlDown).Row
'Check if the start of the string is ECP
If Left(SourceSheet.Cells(x, 1).Value, 3) = "ECP" Then
'Add a row to the array
If IsEmpty(ECParr) Then
ReDim ECParr(0)
Else
ReDim Preserve ECParr(UBound(ECParr) + 1)
End If
'Add the value to the array
ECParr(UBound(ECParr)) = Right(SourceSheet.Cells(x, 1).Value, Len(SourceSheet.Cells(x, 1).Value) - 4)
End If
Next
End Sub
Replace SourceSheet with the sheet where your values exist.
To do it in a fast way which also works for multiple "ECP" in one cell just use this function:
Public Function getStr(rng As Range, ident As String) As String
Dim i As Long, x As Variant, y As Variant
For Each x In Intersect(rng, rng.Parent.UsedRange).Value
y = Split(x, ident)
If UBound(y) > 0 Then
For i = 1 To UBound(y)
getStr = getStr & ", " & ident & Split(y(i), ",")(0)
Next
End If
Next
getStr = Mid(getStr, 3)
End Function
It will return a comma separated string. just use it like: getStr(Range("A:A"), "ECP")
If you still have any questions, just ask ;)

Resources