Extracting numbers from a string in each cell - excel

I'm trying to write code that extracts X consecutive numbers from text.
For example, if I want to extract 5 consecutive numbers in my text:
Cell A1: dsuad28d2hr 22222222 11111 d33d11103
Cell B2: 11111 (wanted)
I could make it work for texts with only 5 numbers but the problem is if my text contains other consecutive numbers higher than 5.
Sub ExtractNum2()
Dim Caract() As String
Dim i As Integer
Dim j As Integer
Dim z As Integer
Dim cont As Integer
Dim goal As Integer
Dim Protocolo() As String
Dim cel As String
Dim lin As Long
lin = Range("A1", Range("A1").End(xlDown)).Rows.Count 'Repeat for each line
For z = 1 To lin
cel = Cells(z, 1)
ReDim Caract(Len(cel))
ReDim Protocolo(Len(cel))
cont = 0
For i = 1 To Len(cel)
Caract(i) = Left(Mid(cel, i), 1)
If IsNumeric(Caract(i)) Then 'Character check
cont = cont + 1
Protocolo(cont) = Caract(i)
'If Not IsNumeric(Caract(6)) And cont = 5 Then**
If cont = 5 '
Dim msg As String
For j = 1 To 5
msg = msg & Protocolo(j)
Next j
Cells(z, 2) = msg 'fills column B
msg = ""
End If
Else
cont = 0
End If
Next i
Next z 'end repeat
End Sub
I'm trying to use:
If Not IsNumeric(Caract(6)) And cont = 5 Then
But it is not working, my output is: B2: 22222 but I want 11111.
What am I missing?
EDIT
Sorry i wasnt clear. I want to extract X numbers with 6>x>4 (x=5). I dont want 22222 since it has 8 consecutive numbers and 11111 has 5 in my example.

UDF:
Function GetNum(cell)
With CreateObject("VBScript.RegExp")
.Pattern = "\b(\d{5})\b"
With .Execute(cell)
If .Count > 0 Then GetNum = .Item(0).SubMatches(0)
End With
End With
End Function
UPDATE:
If you want to return error (say, #N/A) instead of callee's default data type, you could write the following:
Function GetNum(cell)
With CreateObject("VBScript.RegExp")
.Pattern = "\b(\d{5})\b"
With .Execute(cell)
If .Count > 0 Then
GetNum = .Item(0).SubMatches(0)
Else
GetNum = CVErr(xlErrNA)
End If
End With
End With
End Function

I tried this with a Cell containing "Yjuj 525211111x5333332s5" to test whether 2 consecutive 5 characters get catch, and it worked great.
Sub Macro_Find_Five()
Dim str As String
Dim tmp As String
Dim cntr As Integer
Dim result As String
str = Sheet1.Cells(1, 1).Value
tmp = ""
cntr = 1
col = 2
result = ""
'For Loop for tracing each charater
For i = 1 To Len(str)
'Ignore first starting character
If i > 1 Then
'If the last character matches current character then
'enter the if condition
If tmp = Mid(str, i, 1) Then
'concatenate current character to a result variable
result = result + Mid(str, i, 1)
'increment the counter
cntr = cntr + 1
Else
'if the previous character does not match
'reset the cntr to 1
cntr = 1
'as well initialize the result string to "" (blank)
result = ""
End If
End If
'if cntr matches 5 i.e. 5 characters traced enter if condition
If cntr = 5 Then
'adding to next column the result found 5 characters same
Sheet1.Cells(1, col).Value = result
'increment the col (so next time it saves in next column)
col = col + 1
'initializing the variables for new search
cntr = 1
tmp = ""
result = ""
End If
'stores the last character
tmp = Mid(str, i, 1)
'if first character match concatenate.
If cntr = 1 Then
result = result + Mid(str, i, 1)
End If
Next i
End Sub

Related

VBA Excel: Feasible combination creator using single list of elements with no element repeating

I have the following Excel sheet which has random number combinations build using numbers from 2 to 50 in set of 3, 2 and 1 in Column A.
I am trying to build whole possible combinations between Column A elements such that the obtained combination doesn't have any repeating numbers in them and contains all the number from 2 to 50.
My current code starts from A2 and builds only a single combination set. It doesn't evaluate other possible combinations with starting element as in A2, it then goes to A3 and then builds only one combination set using A3. This step continues for A4,A5...
This is my current code.
Private Sub RP()
Dim lRowCount As Long
Dim temp As String, s As String
Dim arrLength As Long
Dim hasElement As Boolean
Dim plans() As String, currentPlan() As String
Dim locationCount As Long
Dim currentRoutes As String
Dim line As Long
Worksheets("Sheet1").Activate
Application.ActiveSheet.UsedRange
lRowCount = ActiveSheet.UsedRange.Rows.Count
locationCount = -1
line = 2
Debug.Print ("*********")
For K = 2 To lRowCount - 1
currentRoutes = ""
For i = K To lRowCount
s = ActiveSheet.Cells(i, 1)
Do
temp = s
s = Replace(s, " ", "")
Loop Until temp = s
currentPlan = Split(Trim(s), ",")
arrLength = UBound(currentPlan) - LBound(currentPlan) + 1
hasElement = False
If Len(Join(plans)) > 0 Then
For j = 0 To arrLength - 1
pos = Application.Match(currentPlan(j), plans, False)
If Not IsError(pos) Then
hasElement = True
Exit For
End If
Next j
End If
If Not hasElement Then
currentRoutes = currentRoutes & (Join(currentPlan, ",")) & " "
If Len(Join(plans)) > 0 Then
plans = Split(Join(plans, ",") & "," & Join(currentPlan, ","), ",")
Else
plans = currentPlan
End If
End If
Next i
If locationCount < 0 Then
locationCount = UBound(plans) - LBound(plans) + 1
End If
If (UBound(plans) - LBound(plans) + 1) < locationCount Then
Debug.Print ("Invalid selection")
Else
Debug.Print (Trim(currentRoutes))
Worksheets("Sheet1").Cells(line, 11) = currentRoutes
line = line + 1
End If
Erase plans
Debug.Print ("*********")
Next K
End Sub

VBA: sorted collection

The code below extracts & format values from the range B6:E6, and then stores them in the variable. Afterwards, the routine sorts the collection of 4 variables in the ascending order. When sorted they're being put into the range L31:O31.
The problem is that if there are less than 4 variables selected, say 3, the routine will skip L31 cell, and put the rest to M31:O31. Whilst it should be input as L31:N31, and O31 - blank.
How can the code be modified to make it fulfill the data starting from L31 if less than 4 variables are in the collection?
Function ExtractKey(s As Variant) As Long
Dim v As Variant, n As Long
v = Trim(s) 'remove spaces leave only spaces between words
If v Like "*(*)" Then 'if it's SOPXX (YYYY) then
n = Len(v) 'find number of the characters
If n = 11 Then
v = Mid(v, n - 7, 7) 'find the number of SOP + year in bracket
ElseIf n = 12 Then
v = Mid(v, n - 8, 8)
End If
v = Replace(v, "(", "") 'replace the brackets with nothing
v = Replace(v, " ", "")
'SOP10 (2015) doesn't have to go first before SOP12 (2014); switch figures
If n = 11 Then
v = Right(v, 4) + Left(v, 1)
ElseIf n = 12 Then
v = Right(v, 4) + Left(v, 2)
End If
ExtractKey = CLng(v)
Else
ExtractKey = 0
End If
End Function
Sub Worksheet_Delta_Update()
Dim SourceRange As Range, TargetRange As Range
Dim i As Long, j As Long, minKey As Long, minAt As Long
Dim v As Variant
Dim C As New Collection
Set SourceRange = Worksheets("t").Range("B6:E6")
Set TargetRange = Worksheets("x").Range("L31:O31")
For i = 1 To 4
v = SourceRange.Cells(1, i).Value
C.Add Array(ExtractKey(v), v)
Next i
'transfer data
For i = 1 To 4
minAt = -1
For j = 1 To C.Count
If minAt = -1 Or C(j)(0) < minKey Then
minKey = C(j)(0)
minAt = j
End If
Next j
TargetRange.Cells(1, i).Value = C(minAt)(1)
C.Remove minAt
Next i
End Sub
You could add one variable e.g. col which will be used instead of variable i when the value is inserted into TargetRange. This variable will work the same way as the i works but it will be incremented only when the value which is inserted is not empty. HTH
'transfer data
Dim col As Integer
col = 1
For i = 1 To 4
minAt = -1
For j = 1 To C.Count
If minAt = -1 Or C(j)(0) < minKey Then
minKey = C(j)(0)
minAt = j
End If
Next j
If (C(minAt)(1) <> "") Then
TargetRange.Cells(1, col).Value = C(minAt)(1)
col = col + 1
End If
C.Remove minAt
Next i

Matching two titles by words and calculate %

I was trying to automate an Excel file which has title in both A and B columns and I have to search each word from A within B and calculate the % by using the "no of words matched/total no of words (in column A)" formula.
I'm using the below code, however its not giving me the accurate % for which the title has repeated words (Duplicate words).
Sub percentage()
Dim a() As String, b() As String
Dim aRng As Range, cel As Range
Dim i As Integer, t As Integer
Set aRng = Range(Range("A1"), Range("A5").End(xlDown))
For Each cel In aRng
a = Split(Trim(cel), " ")
b = Split(Trim(cel.Offset(, 1)), " ")
d = 0
c = UBound(a) + 1
If cel.Value <> "" Then
If InStr(cel, cel.Offset(, 1)) Then
d = UBound(b) + 1
Else
For i = LBound(a) To UBound(a)
For t = LBound(b) To UBound(b)
If UCase(a(i)) = UCase(b(t)) Then
d = d + 1
End If
Next
Next
End If
End If
cel.Offset(0, 2).Value = (d / c)
Next
End Sub
If Title 1 : Really Nice pack with Nice print and Title 2 : Nice Print Nice pack then result should be 3/6 i.e. 67%.
But I'm getting a result as 100%.
Can anyone help me out please.
Titles are
Great job dud
Really Nice pack with Nice print
To give success and success process
Don’t eat too much. If you eat too much you will get sick
I have tried =noDuplicate(celladdress)
First, you should delete duplicate word in column B.
My function delete word and return array of word that not duplicate.
Function noDuplicate(ByVal str As String) As String()
Dim splitStr() As String
Dim result() As String
Dim i As Integer
Dim j As Integer
Dim k As Integer
Dim addFlag As Boolean
splitStr = Split(UCase(str), " ")
ReDim result(UBound(splitStr))
'
result(0) = splitStr(0)
k = 0
For i = 1 To UBound(splitStr)
addFlag = True
For j = 0 To k
If splitStr(i) = result(j) Then
addFlag = False
Exit For
End If
Next j
If addFlag Then
result(k + 1) = splitStr(i)
k = k + 1
End If
Next i
ReDim Preserve result(k)
noDuplicate = result
End Function
Then calculate the percentage of number of match word and number of word in column A.
Function percentMatch(ByVal colA As String, ByVal colB As String) As Double
Dim splitColA() As String
Dim splitColB() As String
Dim i As Integer
Dim j As Integer
Dim matchCount As Integer
splitColA = Split(UCase(colA), " ")
splitColB = noDuplicate(colB)
matchCount = 0
For i = 0 To UBound(splitColA)
For j = 0 To UBound(splitColB)
If splitColA(i) = splitColB(j) Then
matchCount = matchCount + 1
Exit For
End If
Next j
Next i
percentMatch = matchCount / (UBound(splitColA) + 1)
End Function
After add these two function, you can write your new code to below
Sub percentage()
Dim aRng As Range, cel As Range
Set aRng = Range(Range("A1"), Range("A5").End(xlDown))
For Each cel In aRng
cel.Offset(0, 2).Value = percentMatch(cel.Value, cel.Offset(0, 1).Value)
Next
End Sub
Note, I not protect for empty string in the function.
If you F8 through the code, you can see the problem.
The first Nice in column A loops through column B and counts 2 occurences.
Pack in column A loops through column B and counts 1 occurence.
The second Nice in column A loops through column B and counts 2 occurences.
Print in column A loops through column B and counts 1 occurence.
So you get a count of 6 against the 6 words in column A; 100%
If you add a random word to column A, you'll get 6 out of 7.

Why is my code ignoring underscores?

I am writing a very simple vba code that is designed to iterate through the letters/numbers in each cell in row 2 and only keep the numbers in the beginning. Each cell starts with between 4 and 7 numbers, and then will be followed by either a letter (1 or more), a . and a number, or an underscore and a letter and a number.
The issue I am having is that my code is only returning the correct values for some of the cells. Only the cells that have a . are cleaned correctly. The ones with the underscore delete everything after the _, but keep the _ itself and the cells with letters keep the letters but delete the . and anything after it.
This is my code:
Sub getIDs()
Dim counter As Integer
counter = 1
Dim rowCounter As Integer
rowCounter = 2
Dim original As String
original = ""
Dim newText As String
newText = ""
Do While Len(Cells(rowCounter, 2)) > 0
Do While counter <= Len(Cells(rowCounter, 2))
If Not IsNumeric((Mid(Cells(rowCounter, 2).Value, counter, 1))) Or Mid(Cells(rowCounter, 2).Value, counter, 1) = "_" Then
Exit Do
Else
counter = counter + 1
End If
Loop
newText = Left(Cells(rowCounter, 2), counter)
Cells(rowCounter, 2) = newText
rowCounter = rowCounter + 1
Loop
End Sub
Examples: The original cells contain these four types of info (numbers vary):
Input Desired output Actual output Actual output OK?
----------------|-----------------|--------------|-------------------------
12345_v2.jpg 12345 12345_ No, "_" should be removed
293847.psd 293847 293847 OK
82364382.1.tga 82364382 82364382 OK
172982C.5.tga 172982 172982C No, "C" should be removed
So I found two problems with your code. The first that counter, really needs to be counter-1 when you set new text since that is the position of the non-numeric or underscore character. Copying at the counter point will give you an extra character.
The second issue is that you need to reset the counter variable outside of the inner Do loop, otherwise you will start at the position of the previous last found character. Try this.
Sub getIDs()
Dim counter As Integer
counter = 1
Dim rowCounter As Integer
rowCounter = 2
Dim original As String
original = ""
Dim newText As String
newText = ""
Do While Len(Cells(rowCounter, 2)) > 0
counter = 1
Do While counter <= Len(Cells(rowCounter, 2))
If Not IsNumeric((Mid(Cells(rowCounter, 2).Value, counter, 1))) Or Mid(Cells(rowCounter, 2).Value, counter, 1) = "_" Then
Exit Do
Else
counter = counter + 1
End If
Loop
newText = Left(Cells(rowCounter, 2), counter - 1)
Cells(rowCounter, 2) = newText
rowCounter = rowCounter + 1
Loop
End Sub
This will remove dots and underscores.
I have to run; no time to implement removing letters. But this should put you on the right track. I may continue tomorrow.
Sub lkjhlkjh()
Dim s As String
Dim iUnderscore As Long
Dim iDot As Long
Dim iCut As Long
s = Sheet1.Cells(1, 1)
iUnderscore = InStr(s, "_")
iDot = InStr(s, ".")
If iUnderscore = 0 Then
If iDot = 0 Then
'Don't cut
Else
iCut = iDot
End If
Else
If iDot = 0 Then
iCut = iUnderscore
Else
If iDot > iUnderscore Then
iCut = iUnderscore
Else
iCut = iDot
End If
End If
End If
If iCut > 0 Then s = Left(s, iCut - 1)
Sheet1.Cells(1, 1) = s
End Sub

in vba how to define array and compare them?

i have 2 sheets , i want to find the same rows in 2 sheets , so i put the first row in array , and by a for next i define the first array ...then i define another array from second sheet , then i compare them .... why it doesn't work?
Sub compare()
Dim n(4) As Variant
Dim o(4) As Variant
Dim i As Integer
For i = 3 To 20 'satrha
For j = 2 To 4 'por kardan
n(j) = Sheets("guys").Cells(i, j)
Next 'por kardan
k = 3
Do 'hhhh
For Z = 2 To 4 'por dovomi
o(Z) = Sheets("p").Cells(k, Z)
Next 'por dovomi
If n(j) = o(Z) Then
Sheets("guys").Cells(i, 1) = Sheets("p").Cells(k, 2)
flag = True
Else
flag = False
k = k + 1
End If
Loop Until flag = False 'hhhhh
Next 'satrha
End Sub
Guessing from your existing code, my following code will copy the value from sheet "p" column B into sheet "guys" column A when a match is found.
Sub compare()
Dim i As Integer
Dim j As Integer
Dim l As Integer
l = Sheets("p").Range("B65535").End(xlUp).Row
Debug.Print l
For i = 3 To 20
For j = 3 To l
If Sheets("guys").Cells(i, 2).Value = Sheets("p").Cells(j, 2).Value And _
Sheets("guys").Cells(i, 3).Value = Sheets("p").Cells(j, 3).Value And _
Sheets("guys").Cells(i, 4).Value = Sheets("p").Cells(j, 4).Value Then
Sheets("guys").Cells(i, 1).Value = Sheets("p").Cells(j, 2).Value
Exit For
End If
Next
Next
End Sub
Noted that I explicitly said Value in my code. That will copy the computed value (e.g. result of a formula) instead of the "original" content.

Resources