I've been looking for a while for a code that would give me the digits between without using regex (I want my macro to be used by anyone especially non-computer friendly people). This is a small part of a code creating series for a chart dynamically creating the chart etc.
Here is the type of data I am dealing with "C23H120N5O4Cl" so I'd like to save in a variable 23 then in another one 120 the rest should not matter (it could be nothing).
My digits will likely be between single characters (C,H,or else) but I need the numbers after C and H. So at the moment here is my code :
RangeOccupied = Range("C2").End(xlDown).row
For i = 1 To RangeOccupied
If i <> RangeOccupied Then
'Look for digits after C
pos = InStr(1, Cells(i + 1, 2), "C") + 1
pos1 = InStr(pos, Cells(i + 1, 2), "H")
NumC = Mid(Cells(i + 1, 2), pos, pos1 - pos)
'Look for digits after H
pos = InStr(1, Cells(i + 1, 2), "H") + 1
pos1 = InStr(pos, Cells(i + 1, 2), "O")
NumH = Mid(Cells(i + 1, 2), pos, pos1 - pos)
End If
Next
Ideally I'd like the pos1 numbers not to be dependent on a specific character but any character. i.e having pos1=InStr(pos,Cells(i+1,2),"ANY NON-NUMBER CHARACTER").
I do not know if it is possible without using regex.
This function will return an array of the digit strings in a text string
Option Explicit
Function myDigits(str As String) As String()
Dim col As Collection
Dim I As Long, S() As String
I = 0
Set col = New Collection
Do Until I > Len(str)
I = I + 1
If IsNumeric(Mid(str, I, 1)) Then
col.Add Val(Mid(str, I, Len(str)))
I = I + 1
Do Until Not IsNumeric(Mid(str, I, 1))
I = I + 1
Loop
End If
Loop
ReDim S(0 To col.Count - 1)
For I = 1 To col.Count
S(I - 1) = col(I)
Next I
myDigits = S
End Function
Okay, I'm absolutely certain there is a more efficient way of doing this. But I think the following example makes it fairly clear on one way to separate your values.
Option Explicit
Sub test()
Dim testValues() As String
Dim val1 As Long
Dim val2 As Long
testValues = Split("C23H120N5O4Cl,C23O120N5H4Cl,C4H120", ",")
Dim testValue As Variant
For Each testValue In testValues
ExtractValues testValue, val1, val2
Debug.Print "For " & testValue & ": " & val1 & " and " & val2
Next testValue
End Sub
Public Sub ExtractValues(ByVal inString As String, _
ByRef output1 As Long, _
ByRef output2 As Long)
Dim outString1 As String
Dim outString2 As String
Dim stage As String
stage = "Begin"
Dim thisCharacter As String
Dim i As Long
For i = 1 To Len(inString)
thisCharacter = Mid$(inString, i, 1)
Select Case stage
Case "Begin"
If thisCharacter = "C" Then stage = "First Value"
Case "First Value"
If (Asc(thisCharacter) >= Asc("0")) And _
(Asc(thisCharacter) <= Asc("9")) Then
outString1 = outString1 & thisCharacter
Else
'--- if we get here, we're done with this value
output1 = CLng(outString1)
'--- verify the next character is the "H"
If thisCharacter = "H" Then
stage = "Second Value"
Else
stage = "Next Value"
End If
End If
Case "Next Value"
If thisCharacter = "H" Then stage = "Second Value"
Case "Second Value"
If (Asc(thisCharacter) >= Asc("0")) And _
(Asc(thisCharacter) <= Asc("9")) Then
outString2 = outString2 & thisCharacter
Else
'--- if we get here, we're done with this value
output2 = CLng(outString2)
stage = "Finished"
Exit For
End If
End Select
Next i
If Not (stage = "Finished") Then
output2 = CLng(outString2)
End If
End Sub
Here's another method that's more generic and efficient than my first solution. This approach uses a function to extract the number following a given substring -- in this case it's a single letter "C" or "H". The function accounts for the value being at the end of the input value as well.
Option Explicit
Sub test()
Dim testValues() As String
Dim val1 As Long
Dim val2 As Long
testValues = Split("C23H120N5O4Cl,C23O120N5H4Cl,C4H120", ",")
Dim testValue As Variant
For Each testValue In testValues
val1 = NumberAfter(testValue, "C")
val2 = NumberAfter(testValue, "H")
Debug.Print "For " & testValue & ": " & val1 & " and " & val2
Next testValue
End Sub
Private Function NumberAfter(ByVal inString As String, _
ByVal precedingString As String) As Long
Dim outString As String
Dim thisToken As String
Dim foundThisToken As Boolean
foundThisToken = False
Dim i As Long
For i = 1 To Len(inString)
thisToken = Mid$(inString, i, 1)
If thisToken = precedingString Then
foundThisToken = True
ElseIf foundThisToken Then
If thisToken Like "[0-9]" Then
outString = outString & thisToken
Else
Exit For
End If
End If
Next i
NumberAfter = CLng(outString)
End Function
I found this solution from here Extract numbers from chemical formula
Public Function ElementCount(str As String, element As String) As Long
Dim i As Integer
Dim s As String
For i = 1 To 3
s = Mid(str, InStr(str, element) + 1, i)
On Error Resume Next
ElementCount = CLng(s)
On Error GoTo 0
Next i
End Function
Which works but if simple molecules like CH4 are put in it does not work since no number are shown... but I (we) can probably work that out.
Thanks again for all the solutions !
EDIT:
Here is the function I use that I think takes all possible scenarios into account ! Thanks again for your help !
Public Function ElementCount(str As String, element As String) As Long
Dim k As Integer
Dim s As String
For k = 1 To Len(str)
s = Mid(str, InStr(str, element) + 1, k)
On Error Resume Next
ElementCount = CLng(s)
On Error GoTo 0
If InStr(str, element) > 0 And ElementCount = 0 Then
ElementCount = 1
End If
Next k
End Function
EDIT
Changed the function to use and return dictionaries having keys of "C" and "H" paired with their numbers. Included a screenshot below.
Made sure it handles for tricky situations where multiple letters are packed ontop of each other:
Code:
Sub mainLoop()
Dim numbers As Scripting.Dictionary: Set numbers2 = New Scripting.Dictionary
For i = 1 To 5
Set numbers = returnDict(Cells(i, 1).Value)
printout numbers, i
Next
End Sub
Function returnDict(cellValue As String) As Scripting.Dictionary
Dim i As Integer: i = 1
Dim holder As String: holder = ""
Dim letter As String
Set returnStuff = New Scripting.Dictionary
While i < Len(cellValue)
If Mid(cellValue, i, 1) = "C" Or Mid(cellValue, i, 1) = "H" Then
i = i + 1
If IsNumeric(Mid(cellValue, i, 1)) Then
letter = (Mid(cellValue, i - 1, 1))
Do While IsNumeric(Mid(cellValue, i, 1))
holder = holder & Mid(cellValue, i, 1)
i = i + 1
If i > Len(cellValue) Then Exit Do
Loop
returnStuff.Add letter, holder
holder = ""
ElseIf Mid(cellValue, i, 1) <> LCase(Mid(cellValue, i, 1)) Then
returnStuff.Add Mid(cellValue, i - 1, 1), "1"
End If
Else
i = i + 1
End If
Wend
End Function
And heres a quick little function used to print out the contents of the dictionary
Sub printout(dict As Scripting.Dictionary, row As Integer)
Dim i As Integer: i = 2
For Each Key In dict.Keys
Cells(row, i).Value = Key & ": " & dict.Item(Key)
i = i + 1
Next
End Sub
My 2c:
Sub tester()
Dim r, arr, v
arr = Array("C", "Z", "Na", "N", "O", "Cl", "Br", "F")
For Each v In arr
Debug.Print v, ParseCount("C15H12Na2N5O4ClBr", v)
Next v
End Sub
Function ParseCount(f, s)
Const ALL_SYMBOLS As String = "Ac,Al,Am,Sb,Ar,As,At,Ba,Bk,Be,Bi,Bh,Br,Cd,Ca,Cf,Ce,Cs,Cl," & _
"Cr,Co,Cn,Cu,Cm,Ds,Db,Dy,Es,Er,Eu,Fm,Fl,Fr,Gd,Ga,Ge,Au,Hf,Hs,He,Ho,In,Ir,Fe,Kr,La,Lr," & _
"Pb,Li,Lv,Lu,Mg,Mn,Mt,Md,Hg,Mo,Mc,Nd,Ne,Np,Ni,Nh,Nb,No,Og,Os,Pd,Pt,Pu,Po,Pr,Pm,Pa,Ra," & _
"Rn,Re,Rh,Rg,Rb,Ru,Rf,Sm,Sc,Sg,Se,Si,Ag,Na,Sr,Ta,Tc,Te,Ts,Tb,Tl,Th,Tm,Sn,Ti,Xe,Yb,Zn," & _
"Zr,B,C,F,H,I,N,O,P,K,S,W,U,V,Y"
Dim atoms, rv, pos, i As Long
atoms = Split(ALL_SYMBOLS, ",")
rv = 0 'default return value
If IsError(Application.Match(s, atoms, 0)) Then
rv = -1 'not valid atomic symbol
Else
i = 1
pos = InStr(i, f, s, vbBinaryCompare)
If pos > 0 Then
If Len(s) = 2 Then
'should be a true match...
rv = ExtractNumber(f, pos + 2)
ElseIf Len(s) = 1 Then
'check for false positives eg "N" matches on "Na"
Do While pos > 0 And Mid(f, pos + 1, 1) Like "[a-z]"
i = pos + 1
pos = InStr(i, f, s, vbBinaryCompare)
Loop
If pos > 0 Then rv = ExtractNumber(f, pos + 1)
Else
'exotic chemistry...
End If
End If
End If
ParseCount = rv
End Function
'extract consecutive numeric digits from f starting at pos
' *returns 1 if no number present*
Function ExtractNumber(f, pos)
Dim rv, s, i As Long
Do While (pos + i) <= Len(f)
If Not Mid(f, pos + i, 1) Like "#" Then Exit Do
i = i + 1
Loop
ExtractNumber = IIf(i = 0, 1, Mid(f, pos, i))
End Function
I'm trying to write code that extracts X consecutive numbers from text.
For example, if I want to extract 5 consecutive numbers in my text:
Cell A1: dsuad28d2hr 22222222 11111 d33d11103
Cell B2: 11111 (wanted)
I could make it work for texts with only 5 numbers but the problem is if my text contains other consecutive numbers higher than 5.
Sub ExtractNum2()
Dim Caract() As String
Dim i As Integer
Dim j As Integer
Dim z As Integer
Dim cont As Integer
Dim goal As Integer
Dim Protocolo() As String
Dim cel As String
Dim lin As Long
lin = Range("A1", Range("A1").End(xlDown)).Rows.Count 'Repeat for each line
For z = 1 To lin
cel = Cells(z, 1)
ReDim Caract(Len(cel))
ReDim Protocolo(Len(cel))
cont = 0
For i = 1 To Len(cel)
Caract(i) = Left(Mid(cel, i), 1)
If IsNumeric(Caract(i)) Then 'Character check
cont = cont + 1
Protocolo(cont) = Caract(i)
'If Not IsNumeric(Caract(6)) And cont = 5 Then**
If cont = 5 '
Dim msg As String
For j = 1 To 5
msg = msg & Protocolo(j)
Next j
Cells(z, 2) = msg 'fills column B
msg = ""
End If
Else
cont = 0
End If
Next i
Next z 'end repeat
End Sub
I'm trying to use:
If Not IsNumeric(Caract(6)) And cont = 5 Then
But it is not working, my output is: B2: 22222 but I want 11111.
What am I missing?
EDIT
Sorry i wasnt clear. I want to extract X numbers with 6>x>4 (x=5). I dont want 22222 since it has 8 consecutive numbers and 11111 has 5 in my example.
UDF:
Function GetNum(cell)
With CreateObject("VBScript.RegExp")
.Pattern = "\b(\d{5})\b"
With .Execute(cell)
If .Count > 0 Then GetNum = .Item(0).SubMatches(0)
End With
End With
End Function
UPDATE:
If you want to return error (say, #N/A) instead of callee's default data type, you could write the following:
Function GetNum(cell)
With CreateObject("VBScript.RegExp")
.Pattern = "\b(\d{5})\b"
With .Execute(cell)
If .Count > 0 Then
GetNum = .Item(0).SubMatches(0)
Else
GetNum = CVErr(xlErrNA)
End If
End With
End With
End Function
I tried this with a Cell containing "Yjuj 525211111x5333332s5" to test whether 2 consecutive 5 characters get catch, and it worked great.
Sub Macro_Find_Five()
Dim str As String
Dim tmp As String
Dim cntr As Integer
Dim result As String
str = Sheet1.Cells(1, 1).Value
tmp = ""
cntr = 1
col = 2
result = ""
'For Loop for tracing each charater
For i = 1 To Len(str)
'Ignore first starting character
If i > 1 Then
'If the last character matches current character then
'enter the if condition
If tmp = Mid(str, i, 1) Then
'concatenate current character to a result variable
result = result + Mid(str, i, 1)
'increment the counter
cntr = cntr + 1
Else
'if the previous character does not match
'reset the cntr to 1
cntr = 1
'as well initialize the result string to "" (blank)
result = ""
End If
End If
'if cntr matches 5 i.e. 5 characters traced enter if condition
If cntr = 5 Then
'adding to next column the result found 5 characters same
Sheet1.Cells(1, col).Value = result
'increment the col (so next time it saves in next column)
col = col + 1
'initializing the variables for new search
cntr = 1
tmp = ""
result = ""
End If
'stores the last character
tmp = Mid(str, i, 1)
'if first character match concatenate.
If cntr = 1 Then
result = result + Mid(str, i, 1)
End If
Next i
End Sub
I am looking for a formula to list occurrences of values only if they are greater than 2 times; and the result would be shown as in the image.
For example, if a value repeats 2 times, it's shown by "2", and 3 times by "3". so if there are two numbers repeating in the range, then it would be shown by "32" as in the image below. (There is no need for a comma between the numbers). Thanks.
Here is a simple UDF:
Function mycount(rng As Range) As String
Dim str As String
Dim rngcnt As Range
For Each rngcnt In rng
If InStr("," & str & ",", "," & rngcnt.Value & ",") = 0 Then
If Application.WorksheetFunction.CountIf(rng, rngcnt) > 1 Then
mycount = mycount & Application.WorksheetFunction.CountIf(rng, rngcnt)
str = str & "," & rngcnt
End If
End If
Next rngcnt
End Function
So your call on the sheet would be:
=mycount(A2:H2)
Then copy down.
The way I got it is defining a VBA function.This function uses a dictionary, so it is necessary to add th reference to 'Microsoft Scripting Runtime' (look here). Also, I have used a function to sort the characters in string from here
Function Repetitions(rng As Range)
Dim dict As New Scripting.Dictionary
Dim res() As Integer
For aux = 1 To rng.Count
Dim numero As Integer
numero = rng.Cells(1, aux).Value
If Not dict.Exists(numero) Then
dict.Add numero, 1
Else
dict(numero) = dict(numero) + 1
End If
Next aux
Dim result As String
result = ""
For aux = 0 To UBound(dict.Items)
If dict.Items(aux) > 1 Then result = result & dict.Items(aux)
Next aux
While Len(result)
iTemp = 1
Temp = Left(result, 1)
For I = 2 To Len(result)
If StrComp(Mid(result, I, 1), Temp, vbTextCompare) = 0 Then
If StrComp(Mid(result, I, 1), Temp, vbBinaryCompare) = 1 Then
Temp = Mid(result, I, 1)
iTemp = I
End If
End If
If StrComp(Mid(result, I, 1), Temp, vbTextCompare) = 1 Then
Temp = Mid(result, I, 1)
iTemp = I
End If
Next I
Repetitions = Repetitions & Temp
result = Left(result, iTemp - 1) & _
Mid(result, iTemp + 1)
Wend
End Function
After all, you will be able to use the function as formula in Excel, calling it as following for example:
=Repetitions(A2:F2)
I am facing problem when receiving a long message as below
40=1.22.50=0.002.60=35.
The system use the dot as separator while there is also decimal values for numeric value.
The desired output is
40=1.22
50=0.002
60=35
I am now using manual way to format the message. Hope to have a better way to overcome this.
Assuming you have one dot "." as the decimal position, and another "." that separates each element in the array. You can use the code below to read all values of the Long string into an array (Nums is the name of the array).
Option Explicit
Sub Seperate_DecimNumbers()
Dim Nums As Variant
Dim FullStr As String
Dim DotPosition As Integer
Dim i As Integer
' init array size to a large size , will redim it at the end to number of elements found
ReDim Nums(1 To 100)
FullStr = "40=1.22.50=0.002.60=35."
i = 1 ' init array elements counter
Do Until Len(FullStr) = 0
' call FindN function , searching for the 2nd "."
DotPosition = FindN(FullStr, ".", 2)
' unable to find 2 "." in the string >> last element in the array
If DotPosition = 0 Then
Nums(i) = FullStr
Exit Do
Else ' was able to find 2 "." in the string
Nums(i) = Left(FullStr, DotPosition - 1)
End If
i = i + 1
FullStr = Right(FullStr, Len(FullStr) - DotPosition)
Loop
' redim array back to maximum of numbers found in String
ReDim Preserve Nums(1 To i)
' place output start location from Range A2 and below (till number of elements in the array)
Range("A1").Offset(1, 0).Resize(UBound(Nums), 1).Value = Application.Transpose(Nums)
End Sub
Function FindN(sInputString As String, sFindWhat As String, N As Integer) As Integer
' this function find the Nth position of a certain character in a string
Dim J As Integer
Application.Volatile
FindN = 0
For J = 1 To N
FindN = InStr(FindN + 1, sInputString, sFindWhat)
If FindN = 0 Then ' unable to find the 2nd "." >> last appearance
Exit For
End If
Next
End Function
See result below:
Here's my take on the answer, which splits things on the = rather than the .. Doing it this way allows for input such as 40=1.22.50=0.002.60=35.70=120. (i.e. the part to the right of an = does not have to contain a ., it could be an integer.)
Sub SplitDotEqual()
Dim s As String
Dim a() As String
Dim i As Integer
Dim d As Integer
'Read from A1
s = Range("A1").Value
'Split on the "="
a = Split(s & ".", "=") ' include an extra "." to ensure that
' the final field is ended
For i = 0 To UBound(a) - 1
'Put the "=" back
a(i) = a(i) & "="
'Find the last "." before the next "="
d = InStrRev(a(i + 1), ".")
'Append everything prior to the "."
a(i) = a(i) & Left(a(i + 1), d - 1)
'Write to A2:Ax
Cells(i + 2, 1).Value = a(i)
'Strip off everything prior to the ".",
'leaving just the stuff prior to the "="
a(i + 1) = Mid(a(i + 1), d + 1)
Next
End Sub
Let's assume that every other dot is a separator. This code changes the odd-numbered dots into pipes and then parses on the pipes:
Sub parser()
Dim FlipFlop As Boolean, dot As String, pipe As String
Dim s As String, L As Long, i As Long, CH As String
dot = "."
pipe = "|"
s = Range("A1").Value
L = Len(s)
FlipFlop = True
For i = 1 To L
CH = Mid(s, i, 1)
If CH = dot Then
If FlipFlop Then
Else
Mid(s, i, 1) = pipe
End If
FlipFlop = Not FlipFlop
End If
Next i
msg = s & vbCrLf
ary = Split(s, pipe)
For Each a In ary
msg = msg & vbCrLf & a
Next a
MsgBox msg
End Sub
got more closer message and the code partially works.
8=TEST.1.2.9=248.35=D.49=MMUIJ.56=FGTUH.34=32998.50=MMTHUJ.57=AY/ABCDE.52=20161216-07:58:07.11=00708991.1=A-12345-