Parsing comma-delimited string of serial numbers - excel

I would like some direction/help on how to code a VBA-coded solution for my scenario, details follow. I am very comfortable with VBA coding - I am really looking for advise on how to approach the problem, not any specific solution.
My department bears the highly-enviable task of daily label-making. We receive a spreadsheet from Production that has a cell/cells of serial numbers to be printed (examples below). The numbers are often not contiguous, but the basic (human-generated) 'format' is the same (hyphens for ranges, commas for single numbers). The serial numbers in the example below are 6 digits, but often are different lengths, adding to the complexity. I am looking for feedback on how to ultimately parse the cell.text into a complete list of serial numbers that can be ultimately used as a source for our label printer's software.
Again, I think I have the ability to actually code this; I am asking how to approach parsing the cell.value(s), identifying spaces, commas, and hyphens as needed, and retrieving a list of serial numbers, in any usable format. I can SPLIT at commas, and I can code the range before and after a hyphen. How do I approach the 6 digit format, as well as the change to the first three characters (364-365, could be many).
EXAMPLE SPREADSHEET CELL.VALUE: "364701-703, 705, 706, 708-710, 365100-104, 121" is a request for 14 labels:
EXPECTED PARSED RESULT: 364701, 364702, 36703, 364705, 364706, 364708, 364709, 364710, 365100, 365101, 365102, 365013, 350104, 365121

It's just a matter of how you keep track of things.
Given your data, the following will output what you want. You will note I added a single serial number item as you only had ranges listed in your sample:
Option Explicit
Sub labelMaker()
Const sRequest As String = "364701-703, 705, 706, 708-710,364800, 365100-104, 121"
Dim V, W, X
Dim lFirstThree As Long, I As Long, J As Long
'Dim D As Dictionary 'early binding
Dim D As Object 'late binding
'Set D = New Dictionary 'early binding
Set D = CreateObject("Scripting.Dictionary") 'late binding
V = Split(Replace(sRequest, " ", ""), ",")
For Each W In V
X = Split(W, "-")
If Len(X(0)) = 6 Then lFirstThree = Left(X(0), 3) 'we start a new series
For I = Right(X(LBound(X)), 3) To Right(X(UBound(X)), 3)
D.Add lFirstThree & I, lFirstThree & I
Next I
Next W
'write the results to the worksheet
V = WorksheetFunction.Transpose(D.Keys)
With Cells(1, 1).Resize(D.Count) 'will be on active sheet
.EntireColumn.Clear
.Value = V
End With
End Sub
The above works only with six digit serial numbers, which is what you provided. I'm reasonably sure the variability can be coded for, but without knowing how they vary (which is the fixed part and which the variable part), it would be hard to provide a one-size fits all solution.

You might code it however you would and post that to https://codereview.stackexchange.com/ and then you could see how some other people might approach it.
I don't have any illuminating advice, so I'll just show you how I'd do it. The splitting is easy enough and you just have to keep track of the first three numbers for when they're missing.
Public Sub GenerateSerialNumbers(ByVal sNumbers As String)
Dim vaComma As Variant, vaHyph As Variant
Dim i As Long, j As Long
Dim lPrefix As Long, lStart As Long, lEnd As Long
Dim sInput As String
Dim dc As Scripting.Dictionary
Set dc = New Scripting.Dictionary
vaComma = Split(sNumbers, ",")
For i = LBound(vaComma) To UBound(vaComma)
sInput = Trim$(vaComma(i))
If InStr(1, sInput, "-") > 0 Then
vaHyph = Split(sInput, "-")
'If you get a full one, keep the first three
If Len(vaHyph(0)) = 6 Then lPrefix = Val(Left$(sInput, 3)) * 1000
'Add the prefix if needed
lStart = Val(vaHyph(0))
If lStart < 1000 Then lStart = lPrefix + lStart
lEnd = Val(vaHyph(1))
If lEnd < 1000 Then lEnd = lPrefix + lEnd
Else
If Len(sInput) = 6 Then lPrefix = Val(Left$(sInput, 3)) * 1000
lStart = Val(sInput)
If lStart < 1000 Then lStart = lPrefix + lStart
lEnd = lStart
End If
'Generate the list
For j = lStart To lEnd
dc.Add j, j
Next j
Next i
Sheet1.Range("a1").Resize(dc.Count, 1).Value = Application.Transpose(dc.Items)
End Sub

try this:
Function trlMyString(myString As String) As String
On Error GoTo trlMyStringError
Dim i As Integer
Dim j As Integer
Dim helpArray() As String
Dim strg As String
Dim label1 As String
Dim label2 As String
strg = ""
helpArray() = Split(myString, ", ")
For i = LBound(helpArray) To UBound(helpArray)
If Len(helpArray(i)) > 3 And InStr(1, helpArray(i), "-") <> 4 Then
label1 = Left$(helpArray(i), 3)
helpArray(i) = Right$(helpArray(i), Len(helpArray(i)) - 3)
End If
If InStr(1, helpArray(i), "-") > 0 Then
For j = CInt(Left$(helpArray(i), 3)) To CInt(Right$(helpArray(i), 3))
'Debug.Print CInt(Left$(helpArray(i), 3)), CInt(Right$(helpArray(i), 3))
label2 = Trim$(Str$(j))
strg = strg & label1 & label2 & ", "
Next j
Else
label2 = helpArray(i)
strg = strg & label1 & label2 & ", "
End If
Next i
'Debug.Print strg
trlMyStringExit:
trlMyString = strg
Exit Function
trlMyStringError:
Resume trlMyStringExit
End Function

Related

TEXTTOARRAY(). Has anyone written an optimal VBA function that can do the inverse of =ARRAYTOTEXT(,1)

At first glance, a mixture of mid and len (to remove the curly brackets) and text split would achieve this. However, this does not deal with edge cases where a semicolon or comma is present in an individual element. See the example below.
Let A1=1
Let B1="Semicolon ; in me"
Let A2="Comma, in me"
let B2=4
ARRAYTOTEXT(A1:B2,1)={1,"Semicolon ; in me";"Comma , in me",4} = (C)
ARAYTOTEXT_INV(C) = Spilled range identical to A1:B2
Now using a text split of (C) would find the semicolons and commas within the speech marks and split the text too much. I think I need some use of regex to get the desired result.
The inverse function will be applied to many such ranges so needs to be optimal. The answer needs to also be able to deal with numbers and blank values adequately.
Edit: needs to be able to solve for the below cases as well as normal text:
Numbers that don't have speech marks.
Blanks that are not surrounded by quotation marks.
Sets of sets (which is less likely to happen granted) such as
{"{"a,"," b,";" a,"," b,"}","{"c,"," d,";"c,"," d,"}"}
Edge cases {",",";"), you can imagine an element being the formula
"=FIND(",",a1)" for example.
In the image below you can use ARRAYTOTEXT(B3:C4,1) to get to the value in B7. I want a function that can be placed in B10 (to spill into B10:C11) to give me the original values back i.e. the inverse of ARRAYTOTEXT.
See Excel Example
This is actually not so simple at all. But maybe try:
Formula in A3:
=DROP(DROP(REDUCE(0,MID(A1,SEQUENCE(LEN(A1)),1),LAMBDA(a,b,TOCOL(LET(x,TAKE(a,1),IF(b="""",VSTACK(NOT(x),DROP(a,1)),IF(x+ISNUMBER(--b),VSTACK(DROP(a,-1),TAKE(a,-1)&b),VSTACK(a,"")))),3))),1),-1)
I don't think this will tick your edge-cases.
I had a crack at my own problem. This seems to work for all cases. Can anyone make this more efficient?
Function TEXTTOARRAY(inarr As String)
Dim nDbleQuote As Long
Dim charLng As String
Dim BrkElum() As Long
Dim lenArr As Long
Dim nCol As Long, nRow As Long, nElum As Long
Dim iLng As Long, iRows As Long, iCols As Long, iElum As Long
Dim RowSep As String, ColSep As String
RowSep = Application.International(xlRowSeparator)
ColSep = Application.International(xlColumnSeparator)
'Remove curly brackets
Dim Arr As String: Arr = Mid$(inarr, 2, Len(inarr) - 2)
ReDim BrkElum(1 To 1): BrkElum(1) = 0
nElum = 1
nRow = 1
nCol = 1
lenArr = Len(Arr)
'Iterate through string and find break points
For iLng = 1 To lenArr
charLng = Mid$(Arr, iLng, 1)
If charLng = Chr(34) Then nDbleQuote = nDbleQuote + 1
If WorksheetFunction.IsEven(nDbleQuote) Then
If charLng = ColSep Then
If nRow = 1 Then nCol = nCol + 1
nElum = nElum + 1
ReDim Preserve BrkElum(1 To nElum)
BrkElum(nElum) = iLng
ElseIf charLng = RowSep Then
nRow = nRow + 1
nElum = nElum + 1
ReDim Preserve BrkElum(1 To nElum)
BrkElum(nElum) = iLng
End If
End If
Next iLng
ReDim Preserve BrkElum(1 To nElum + 1)
BrkElum(nElum + 1) = lenArr + 1
'Create array
Dim ArrOut() As Variant
ReDim ArrOut(1 To nRow, 1 To nCol)
For iRows = 1 To nRow
For iCols = 1 To nCol
iElum = (iRows - 1) * nCol + iCols
ArrOut(iRows, iCols) = Mid$(Arr, BrkElum(iElum) + 1, BrkElum(iElum + 1) - BrkElum(iElum) - 1)
If Left$(ArrOut(iRows, iCols), 1) = Chr(34) Then 'Remove outside quotes and replace internal double double quotes with single double quotes
ArrOut(iRows, iCols) = Replace(Mid$(ArrOut(iRows, iCols), 2, Len(ArrOut(iRows, iCols)) - 2), Chr(34) & Chr(34), Chr(34))
ElseIf IsNumeric(ArrOut(iRows, iCols)) Then 'Check if numeric and if so change from text to number
ArrOut(iRows, iCols) = CDbl(ArrOut(iRows, iCols))
End If
Next iCols
Next iRows
TEXTTOARRAY = ArrOut
End Function
You can see in the image linked below the original range in B4:D6.
You can see in B8 ARRAYTOTEXT(B4:D6,1).
You can see in B10:B12 TEXTTOARRAY(B8) (The desired function).
You can see in B14:D16 that all cells in B4:D6=B10:D12.
How it has worked out
Based on the observation that typing a formula ={1,"Semicolon ; in me";"Comma , in me",4} produces the desired result
this can be done as a UDF using Evaluate, and sensitive to the data type of the input
Function TextToArray(r As Variant) As Variant
If TypeOf r Is Range Then
TextToArray = Application.Evaluate(r.Value2)
Else
TextToArray = Application.Evaluate(r)
End If
End Function
Regarding incomplete input (eg {"a",}) to use this you'd have to preprocess the input to add missing elements (eg an empty string ""). But if you do that you might as well just just process the whole string.
Note that this answer is based on your input string being a valid formula. {"a",} is not a valid formula.

Convert UDF that Parses TIme-Starts to an Array formula

I created a UDF to parse time-starts from a delimited string.
- Returns an Array(0 to 23) that represent hours in the day
- Each time-start is separated by a comma
- # is used to signify multiple time-starts
For example 5#8p returns 5 as the 20th element in the 0 based array.
AssignmentList("2#12a,3#6a,10#12p,6p,5#8p")(0)
Sub Setup()
Range("A1:AA1").Value = Array("1st", "2nd", "3rd", "12PM", "1AM", "2AM", "3AM", "4AM", "5AM", "6AM", "7AM", "8AM", "9AM", "10AM", "11AM", "12PM", "1PM", "2PM", "3PM", "4PM", "5PM", "6PM", "7PM", "8PM", "9PM", "10PM", "11PM")
Range("A2:C2").Value = Array("12a", "10a,3#12p", "6p,5#8p")
Range("D2:AA2").FormulaArray = "=AssignmentList($A2:$C2)"
End Sub
Function AssignmentList(ByRef Source As Variant) As Variant
Dim Assignments(0 To 23) As Double
Dim Item As Variant, At As Variant
Dim Text As String
Text = WorksheetFunction.TextJoin(",", True, Source)
For Each Item In Split(Text, ",")
If InStr(Item, "#") > 0 Then
At = Split(Item, "#")
Assignments(Hour(At(1))) = Assignments(Hour(At(1))) + At(0)
Else
Assignments(Hour(Item)) = Assignments(Hour(Item)) + 1
End If
Next
AssignmentList = Assignments
End Function
I would like to convert this function to an Array Formula but do not know where to start. References or advice as where to start would be greatly appreciated.
I am also interested in anyway that I could improve my UDF. Ultimately, I will use whichever function gives me the best performance.
I would stick with the UDF -- it will be much simpler to maintain.
I wouldn't bother with joining.
I'd modify your routine a bit, but retain similar logic:
Unless you will be dealing with fractions or very large numbers, I'd use Long instead of Double.
Function AssignmentList(Source) As Long()
Dim Assignments(1 To 1, 1 To 24) As Long
Dim I As Long, V As Variant, W As Variant
Dim vSrc As Variant
Dim t As Date, l As Long
vSrc = Source 'assumed to be a single horizontal row
For I = LBound(vSrc, 2) To UBound(vSrc, 2)
V = Split(vSrc(1, I), ",")
For Each W In V
If InStr(W, "#") > 0 Then
l = Split(W, "#")(0)
t = Split(W, "#")(1)
Else
l = 1
t = W
End If
Assignments(1, Hour(t) + 1) = l
Next W
Next I
AssignmentList = Assignments
End Function

VBA text loop optimisation - Extract emails from text

I need a bit of help with a small project. I just started VBA and I think I could use learning to optimise my code.
Cell A2, contains a text with many email address separated by ",". I managed to extract all the email addresses but I think I made too much use of cells, and I was wondering if you can help me reduce that and use the variables defined instead.
Screenshot of the working code
Sub fpor()
Dim Text As String
Dim full As Integer
Dim i As Integer
Dim e As Integer
Dim part As String
Dim part_len As Integer
Dim part_a As Integer
Dim Text_2 As String
x = 5
ActiveCell = Range("A2")
Text = Range("A2")
full = Len(Text)
'full = InStrRev(Text, ",")
For i = 1 To full
Cells((x + i), 1).Value = Text
part = InStr(Text, ",")
Cells((x + i), 2).Value = part
Cells((x + i), 3) = Left(Text, part)
Cells((x + i), 4) = full - part
Text = Right(Cells((x + i), 1), Cells((x + i), 4))
If part = 0 Then
full = 0
Cells((x + i), 3) = Text
Exit For
Else:
full = Len(Text)
End If
Next i
MsgBox (full)
MsgBox (part)
End Sub `
How do you think I can better optimise the For Loop?
Thank you all for your answers you awesome people : )
you can greatly simplify your code with the use of Split() Function as follows:
Option Explicit
Sub fpor()
Dim emailsArr As Variant
With Worksheets("emails") '<--change "emails" with your actual sheet name
emailsArr = Split(.Range("a2"), ",") '<--| split all emails names delimited by a ',' into an array
.Range("A6").Resize(UBound(emailsArr)).value = Application.Transpose(emailsArr) '<--| write array content from cell A6 downwards
End With
End Sub

How to target and remove multiple sections out of a cell

Using VBA or A Standard formula, I need to edit the following from cells.
I need to remove everything up to and including "Path:",
Then I need it to find | and start over until it reaches the end of the Cell
Example:
Category Name: Ladies, Category Path: Ladies|Category Name: Sale, Category Path: Sale|Category Name: New, Category Path: New|
Goal:
Ladies|Sale|New
It can include NO "|" or it can include up to 20 "|"
Edit: Realized I needed to show my work AFTER the tour. :)
I have spent a day or two on this and so far this is only I can come up with...
Dim s As String
s = Range("Z7").Value
Dim indexOfPath As Integer
Dim indexOfPipe As Integer
Dim indexOfCat As Integer
indexOfPath = InStr(1, s, "Path:")
indexOfPipe = InStr(1, s, "|")
Dim finalString As String
Dim pipeString As String
finalString = Right(s, Len(s) - indexOfPath - 5)
indexOfCat = InStr(1, finalString, "Path:")
pipeString = Right(finalString, Len(finalString) - indexOfCat - 5)
Range("A47").Value = finalString
Range("A48").Value = pipeString
How ever I have got to the point where I am not confusing myself...
Split the cell value on "|", then split each value in the resulting array on "Path:" and take the second element from the result of that.
Like this:
Sub Tester()
Dim s As String, arr, v, arr2
s = "Category Name: Ladies, Category Path: Ladies|Category Name:" & _
" Sale, Category Path: Sale|Category Name: New, Category Path: New|"
arr = Split(s, "|")
For Each v In arr
v = Trim(v)
If Len(v) > 0 Then
arr2 = Split(v, "Path:")
If UBound(arr2) > 0 Then Debug.Print arr2(1)
End If
Next v
End Sub
Try this Function:
Function splitonbar(rng As Range) As String
Dim tempArr() As String
Dim temp As String
Dim i As Integer
tempArr = Split(rng.Value, "|")
For i = LBound(tempArr) To UBound(tempArr)
If Len(tempArr(i)) > 0 Then
temp = temp & "|" & Trim(Mid(tempArr(i), InStr(tempArr(i), "Path:") + 5))
End If
Next i
splitonbar = Mid(temp, 2)
End Function
It can be used as Formula on the sheet, or be called from another sub. To use as a UDF put in a module in the workbook then simply call it with a formula:
=splitonbar(Z7)
Or you can call it with a sub like this:
Sub splitstring()
Dim t as string
t = splitonbar(range("Z7"))
debug.print t
end sub
To directly fit your needs:
Public Function test(ByVal arg As Variant) As String
Dim i As Long
arg = Split(arg, "Category Name: ")
For i = 1 To UBound(arg)
arg(i) = Left(arg(i), InStr(arg(i), ",") - 1)
Next
test = Mid(Join(arg, "|"), 2)
End Function
The Split itself cuts everything in front of the keyword. The Left cuts everything after the comma (including the comma itself)
If you still have questons left, just ask :)

Parsing and comparing a complicated string

I am hoping someone could help me out with a VBA Excel macro.
I have received a worksheet in Excel 2007 which contains product names in one column, and I need to sort this into a logical format so I can use it. However, the list itself is not in any kind of logical order, is 10 000 rows long and I am going to have to do this every month!!
Basically, what I would like to do is search for certain keywords which are common to most of the entries and move them into separate cells in different columns (but in the same row as the original entry).
Regarding keywords: There are 3 different types, two of which I have a complete list of.
Example of keywords: some are measures such as cm (centimetre), mm (millimetre), m (metre) etc.). Then there are other keywords such as % and finally a last set of keywords which is wood, plastic, glass etc.
If this was not complicated enough, the measures (cm for example) are duplicated in some instances and are important details so I cant just separate them but would ideally like them in two adjacent cells.
Fortunately, there is a space after each measure, % sign and item material.
Working from right to left is the easiest way I can think of achieving this as the first description in the string varies wildly between entries and that can stay as is.
So, below is an example string, lets say this is in Cell A1. (Inverted commas are not included in the string and the word "by" appears in only about 100 cases. Usually it is missing...)
"Chair Leg Wood 100% 1m by 20cm"
I would ideally like for the string to be split up into cells as follows
Cell B1 - Chair Leg
Cell C1 - Wood
Cell D1 - 1m
Cell E1 - 2cm
Cell F1 - 100%
Having the % measures in the same column would be extremely helpful
Can anyone please help me with this or the beginnings of a macro which does this and then moves down the list - I have tried using some basic "find" and "len" formulas but really am at my wits end on how to deal with this!
The task boils down to defining a robust definition of the structure of the input data.
Form the info provided a candidate definition might be
<Description, one or more words> <Material, one word> <A value followd by %> <Dimension A> <optional "by"> <Dimension B>
The following macro will process data that conforms this this spec. The definition may need
expanding, eg two word materials (eg Mild Steel)
You will need to add error handling in case any rows don't conform, eg no % in the string, or % character elsewhere in string
Option Explicit
Dim dat As Variant
Sub ProcessData()
Dim r As Range
Dim i As Long
Set r = Intersect(ActiveSheet.UsedRange, ActiveSheet.Columns(1)).Resize(, 5)
dat = r
For i = 1 To UBound(dat, 1)
ParseRow i, CStr(dat(i, 1))
Next
r = dat
ActiveSheet.Columns(5).Style = "Percent"
End Sub
Sub ParseRow(rw As Long, s As String)
'Chair Leg Wood 100% 1m by 20cm
Dim i As Long
Dim sDim As String, sPCnt As String, sMat As String, sDesc As String
Dim sA As String, sB As String
i = InStr(s, "% ")
sDim = Trim(Replace(Mid(s, i + 2), " by ", " ")) ' text to right of %, remove "by"
sA = Trim(Left(sDim, InStr(sDim, " "))) ' split dimension string in two
sB = Trim(Mid(sDim, InStr(sDim, " ")))
s = Left(s, i)
i = InStrRev(s, " ")
sPCnt = Mid(s, i + 1) ' text back to first space before %
s = Trim(Left(s, i))
i = InStrRev(s, " ") ' last word in string
sMat = Mid(s, i + 1)
sDesc = Trim(Left(s, i)) ' whats left
dat(rw, 1) = sDesc
dat(rw, 2) = sMat
dat(rw, 3) = sA
dat(rw, 4) = sB
dat(rw, 5) = sPCnt
End Sub
First, I'd use the Split function to separate the parts into an array, this will avoid most of the string functions and string math:
Dim parts As Variant
parts = Split(A1)
Then, I'd do my comparisons to each part.
Finally, I'd concatenate the parts I didn't breakout, and place all parts on the sheet.
This is based on your example which has spaces inbetween every part, though something similar could work otherwise, you just have to do more work with each part.
Here's my stab at it. We could use about 10 more examples, but this should be a start. To use, select a one column range with your descriptions and run SplitProduct. It will split it out to the right of each cell.
Sub SplitProducts()
Dim rCell As Range
Dim vaSplit As Variant
Dim i As Long
Dim aOutput() As Variant
Dim lCnt As Long
Const lCOLDESC As Long = 1
Const lCOLMAT As Long = 2
Const lCOLPCT As Long = 3
Const lCOLREM As Long = 4
If TypeName(Selection) = "Range" Then
If Selection.Columns.Count = 1 Then
For Each rCell In Selection.Cells
'split into words
vaSplit = Split(rCell.Value, Space(1))
ReDim aOutput(1 To 1, 1 To 1)
'loop through the words
For i = LBound(vaSplit) To UBound(vaSplit)
Select Case True
Case IsPercent(vaSplit(i))
'percents always go in the same column
lCnt = lCOLPCT
If UBound(aOutput, 2) < lCnt Then
ReDim Preserve aOutput(1 To 1, 1 To lCnt)
End If
aOutput(1, lCnt) = vaSplit(i)
Case IsInList(vaSplit(i))
'list items always go in the same column
lCnt = lCOLMAT
ReDim Preserve aOutput(1 To 1, 1 To lCnt)
If UBound(aOutput, 2) < lCnt Then
ReDim Preserve aOutput(1 To 1, 1 To lCnt)
End If
aOutput(1, lCnt) = vaSplit(i)
Case IsMeasure(vaSplit(i))
'measurements go in the last column(s)
If UBound(aOutput, 2) < lCOLREM Then
lCnt = lCOLREM
Else
lCnt = UBound(aOutput, 2) + 1
End If
ReDim Preserve aOutput(1 To 1, 1 To lCnt)
aOutput(1, lCnt) = vaSplit(i)
Case Else
'everything else gets concatentated in the desc column
aOutput(1, lCOLDESC) = aOutput(1, lCOLDESC) & " " & vaSplit(i)
End Select
Next i
'remove any extraneous spaces
aOutput(1, lCOLDESC) = Trim(aOutput(1, lCOLDESC))
'write the values to the left of the input range
rCell.Offset(0, 1).Resize(1, UBound(aOutput, 2)).Value = aOutput
Next rCell
Else
MsgBox "Select a one column range"
End If
End If
End Sub
Function IsPercent(ByVal sInput As String) As Boolean
IsPercent = Right$(sInput, 1) = "%"
End Function
Function IsInList(ByVal sInput As String) As Boolean
Dim vaList As Variant
Dim vaTest As Variant
'add list items as needed
vaList = Array("Wood", "Glass", "Plastic")
vaTest = Filter(vaList, sInput)
IsInList = UBound(vaTest) > -1
End Function
Function IsMeasure(ByVal sInput As String) As Boolean
Dim vaMeas As Variant
Dim i As Long
'add measurements as needed
vaMeas = Array("mm", "cm", "m")
For i = LBound(vaMeas) To UBound(vaMeas)
'any number of characters that end in a number and a measurement
If sInput Like "*#" & vaMeas(i) Then
IsMeasure = True
Exit For
End If
Next i
End Function
No guarantees that this will be speedy on 10k rows.

Resources