Split data string over columns AND rows using VBA - excel

I am trying to speed up a currently working automated workbook.
PHP sends a string similar to the below to VBA:
1[|:#:|]text-one[|:#:|]code-one[|:#:|]qty-one[{:#:}]
2[|:#:|]text-two[|:#:|]code-two[|:#:|]qty-two[{:#:}]
where
[|:#|] represents "new column"
[{:#:}] represents "new row"
When it is parsed by the VBA this is the output:
I currently use the following VBA code to parse this into a workbook:
myArray = Split(myReply, "[{:#:}]")
myRow = 1
For Each element In myArray
myRow = myRow + 1
subArray = Split(element, "[|:#:|]")
myCol = 2
For Each subelement In subArray
myCol = myCol + 1
Cells(myRow, myCol).Value = subelement
Next subelement
Next element
I am about to start optimising the code in this workbook and I am aware I can do something like (pseudo code):
for each element....
Range("C2:F2").Value = Split(element, "[|:#:|]") 'Example row number would be incremental
However is there a way to do it so that I can split into the entire Range?
For example, If I know there are 29 "rows" within the data that has been returned, I would like to be able to use split to place the data into all the rows.
I imagine the syntax would be something similar to the below, however this doesn't seem to work:
Range("C2:F29").Value = Split(Split(element, "[|:#:|]"),"[{:#:}]")

The optimal thing to do is to do everything in native VBA code and not interact with the Excel sheet until the end. Writing to sheet is a time consuming operation, so this procedure does it once and once only, writing the whole two-dimensional array at once, rather than writing it line by line. Therefore no need to disable screen updating, calculation, or anything else.
Function phpStringTo2DArray(ByVal phpString As String) As Variant
Dim iRow As Long
Dim iCol As Long
Dim nCol As Long
Dim nRow As Long
Dim nColMax As Long
Dim lines() As String
Dim splitLines() As Variant
Dim elements() As String
lines = Split(phpString, "[{:#:}]")
nRow = UBound(lines) - LBound(lines) + 1
ReDim splitLines(1 To nRow)
For iRow = 1 To nRow
splitLines(iRow) = Split(lines(iRow - 1), "[|:#:|]")
nCol = UBound(splitLines(iRow)) - LBound(splitLines(iRow)) + 1
' in case rows have different number of columns:
If nCol > nColMax Then nColMax = nCol
Next iRow
Erase lines
'We now have a (Variant) array of arrays. Convert this to a regular 2D array.
ReDim elements(1 To nRow, 1 To nColMax)
For iRow = 1 To nRow
nCol = UBound(splitLines(iRow)) - LBound(splitLines(iRow)) + 1
For iCol = 1 To nCol
elements(iRow, iCol) = splitLines(iRow)(iCol - 1)
Next iCol
Next iRow
Erase splitLines
phpStringTo2DArray = elements
End Function
Example usage:
Dim s As String
Dim v As Variant
s = "1[|:#:|]text-one[|:#:|]code-one[|:#:|]qty-one[{:#:}]2[|:#:|]text-two[|:#:|]code-two[|:#:|]qty-two[{:#:}]"
v = phpStringTo2DArray(s)
'Write to sheet
Range("A1").Resize(UBound(v, 1), UBound(v, 2)) = v
If you want to ignore the final line break [{:#:}], could add this line at the top of the function:
If Right(phpString, 7) = "[{:#:}]" Then phpString = Left(phpString, Len(phpString) - 7)

This wasn't as easy as I originally thought. I can get rid of one loop easily. But there's still an if test, so it doesn't break on empty strings etc. I feel a guru could make this even more efficient.
My worry is that for you this process is taking a lot of time. If you are trying to speed things up, your code doesn't look too horribly inefficient.
More likely if it's running slowly, is that the application.calculation & application.screenUpdating settings are set incorrectly.
Sub takePHP(myString As String)
'This sub takes specially formatted strings from a PHP script,
'and parses into rows and columns
Dim myRows As Variant
Dim myCols As Variant
Dim subRow As Variant
Application.ScreenUpdating = False
Application.Calculation = xlCalculateManual
myRows = Split(myString, "[{:#:}]")
x = 1
For Each subRow In myRows
bob = Split(subRow, "[|:#:|]")
If UBound(bob) <> -1 Then
Range(Cells(x, 1), Cells(x, UBound(bob) + 1)).Value = bob
x = x + 1
End If
Next
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub

Related

VBA Filter Column Store Unique Values of Another Column

I'm trying to accomplish the following.
Filter Column A
Grab Unique values from Column B
So given the following table...
I'd like to filter for "One" in column A and get an array back that I can paste unto column C like this...
I've tried to use dictionaries but I have little understanding of how that works. There can be thousands of rows so speed can be an issue and I'd rather not loop through each if it's not necessary.
I've seen solutions that bring back unique values of a column using advanced filter but never a combination of filtering one column and then using the filtered results to get a unique list of values.
Example of code (partial) I've tried:
On Error Resume Next
enterpriseReportSht.ShowAllData
On Error GoTo 0
With enterpriseReportSht
.AutoFilterMode = False
With .Range(Cells(1, 1).Address, Cells(entRptLR, entRptLC).Address)
.AutoFilter Field:=manLevel2CN, Criteria1:=userInputsArr(i, manLevel2InputCN)
'.SpecialCells(xlCellTypeVisible).Copy Destination:=resultsSht.Range("A1")
End With
End With
filteredColArr = enterpriseReportSht.UsedRange.columns(manLevel4CN).Value
RemoveDuplicatesFromArray (filteredColArr)
with this function:
Public Function RemoveDuplicatesFromArray(sourceArray As Variant)
Dim duplicateFound As Boolean
Dim arrayIndex As Integer, i As Integer, j As Integer
Dim deduplicatedArray() As Variant
arrayIndex = -1
deduplicatedArray = Array(1)
For i = LBound(sourceArray) To UBound(sourceArray)
duplicateFound = False
For j = LBound(deduplicatedArray) To UBound(deduplicatedArray)
If sourceArray(i) = deduplicatedArray(j) Then
duplicateFound = True
Exit For
End If
Next j
If duplicateFound = False Then
arrayIndex = arrayIndex + 1
ReDim Preserve deduplicatedArray(arrayIndex)
deduplicatedArray(arrayIndex) = sourceArray(i)
End If
Next i
RemoveDuplicatesFromArray = deduplicatedArray
End Function
My concerns with it is that it's not grabbing the filtered data. It's grabbing all of it I believe. I'm also getting an error with the remove duplicates function.
This should do what you are looking for using a dictionary.
You could speed it up by loading the range into an array and iterating through that, but it's a bit of a pain to do that with a filtered range as well as getting the upperbound of a two dimensional array, you'll need to transpose it into a one dimensional array first. Probably not worth it unless you notice the speed is really slow. I tested with 15k rows it was < 1 second.
Dim i As Long
Dim lr As Long
Dim filterrange As Range
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
With Sheet1 'Change as needed
lr = .Cells(.Rows.Count, 1).End(xlUp).Row
Set filterrange = .Range(.Cells(1, 1), .Cells(lr, 2))
filterrange.AutoFilter 1, "One"
lr = .Cells(.Rows.Count, 1).End(xlUp).Row 'Only really necessary if you have a lot of rows
For i = 1 To lr
If .Rows(i).EntireRow.Hidden = False Then
If Not dict.exists(.Cells(i, 2).Value) Then
dict.Add .Cells(i, 2).Value, ""
End If
End If
Next i
filterrange.AutoFilter
Dim key As Variant
i = 1
For Each key In dict
.Cells(i, 3).Value = key
i = i + 1
Next key
End With
Assuming you have office 365 and no mention of VB only solution, this can achieved using worksheet functions itself
=UNIQUE(FILTER(B:B,A:A=A1))

how can I insert a condition "if it contains" to search for a specific letter in VBA?

I want to count how many times appear the parameters CA, CU and CH, in an excel that looks like this:
I have tried to use the following code, but as the cells don't contain only the parameter I am searching for, it doesn't work:
Sub ContarOV()
Dim cont As Variant
Dim sumaCA As Variant
Dim sumaCU As Variant
Dim sumaCH As Variant
sumaCA = 0
sumaCU = 0
sumaCH = 0
For cont = 3 To 12
If Cells(cont, 2) = ("CA") Then
sumaCA = sumaCA + 1
End If
If Cells(cont, 2) = ("CU") Then
sumaCU = sumaCU + 1
End If
If Cells(cont, 2) = ("CH") Then
sumaCH = sumaCH + 1
End If
Next cont
End Sub
As per #BigBen, I would try to avoid any iteration. What about one of the following options (assuming your data sits from A2:A?):
Sub Test()
Dim lr As Long, x As Long
Dim arr As Variant
Dim rng As Range
With Sheet1 'Change according to your sheets CodeName
'Get last used row
lr = .Cells(.Rows.Count, 1).End(xlUp).Row
'Get data into memory for method 1
arr = Application.Transpose(.Range("A2:A" & lr).Value)
'Create range object for method 2
Set rng = .Range("A2:A" & lr)
'Method 1: Count values with FILTER
Debug.Print UBound(Filter(arr, "CA")) + 1
Debug.Print UBound(Filter(arr, "CU")) + 1
Debug.Print UBound(Filter(arr, "CH")) + 1
'Method 2: Count values with COUNTIF
Debug.Print WorksheetFunction.CountIf(rng, "CA*")
Debug.Print WorksheetFunction.CountIf(rng, "CU*")
Debug.Print WorksheetFunction.CountIf(rng, "CH*")
End With
End Sub
Btw, I would give sumaCA and your other variables a meaningfull data type, Long in this case.
You can use InStr() to return the position of the desired characters in the string. This would look something like If Not InStr(1, Cells(cont,2).Text, "CH") = 0 Then, but looping through strings is generally a slow process. Unless you have a specific need for looping, I like BigBen's answer a lot better than I like looping with InStr().

Nested loops causing Excel crash

I am attempting to run a VBA macro that iterates down about 67,000 rows with 100 columns in each row. For each of the cells in these rows, the value is compared against a column with 87 entries in another sheet. There are no errors noted when the code is run but Excel crashes every time. The odd thing is that the code seems to work; I have it set to mark each row in which a match is found and it does so before crashing. I have attempted to run it many times and it has gotten through between 800 and 11,000 rows before crashing, depending on the attempt.
My first suspect was memory overflow due to the volume of calculations but my system shows CPU utilization at 100% and memory usage around 50% while running this code:
Sub Verify()
Dim codes As String
Dim field As Object
For i = 2 To Sheets("DSaudit").Rows.Count
For Each field In Sheets("Dsaudit").Range(Cells(i, 12), Cells(i, 111))
r = 1
While r <= 87
codes = ThisWorkbook.Sheets("287 Denominator CPT").Cells(r, 1).Value
If field = codes Then
Cells(i, 112).Value = "True"
r = 88
Else
r = r + 1
End If
Wend
Next field
i = i + 1
Next i
End Sub
It should also be noted that I am still very new to VBA so it's likely I've made some sort of egregious rookie mistake. Can I make some alterations to this code to avoid a crash or should I scrap it and take a more efficient approach?
When ever possible iterate variant arrays. This limits the number of times vba needs to access the worksheet.
Every time the veil between vba and Excel is pierced cost time. This only pierces that veil 3 times not 9,031,385,088
Sub Verify()
With Sheets("DSaudit")
'Get last row of Data
Dim lastrow As Long
lastrow = .Cells(.Rows.Count, 12).End(xlUp).Row 'if column 12 ends before the last row of data change to column that has them all.
'Load Array with input Values
Dim rng As Variant
rng = .Range(.Cells(2, 12), .Cells(lastrow, 111)).Value
'Create output array
Dim outpt As Variant
ReDim outpt(1 To UBound(rng, 1), 1 To 1)
'Create Match array
Dim mtch As Variant
mtch = Worksheets("287 Denominator CPT").Range("A1:A87").Value
'Loop through first dimension(Row)
Dim i As Long
For i = LBound(rng, 1) To UBound(rng, 1)
'Loop second dimension(Column)
Dim j As Long
For j = LBound(rng, 2) To UBound(rng, 2)
'Loop Match array
Dim k As Long
For k = LBound(mtch, 1) To UBound(mtch, 1)
'If eqaul set value in output and exit the inner loop
If mtch(k, 1) = rng(i, j) Then
outpt(i, 1) = "True"
Exit For
End If
Next k
'If filled true then exit this for
If outpt(i, 1) = "True" Then Exit For
Next j
Next i
'Assign the values to the cells.
.Cells(2, 112).Resize(UBound(outpt, 1), 1).Value = outpt
End With
End Sub

Loop through column, store values in an array

I am trying to find a way to:
Loop through a column (B column)
Take the values, store them in an array
Loop through that array and do some text manipulation
However, I cannot think of a way to loop through a column and take those values, storing them in an array. I have looked through Stack Overflow and google but have not found a successful solution.
In advance, thank you for your help.
Sub collectNums()
Dim eNumStorage() As String ' initial storage array to take values
Dim i as Integer
Dim j as Integer
Dim lrow As Integer
lrow = Cells(Rows.Count, "B").End(xlUp).Row ' The amount of stuff in the column
For i = lrow To 2 Step -1
If (Not IsEmpty(Cells(i, 2).Value)) Then ' checks to make sure the value isn't empty
i = eNumStorage ' I know this isn't right
Next i
If (IsEmpty(eNumStorage)) Then
MsgBox ("You did not enter an employee number for which to query our database. Quitting")
Exit Sub
End If
End Sub
This is the easiest way to get column to array:
Public Sub TestMe()
Dim myArray As Variant
Dim cnt As Long
myArray = Application.Transpose(Range("B1:B10"))
For cnt = LBound(myArray) To UBound(myArray)
myArray(cnt) = myArray(cnt) & "something"
Next cnt
For cnt = LBound(myArray) To UBound(myArray)
Debug.Print myArray(cnt)
Next cnt
End Sub
It takes the values from B1 to B10 in array and it gives possibility to add "something" to this array.
The Transpose() function takes the single column range and stores it as an array with one dimension. If the array was on a single row, then you would have needed a double transpose, to make it a single dimension array:
With Application
myArray = .Transpose(.Transpose(Range("A1:K1")))
End With
MSDN Transpose
CPearson Range To Array
Creating an Array from a Range in VBA
Just adding a variation on Vityata's which is the simplest way. This method will only add non-blank values to your array. When using your method you must declare the size of the array using Redim.
Sub collectNums()
Dim eNumStorage() As String ' initial storage array to take values
Dim i As Long
Dim j As Long
Dim lrow As Long
lrow = Cells(Rows.Count, "B").End(xlUp).Row ' The amount of stuff in the column
ReDim eNumStorage(1 To lrow - 1)
For i = lrow To 2 Step -1
If (Not IsEmpty(Cells(i, 2).Value)) Then ' checks to make sure the value isn't empty
j = j + 1
eNumStorage(j) = Cells(i, 2).Value
End If
Next i
ReDim Preserve eNumStorage(1 To j)
'Not sure what this bit is doing so have left as is
If (IsEmpty(eNumStorage)) Then
MsgBox ("You did not enter an employee number for which to query our database. Quitting")
Exit Sub
End If
For j = LBound(eNumStorage) To UBound(eNumStorage) ' loop through the previous array
eNumStorage(j) = Replace(eNumStorage(j), " ", "")
eNumStorage(j) = Replace(eNumStorage(j), ",", "")
Next j
End Sub

excel vba - efficiently loop 2d array

I'm hopelessly trying to find a better way of filling a range contents. This way produces the correct results but is very slow. Can anyone point me in the correct direction in terms of how to fill a 2d array or otherwise to speed up the algorithm? I would love a code snippet someone has had success with or even just links that show a cleaner method.
here is my OLD code:
----------------
f = 1
maxcol = 'func call to get last non blank col ref .ie could return T, R, H.etc
For f = 1 To UBound(filenames)
Set aDoc = LoadXmlDoc(filenames(f))
For Each c In Worksheets("Results").Range("A1:" & maxcol & "1")
c.Offset(f, 0).Value = aNode.Text
Next c
Worksheets("Results").Range(maxcol & "1").Offset(f, 0).Value = filenames(f)
Next f
UPDATED CODE:
----------
Dim aDoc As DOMDocument
Dim aNode As IXMLDOMNode
Dim numOfXpaths As Integer
Dim filenames As Variant
Dim f As Integer
Dim maxcol As String
Dim rngStart As Range
Dim nColIndex As Long
Dim lngCalc As Long
'Dim numOfFiles As Integer
Dim aXpaths As Variant
numOfFiles = UBound(filenames)
colToRow aXpaths, numOfXpaths
maxcol = Number2Char(numOfXpaths)
ReDim aValues(1 To numOfFiles, 1 To numOfXpaths + 1) As Variant
For f = 1 To numOfFiles
Set aDoc = LoadXmlDoc(filenames(f))
For nColIndex = 1 To numOfXpaths
If aDoc.parseError Then
aValues(f, nColIndex) = "XML parse error:"
Else
Set aNode = aDoc.selectSingleNode(aXpaths(nColIndex))
aValues(f, nColIndex) = aNode.Text
End If
Next nColIndex
aValues(f, numOfXpaths + 1) = filenames(f)
Next f
Worksheets("Results").Range("A1").Offset(1, 0).Resize(numOfFiles, numOfXpaths + 1).Value = aValues
Function colToRow(ByRef aXpaths As Variant, ByRef numOfXpaths As Integer)
Dim xpathcount As Integer
Dim c As Integer
'Dim aXpaths As Variant
xpathcount = Worksheets("Xpaths").Cells(Rows.Count, "A").End(xlUp).Row - 1
ReDim aXpaths(1 To xpathcount + 1) As Variant
For c = 0 To xpathcount
Worksheets("Results").Range("A1").Offset(0, c) = Worksheets("Xpaths").Range("A1").Offset(c, 0)
Worksheets("Results").Range("A1").Offset(0, c).Columns.AutoFit
aXpaths(c + 1) = Worksheets("Xpaths").Range("A1").Offset(c, 0)
Next c
Worksheets("Results").Range("A1").Offset(0, xpathcount + 1) = "Filename"
'colToRow = xpathcount + 1
numOfXpaths = xpathcount + 1
End Function
Function Number2Char(ByVal c) As String
Number2Char = Split(Cells(1, c).Address, "$")(1)
End Function
To do this efficiently you should generate a 2-dimensional data with the data you want to write, then write it all in one go.
Something like the following. I prefer 0-based arrays for compatibility with other languages whereas you seem to be using a 1-based array (1 to UBound(filenames). So there may be off-by-one errors in the following untested code:
f = 1
maxcol = 'func call to get last non blank col ref .ie could return T, R, H.etc
' 2D array to hold results
' 0-based indexing: UBound(filenames) rows and maxcol columns
Dim aValues(0 to UBound(filenames)-1, 0 To maxcol-1) As Variant
Dim rngStart As Range
Dim nColIndex As Long
For f = 1 To UBound(filenames)
Set aDoc = LoadXmlDoc(filenames(f))
aValues(f-1, 0) = filenames(f)
For nColIndex = 1 To maxCol-1
aValues(f-1, nColIndex) = aNode.Text
Next nColIndex
Next f
' Copy the 2D array in one go
Worksheets("Results").Offset(1,0).Resize(UBound(filenames),maxCol).Value = aValues
As you're getting you results from XML, have you looked into using XML Maps to display the information - might not be suitable for your situation, but worth a try.
This link below shows some stuff about using XML maps in Excel.
The syntax of the line to load an XML string into a define map is similar to this:
ActiveWorkbook.XmlMaps("MyMap").ImportXml(MyXMLDoc,True)
You might want to look at my code in "Using Variant Arrays in Excel VBA for Large Scale Data Manipulation", http://www.experts-exchange.com/A_2684.html (further detail provided in the hyperlink)
Note that as I don't have your data above to work with the article provides a sample solution (in this case efficiently deleting leading zeroes) to meet you filling a range from a 2d array requirement.
Key points to note
The code handles non contigious ranges by use of Areas
When using variant arrays alwasy test that the range setting the array size is bigger than 1 cell - if not you cant use a variant
The code readas from a range, runs a manipulation, then dumps back to the same range
Using Value2 is slightly moe efficient than Value
Here is the code:
'Press Alt + F11 to open the Visual Basic Editor (VBE)
'From the Menu, choose Insert-Module.
'Paste the code into the right-hand code window.
'Press Alt + F11 to close the VBE
'In Xl2003 Goto Tools … Macro … Macros and double-click KillLeadingZeros
Sub KillLeadingZeros()
Dim rng1 As Range
Dim rngArea As Range
Dim lngRow As Long
Dim lngCol As Long
Dim lngCalc As Long
Dim objReg As Object
Dim X()
On Error Resume Next
Set rng1 = Application.InputBox("Select range for the replacement of leading zeros", "User select", Selection.Address, , , , , 8)
If rng1 Is Nothing Then Exit Sub
On Error GoTo 0
'See Patrick Matthews excellent article on using Regular Expressions with VBA
Set objReg = CreateObject("vbscript.regexp")
objReg.Pattern = "^0+"
'Speed up the code by turning off screenupdating and setting calculation to manual
'Disable any code events that may occur when writing to cells
With Application
lngCalc = .Calculation
.ScreenUpdating = False
.Calculation = xlCalculationManual
.EnableEvents = False
End With
'Test each area in the user selected range
'Non contiguous range areas are common when using SpecialCells to define specific cell types to work on
For Each rngArea In rng1.Areas
'The most common outcome is used for the True outcome to optimise code speed
If rngArea.Cells.Count > 1 Then
'If there is more than once cell then set the variant array to the dimensions of the range area
'Using Value2 provides a useful speed improvement over Value. On my testing it was 2% on blank cells, up to 10% on non-blanks
X = rngArea.Value2
For lngRow = 1 To rngArea.Rows.Count
For lngCol = 1 To rngArea.Columns.Count
'replace the leading zeroes
X(lngRow, lngCol) = objReg.Replace(X(lngRow, lngCol), vbNullString)
Next lngCol
Next lngRow
'Dump the updated array sans leading zeroes back over the initial range
rngArea.Value2 = X
Else
'caters for a single cell range area. No variant array required
rngArea.Value = objReg.Replace(rngArea.Value, vbNullString)
End If
Next rngArea
'cleanup the Application settings
With Application
.ScreenUpdating = True
.Calculation = lngCalc
.EnableEvents = True
End With
Set objReg = Nothing
End Sub

Resources