Let me explain.
I have a state similarity implementation in excel. One module, called a state, contains many public dictionaries at the top level.
I fill each dictionary with a huge number of object classes - mostly just data from sheets. Simple practice. The problem starts after the macro has worked normally, it leaves all these dictionaries in memory, in the task manager the Excel occupies from 2GB - this is also normal.
State Module - Standalone module
public Dict1 as Dictionary
public Dict2 as Dictionary
'Persists sheets data
public Dict3 as Dictionary
public Dict4 as Dictionary
'For persists renaming some objs
public Dict5 as Dictionary
public Dict6 as Dictionary
public Dict7 as Dictionary
Class Module - Data Interface Example - clsData
Public Name as string
Public Prop1 as string
Public Prop2 as Integer
Public Prop3 as Date
Public Prop4 as string
Public Value as double
The code below is just an example for stackoverflow. In my modules I am grabbing data from a sheet with the Range.CurrentRegion and iterate lbound to ubound.
Another data grabber
Function DataGrabberFromSheet(ByRef CurrentDict as Dict) as String
Dim i as long
Dim data as variant
Dim DataObj as clsData
set CurrentDict = New Dictionary <--- That's recreate dict obj and start clear old data for some how, but i do not need that anymore.
data = Sheet1.Range("A1:Q5000").Values
for i = 1 to 5000
set DataObj = new clsData
DataObj.Name = data(i, 1)
DataObj.Prop1 = data(i,2)
...
call CurrentDict.add(DataObj.Name, DataObj)
next
'For logging, it just an example:
If Success then
DataGrabberFromSheet = "Success"
else
DataGrabberFromSheet = "Bad"
endif
end Function
And so I decide in the debbuger to forcibly stop the program by pressing the stop button. At this point, Excel freezes for a long time, because I already have 50 or more of these dictionaries in the RAM, some of them have 200k elements each. Sometimes it takes about 300 seconds, and sometimes it instantly clears it somehow.
Old data erase by item, I think, but I do not need that data anymore. Can excel just skip that part of memory and just rewrite a new data.
How can I instantly redefine these dictionaries when the macro is rerun and not have to wait for that long cleanup? data is no longer needed at this point
Classic internet methods optimisations not solved that problem:
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
...
I think this is a known issue with VBA - clearing a large number of objects takes a long time: regardless of whether they're stored in a dictionary, collection, or an array.
For example:
Dim arr() As clsData
Sub Tester()
Const NUM As Long = 120000
Dim i As Long, obj As clsData, t
t = Timer
Debug.Print "---------"
ReDim arr(1 To NUM)
Debug.Print "Reset", Timer - t
t = Timer
For i = 1 To NUM
Set obj = New clsData
obj.Prop1 = "Item" & i
obj.Prop2 = "Item" & i
obj.Prop3 = "Item" & i
obj.Prop4 = "Item" & i
Set arr(i) = obj
Next i
Debug.Print "Fill", Timer - t
End Sub
Where clsData is just:
Public Prop1
Public Prop2
Public Prop3
Public Prop4
Output from first run (after clicking "Stop" in VBE):
Reset 0
Fill 0.34375
Second run:
Reset 8.601563 <<<<<
Fill 0.3554688
Related
I've been wrestling with this problem for a while. My problem is that I have a bunch of JSON data and I want to represent it as objects.
Arrays are problematic.
I create a class module such as FancyCat with a public Name as String for its name.
Then I can set this with
Dim MyFancyCat as FancyCat
Set MyFancyCat = new FancyCat
FancyCat.Name = JSONData("Name")
I've typed that from memory but I think it's correct. Anyhoo, it works fine.
The problem is that a fancy cat has several pairs of socks. The number of socks is variable.
In vba you cannot for some reason have a public array. So this code is illegal:
public Socks() as FancySock 'Illegal
Looking on SO I found two solutions, one, to make it private and use a property to access it, and the other, to declare it as Variant and then stick an array into it later.
My approach to populating this array, is to examine the JSON array to get the Count, and then to ReDim the array to match and then populate it.
The problem is my ReDim statement refuses to work.
It seems I cannot redim a property, I get an error. And I also get an error trying to redim the public variant field. My ReDim works OK if I declare a local array and redim it, so potentially I could do that and then assign it to the property... but it just seems bizarre that I can't redim it directly.
Any idea why it's not working?
With the Variant approach above my code is:
ReDim MyFancyCat.Socks(socksLength) As FancySocks
And in the FancyCat class module:
public Socks As Variant
I get Method or Data Member Not Found.
The error for the other approach was different but I rejigged all my code to try the second approach so I am not sure what it was.
Edit: I'm gonna explain what I am trying to do a bit more clearly. I have some JSON data coming in, and I want to store it as an object hierarchy.
In C# I would do this (pseudo code without linq shortcuts):
var myData = ReadJsonData(); // Produces a kind of dictionary
var myFancyCat = new FancyCat();
myFancyCat.Name = myData["Name"];
myFancyCat.Age = myData["Age"];
myFancyCat.Socks = new List<FancySock>();
foreach (var sock in myData["Socks"])
{
myFancyCat.Socks.Add(sock);
}
In excel I want to do the same thing.
So I make a class module for FancyCat and FancySock and give FancyCat public members for Name, Age etc but then I also want an array of socks that my cat owns. I wanted to do this with strongly typed references, e.g. my c# code above I can do:
myFancyCat.Socks[0].Colour // Intellisense works, shows colour as a property
However it seems in excel you can't have publicly declared arrays. So you can get around this according to the comments by declaring it as variant and then sticking an array in anyway, but you would lose the intellisense. Or you can use a get/let property which kinda works but is more fiddly as it seems you can't actually expose an array using a get/let you have to have it take an index and expose elements individually.
So at this point I am thinking forget the strongly typed it's not happening, perhaps use a collection?
The FancySock class may have further nested arrays within it. I've read that there's no ByRef for arrays (at least, not completely - I think you can get an array ByRef but not set one?). I am not sure if that would create problems with trying to set it.
But ultimately, I just want to end up with my JSON data represented easily in an OO way, so that in my excel ultimately I can just do
myFancyCat.Name or myFancyCat.Socks.Count or myFancyCat.Socks(1).Colour etc
It seems much harder than it looks to simply deserialise JSON into 'objects' in vba.
Please, try the next way:
Insert a class module, name it FancyCat and copy the next code:
Option Explicit
Private arrL As Object
Public myName As String, myAge As Long
Public Sub Class_Initialize()
Set arrL = CreateObject("System.Collections.ArrayList")
End Sub
Public Property Let Name(strName As String)
myName = strName
End Property
Public Property Let Age(lngAge As String)
myAge = lngAge
End Property
Public Property Let SocksAdd(sMember)
arrL.Add sMember
End Property
Public Property Get Socks() As Variant
Socks = arrL.toarray()
End Property
Use it in the next testing Sub:
Sub testClassDictListArray()
Dim myFancyCat As New FancyCat, myData As Object
Dim arrSocks, sock
Set myData = CreateObject("Scripting.Dictionary") 'this should be the dictionary returned by ParseJSON
myData.Add "Name", "John Doe": myData.Add "Age", 35
myData.Add "Socks", Array("Blue", "White", "Red", "Green", "Yellow")
myFancyCat.Name = myData("Name")
myFancyCat.Age = myData("Age")
For Each sock In myData("Socks")
myFancyCat.SocksAdd = sock
Next sock
arrSocks = myFancyCat.Socks
Debug.Print Join(arrSocks, "|")
End Sub
I am not sure I perfectly understand the scenario you try putting in discussion...
If you want to benefit of instellisense suggestions, I will tell you what references to be added. Even, I will send two pieces of code to automatically add the necessary references (I mean, Scripting.Dictionary and ArrayList`).
Please, test it and send some feedback.
In your class:
Private m_Name As String
Private m_Socks() As String
Public Property Let Name(Name As String)
m_Name = Name
End Property
Public Property Get Name() As String
Name = m_Name
End Property
Public Sub SetSize(Quantity As Long)
ReDim m_Socks(1 To Quantity)
End Sub
Public Property Let Socks(Index As Long, Sock As String)
m_Socks(Index) = Sock
End Property
Public Property Get Socks(Index As Long) As String
Socks = m_Socks(Index)
End Property
In a regular module:
Sub UseFancyCat()
Dim MyFancyCat As FancyCat
Set MyFancyCat = New FancyCat
MyFancyCat.Name = "Fancy Name"
MyFancyCat.SetSize 2
MyFancyCat.Socks(1) = "Sock1"
MyFancyCat.Socks(2) = "Sock2"
Debug.Print MyFancyCat.Name
Debug.Print MyFancyCat.Socks(1)
Debug.Print MyFancyCat.Socks(2)
End Sub
I have the below code to populate a listbox, therefore I want to remove duplicates from my combobox. I Don't know how to do it:
Private Sub CommandButton1_Click()
Dim ws_suivi As Worksheet
Set ws_suivi = ActiveWorkbook.Worksheets("suivi")
Fin_Liste_suivi = ws_suivi.Range("A65530").End(xlUp).Row
For i = 2 To Fin_Liste_suivi
UserForm_SDE.ComboBox_Type_Rapp.AddItem ws_suivi.Range("AD" & i)
Next
UserForm_SDE.Show
End Sub
It is often worth searching to see if a Library for VBA exists that will save you reinventing the wheel.
It is a particular annoyance of VBA that whilst we have such useful structures as Collections and Scripting.Dictionaries there is no easy way to get information into such objects or to do much processing of the data once those objects are populated.
I had a project which had a lot of processing of arrays/scripting.dictionariews and to make my life a little easier I created a VBA library in C# called Kvp (for Key Value Pairs) which is a bit like a Scripting.Dictionary on steriods.
You can download the library, source code, documentation for the Kvp object from here
Once you have added a reference to the Kvp library you can declare a Kvp object in the standard way.
Dim myKvp as Kvp
Set myKvp=New Kvp
You can then add a 1D range from an excel spreadsheet in a single statement
myKvp.AddByIndexFromArray <excel range>.Value
which gives a Kvp of long integers vs cell values
The OP wishes a list of unique values. To do this with a Kvp we can use the Mirror method to create a Kvp of the unique values.
Dim myMirroredKvp as Kvp
set myMirroredKvp=myKvp.Mirror
The Mirror method returns a Two item Kvp where item 0 is a Kvp of unique items vs the first Key at which the item was found and item 1 is a Kvp of original Keys vs value where the values are a duplicate.
You can then get an array of the keys using the GetKeys method
Dim myUniqueValues as Variant
myUniqueValues = myMirroredKvp.GetItem(0).GetKeys
Or should you want the items sorted in reverse order
myUniqueValues - myMirroredKvp.GetItem(0).GetKeysDescending
The above can be shortened to
myUniqueValues = myKvp.Mirror.GetItem(0).GetKeysDescending
I've found the Kvp library quite useful. I hope you do to!!
While you could load the list to a Dictionary, you might find it simpler to try using WorksheetFunction.CountIf to check if the item is further up your list (and has, thus, already been included):
If (i=2) OR (WorksheetFunction.CountIf(ws_suivi.Range(ws_suivi.Cells(2,30),ws_suivi.Cells(i-1,30)), ws_suivi.cells(i,30).Value)<1) Then
UserForm_SDE.ComboBox_Type_Rapp.AddItem ws_suivi.Range("AD" & i)
End If
As a side-note: Since Excel 2007 increased the Row Limit from 65536 (216) to 1048576 (220), you may want to change Fin_Liste_suivi = ws_suivi.Range("A65530").End(xlUp).Row to Fin_Liste_suivi = ws_suivi.Cells(ws_suivi.Rows.Count, 1).End(xlUp).Row
I found :
Dim Valeur As String
Dim i As Integer
Dim j As Integer
'For each element in the list
For i = 0 To lst_ref.ListCount - 1
Valeur = Combobox.List(i)
For j = i + 1 To Combobox.ListCount - 1
'If the element exist, delete it
If Valeur = Combobox.List(j) Then
Call Combobox.RemoveItem(j)
End If
Next j
Next i
It take the beggining of the combobox and check if the value is red again in to the end of the combobox.
I am coding a Manager in Excel-VBA with several buttons.
One of them is to generate a tab using another Excel file (let me call it T) as input.
Some properties of T:
~90MB size
~350K lines
Contains sales data of the last 14 months (unordered).
Relevant columns:
year/month
total-money
seller-name
family-product
client-name
There is not id columns (like: cod-client, cod-vendor, etc.)
Main relation:
Sellers sells many Products to many Clients
I am generating a new Excel tab with data from T of the last year/month grouped by Seller.
Important notes:
T is the only available input/source.
If two or more Sellers sells the same Product to the same Client, the total-money should be counted to all of those Sellers.
This is enough, now you know what I have already coded.
My code works, but, it takes about 4 minutes of runtime.
I have already coded some other buttons using smaller sources (not greater than 2MB) which runs in 5 seconds.
Considering T size, 4 minutes runtime could be acceptable.
But I'm not proud of it, at least not yet.
My code is mainly based on Scripting.Dictionary to map data from T, and then I use for each key in obj ... next key to set the grouped data to the new created tab.
I'm not sure, but here are my thoughts:
If N is the total keys in a Scripting.Dictionary, and I need to check for obj.Exists(str) before aggregating total-money. It will run N string compares to return false.
Similarly it will run maximun N string compares when I do Set seller = obj(seller_name).
I want to be wrong with my thoughts. But if I'm not wrong, my next step (and last hope) to reduce the runtime of this function is to code my own class object with Tries.
I will only start coding tomorrow, what I want is just some confirmation if I am in the right way, or some advices if I am in the wrong way of doing it.
Do you have any suggestions? Thanks in advance.
Memory Limit Exceeded
In short:
The main problem was because I used a dynamic programming approach of storing information (preprocessing) to make the execution time faster.
My code now runs in ~ 13 seconds.
There are things we learn the hard way. But I'm glad I found the answer.
Using the Task Manager I was able to see my code reaching 100% memory usage.
The DP approach I mentioned above using Scripting.Dictionary reached 100% really faster.
The DP approach I mentioned above using my own cls_trie implementation also reached 100%, but later than the first.
This explains the ~4-5 min compared to ~2-3 min total runtime of above attempts.
In the Task Manager I could also see that the CPU usage never hited 2%.
Solution was simple, I had to balance CPU and Memory usages.
I changed some DP approaches to simple for-loops with if-conditions.
The CPU usage now hits ~15%.
The Memory usage now hits ~65%.
I know this is relative to the CPU and Memory capacity of each machine. But in the client machine it is also running in no more than 15 seconds now.
I created one GitHub repository with my cls_trie implementation and added one excel file with an example usage.
I'm new to the excel-vba world (4 months working with it right now). There might probably have some ways to improve my cls_trie implementation, I'm openned to suggestions:
Option Explicit
Public Keys As Collection
Public Children As Variant
Public IsLeaf As Boolean
Public tObject As Variant
Public tValue As Variant
Public Sub Init()
Set Keys = New Collection
ReDim Children(0 To 255) As cls_trie
IsLeaf = False
Set tObject = Nothing
tValue = 0
End Sub
Public Function GetNodeAt(index As Integer) As cls_trie
Set GetNodeAt = Children(index)
End Function
Public Sub CreateNodeAt(index As Integer)
Set Children(index) = New cls_trie
Children(index).Init
End Sub
'''
'Following function will retrieve node for a given key,
'creating a entire new branch if necessary
'''
Public Function GetNode(ByRef key As Variant) As cls_trie
Dim node As cls_trie
Dim b() As Byte
Dim i As Integer
Dim pos As Integer
b = CStr(key)
Set node = Me
For i = 0 To UBound(b) Step 2
pos = b(i) Mod 256
If (node.GetNodeAt(pos) Is Nothing) Then
node.CreateNodeAt pos
End If
Set node = node.GetNodeAt(pos)
Next
If (node.IsLeaf) Then
'already existed
Else
node.IsLeaf = True
Keys.Add key
End If
Set GetNode = node
End Function
'''
'Following function will get the value for a given key
'Creating the key if necessary
'''
Public Function GetValue(ByRef key As Variant) As Variant
Dim node As cls_trie
Set node = GetNode(key)
GetValue = node.tValue
End Function
'''
'Following sub will set a value to a given key
'Creating the key if necessary
'''
Public Sub SetValue(ByRef key As Variant, value As Variant)
Dim node As cls_trie
Set node = GetNode(key)
node.tValue = value
End Sub
'''
'Following sub will sum up a value for a given key
'Creating the key if necessary
'''
Public Sub SumValue(ByRef key As Variant, value As Variant)
Dim node As cls_trie
Set node = GetNode(key)
node.tValue = node.tValue + value
End Sub
'''
'Following function will validate if given key exists
'''
Public Function Exists(ByRef key As Variant) As Boolean
Dim node As cls_trie
Dim b() As Byte
Dim i As Integer
b = CStr(key)
Set node = Me
For i = 0 To UBound(b) Step 2
Set node = node.GetNodeAt(b(i) Mod 256)
If (node Is Nothing) Then
Exists = False
Exit Function
End If
Next
Exists = node.IsLeaf
End Function
'''
'Following function will get another Trie from given key
'Creating both key and trie if necessary
'''
Public Function GetTrie(ByRef key As Variant) As cls_trie
Dim node As cls_trie
Set node = GetNode(key)
If (node.tObject Is Nothing) Then
Set node.tObject = New cls_trie
node.tObject.Init
End If
Set GetTrie = node.tObject
End Function
You can see in the above code:
I hadn't implemented any delete method because I didn't need it till now. But it would be easy to implement.
I limited myself to 256 children because in this project the text I'm working on is basically lowercase and uppercase [a-z] letters and numbers, and the probability that two text get mapped to the same branch node tends zero.
as a great coder said, everyone likes his own code even if other's code is too beautiful to be disliked [1]
My conclusion
I will probably never more use Scripting.Dictionary, even if it is proven that somehow it could be better than my cls_trie implementation.
Thank you all for the help.
I'm convinced that you've already found the right solution because there wasn't any update for last two years.
Anyhow, I want to mention (maybe it will help someone else) that your bottleneck isn't the Dictionary or Binary Tree. Even with millions of rows the processing in memory is blazingly fast if you have sufficient amount of RAM.
The botlleneck is usually the reading of data from worksheet and writing it back to the worksheet. Here the arrays come very userfull.
Just read the data from worksheet into the Variant Array.
You don't have to work with that array right away. If it is more comfortable for you to work with dictionary, just transfer all the data from array into dictionary and work with it. Since this process is entirely made in memory, don't worry about the performance penalisation.
When you are finished with data processing in dictionary, put all data from dictionary back to the array and write that array into a new worksheet at one shot.
Worksheets("New Sheet").Range("A1").Value = MyArray
I'm pretty sure it will take only few seconds
I am trying to create a class (named ClassSection) that contains a collection (named DefectCollection). It needs a function to add items to that collection but I'm having trouble making it work. I get Error 91 "Object variable or with block variable not set."
I have looked at the other answers on here, which is what got me this far, but I don't understand what I'm missing.
Here is the class module code:
Public DefectCollection As Collection
Private Sub Class_Initialise()
Set DefectCollection = New Collection
End Sub
Public Function AddDefect(ByRef defect As CDefect)
DefectCollection.Add defect [<---- error 91]
End Function
And here is the code that calls the function: ('defect' is another class, which works fine - I want each 'ClassSection' to be able to hold an unlimited number of 'defects')
Dim SC As Collection
Dim section As ClassSection
Set SC = New Collection
Dim SurveyLength As Double
For Each defect In DC
SurveyLength = WorksheetFunction.Max(SurveyLength, defect.Pos, defect.EndPos)
Next defect
SurveyLength = Int(SurveyLength)
For i = 0 To numSurveys
For j = 0 To SurveyLength
Set section = New ClassSection
section.ID = CStr(j & "-" & dates(i))
SC.Add Item:=section, Key:=section.ID
Next j
Next i
Dim meterage As Double
For Each defect In DC
meterage = Int(defect.Pos)
Set section = SC.Item(meterage & "-" & defect.SurveyDate)
section.AddDefect defect
Next defect
Thanks!
You get the error because the DefectCollection is Nothing. This is due to the fact that you mispelled the initalization method:
Private Sub Class_Initialise() '<-- it's with "Z", not "S"
Hence, the initialization of the class is never called, the object remain Nothing by default and the method fails when trying to add an object to Nothing
I have code like this:
Dim MyACL As Variant
Dim Person As List
Redim MyACL(0)
Person("Detail1") = "Something1"
.
.
.
Person(Detailx") = "Somethingx"
ForAll name in names
ReDim Preserve MyAcl(Ubound(MyACL)+1)
Person("Name") = name
MyACL = ArrayAppend(MyACL,Person)
End ForAll
It throws error "Type Mismatch". Do you know, how to create an array of lists? Thank you.
This is a typical example of when you want to use a class instead, and create an array of that class. That class, in turn can contain a list (as well as other things). Can be very powerful!
Updated:
The benefit of using a class is that you can add business logic in the class, and it is very easy to extend it with more functionality later. Below is an example, based on the question above, but with additional functionality.
Class PersonObject
Public PersonAttribute List As String
Public NABdoc As NotesDocument
Public PersonName As String
Public Sub New(personname As String)
Dim nab as New NotesDatabase("Server/Domain","names.nsf")
Dim view as NotesView
'*** Get person document from Domino directory
Set view = nab.GetView("PeopleByFirstName")
Set me.NABdoc = view.GetDocumentByKey(personname)
'*** Set person name in object
me.PersonName = personname
'*** Set some values from person doc
me.PersonAttribute("Email") = GetValue("InternetAddress")
me.PersonAttribute("Phone") = GetValue("OfficePhone")
End Sub
Public Function GetValue(fieldname as String) as String
GetValue = me.NABdoc.GetItemValue(fieldname)(0)
End Function
Public Sub AddAttribute(attributename as String, value as string)
me.PersonAttribute(attributename) = value
End Sub
End Class
You can now very easily build you a list, using this class (and assuming that names is a list of unique names):
Dim person List As PersonObject
Dim personname As String
ForAll n in names
'*** Create person object based on name
person(n) = New PersonObject(n)
'*** Store additional info about this person
person.AddAttribute("Age","35")
End ForAll
Hopefully this gives you an idea of what you can do with classes.
You can also take a look at the following two blog entries about the basics of object oriented Lotusscript:
http://blog.texasswede.com/object-oriented-lotusscript-for-beginners-part-1/
http://blog.texasswede.com/object-oriented-lotusscript-for-beginners-part-2/
If you explicitely declare a variable as Array (as you do in your Redim Statement), then it can not be "reassigned" using arrayappend.
And it is NOT necessary to do it that way. just replace the line MyACL = ArrayAppend(MyACL,Person) with MyACL(Ubound(MyACL)) = Person
Take care: With that example code you will never fill MyACL(0) as the first Element filled is MyACL(1)
To begin filling the array with element 0 the code needs to be changed like this:
Dim max As Integer
max = 0
ForAll thisName In names
ReDim Preserve MyAcl(max)
Person("Name") = thisName
MyACL(max) = Person
max = max + 1
End ForAll
BUT: I don't know, if this is a good idea, as you can not access the "Detail1- Property" of Person directly.
Something like
detail = MyACL(1)("Detail1")
is not possible. You always have to have a temporary variable like this:
person = MyACL(1)
detail = person("Detail1")