I am coding a Manager in Excel-VBA with several buttons.
One of them is to generate a tab using another Excel file (let me call it T) as input.
Some properties of T:
~90MB size
~350K lines
Contains sales data of the last 14 months (unordered).
Relevant columns:
year/month
total-money
seller-name
family-product
client-name
There is not id columns (like: cod-client, cod-vendor, etc.)
Main relation:
Sellers sells many Products to many Clients
I am generating a new Excel tab with data from T of the last year/month grouped by Seller.
Important notes:
T is the only available input/source.
If two or more Sellers sells the same Product to the same Client, the total-money should be counted to all of those Sellers.
This is enough, now you know what I have already coded.
My code works, but, it takes about 4 minutes of runtime.
I have already coded some other buttons using smaller sources (not greater than 2MB) which runs in 5 seconds.
Considering T size, 4 minutes runtime could be acceptable.
But I'm not proud of it, at least not yet.
My code is mainly based on Scripting.Dictionary to map data from T, and then I use for each key in obj ... next key to set the grouped data to the new created tab.
I'm not sure, but here are my thoughts:
If N is the total keys in a Scripting.Dictionary, and I need to check for obj.Exists(str) before aggregating total-money. It will run N string compares to return false.
Similarly it will run maximun N string compares when I do Set seller = obj(seller_name).
I want to be wrong with my thoughts. But if I'm not wrong, my next step (and last hope) to reduce the runtime of this function is to code my own class object with Tries.
I will only start coding tomorrow, what I want is just some confirmation if I am in the right way, or some advices if I am in the wrong way of doing it.
Do you have any suggestions? Thanks in advance.
Memory Limit Exceeded
In short:
The main problem was because I used a dynamic programming approach of storing information (preprocessing) to make the execution time faster.
My code now runs in ~ 13 seconds.
There are things we learn the hard way. But I'm glad I found the answer.
Using the Task Manager I was able to see my code reaching 100% memory usage.
The DP approach I mentioned above using Scripting.Dictionary reached 100% really faster.
The DP approach I mentioned above using my own cls_trie implementation also reached 100%, but later than the first.
This explains the ~4-5 min compared to ~2-3 min total runtime of above attempts.
In the Task Manager I could also see that the CPU usage never hited 2%.
Solution was simple, I had to balance CPU and Memory usages.
I changed some DP approaches to simple for-loops with if-conditions.
The CPU usage now hits ~15%.
The Memory usage now hits ~65%.
I know this is relative to the CPU and Memory capacity of each machine. But in the client machine it is also running in no more than 15 seconds now.
I created one GitHub repository with my cls_trie implementation and added one excel file with an example usage.
I'm new to the excel-vba world (4 months working with it right now). There might probably have some ways to improve my cls_trie implementation, I'm openned to suggestions:
Option Explicit
Public Keys As Collection
Public Children As Variant
Public IsLeaf As Boolean
Public tObject As Variant
Public tValue As Variant
Public Sub Init()
Set Keys = New Collection
ReDim Children(0 To 255) As cls_trie
IsLeaf = False
Set tObject = Nothing
tValue = 0
End Sub
Public Function GetNodeAt(index As Integer) As cls_trie
Set GetNodeAt = Children(index)
End Function
Public Sub CreateNodeAt(index As Integer)
Set Children(index) = New cls_trie
Children(index).Init
End Sub
'''
'Following function will retrieve node for a given key,
'creating a entire new branch if necessary
'''
Public Function GetNode(ByRef key As Variant) As cls_trie
Dim node As cls_trie
Dim b() As Byte
Dim i As Integer
Dim pos As Integer
b = CStr(key)
Set node = Me
For i = 0 To UBound(b) Step 2
pos = b(i) Mod 256
If (node.GetNodeAt(pos) Is Nothing) Then
node.CreateNodeAt pos
End If
Set node = node.GetNodeAt(pos)
Next
If (node.IsLeaf) Then
'already existed
Else
node.IsLeaf = True
Keys.Add key
End If
Set GetNode = node
End Function
'''
'Following function will get the value for a given key
'Creating the key if necessary
'''
Public Function GetValue(ByRef key As Variant) As Variant
Dim node As cls_trie
Set node = GetNode(key)
GetValue = node.tValue
End Function
'''
'Following sub will set a value to a given key
'Creating the key if necessary
'''
Public Sub SetValue(ByRef key As Variant, value As Variant)
Dim node As cls_trie
Set node = GetNode(key)
node.tValue = value
End Sub
'''
'Following sub will sum up a value for a given key
'Creating the key if necessary
'''
Public Sub SumValue(ByRef key As Variant, value As Variant)
Dim node As cls_trie
Set node = GetNode(key)
node.tValue = node.tValue + value
End Sub
'''
'Following function will validate if given key exists
'''
Public Function Exists(ByRef key As Variant) As Boolean
Dim node As cls_trie
Dim b() As Byte
Dim i As Integer
b = CStr(key)
Set node = Me
For i = 0 To UBound(b) Step 2
Set node = node.GetNodeAt(b(i) Mod 256)
If (node Is Nothing) Then
Exists = False
Exit Function
End If
Next
Exists = node.IsLeaf
End Function
'''
'Following function will get another Trie from given key
'Creating both key and trie if necessary
'''
Public Function GetTrie(ByRef key As Variant) As cls_trie
Dim node As cls_trie
Set node = GetNode(key)
If (node.tObject Is Nothing) Then
Set node.tObject = New cls_trie
node.tObject.Init
End If
Set GetTrie = node.tObject
End Function
You can see in the above code:
I hadn't implemented any delete method because I didn't need it till now. But it would be easy to implement.
I limited myself to 256 children because in this project the text I'm working on is basically lowercase and uppercase [a-z] letters and numbers, and the probability that two text get mapped to the same branch node tends zero.
as a great coder said, everyone likes his own code even if other's code is too beautiful to be disliked [1]
My conclusion
I will probably never more use Scripting.Dictionary, even if it is proven that somehow it could be better than my cls_trie implementation.
Thank you all for the help.
I'm convinced that you've already found the right solution because there wasn't any update for last two years.
Anyhow, I want to mention (maybe it will help someone else) that your bottleneck isn't the Dictionary or Binary Tree. Even with millions of rows the processing in memory is blazingly fast if you have sufficient amount of RAM.
The botlleneck is usually the reading of data from worksheet and writing it back to the worksheet. Here the arrays come very userfull.
Just read the data from worksheet into the Variant Array.
You don't have to work with that array right away. If it is more comfortable for you to work with dictionary, just transfer all the data from array into dictionary and work with it. Since this process is entirely made in memory, don't worry about the performance penalisation.
When you are finished with data processing in dictionary, put all data from dictionary back to the array and write that array into a new worksheet at one shot.
Worksheets("New Sheet").Range("A1").Value = MyArray
I'm pretty sure it will take only few seconds
Related
Let me explain.
I have a state similarity implementation in excel. One module, called a state, contains many public dictionaries at the top level.
I fill each dictionary with a huge number of object classes - mostly just data from sheets. Simple practice. The problem starts after the macro has worked normally, it leaves all these dictionaries in memory, in the task manager the Excel occupies from 2GB - this is also normal.
State Module - Standalone module
public Dict1 as Dictionary
public Dict2 as Dictionary
'Persists sheets data
public Dict3 as Dictionary
public Dict4 as Dictionary
'For persists renaming some objs
public Dict5 as Dictionary
public Dict6 as Dictionary
public Dict7 as Dictionary
Class Module - Data Interface Example - clsData
Public Name as string
Public Prop1 as string
Public Prop2 as Integer
Public Prop3 as Date
Public Prop4 as string
Public Value as double
The code below is just an example for stackoverflow. In my modules I am grabbing data from a sheet with the Range.CurrentRegion and iterate lbound to ubound.
Another data grabber
Function DataGrabberFromSheet(ByRef CurrentDict as Dict) as String
Dim i as long
Dim data as variant
Dim DataObj as clsData
set CurrentDict = New Dictionary <--- That's recreate dict obj and start clear old data for some how, but i do not need that anymore.
data = Sheet1.Range("A1:Q5000").Values
for i = 1 to 5000
set DataObj = new clsData
DataObj.Name = data(i, 1)
DataObj.Prop1 = data(i,2)
...
call CurrentDict.add(DataObj.Name, DataObj)
next
'For logging, it just an example:
If Success then
DataGrabberFromSheet = "Success"
else
DataGrabberFromSheet = "Bad"
endif
end Function
And so I decide in the debbuger to forcibly stop the program by pressing the stop button. At this point, Excel freezes for a long time, because I already have 50 or more of these dictionaries in the RAM, some of them have 200k elements each. Sometimes it takes about 300 seconds, and sometimes it instantly clears it somehow.
Old data erase by item, I think, but I do not need that data anymore. Can excel just skip that part of memory and just rewrite a new data.
How can I instantly redefine these dictionaries when the macro is rerun and not have to wait for that long cleanup? data is no longer needed at this point
Classic internet methods optimisations not solved that problem:
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
...
I think this is a known issue with VBA - clearing a large number of objects takes a long time: regardless of whether they're stored in a dictionary, collection, or an array.
For example:
Dim arr() As clsData
Sub Tester()
Const NUM As Long = 120000
Dim i As Long, obj As clsData, t
t = Timer
Debug.Print "---------"
ReDim arr(1 To NUM)
Debug.Print "Reset", Timer - t
t = Timer
For i = 1 To NUM
Set obj = New clsData
obj.Prop1 = "Item" & i
obj.Prop2 = "Item" & i
obj.Prop3 = "Item" & i
obj.Prop4 = "Item" & i
Set arr(i) = obj
Next i
Debug.Print "Fill", Timer - t
End Sub
Where clsData is just:
Public Prop1
Public Prop2
Public Prop3
Public Prop4
Output from first run (after clicking "Stop" in VBE):
Reset 0
Fill 0.34375
Second run:
Reset 8.601563 <<<<<
Fill 0.3554688
I have the below code to populate a listbox, therefore I want to remove duplicates from my combobox. I Don't know how to do it:
Private Sub CommandButton1_Click()
Dim ws_suivi As Worksheet
Set ws_suivi = ActiveWorkbook.Worksheets("suivi")
Fin_Liste_suivi = ws_suivi.Range("A65530").End(xlUp).Row
For i = 2 To Fin_Liste_suivi
UserForm_SDE.ComboBox_Type_Rapp.AddItem ws_suivi.Range("AD" & i)
Next
UserForm_SDE.Show
End Sub
It is often worth searching to see if a Library for VBA exists that will save you reinventing the wheel.
It is a particular annoyance of VBA that whilst we have such useful structures as Collections and Scripting.Dictionaries there is no easy way to get information into such objects or to do much processing of the data once those objects are populated.
I had a project which had a lot of processing of arrays/scripting.dictionariews and to make my life a little easier I created a VBA library in C# called Kvp (for Key Value Pairs) which is a bit like a Scripting.Dictionary on steriods.
You can download the library, source code, documentation for the Kvp object from here
Once you have added a reference to the Kvp library you can declare a Kvp object in the standard way.
Dim myKvp as Kvp
Set myKvp=New Kvp
You can then add a 1D range from an excel spreadsheet in a single statement
myKvp.AddByIndexFromArray <excel range>.Value
which gives a Kvp of long integers vs cell values
The OP wishes a list of unique values. To do this with a Kvp we can use the Mirror method to create a Kvp of the unique values.
Dim myMirroredKvp as Kvp
set myMirroredKvp=myKvp.Mirror
The Mirror method returns a Two item Kvp where item 0 is a Kvp of unique items vs the first Key at which the item was found and item 1 is a Kvp of original Keys vs value where the values are a duplicate.
You can then get an array of the keys using the GetKeys method
Dim myUniqueValues as Variant
myUniqueValues = myMirroredKvp.GetItem(0).GetKeys
Or should you want the items sorted in reverse order
myUniqueValues - myMirroredKvp.GetItem(0).GetKeysDescending
The above can be shortened to
myUniqueValues = myKvp.Mirror.GetItem(0).GetKeysDescending
I've found the Kvp library quite useful. I hope you do to!!
While you could load the list to a Dictionary, you might find it simpler to try using WorksheetFunction.CountIf to check if the item is further up your list (and has, thus, already been included):
If (i=2) OR (WorksheetFunction.CountIf(ws_suivi.Range(ws_suivi.Cells(2,30),ws_suivi.Cells(i-1,30)), ws_suivi.cells(i,30).Value)<1) Then
UserForm_SDE.ComboBox_Type_Rapp.AddItem ws_suivi.Range("AD" & i)
End If
As a side-note: Since Excel 2007 increased the Row Limit from 65536 (216) to 1048576 (220), you may want to change Fin_Liste_suivi = ws_suivi.Range("A65530").End(xlUp).Row to Fin_Liste_suivi = ws_suivi.Cells(ws_suivi.Rows.Count, 1).End(xlUp).Row
I found :
Dim Valeur As String
Dim i As Integer
Dim j As Integer
'For each element in the list
For i = 0 To lst_ref.ListCount - 1
Valeur = Combobox.List(i)
For j = i + 1 To Combobox.ListCount - 1
'If the element exist, delete it
If Valeur = Combobox.List(j) Then
Call Combobox.RemoveItem(j)
End If
Next j
Next i
It take the beggining of the combobox and check if the value is red again in to the end of the combobox.
For a class project my partner and I have created a Rock Paper Scissors simulation using Markov Chain. We have the input for what the computer does, but we don't know how we can keep track of the score.
How can we use VBA or maybe a function to get the score after each round?
We've tried things in VBA we tried different functions. But there is no data to summarize it.
Sub Score()
Dim sVariable As String
Dim iNumber As Integer
Dim iPC As Variant
Dim iPlayer As Variant
sVariable = Sheets("Model").Range("D10")
iPC = Sheets("Model").Range("E6") + 1
iPlayer = Sheets("Model").Range("F6") + 1
iNumber = 1
If sVariable = "PC Winner!" Then
Sheets("Model").Range("E6") = iPC
ElseIf sVariable = "Player Winner!" Then
Sheets("Model").Range("F6") = iPlayer
End If
End Sub
That code is the closest we have gotten and we added a button to make it run since it doesn't do it automatically. But now every time we add the score the move changes for the PC because of the random function we have for the Markov data. We want to keep the score and reset it everytime the game is over.
Probably easiest way is to create a global variable and increment the score upon individual wins and then Call a procedure after each round to update the scores.
Note: Depending on your implementation a global variable may not even be necessary and could be easily passed via an argument. It's just hard to tell without further details provided
Public playerScore as Integer
Public pcScore as Integer
Private Sub update_score()
Sheets("Model").Range("E6") = pcScore
Sheets("Model").Range("F6") = playerScore
End Sub
Private Sub Score()
' ... your code here ...'
If sVariable = "PC Winner!" Then
pcScore = pcScore + 1
Else
playerScore = playerScore + 1
End If
update_score
End Sub
and upon new game you re-initate the score
Private Sub new_game()
pcScore = 0
playerScore = 0
' ... your code here ...'
End Sub
I'm not exactly sure, if I've gotten your question right, but this should work.
In your future questions, it would be welcome, if you did bit of a
better job explaining what data you're working with and how your
desired result should look like, as per Minimal, Complete and
Verifiable Example, because from
your current question it's not clear:
when exactly is the game over
where exactly you want to update your score
on which condition should exactly the score increment
which procedures you are calling upon aforementioned events
So I had to do a lot of guess-work in your question. Either way, should be more than enough to guide you to the right path :)
I'm looking to implement a "Stack" Class in VBA for Excel. I want to use a Last In First Out structure. Does anyone came across this problem before ? Do you know external libraries handling structure such as Stack, Hastable, Vector... (apart the original Excel Collection etc...)
Thanks
Here is a very simple stack class.
Option Explicit
Dim pStack As Collection
Public Function Pop() As Variant
With pStack
If .Count > 0 Then
Pop = .Item(.Count)
.Remove .Count
End If
End With
End Function
Public Function Push(newItem As Variant) As Variant
With pStack
.Add newItem
Push = .Item(.Count)
End With
End Function
Public Sub init()
Set pStack = New Collection
End Sub
Test it
Option Explicit
Sub test()
Dim cs As New cStack
Dim i As Long
Set cs = New cStack
With cs
.init
For i = 1 To 10
Debug.Print CStr(.Push(i))
Next i
For i = 1 To 10
Debug.Print CStr(.Pop)
Next i
End With
End Sub
Bruce
Bruce McKinney provided code for a Stack, List, and Vector in this book (it was VB5(!), but that probably doesn't matter much):
http://www.amazon.com/Hardcore-Visual-Basic-Bruce-McKinney/dp/1572314222
(It's out of print, but used copies are cheap.)
The source code appears to be available here:
http://vb.mvps.org/hardweb/mckinney2a.htm#2
(Caveat - I've never used any of his code, but I know he's a highly regarded, long-time VB expert, and his book was included on MSDN for a long time.)
I'm sure there are also many different implementations for these things floating around the internet, but I don't know if any of them are widely used by anybody but their authors.
Of course, none of this stuff is that hard to write your own code for, given that VBA supports resizeable arrays (most of the way to a vector) and provides a built-in Collection class (most of the way to a list). Charles William's answer for a stack is about all the info you need. Just provide your own wrapper around either an array or a Collection, but the code inside can be relatively trivial.
For a hashtable, the MS Scripting Runtime includes a Dictionary class that basically is one. See:
Hash Table/Associative Array in VBA
I do not know of any external VBA libraries for these structures.
For my procedure-call stack I just use a global array and array pointer with Push and Pop methods.
You can use the class Stack in System.Collections, as you can use Queue and others. Just search for vb.net stack for documentation. I have not tried all methods (e.g. Getenumerator - I don't know how to use an iterator, if at all possible in VBA). Using a stack or a queue gives you some nice benefits, normally not so easy in VBA. You can use
anArray = myStack.ToArray
EVEN if the stack is empty (Returns an array of size 0 to -1).
Using a custom Collections Object, it works very fast due to its simplicity and can easily be rewritten (e.g. to only handle strongly typed varibles). You might want to make a check for empty stack. If you try to use Pop on an empty stack, VBA will not handle it gracefully, as all null-objects. I found it more reasonable to use:
If myStack.Count > 0 Then
from the function using the stack, instead of baking it into clsStack.Pop. If you bake it into the class, a call to Pop can return a value of chosen type - of course you can use this to handle empty values, but you get much more grief that way.
An example of use:
Private Sub TestStack()
Dim i as long
Dim myStack as clsStack
Set myStack = New clsStack
For i = 1 to 2
myStack.Push i
Next
For i = 1 to 3
If myStack.Count > 0 Then
Debug.Print myStack.Pop
Else
Debug.Print "Stack is empty"
End If
Next
Set myStack = Nothing
End Sub
Using a LIFO-stack can be extremely helpful!
Class clsStack
Dim pStack as Object
Private Sub Class_Initialize()
set pStack = CreateObject("System.Collections.Stack")
End Sub
Public Function Push(Value as Variant)
pStack.Push Value
End Function
Public Function Pop() As Variant
Pop = pStack.Pop
End Function
Public Function Count() as long
Count = pstack.Count
End Function
Public Function ToArray() As Variant()
ToArray = pStack.ToArray()
End Function
Public Function GetHashCode() As Integer
GetHashCode = pStack.GetHashCode
End Function
Public Function Clear()
pStack.Clear
End Function
Private Sub Class_terminate()
If (Not pStack Is Nothing) Then
pStack.Clear
End If
Set pStack = Nothing
End Sub
I have created a VSTO plugin with my own RTD implementation that I am calling from my Excel sheets. To avoid having to use the full-fledged RTD syntax in the cells, I have created a UDF that hides that API from the sheet.
The RTD server I created can be enabled and disabled through a button in a custom Ribbon component.
The behavior I want to achieve is as follows:
If the server is disabled and a reference to my function is entered in a cell, I want the cell to display Disabled.
If the server is disabled, but the function had been entered in a cell when it was enabled (and the cell thus displays a value), I want the cell to keep displaying that value.
If the server is enabled, I want the cell to display Loading.
Sounds easy enough. Here is an example of the - non functional - code:
Public Function RetrieveData(id as Long)
Dim result as String
// This returns either 'Disabled' or 'Loading'
result = Application.Worksheet.Function.RTD("SERVERNAME", "", id)
RetrieveData = result
If(result = "Disabled") Then
// Obviously, this recurses (and fails), so that's not an option
If(Not IsEmpty(Application.Caller.Value2)) Then
// So does this
RetrieveData = Application.Caller.Value2
End If
End If
End Function
The function will be called in thousands of cells, so storing the 'original' values in another data structure would be a major overhead and I would like to avoid it. Also, the RTD server does not know the values, since it also does not keep a history of it, more or less for the same reason.
I was thinking that there might be some way to exit the function which would force it to not change the displayed value, but so far I have been unable to find anything like that.
EDIT:
Due to popular demand, some additional info on why I want to do all this:
As I said, the function will be called in thousands of cells and the RTD server needs to retrieve quite a bit of information. This can be quite hard on both network and CPU. To allow the user to decide for himself whether he wants this load on his machine, they can disable the updates from the server. In that case, they should still be able to calculate the sheets with the values currently in the fields, yet no updates are pushed into them. Once new data is required, the server can be enabled and the fields will be updated.
Again, since we are talking about quite a bit of data here, I would rather not store it somewhere in the sheet. Plus, the data should be usable even if the workbook is closed and loaded again.
Different tack=new answer.
A few things I've discovered the hard way, that you might find useful:
1.
In a UDF, returning the RTD call like this
' excel equivalent: =RTD("GeodesiX.RTD",,"status","Tokyo")
result = excel.WorksheetFunction.rtd( _
"GeodesiX.RTD", _
Nothing, _
"geocode", _
request, _
location)
behaves as if you'd inserted the commented function in the cell, and NOT the value returned by the RTD. In other words, "result" is an object of type "RTD-function-call" and not the RTD's answer. Conversely, doing this:
' excel equivalent: =RTD("GeodesiX.RTD",,"status","Tokyo")
result = excel.WorksheetFunction.rtd( _
"GeodesiX.RTD", _
Nothing, _
"geocode", _
request, _
location).ToDouble ' or ToString or whetever
returns the actual value, equivalent to typing "3.1418" in the cell. This is an important difference; in the first case the cell continues to participate in RTD feeding, in the second case it just gets a constant value. This might be a solution for you.
2.
MS VSTO makes it look as though writing an Office Addin is a piece of cake... until you actually try to build an industrial, distributable solution. Getting all the privileges and authorities right for a Setup is a nightmare, and it gets exponentially worse if you have the bright idea of supporting more than one version of Excel. I've been using Addin Express for some years. It hides all this MS nastiness and let's me focus on coding my addin. Their support is first-rate too, worth a look. (No, I am not affiliated or anything like that).
3.
Be aware that Excel can and will call Connect / RefreshData / RTD at any time, even when you're in the middle of something - there's some subtle multi-tasking going on behind the scenes. You'll need to decorate your code with the appropriate Synclock blocks to protect your data structures.
4.
When you receive data (presumably asynchronously on a separate thread) you absolutely MUST callback Excel on the thread on which you were intially called (by Excel). If you don't, it'll work fine for a while and then you'll start getting mysterious, unsolvable crashes and worse, orphan Excels in the background. Here's an example of the relevant code to do this:
Imports System.Threading
...
Private _Context As SynchronizationContext = Nothing
...
Sub New
_Context = SynchronizationContext.Current
If _Context Is Nothing Then
_Context = New SynchronizationContext ' try valiantly to continue
End If
...
Private Delegate Sub CallBackDelegate(ByVal GeodesicCompleted)
Private Sub GeodesicComplete(ByVal query As Query) _
Handles geodesic.Completed ' Called by asynchronous thread
Dim cbd As New CallBackDelegate(AddressOf GeodesicCompleted)
_Context.Post(Function() cbd.DynamicInvoke(query), Nothing)
End Sub
Private Sub GeodesicCompleted(ByVal query As Query)
SyncLock query
If query.Status = "OK" Then
Select Case query.Type
Case Geodesics.Query.QueryType.Directions
GeodesicCompletedTravel(query)
Case Geodesics.Query.QueryType.Geocode
GeodesicCompletedGeocode(query)
End Select
End If
' If it's not resolved, it stays "queued",
' so as never to enter the queue again in this session
query.Queued = Not query.Resolved
End SyncLock
For Each topic As AddinExpress.RTD.ADXRTDTopic In query.Topics
AddinExpress.RTD.ADXRTDServerModule.CurrentInstance.UpdateTopic(topic)
Next
End Sub
5.
I've done something apparently akin to what you're asking in this addin. There, I asynchronously fetch geocode data from Google and serve it up with an RTD shadowed by a UDF. As the call to GoogleMaps is very expensive, I tried 101 ways and several month's of evenings to keep the value in the cell, like what you're attempting, without success. I haven't timed anything, but my gut feeling is that a call to Excel like "Application.Caller.Value" is an order of magnitude slower than a dictionary lookup.
In the end I created a cache component which saves and re-loads values already obtained from a very-hidden spreadsheet which I create on the fly in Workbook OnSave. The data is stored in a Dictionary(of string, myQuery), where each myQuery holds all the relevant info.
It works well, fulfils the requirement for working offline and even for 20'000+ formulas it appears instantaneous.
HTH.
Edit: Out of curiosity, I tested my hunch that calling Excel is much more expensive than doing a dictionary lookup. It turns out that not only was the hunch correct, but frighteningly so.
Public Sub TimeTest()
Dim sw As New Stopwatch
Dim row As Integer
Dim val As Object
Dim sheet As Microsoft.Office.Interop.Excel.Worksheet
Dim dict As New Dictionary(Of Integer, Integer)
Const iterations As Integer = 100000
Const elements As Integer = 10000
For i = 1 To elements + 1
dict.Add(i, i)
Next
sheet = _ExcelWorkbook.ActiveSheet
sw.Reset()
sw.Start()
For i As Integer = 1 To iterations
row = 1 + Rnd() * elements
Next
sw.Stop()
Debug.WriteLine("Empty loop " & (sw.ElapsedMilliseconds * 1000) / iterations & " uS")
sw.Reset()
sw.Start()
For i As Integer = 1 To iterations
row = 1 + Rnd() * elements
val = sheet.Cells(row, 1).value
Next
sw.Stop()
Debug.WriteLine("Get cell value " & (sw.ElapsedMilliseconds * 1000) / iterations & " uS")
sw.Reset()
sw.Start()
For i As Integer = 1 To iterations
row = 1 + Rnd() * elements
val = dict(row)
Next
sw.Stop()
Debug.WriteLine("Get dict value " & (sw.ElapsedMilliseconds * 1000) / iterations & " uS")
End Sub
Results:
Empty loop 0.07 uS
Get cell value 899.77 uS
Get dict value 0.15 uS
Looking up a value in a 10'000 element Dictionary(Of Integer, Integer) is over 11'000 times faster than fetching a cell value from Excel.
Q.E.D.
Maybe... Try making your UDF wrapper function non-volatile, that way it won't get called unless one of its arguments changes.
This might be a problem when you enable the server, you'll have to trick Excel into calling your UDF again, it depends on what you're trying to do.
Perhaps explain the complete function you're trying to implement?
You could try Application.Caller.Text This has the drawback of returning the formatted value from the rendering layer as text, but seems to avoid the circular reference problem.Note: I have not tested this hack under all possible circumstances ...