VBA - Why VBA UDFs are so slow compared with native Excel functions? - excel

let's suppose I write the following VBA UDF:
Function TestFunction()
TestFunction = 0
End Function
and then I use it for the first 100000 rows in my sheet. It takes several minutes to execute.
Instead if i use TODAY() for the same number of rows it takes just 3-4 seconds to execute.
Can anyone tell me why and if is there a way to speed up UDFs?
Thank you!

Several reasons.
VBA functions need to run sequentially, off the UI/main thread, and the compiled p-code needs to be interpreted by the VBA runtime.
Native functions are native. They're (presumably - AFAIK they're written in C++) already compiled to machine code that's readily executable and doesn't need to be recompiled and/or interpreted. Some native functions can also leverage multithreaded and "background" computing.
As for speeding up your UDFs, we'd need to see your UDFs for that. A function that does nothing other than assigning a literal return value, doesn't have much room for optimization does it?
UDFs are great. But they're not a silver bullet. If I wanted to write the value 0 to A1:A1000000, I'd do Sheet1.Range("A1:A1000000").Value = 0 and that would be near-instant.
Consider looking into macros rather than UDFs if you're going to have hundreds of thousands of them to calculate.

There are a number of different reasons for this.
For one VBA UDFs are interpreted whereas native Excel worksheet functions are compiled. You would get big speed increases if you compiled your VBA code as VB6 for example. VBA and VB6 code are either the same or almost exactly the same. So the reason for the big speed increase would just be that the VB6 code is compiled rather than interpreted like VBA code is.
VBA code also doesn't produce the same type of worksheet functions that Excel does. VBA UDFs lack worksheet intellisense for example. You can't get this in any way through VBA. You can get it through external add-ins elsewhere however (e.g. Excel DNA.)
Another reason is that VBA isn't the best API for writing performant UDFs. That would be the C API. But the C API is harder to write UDFs in than in VBA.
There are also a number of other things that could affect speed, like your underlying hardware, or the algorithm you're using in the UDF. It's hard to give you useful suggestions without seeing your code.
Are you sure you need UDFs? The only advantages UDFs have over macros (that I'm aware of anyway) is that they don't delete the undostack after they're called whereas most macros do. And they can recalculate dynamically whereas you continuously have to rerun macros after they're called (unless you're using a worksheet event or something.)
If you're doing a ton of calculations on a range of cells, it's probably better to just write the range to an array, manipulate it in VBA, and then just write it back to the range.

Related

UDFs Vs Built In Functions in Excel

Is there a way to assess a speed comparison between UDFs, User-Defined-Functions, and Excel's built in functions without benchmark testing? A purely mathematical approach if you will.
For instance, I recently posted the following question:
Unique Count (Excel VBA vs Formulas) Faster Approach
After writing the UDF it was obvious that it was much faster, but I frankly assumed it wouldn't have been, and am looking for clarity on how I can better align my judgement BEFORE writing vba that proves useless.
I found this:
Speed, VBA VS Excel Formulas , but their question wasn't exactly the same as mine.

How to make an Excel function that include a pause without stalling Excel

I have an Excel "project" that includes a .dll where I have written some complex statistical calculations called through VBA. I have done that for speed reasons. The calculations take about a second each. Since they are called through VBA it stalls Excel for the duration of the calculations and that is acceptable. (The choice of Excel is not mine but a result of the way a third party has chosen to deliver data)
But for the purpose of the project I need to have the results of the calculations turn up after not one second but after ten. I could either expand the calculations for greater accuracy or simply include a pause in the code. But since it is done via VBA it stalls the whole project for all ten seconds and that is not acceptable.
I have looked into ExcelDNA since it avoids VBA completely and might make it possible to do ALL that is done via VBA with ExcelDNA or existing build in functions. I have modified this example for testing:
https://grumpyop.wordpress.com/2009/11/25/a-third-way-dna/
and included a simple Thread.Sleep(10000); to the code to simulate the pause. But that ALSO stalls Excel for the duration of the calculation.
Is there a way to include a pause in functions that doesn´t make Excel wait for the result but where the result is "pushed" to the cell/the cell "subscribes" to the result? Can it be done via ExcelDNA, XLL or via a third solution? I would prefer a soution where I can use C or very lightly modified C since all the statistical functions are written in C.
You need to make your function asynchronous.
Excel supports this from Excel 2010.
https://msdn.microsoft.com/en-us/library/office/ff796219%28v=office.14%29.aspx
ExceDNA also supports Asynchronous functions
https://exceldna.codeplex.com/wikipage?title=Asynchronous%20Functions
But you cannot use a VBA UDF to call an external resource asynchronously: the UDF has to be an XLL.

Modify Iterative VBA Routine to Run on High Performance Cluster

I have a 2010 Excel 64-bit model that has a single VBA subroutine to run through 16 combinations of inputs, which all get processed using the same Excel model calculations and which then outputs the results to tabs in the model. I have access to a high performance cluster (HPC) and wish to run the VBA code such that I can run the 16 combinations in parallel, instead of the current sequential process on the HPC. How should I approach this? For examples, do I need to put each combination into a separate subroutine and have a main VBA subroutine to call each of the combinations? Is front end and back end VBA code that I need to include in order to run the model on the HPC?
Excel VBA does not directly allow multithreading, so unfortunately there is no simple VBA solution for this.
I can see a couple options here, and it will depend on your problem whether you will be able to use them.
In Excel 2007 and 2010, worksheet functions can execute in parallel. If your VBA code is a function and not a sub, and if most of your data comes from the worksheet, you could try to take advantage of that.
You could write a DLL that handles multithreading yourself, and call it from Excel. For this, you'd have to port your code to VB 6 or VB.NET (or straight up rewrite it in C/C++), and manually deal with multithreading.

Why function SUMIFS is better than iterate (for)

Excel SUMIFS function can't execute if the other workbooks is closed. So, i did a SUMIFS function that opens my workbook, iterate via 'for' loop and verify if the value column needs to be added in my total variable.
I did another function that removes my 'for' loop and uses "WorksheetFunction.SumIfs(...)". My new function ran fastest than the old one.
What is the magic behind excel functions and VBA iteration?
From: http://msdn.microsoft.com/en-us/library/ff726673.aspx#xlUsingFuncts
(emphasis added)
User-Defined Functions
User-defined functions that are programmed in C or C++ and that use
the C API (XLL add-in functions) generally perform faster than
user-defined functions that are developed using VBA or Automation (XLA
or Automation add-ins). For more information, see Developing Excel
2010 XLLs.
XLM functions can also be fast, because they use the same tightly
coupled API as C XLL add-in functions. The performance of VBA
user-defined functions is sensitive to how you program and call them.
Faster VBA User-Defined Functions
It is usually faster to use the Excel formula calculations and
worksheet functions than to use VBA user-defined functions. This is
because there is a small overhead for each user-defined function call
and significant overhead transferring information from Excel to the
user-defined function. But well-designed and called user-defined
functions can be much faster than complex array formulas.
One way to get more insight is to test how the performance ratio changes with different input sizes. For example if the performance ratio remains about the same when input size increases 100-fold, than it is probably due to VBA abstraction overhead.

Why is Excel's 'Evaluate' method a general expression evaluator?

A few questions have come up recently involving the Application.Evaluate method callable from Excel VBA. The old XLM macro language also exposes an EVALUATE() function. Both can be quite useful. Does anyone know why the evaluator that is exposed can handle general expressions, though?
My own hunch is that Excel needed to give people a way to get ranges from string addresses, and to get the value of named formulas, and just opening a portal to the expression evaluator was the easiest way. (The help for the VBA version does say its purpose it to "convert a Microsoft Excel name to an object or a value".) But of course you don't need the ability to evaluate arbitrary expressions just to do that. (That is, Excel could provide a Name.Evaluate method or something instead.)
Application.Evaluate seems kind of...unfinished. It's full behavior isn't very well documented, and there are quite a few quirks and limitations (as described by Charles Williams here: http://www.decisionmodels.com/calcsecretsh.htm) with what is exposed.
I suppose the answer could be simply "why not expose it?", but I'd be interested to know what design decisions led to this feature taking the form that it does. Failing that, I'd be interested to hear other hunches.
Well I think its required to enable VBA to get the result from a Named Formula (or a string containing a formula), (OK there is also the ugly method of inserting the formula into a spare cell and then reading back the result, but for example that won't work from inside a UDF).
In VBA its complex to determine if a Defined Name contains a range reference or a formula. Using Evaluate works for both cases.
Its also sometimes very efficient and simpler to build Excel formulae as strings and evaluate them rather than having to bring all the data from Excel into VBA and then do the calculations in VBA. (Its expensive to get data from Excel into VBA and even worse with current .NET implementations).

Resources