I'm having an issue with a memory leak while I run some vba code I wrote to look at a source spreadsheet, pull new data, then do some work on it, and save it to other spreadsheets. The code then uses Application.OnTime to call itself again in a few minutes giving me a continually updating dataset. All the excel files involved are under 10MB. The result is after a few hours running, the excel process will be multiple gigabytes as well as the Kernel Memory Paged Pool. Alternatively I've tried controlling the looping from a Word macro so that I'm able to kill the excel process after each run completes. This keeps the excel process memory usage in check but the Kernel Memory Paged Pool still grows seemingly without end - after about two days the paged pool will be about 10GB.
I've seen some advice about being wary of using two dots with COM objects as a source of memory issues.
How do I properly clean up Excel interop objects?
https://www.add-in-express.com/creating-addins-blog/2013/11/05/release-excel-com-objects/
But from what I've seen, these don't address the issue if you're coding within the Microsoft Visual Basic for Applications side of excel? Does the two dot issue remain if my code is all in VBA?
If it does, how far do I need to take idea when dealing with things like ranges, rows, columns etc. For example, a common task is finding the number of used rows in a sheet which I do by:
Dim ws as Excel.Worksheet
Set ws = ThisWorkbook.Worksheets("name")
With ws
rowsize = .Range("A1", .Range("A" & .Rows.Count).End(xlUp)).Rows.Count
End With
Do you need to create variables that hold Worksheet.Range and Range.Rows to avoid double dots? If so what would the above code look like properly written to observe the no two dot rule of thumb?
PS
I've tried debugging the memory leak more directly by using poolmon.exe. This repeatedly shows CMNb as the culprit tag, but I can't seem to get any further down this debug path as I'm unable to locate the tag using strings and findstr as exampled in the below link:
https://blogs.technet.microsoft.com/markrussinovich/2009/03/10/pushing-the-limits-of-windows-paged-and-nonpaged-pool/
Related
I'm using excel vba to pull data from a MS Access DB - this is using Excel 2013 and Access 2013 32bit. The code historically has used:
Provider=Microsoft.Jet.OLEDB.4.0;
However some computers have upgraded to Excel 2016 64bit and the Jet provider is not available for 64bit. I have changed the code to:
Provider=Microsoft.ACE.OLEDB.12.0;
which works for both 64bit and 32bit systems. However, I have noticed a significant speed drop in loading/saving data just from changing this line. Does anyone know why this can be and how I can improve it?
You are correct in having to choose the ACE provider for x64 bits.
And the big advantage of JET was it was (and still is) installed on all copies of windows by default. So no need to install Access or the runtime, or previous the office connectivity package.
As for performance? There has been a few comments about performance in regards to ACE x64.
However, one trick or suggestion is to ensure that the connection stays open. In other words, are you sure the row processing is going slow, or it is the overall time?
(perhaps put a test msg box, or test in your code.
Eg:
Dim T as single
T = timer()
‘ your code here
Debug.print timer() – t
The above will thus spit out the time to the debug window (while in VBA ide hit ctrl-g to display the immediate/debug window.
The reason why I suggest force open idea is often you find that ACE takes a VERY long time to open. But once open then the data reading has good performance (same as before).
So, I suggest to check and try this fix.
So open a table (any table) and KEEP it open. Now run your existing code (that may well open + close other tables). The issue is when ACE attempts to open a table, it tries to put locks on the mdb/accdb file and it is this process that takes VERY VERY long time.
However, if you force (keep) open one table, then this VERY slow process of ACE attempting to lock the file for read/write does not occur each time you execute a query, or create additional recordsets in code.
So, if the row reading speed is fast, but the time to START + open is very slow, then before you run + test your routines, force open a table to some reocrdset (keep it active and in scope), and THEN try your code.
I find 9 out of 10 times, this results in elimination of this slow speed, and often I seen the results are nothing short of spectacular (it will run faster then before!!!)
There are a lot of great examples of how to take an Excel range, create an image from it, and save it to the drive. Here is one: Export pictures from excel file into jpg using VBA
This works great on a small scale, but when you try to run this through 3,000 or more iterations, a "memory leak" caused by the repeated use of the clipboard eventually erodes the process and the macro fails somewhere along the way. This occurs even when running 64-bit Excel on a powerful machine (50+ GB of RAM).
Are there any ways to do this without using the clipboard?? My first thought was to try to fix the memory leak issue, but all of those attempts have been unsuccessful. For context, I'm basically using the exact code as provided in the solution on link above (with a couple of added features to try to reduce memory leaking like auto-saving the workbook after every 100 images, etc.).
I'm also looking for what you mentioned; here's how to do it with a chart:
Dim file As String ' the path to the saved image, in the temp dir
file = Environ$("temp") & "\chart.gif"
Sheets("Sheet1").ChartObjects(0).Activate
Sheets("Sheet1").ChartObjects(0).Chart.Export Filename:=file, FilterName:="GIF"
There was ultimately no solution for the memory leak, it seems to be a systemic problem with VBA.
For those trying to programmatically generate charts, it is much easier to build in PHP.
I am running VBA code on a large Excel spreadsheet. How do I clear the memory between procedures/calls to prevent an "out of memory" issue occurring?
The best way to help memory to be freed is to nullify large objects:
Sub Whatever()
Dim someLargeObject as SomeObject
'expensive computation
Set someLargeObject = Nothing
End Sub
Also note that global variables remain allocated from one call to another, so if you don't need persistence you should either not use global variables or nullify them when you don't need them any longer.
However this won't help if:
you need the object after the procedure (obviously)
your object does not fit in memory
Another possibility is to switch to a 64 bit version of Excel which should be able to use more RAM before crashing (32 bits versions are typically limited at around 1.3GB).
I've found a workaround. At first it seemed it would take up more time, but it actually makes everything work smoother and faster due to less swapping and more memory available. This is not a scientific approach and it needs some testing before it works.
In the code, make Excel save the workbook every now and then. I had to loop through a sheet with 360 000 lines and it choked badly. After every 10 000 I made the code save the workbook and now it works like a charm even on a 32-bit Excel.
If you start Task Manager at the same time you can see the memory utilization go down drastically after each save.
Answer is you can't explicitly but you should be freeing memory in your routines.
Some tips though to help memory
Make sure you set object to null before exiting your routine.
Ensure you call Close on objects if they require it.
Don't use global variables unless absolutely necessary
I would recommend checking the memory usage after performing the routine again and again you may have a memory leak.
Found this thread looking for a solution to my problem. Mine required a different solution that I figured out that might be of use to others. My macro was deleting rows, shifting up, and copying rows to another worksheet. Memory usage was exploding to several gigs and causing "out of memory" after processing around only 4000 records. What solved it for me?
application.screenupdating = false
Added that at the beginning of my code (be sure to make it true again, at the end)
I knew that would make it run faster, which it did.. but had no idea about the memory thing.
After making this small change the memory usage didn't exceed 135 mb. Why did that work? No idea really. But it's worth a shot and might apply to you.
If you operate on a large dataset, it is very possible that arrays will be used.
For me creating a few arrays from 500 000 rows and 30 columns worksheet caused this error. I solved it simply by using the line below to get rid of array which is no longer necessary to me, before creating another one:
Erase vArray
Also if only 2 columns out of 30 are used, it is a good idea to create two 1-column arrays instead of one with 30 columns. It doesn't affect speed, but there will be a difference in memory usage.
I had a similar problem that I resolved myself.... I think it was partially my code hogging too much memory while too many "big things"
in my application - the workbook goes out and grabs another departments "daily report".. and I extract out all the information our team needs (to minimize mistakes and data entry).
I pull in their sheets directly... but I hate the fact that they use Merged cells... which I get rid of (ie unmerge, then find the resulting blank cells, and fill with the values from above)
I made my problem go away by
a)unmerging only the "used cells" - rather than merely attempting to do entire column... ie finding the last used row in the column, and unmerging only this range (there is literally 1000s of rows on each of the sheet I grab)
b) Knowing that the undo only looks after the last ~16 events... between each "unmerge" - i put 15 events which clear out what is stored in the "undo" to minimize the amount of memory held up (ie go to some cell with data in it.. and copy// paste special value... I was GUESSING that the accumulated sum of 30sheets each with 3 columns worth of data might be taxing memory set as side for undoing
Yes it doesn't allow for any chance of an Undo... but the entire purpose is to purge the old information and pull in the new time sensitive data for analysis so it wasn't an issue
Sound corny - but my problem went away
I was able to fix this error by simply initializing a variable that was being used later in my program. At the time, I wasn't using Option Explicit in my class/module.
Here is the result of a profiled simulation run of my MATLAB program. I need to run this simulation several hundred thousand times (~100,000 times).
Thus I need a faster way to read the Excel file.
Specifications: The Excel file is of 10000x2 cells and each simulation run is reading one such sheet each from 5 separate Excel files.
UPDATE: I put the xlsread in basic mode and also reduced the number of calls by combining my input into a single file. Next target is xlswrite now. Ah, that sinking feeling. :|
NOTE: Although writing to a CSV file using dlmread is very fast (around 20 times), I need to use the comfort of separate sheets that an .xls file provides.
I don't think you would be able to wring much out of xlswrite if you need Excel sheets as the output.
How about parallelizing?
Do you have access to the parallel computing toolbox? Or maybe you can run two instances of MATLAB if your box supports it. If so, you could consider two approaches:
Have the first process do the xlsread part, the simulation part and then write to mat files/plain binary/CSV, whatever is the fastest while preserving your data integrity. Have another process convert the matfiles/intermediate data files into Excel using xlswrite.
Have N MATLAB instances/workers (N depends on your physical machine capacity). Parallelize the whole read-process-write part across N workers. Note, I am not sure how Excel would scale when called by N workers! (xlswrite uses activeX/MS Excel to write the data).
As any parallel approach, your mileage will vary on the complexity of the simulation vs. required file I/O and its performance.
We have a massive spreadsheet which does a lot of calculations and not much drawing / writing to spreadsheets
My question is : Does monitoring the spreadsheet whilst it is running via RDP actually make this slower??
Put differently if rdp was disconnected would this result in improved speed??
I've actually done a lot of work from home via Remote Desktop that involved an Excel Workbook (and Access Applications) doing lots of hefty calculations. From my experience, I didn't notice any slowdown in the calculations on the Excel sheet, but occasionally the connection would slow and anything that refreshed the screen a lot would make the PC difficult to use.
The most important thing, however, is to write code that modifies the visual elements of the screen as much as possible. For example, instead of looping through a bunch of cells and setting each one as the active cell to find its value, loop through a set of range values that don't require the sheet to refresh. This, by far, has created the biggest performance boost in my VBA code.
If your code is already fairly optimized, you'll probably not see any difference monitoring it over RDP. However, if monitoring is your issue, you ought to consider outputing data to a separate Excel or Text file that might be stored on a shared server. If done correctly, I imagine that would have a smaller impact on your CPU than RDP. THis will still allow you to monitor the progress of the Excel application without having to log in.
Just look at the CPU usage of Excel and the RDP server. If Excel isn't getting its 100% while calculating, or if the RDP server seems to be using too much... then yes, RDP is making things slower.