Reducing occupied memory by multidimensional variant array in vba - excel

I have a rather large array:
Global add_vit(0 To 6, 0 To 6, 2 To 18, 0 To 1300, 0 To 8) As Variant
Which I fill partially fill up in module a, sub one, (which takes a long time). I wish to execute Module a sub one only once.
Once the execution has finished, I want to acces the above variant array in module b, sub two, and run module b sub two multiple times independently, in order to verify the code I am working on in module b, sub two.
I learnt that the "Global" part means that the the array will be filled/preserved even after module a, sub one is completely finished. That is what I need, but at the same time, I run on the border of getting out of memory errors.
I have several of such arrays in run module a, sub two which are all interlinked, and in certain conditions I need specific entries to be copied to other specific entries of arrays. This also prevents me from separating the computation of this last global array into another module.
I am also confused by the perceived randomness of when a "out of memory" error occurs, when I run the same script with the same initial conditions at different times, thoug I assume it is because the amount of memory available for excel is not static, but dependent on other processes I use on my laptop simultaneously.
Does anyone have a suggestion on how to maintain the same amount of entries (for doubles and longs), (or doubles and booleans) in storage accesible, (approx. 8.000.000) whilst still being able to access them from different modules when the initial computation is finished without occupying so much memory?*
*Without manually storing them into an excel sheet for it is tedious and slows down computation drastically.
I also try to reset the entire array manually before computations with the following script:
For A = 0 To 6
For B = 0 To 5
For C = 2 To 18
For d = 0 To 1300
For e = 0 To 4
add_vit(A, B, C, d, e) = ""
Next e
For f = 6 To 7
add_vit(A, B, C, d, e) = False
Next f
Next d
Next C
Next B
Next A

Thank you for your tips!
#A.S.H Thank you I have tested and validated that indeed a public also retains it's value after the the sub/module has been completed.
So I appologize #ThunderFrame , you were right, I could use a public instead of a sub, thank you.
#user3598756 thank you, that was the statement I initially was looking for, the erase statement.
#Absinthe I forgot to mention that module 2 is actually a sub in a form that is generally called with userfomX.show, I still have to look up how the variables are passed through to that. But mainly the separation of the two modules was what I was looking for, which meant that I wanted to run module 2 without needing to run module1 everytime whilst still using the stored output of a single run of module one.
Thank you so much everyone, you are awesome!

Related

Cut string from the first number from the left using Excel VBA

I need to get the column letters of a range for a macro. I specifically need the column letters, the numbers you get for columns directly using VBA Address functions won't work. Since the ranges are always from one column only, this simplifies the task. The range retrieved could be something like B3 or B3:B5, but always the same column and are inside a table.
So, what I need (in this case) to get would be B as a string. I tried to do the following:
RangeOfInterest = Worksheets("Sheet1").Range("Table1[Column1]").Address(0, 0)
RangeColumn = Right(RangeOfInterest, Len(RangeOfInterest) - InStr(RangeOfInterest, [0-9]))
However, I run into a series of issues with this. First, there is the InStr function. I thought this was the best way, because this function searches for the position of a character starting from the left, which is exactly what I need. However, I would need it to search for many values (any number from 0 to 9). Could I add all the numbers as search arguments or use some kind of trick to search between a range of numbers? What I tried certainly doesn't work.
On the other hand, I assume that if I somehow manage to add all numbers from 0 to 9, the function would start searching for them one by one instead of stopping the first time there is any number in the string? This would result in the issue that if for example there is a range like B3:B10 it will begin searching for a 0 and return the position of the 0 and finish, hence my code will return the string B3:B1 instead of just B.
Also, I can't just use a fixed solution like Left(RangeOfInterest, 1) to get the B because the code should work with any range, and once you reach the Z the column letters are double and go like AA, AB and so on.
I thought that another alternative would be to loop, but all my tries resulted in very complex pieces of code for what seems to have a pretty easy solution. Also, if possible, I would like to avoid looping although that doesn't matter if there is no other option. I would really appreciate any suggestion to solve this.
More often, one wants the number of a columns than its name because if you feed Excel the name it will convert it to the corresponding number for processing. Therefore I hope it isn't that you need the name for the purpose of addressing a cell using VBA. Anyway, here you go:-
Dim RangeOfInterest As String
RangeOfInterest = Worksheets("Sheet2").Range("Table1[Column1]").EntireColumn.Address(0, 0)
RangeOfInterest = Split(RangeOfInterest, ":")(0)
Debug.Print RangeOfInterest

Assigning and reading multidimensional arrays in Python

I'm stumped.
for a in range(0,500): #500 is a highly variable number but using it for example purposes
b = findall(r'<(.*?)>', d) # d will return a highly number variable number of matches could be anywhere from 45-10000
c.append([b])
print(c[0][1])
This returns the error because everything from 'b' goes into c[0][0]. I can understand this. The question is how do I split 'b' apart so I can put it into c so I can
print(c[0][234])
and get it give me back the 235, err element 234 of the 1, err 0, line?
This is a situation like I said above where the number of times going through 'b' will be variable, at least for right now until I get the entire file prepped I can only that 'b' in the end will be way north of 10,000 and probably closer to 100,000 by the time I have all the data collection finished. The number of elements that are stored can and will be highly variable depending on the file that they come from. They are all coming from a csv file but I'm hoping to not to deal with adding in any 'complexity' by going out and having to deal with the csv module...since I've never used it before and that will probably just lead to more questions.
I have tried something similiar to...different variables naturally so things would be appropriately matched up
d = list(zip(*(e.split(',') for e in b)))
all this has did is split on each and every letter versus on the comma.
Your error is coming from the square brackets you have in c.append([b]). The brackets create an extra list that contains the list b. So rather than a two dimensional data structure, you're ending up with three dimensions. Your indexing fails because c[0][1] is trying to get a second value from the middle list (which only ever has one item in it).
You might get what you want with c[0][0][1] instead. But you probably don't actually want that extra level in your data structure. You can avoid creating it by using: c.append(b)

VBA Excel avoiding for loops with variable column

I’m using VBA for the first time. Up till now, I succeeded in building a model. However, I would still like to speed up calculations (I have already turned off ScreenUpdating, EnableEvents, xlCalculationAutomatic, DisplayPageBreaks). I have read on the internet that for-loops are quite time-consuming. Unfortunately, I really have a lot of them. Therefore, my question:
Assume I have this type of code:
For p = 1 To Periods
Demand(p) = Worksheets("Sheet1").Cells(3, 1 + p)
Next p
My first question: Does this for-loop really slow down run time?
Second question: How can I rewrite it, thereby speeding up calculations?
I have been trying the following:
Demand = Worksheets("Sheet1").Range(Worksheets("Sheet1").Cells(3, 2), Worksheets("Sheet1").Cells(3, 1 + Periods))
But unfortunately, this does not seem to work. I have already Googled this, but I do not seem to find an answer.
Many thanks in advance for your help!
It's not the For loops that take the time, it's what you do inside them. With Excel VBA, one of the slowest things you can do is interact with worksheets.
One quick change you could consider is this:
With Worksheets("Sheet1")
For p = 1 To Periods
Demand(p) = .Cells(3, 1 + p)
Next p
End With
... which means that your code knows it's always dealing the the same worksheet and doesn't have to look up the right one each time you go around the loop. You're probably not going to notice much difference, but you've gone from one worksheet lookup per period to one per run. You still have a cell lookup per period, though.
You can get to a single call to the worksheet though: get all the values in one call and avoid the loop completely:
Demand = Worksheets("Sheet1").Cells(3, 2).Resize(1, Periods).Value
That's going to get you a 1 x Periods array in Demand. Try it - use "View...Locals Window" to see what you've got.
It actually depends how large your end counter is and how small your steps are. You'll likely to experience noticeable slowdowns if your end counter is > 500K (Really big).
However, you should consider the following:
Optimizing the instructions inside the loop
Your speed is more likely to be influenced by the instructions inside the loop. You're better off optimizing them then worrying whether you should use a For each or a For Loop.
Exemple: Using .Rows().EntireRow property within a loop will often decrease the speed of your code because Excel would consider a row a range that accounts for a possibly one two million cells. That's exactly what's going to slow your code done.
Avoid using repetitive referencing
With Statements are a good way to increase the speed of your code. Just make sure that your loops are nested within a With statement. That way, VBA would not have to go through parents references multiple times.
#Mike Woodhouse's answer is a good example of how you can use a With statement to increase the speed of your code.
VBA will execute Periods time your instructions with one reference to the Worksheets("Sheet1"):
With Worksheets("Sheet1")
For p = 1 To Periods
Demand(p) = .Cells(3, 1 + p)
Next p
End With
However in your old code (below), VBA execute your instruction and in doing that, it will call Periods time WorkSheets("Sheet1")
For p = 1 To Periods
Demand(p) = Worksheets("Sheet1").Cells(3, 1 + p)
Next p

Excel VBA code execution dramatically faster after breaking and resumiing

I have a piece of code which is pretty straight forward:
Dim r As Integer, c As Integer
Dim rcnt As Integer, ccnt As Integer
With ActiveSheet
.Unprotect
Application.ScreenUpdating = False
rcnt = .UsedRange.Rows.Count
ccnt = .UsedRange.Columns.Count
For r = 3 To rcnt
For c = 1 To ccnt
If Not .Cells(r, c).Locked Then
.Cells(r, c) = ""
End If
Next
Next
Application.ScreenUpdating = True
ThisWorkbook.ProtectSheet ActiveSheet
End With
It is run as part of a larger context where I manually shuffle stuff into several sheets from an external file. The really, really odd thing is that when I execute the larger set of procedures (of which this snippet is a part) it will be very, very slow (30-70 seconds). However, if I hit CTRL-BREAK, step into debug mode, and then immediately resume excecution, the code performs as expected, meaning sub-second time span for all consecutive sheets.
I'm posting here to see of someone has run across a similar behaviour, and if so, how did you fix it?
Thanks in advance!
/Martin Rydman
It happened the same to me and made me lose some hours. So, just for the record: when the execution stops, Excel processes every pending event in the spreadsheet, which can easely be blocked if you are updating a lot of cells. Adding a "DoEvents" command in key portions of the code solved the problem. It unblocks the spreadsheet and makes everything work "dramatically faster".
I hope this will save time to someone else...
I've seen Excel behave oddly many times so I believe you. Most of the time I do not stop to diagnose but instead chose to code around or quite simply take a different strategy.
However, if I look at your code then perhaps we might be able to tighten it further.
On the line
.Cells(r, c) = ""
you are actually allocating a string, even though the string is empty. String allocation does take time. Perhaps you could use
.Cells(r, c).Clear
or
.Cells(r, c) = Empty
both of these have same impact as your line, i.e. to clear a cell. Will you try these and then see if you still have oddness?
Also you might want to consider clearing contiguous ranges in one statement. So
Range("B2:D4").Clear
will clear nine cells in one go, imagine if you could clear hundreds of cells in one go. I realise from your code that have cells that you do not want cleared which you are marking/protecting with the Locked property, but nevertheless I reckon you can write some more complex code to identify blocks of cells that can be cleared in one go.
And finally, it looks like you are constantly recycling your sheets, i.e. populating them and then clearing them down. You might want to instead start with a fresh sheet, put in your labels, lock any cells as necessary and then populate with variable data; actually come to think of it you could use a template sheet, see this link.
http://office.microsoft.com/en-gb/excel-help/about-excel-templates-HP005229286.aspx

User Defined Functions in Excel and Speed Issues

I have an Excel model that uses almost all UDFs. There are say, 120 columns and over 400 rows. The calculations are done vertically and then horizontally --- that is first all the calculations for column 1 are done, then the final output of column 1 is the input of column 2, etc. In each column I call about six or seven UDFs which call other UDFs. The UDFs often output an array.
The inputs to each of the UDFs are a number of variables, some range variables, some doubles. The range variables are converted to arrays internally before their contents are accessed.
My problem is the following, I can build the Excel model without UDFs and when I run simulations, I can finish all computations in X hours. When I use UDFs, the simulation time is 3X hours or longer. (To answer the obvious question, yes, I need to work with UDFs because if I want to make small changes to the model (like say add another asset type (it is a financial model)) it takes nearly a day of remaking the model without UDFs to fit the new legal/financial structure, with UDFs it takes about 20 minutes to accommodate a different financial structure.)
In any case, I have turned off screen updating, there is no copying and pasting in the functions, the use of Variant types is minimal, all the data is contained in one sheet, i convert all range type variables to arrays before getting the contents.
What else can I do other than getting a faster computer or the equivalent to make the VBA code/Excel file run faster? Please let me know if this needs more clarification.
Thanks!
Couple of general tips.
Take your function and work out where the bottlenecks really are. See this question for the use of timer in excel. I'm sure there are VBA profilers out there... but you probably don't need to go that far. (NB: do this with one cell of data first...)
Think about your design... 400x120 cells of data is not a lot. And for it to take hours that must be painful. (In the past i've cracked it after waiting a minute for 1,000s of VLOOKUPS() to return) anyway maybe instead of having having a stack of UDFs why not have a simple subroutine that for..each through the range and does what you need it to do. 48,000 cells could take seconds or maybe just minutes. You could then associate the subroutine with a button or menu item for the user.
Out of interest i had a quick look at option 2 and created MyUDF(), using the sub DoMyUDF() to call it for the active selection worked 10x faster for me, than having the UDF in each and every cell.
Option Explicit
Function MyUDF(myVar As Variant) As Variant
MyUDF = myVar * 10
End Function
Sub DoMyUDF()
Dim r As Range
Dim c As Variant
Dim t As Single
t = Timer
If TypeName(Selection) <> "Range" Then
Exit Sub
End If
Set r = Selection.Cells
Application.DisplayStatusBar = True
For Each c In r
c.Value = MyUDF(c.Value)
Application.StatusBar = "DoMyUDF(): " & Format(Timer - t, "#0.0000ms")
Next
Debug.Print "DoMyUDF(): " & Format(Timer - t, "#0.0000ms")
End Sub
If you replace MyUDF() with your UDF this may only save you 4.5 minutes... but it's possible there are some other economies you can build in. Especially if you are repeating the same calcs over and over again.
There is a slowdown bug in the way Excel handles UDFs. Each time a UDF gets calculated Excel refreshes the VBE title bar (you can see it flicker). For large numbers of UDFs this is very slow.
The bypass is very simple in Manual Calculation mode: just initiate the calculation from VBA using something like Application.Calculate (you can trap F9 etc with OnKey).
see http://www.decisionmodels.com/calcsecretsj.htm for some more details.
Have you considered replacing the UDF (which gets called once per output cell) with a macro, which can operate on a range of cells in a loop?
UDF setup/teardown is very slow, and anything each UDF call does in common with other UDFs (reading from overlapping inputs, for example) becomes extra effort.
I've been able to improve performance 10-50x doing this--had a situation less than a month ago involving a spreadsheet with 4000-30000 UDF calls, replaced them with a single macro that operates on a few named ranges.
Control and minimize recalculations with
wks.EnableCalculation = False
or
Application.Calculation = xlCalculationManual
Also, minimize the exchanges between VBA and the workbooks. It is faster to read and write a blocs of cells at once into an array
MyArray = range("B2:B20000")
rather than cell by cell (for each...).
Make sure you start the recalc from the VBA and not from the spreadsheet, see http://msdn.microsoft.com/en-us/library/aa730921.aspx#Office2007excelPerf_Overview
and the section about Faster VBA User-Defined Functions.
That is
Application.Calculate
is much faster (in my test case 100 times) than pressing F9 in the spreadsheet.

Resources