I have a workbook that I'm essentially using as a place to log data. Currently it contains about 400 rows, but a new row is added each time I run a process. One of the columns contains =RAND() on each row which is used to sort the data in a random order when it gets exported.
Because of the number of RAND() calculations, the sheet is starting to take a lot of time to insert each row. I'm wondering if something like RANDBETWEEN(0,9999) would be more efficient to serve this purpose - or if there is another alternative that I'm not aware of...
Rand() is a volatile function and will cause a recalculation of the whole workbook whenever any cell is changed.
You can stop that with the calculation settings in the Formulas ribbon. Set calculation to manual and click the Calculate Now or Calculate Sheet command when you want to run a recalculation.
Be aware that the setting applies to Excel as an application and will affect all workbooks. This can be dangerous if you forget that you have manual calculation applied and expect results in other workbooks to recalculate automatically.
Related
Currently, I am trying to calculate daily averages of the availability and performance of machines. In the raw data, each machine has a different number of data points that are needed to make an average from.
Is there a way to get VBA to auto-detect the group of rows, and from then, take the average of the data in those rows?
At the moment, I have it being completed in an inefficient and buggy way held together by duct tape. The ActiveCell moves down until it finds a non-blank cell. It stores the value of the cell into an array and adds more data until it reaches a blank row. Then, it writes the value to the average column, clears the array, and moves down until it reaches a non-blank cell.
I am aware that ActiveCell.select is a terrible way to code, but I don't know of any more elegant ways to accomplish this task.
A sample of how the data will look
If you know there will not be duplicated machines, AverageIfs() is a great option, just make sure to add an And() to check current row and next row as seem below.
If there may be duplicated machines, you can use XMatch() to find the first empty row, and use either indirect() to build your AverageIfs() formula (like i did... but only useful if your data will always be in the same place) or use XMatch() along with Offset() and Sum() if you need the data to move more easily. One drawback is requiring a blank row at the top of the data, but this could be worked around with an additional if statement.
Please note: Yellow cells indicate where the formula was pulled from.
I'm having some serious performance issues with an excel workbook I created. I need to pull data from another worksheet in the book that has 7 columns of data and about 300 rows.
The amount of data should be no problem - I think the issue I'm having comes down to an index|match array that has multiple match conditions. I'm wondering if there's another approach I can take, because the workbook is becoming aggravating to work with.
Here's some made up data:
This data is aggregated in a separate program from a database, and I output it to an excel file.
Here's a sample of a made up segment of a report:
Where the formula for the rows "Active Accounts" and "Online Enabled Accounts" are:
{=IFERROR(INDEX($D:$G,MATCH($K$2&M$1,$C:$C&$A:$A,0),MATCH($L2,$D$1:$G$1,0)),0)}
and the formula for the rows "Both", "Online", and "Paper" are as follows:
{=IFERROR(INDEX($G:$G,MATCH($K2&M$1&$L6,$C:$C&$A:$A&$F:$F,0)),0)}
I have about 5 other "segments" that reflect similar data by different in this format across 13 months. With only 300 data records this workbook is still painfully slow to even apply formatting, so I'm hoping there's a better approach than to just use these arrows with Index|Match.
Select any range in your dataset and hit CTRL + A then hit CTRL + T. This will create a table you can reference as a named range.
Write the index match formula as you normally would except make sure to only select the data you're looking for this means do not select the entire column, this is what's weighing down your system, simply choose the ranges with the data you're looking at.
You'll notice as you highlight your data it will say something like Table1[Accounts] what this means is it's accessing that named column's datarange. This will allow your formula to scale as the table grows (or shrink as needed), and not calculate any farther than needed. This will save your computer a tremendous amount of computing power while it calculates.
I'm attempting to run an index/match query for 65,000 cells as part of a store inventory calculation in Excel. We have 65,000 unique items in our database.
Anyways, here is the formula I am pasting down a single column for 65,000 rows. Obviously, it runs EXTREMELY slow. What could I possibly change to speed things up?
=INDEX(SAQTY!H:H, MATCH(A2&"GRA", SAQTY!C:C&SAQTY!F:F, 0))
On a side note, the index/match is cross checking across multiple sheets, does that have anything to do with performance?
make the references dynamic like this:
=INDEX(SAQTY!H:H, MATCH(A2&"GRA", SAQTY!C1:INDEX(SAQTY!C:C,MATCH(1E+99,SAQTY!H:H))&SAQTY!F1:INDEX(SAQTY!F:F,MATCH(1E+99,SAQTY!H:H)), 0))
Help!
I have an 8MB 2010 .xslx workbook (no macros) that runs a full calc in about 2 seconds. It only 2 worksheets each with less than 1,500 rows. However, it has 100 and 200 columns. It takes 20+ seconds to insert or delete a row (and much much longer when I delete a group of rows).
It does have a fair amount of calculations in the workbook largely made up of index/match formulas. I went a process to simplify that process by only calculating the matches (for the most part) at the top and left of the worksheet. For example, All of F7:DV7 will point to only 2 rows on worksheet 2 so the match() is only done once in column C and D.
I realize index/match is more complicated than simple a+b and that excel likes rows more than columns but this file isn't that big at all and it seems like it should be able to handle it. And the fact that the calculation is fine, it's just when I add/delete rows that it's so slow has me bewildered.
I came across a similar issue recently, and I found this question while searching for an answer online. Unfortunately, it didn't include an answer, so I moved on. However, I found the reason why the worksheet I was working on was taking so long to delete rows and wanted to return to this question and add my 2 cents.
In my case, it turned out one of the vlookup formulas included table array written something as SheetName!$A$1:D5000. When the formula was copied down, the range expanded by one in every cell down. So the next cell down had defined table array as SheetName!$A$1:D5001. And this went on for a few thousand rows. Turning off automatic calculation had no effect on reducing the wait whenever deleting rows.
Anyway, changing the table array in the vlookup to SheetName!A:D and copying that vlookup down the column did the trick. You didn't mention you used a vlookup, but it could be happening in the index/match formulas.
this is an areas problem. When you filter your data and select an entire column, you are selecting multiple non contiguos ranges, i.e multiple areas. A workaround could be:
sort your data from a to z to group the rows you want to delete in
only one area
Filter the values you want to delete
Delte rows
Enjoy!
If the actual order of your data is important to you, just add a column, fill it with numbers from 1 to n. Perform steps 1) 2) and 3), then restore the original order. Perform step 3).
I am finding inserting rows in table structures or in normal cells - manually or otherwise - very very slow. Like it takes more than 10 mins to insert 7 rows in a table (containing literal strings only) or in adjacent cells, in a sheet with no conditional formatting.
The workbook has 45 worksheets and 20 tables, with the bigger tables having XML files of about 10KB. There are 33MB worth of spreadsheet XMLs with most being around 300KB with 5 more than 1MB and one being 15MB. Its fairly complex but not massive. All of the calculations flow nicely from left to right up to down, right sheet to left sheet and I've mostly managed to avoid array formulas. All of the tables have regular structures, with the calculated columns having one only formula. Most of the table columns are calculated, with only a couple of smaller ones containing literal data.
I do have a lot of conditional formatting on a couple of sheets but I've been very careful to keep it rational and stopped it from fragmenting: I have about 45 rules for the whole sheet and these are generalised to cover all columns. The main processing for the formating decisions are moved into the tables as helper columns and as I said, very regular in structure.
It seems that these type of edits are not thread safe so only one processor is loading up and there is very light disc activity. I can't understand what excel is doing all that time.
Of course I set calculation to manual...
I've seen comments attributing this type of thing to the increased row and column limits, but I don't understand why this should be a factor. If I look at the XML files of the spreadsheets, there is only code for rows and columns that are occupied with values or formulas. So why are the unoccupied cells in play?
This is having a massive effect on my productivity - although I'm learning a lot by reading in sites like this in my new-found spare time. I really need to figure out what the problem is so that I can avoid or work around this issue if possible.
Can anybody help me on that?
Just in case people are wondering about this, the answer is to use power query and power view in excel. I find medium (500k lines) datasets and complex structures and transformations all work without a hitch. I never use formulae in tables anymore. The other thing is that this naturally leads you to power bi which is great. That's my tip.
Long insertion times may be due to INDEX (or other functions) that reference a whole column, or a whole row.
I had a very similar problem: not too complex worksheet (about 2500 rows, with 15 columns of data (results from a query), and about 10 columns of formulas to extract data from the query results. when I inserted a column, the first columns might insert within 4 seconds or so, but the second insert would take over a minute. Yikes! I searched the internet and found this site http://support.microsoft.com/kb/2755145.
My experience:
I was using a formula like =INDEX(11:11,1,MATCH(AC$5,$10:$10,0)), about 25000 times in my worksheet. You can see that each formula references an entire row twice. Apparently, when I added a column, since each row is affected, and therefore each of my formulas was affected, Excel would dutifully go to work trying to figure out what to do about that.
Based on what I learned form the microsoft website, I changed the formula to =INDEX(QueryResults,ROW()-ROW(QueryHeaders),MATCH(AC$5,QueryHeaders,0)), where the QueryResults and QueryHeaders are simple named ranges.
After I made this change throughout the sheet, inserting a column became almost instantaneous - less than a second.
This sounds like the problem described here http://fastexcel.wordpress.com/2012/01/30/excel-2010-tableslistobject-slow-update-and-how-to-bypass/
If so you have to break one of the conditions to bypass it:
For this slowdown to occur each of the following conditions must be true:
A cell within the Table must be selected
The sheet containing the Table must be the Active Sheet
The cell being updated must be on the same sheet as the table, but does not have to be within the table
There must be a reasonable number of formulas in the workbook.
Maybe you could do the update indirectly via VBA with another sheet active. Or Maybe moveing all the formulas to a separate workbook would bypass it. Or convert your Tables back to normal ranges (& use dynamic range names if neccessary)
Try removing conditional formatting and then reapplying it with vba after main code is through. Worked for me.