excel 2010/2013 insert rows is very slow - excel

I am finding inserting rows in table structures or in normal cells - manually or otherwise - very very slow. Like it takes more than 10 mins to insert 7 rows in a table (containing literal strings only) or in adjacent cells, in a sheet with no conditional formatting.
The workbook has 45 worksheets and 20 tables, with the bigger tables having XML files of about 10KB. There are 33MB worth of spreadsheet XMLs with most being around 300KB with 5 more than 1MB and one being 15MB. Its fairly complex but not massive. All of the calculations flow nicely from left to right up to down, right sheet to left sheet and I've mostly managed to avoid array formulas. All of the tables have regular structures, with the calculated columns having one only formula. Most of the table columns are calculated, with only a couple of smaller ones containing literal data.
I do have a lot of conditional formatting on a couple of sheets but I've been very careful to keep it rational and stopped it from fragmenting: I have about 45 rules for the whole sheet and these are generalised to cover all columns. The main processing for the formating decisions are moved into the tables as helper columns and as I said, very regular in structure.
It seems that these type of edits are not thread safe so only one processor is loading up and there is very light disc activity. I can't understand what excel is doing all that time.
Of course I set calculation to manual...
I've seen comments attributing this type of thing to the increased row and column limits, but I don't understand why this should be a factor. If I look at the XML files of the spreadsheets, there is only code for rows and columns that are occupied with values or formulas. So why are the unoccupied cells in play?
This is having a massive effect on my productivity - although I'm learning a lot by reading in sites like this in my new-found spare time. I really need to figure out what the problem is so that I can avoid or work around this issue if possible.
Can anybody help me on that?
Just in case people are wondering about this, the answer is to use power query and power view in excel. I find medium (500k lines) datasets and complex structures and transformations all work without a hitch. I never use formulae in tables anymore. The other thing is that this naturally leads you to power bi which is great. That's my tip.

Long insertion times may be due to INDEX (or other functions) that reference a whole column, or a whole row.
I had a very similar problem: not too complex worksheet (about 2500 rows, with 15 columns of data (results from a query), and about 10 columns of formulas to extract data from the query results. when I inserted a column, the first columns might insert within 4 seconds or so, but the second insert would take over a minute. Yikes! I searched the internet and found this site http://support.microsoft.com/kb/2755145.
My experience:
I was using a formula like =INDEX(11:11,1,MATCH(AC$5,$10:$10,0)), about 25000 times in my worksheet. You can see that each formula references an entire row twice. Apparently, when I added a column, since each row is affected, and therefore each of my formulas was affected, Excel would dutifully go to work trying to figure out what to do about that.
Based on what I learned form the microsoft website, I changed the formula to =INDEX(QueryResults,ROW()-ROW(QueryHeaders),MATCH(AC$5,QueryHeaders,0)), where the QueryResults and QueryHeaders are simple named ranges.
After I made this change throughout the sheet, inserting a column became almost instantaneous - less than a second.

This sounds like the problem described here http://fastexcel.wordpress.com/2012/01/30/excel-2010-tableslistobject-slow-update-and-how-to-bypass/
If so you have to break one of the conditions to bypass it:
For this slowdown to occur each of the following conditions must be true:
A cell within the Table must be selected
The sheet containing the Table must be the Active Sheet
The cell being updated must be on the same sheet as the table, but does not have to be within the table
There must be a reasonable number of formulas in the workbook.
Maybe you could do the update indirectly via VBA with another sheet active. Or Maybe moveing all the formulas to a separate workbook would bypass it. Or convert your Tables back to normal ranges (& use dynamic range names if neccessary)

Try removing conditional formatting and then reapplying it with vba after main code is through. Worked for me.

Related

Excel - Speed Up Index|Match Array Calculations

I'm having some serious performance issues with an excel workbook I created. I need to pull data from another worksheet in the book that has 7 columns of data and about 300 rows.
The amount of data should be no problem - I think the issue I'm having comes down to an index|match array that has multiple match conditions. I'm wondering if there's another approach I can take, because the workbook is becoming aggravating to work with.
Here's some made up data:
This data is aggregated in a separate program from a database, and I output it to an excel file.
Here's a sample of a made up segment of a report:
Where the formula for the rows "Active Accounts" and "Online Enabled Accounts" are:
{=IFERROR(INDEX($D:$G,MATCH($K$2&M$1,$C:$C&$A:$A,0),MATCH($L2,$D$1:$G$1,0)),0)}
and the formula for the rows "Both", "Online", and "Paper" are as follows:
{=IFERROR(INDEX($G:$G,MATCH($K2&M$1&$L6,$C:$C&$A:$A&$F:$F,0)),0)}
I have about 5 other "segments" that reflect similar data by different in this format across 13 months. With only 300 data records this workbook is still painfully slow to even apply formatting, so I'm hoping there's a better approach than to just use these arrows with Index|Match.
Select any range in your dataset and hit CTRL + A then hit CTRL + T. This will create a table you can reference as a named range.
Write the index match formula as you normally would except make sure to only select the data you're looking for this means do not select the entire column, this is what's weighing down your system, simply choose the ranges with the data you're looking at.
You'll notice as you highlight your data it will say something like Table1[Accounts] what this means is it's accessing that named column's datarange. This will allow your formula to scale as the table grows (or shrink as needed), and not calculate any farther than needed. This will save your computer a tremendous amount of computing power while it calculates.

One long sheet vs many sheets

I am planing to start a porject in excel which will eventually contain about 25 tables, each one with something like 70x30 cells and one main table that draws information from all other tables are presents results accordingly.
Since I don't know the exact number of rows and columns and I don't want to leaves spaces to make up for future needed rows/cols, I was thinking about putting each table in a separated sheet. Now my question is how will it affect the speed performance of my poject? (all the cells in the tables contain about a line long formula with calculations)
Can't answer for Google but for Excel distributing over many sheets will not make a significant difference.
for the white spaces i was using something similar to get data and leave empty cells, you can take this as start point
=QUERY({(sheet1!E4:E);(sheet2!Y4:Y)},"select * where Col1 is not null")

Changing range in a lookup formula based upon a value

I have a very large excel file with approximately 200 sheets with fields. Each sheet is a ranking of a subset of values which was output from an R program. There are 2 versions for about each entry. The subset data is not in the original sheet - only the name of the sheet, and the summary table i'm trying to build. I'd like to automatically determine which range (sheet) the lookup queries.
The Manual answer is to sort, filter create a lookup and consolidate the summary data, copy the formula, find replace the range reference, fill, repeat. hopefully there is a solution rather than copy-pasting, editing, hundreds of times.
You may want to re-think the data architecture. If possible, let the golden rule apply to have data in one sheet and reporting on other sheets.
Find a way to have all the 200 sheet's data in just one sheet. You may have to introduce a few additional columns to distinguish each row.
Then you can start building reports on all the data, using Pivot Tables, or more sophisticated tools like Power Pivot.
With the next to nothing info you provide about your data set it is hard to suggest more concrete advice.

Excel 2010 add and delete rows is very slow but calculations is not

Help!
I have an 8MB 2010 .xslx workbook (no macros) that runs a full calc in about 2 seconds. It only 2 worksheets each with less than 1,500 rows. However, it has 100 and 200 columns. It takes 20+ seconds to insert or delete a row (and much much longer when I delete a group of rows).
It does have a fair amount of calculations in the workbook largely made up of index/match formulas. I went a process to simplify that process by only calculating the matches (for the most part) at the top and left of the worksheet. For example, All of F7:DV7 will point to only 2 rows on worksheet 2 so the match() is only done once in column C and D.
I realize index/match is more complicated than simple a+b and that excel likes rows more than columns but this file isn't that big at all and it seems like it should be able to handle it. And the fact that the calculation is fine, it's just when I add/delete rows that it's so slow has me bewildered.
I came across a similar issue recently, and I found this question while searching for an answer online. Unfortunately, it didn't include an answer, so I moved on. However, I found the reason why the worksheet I was working on was taking so long to delete rows and wanted to return to this question and add my 2 cents.
In my case, it turned out one of the vlookup formulas included table array written something as SheetName!$A$1:D5000. When the formula was copied down, the range expanded by one in every cell down. So the next cell down had defined table array as SheetName!$A$1:D5001. And this went on for a few thousand rows. Turning off automatic calculation had no effect on reducing the wait whenever deleting rows.
Anyway, changing the table array in the vlookup to SheetName!A:D and copying that vlookup down the column did the trick. You didn't mention you used a vlookup, but it could be happening in the index/match formulas.
this is an areas problem. When you filter your data and select an entire column, you are selecting multiple non contiguos ranges, i.e multiple areas. A workaround could be:
sort your data from a to z to group the rows you want to delete in
only one area
Filter the values you want to delete
Delte rows
Enjoy!
If the actual order of your data is important to you, just add a column, fill it with numbers from 1 to n. Perform steps 1) 2) and 3), then restore the original order. Perform step 3).

Is there an easy way to reformat a poorly-formatted tree to a two-column table?

I have a table representing a series of components and their subcomponents, and the subcomponents' respective subcomponents, and so on. It currently looks like a tree (one-to-many relations), but it could change at some point to resemble a graph (many-to-many relations) instead. Unfortunately, it was poorly formatted by its author, and looks something like this:
The above format is poor because there is a lot of data duplication and it is limited to a set number (4) of tiers. I would instead prefer if it looked something like this:
The above format is nice because there is very little data duplication, and it is not limited to a set number of tiers.
In case there is any confusion about what the tables represent, here is a graphical representation of the data:
It is simple enough to convert from the poor format to the nice format, but there are hundreds of root components, and manual data entry would be far too time-consuming and tedious.
I suspect this problem is unique and I am prepared to write some VBA code myself to parse the table into the nice format, but I thought I'd make sure that this wasn't a common problem with a pre-rolled solution before I rolled my own.
Is there a technical term to describe the poor formatting in the first table? Is there an easier way to reformat the data than to write a VBA macro?
This may be a complete aberration but it works for your sample (and at the moment I don’t have time to break it!)
Add an index (and a label for it) and reverse pivot (eg see An excel formula to find a row/column index in array).
Instead of drilling down on the Grand Totals intercept, drill down on each of the totals for the Tiers.
Reassemble the tables side by side, delete all columns except the Value ones and copy table to another area with Paste Special Values. Remove Duplicates on the range. Every time the value in the column immediately to the right does not change, delete and shift the values in the cells to the left. Reorder the columns right to left.
I copied each pair of adjacent columns in the Tier table (Tier 1 & Tier 2, Tier 2 & Tier 3, Tier 3 & Tier 4) and pasted them stacked vertically into a single pair of columns (Subcomponent & Component).
Next, I removed duplicates by selecting both of my new columns and clicking Remove Duplicates in the Data ribbon tab.
Next, I had to remove all rows which contained a blank cell in the Subcomponent column. To do this, I selected both columns again and filtered the data by clicking Filter in the Data ribbon tab. I selected (Blanks) in the Filter menu on the Subcomponent column and deleted all visible rows. I removed the filer by selecting (Select All) in the filter menu.
The resulting table contained many blank rows, so again I removed duplicates, and then manually shifted the data up one row to displace the one remaining blank row.
In the end, it took about a half hour, which is probably less time than it would have taken me to code a macro, and definitely less time than manual data entry.

Resources