I am planing to start a porject in excel which will eventually contain about 25 tables, each one with something like 70x30 cells and one main table that draws information from all other tables are presents results accordingly.
Since I don't know the exact number of rows and columns and I don't want to leaves spaces to make up for future needed rows/cols, I was thinking about putting each table in a separated sheet. Now my question is how will it affect the speed performance of my poject? (all the cells in the tables contain about a line long formula with calculations)
Can't answer for Google but for Excel distributing over many sheets will not make a significant difference.
for the white spaces i was using something similar to get data and leave empty cells, you can take this as start point
=QUERY({(sheet1!E4:E);(sheet2!Y4:Y)},"select * where Col1 is not null")
Related
Every month I get given a budget from one of our clients in a Google sheet, which I need to convert into a SQL query so it can be uploaded into our database. As the number of rows and columns changes, I want to write some formula to semi-automate the process for time saving and mistake elimination.
This budget has spends in multiple columns, which I've managed to write formulas to combine into one column, with the correct details in the columns next to it (see example links below).
How I've transformed the data so far
The issue is this budget per country and partner, then has to be split again across multiple options. This leaves me with three columns worth of spend values, that I'd really like to combine back into one column, and ideally skip out all the zero values.
I've found an array formula on this site that will skip the zeroes, but I can't get it to work on more than one column.
=IFERROR(INDEX($U:$U,SMALL(ROW(myRange)*(myRange<>0),SUMPRODUCT(N(myRange=0))+ROWS($1:1))),"")
From this Question's Answer
Is it possible to write a formula, that skips the zero values down one column, and then starts at the next? And that will also allow me to keep the correct matching details from the other columns alongside it, as well as bring in the column headers for the options as entries in a new column?
Thanks
Edit:
Here is the final format I'm looking for:
There is a concatenated field off the end that combines all the columns. Most of the values are populated by various Vlookups, to transform from the text version, into the database IDs, needed to fill the table.
It's also worth saying, that not being able to skip the zeros, is OK, as I can manually delete them fairly easily.
But as the number of countries and partners can and will change, I want the formula to be able to move column at the end of the dataset.
Help!
I have an 8MB 2010 .xslx workbook (no macros) that runs a full calc in about 2 seconds. It only 2 worksheets each with less than 1,500 rows. However, it has 100 and 200 columns. It takes 20+ seconds to insert or delete a row (and much much longer when I delete a group of rows).
It does have a fair amount of calculations in the workbook largely made up of index/match formulas. I went a process to simplify that process by only calculating the matches (for the most part) at the top and left of the worksheet. For example, All of F7:DV7 will point to only 2 rows on worksheet 2 so the match() is only done once in column C and D.
I realize index/match is more complicated than simple a+b and that excel likes rows more than columns but this file isn't that big at all and it seems like it should be able to handle it. And the fact that the calculation is fine, it's just when I add/delete rows that it's so slow has me bewildered.
I came across a similar issue recently, and I found this question while searching for an answer online. Unfortunately, it didn't include an answer, so I moved on. However, I found the reason why the worksheet I was working on was taking so long to delete rows and wanted to return to this question and add my 2 cents.
In my case, it turned out one of the vlookup formulas included table array written something as SheetName!$A$1:D5000. When the formula was copied down, the range expanded by one in every cell down. So the next cell down had defined table array as SheetName!$A$1:D5001. And this went on for a few thousand rows. Turning off automatic calculation had no effect on reducing the wait whenever deleting rows.
Anyway, changing the table array in the vlookup to SheetName!A:D and copying that vlookup down the column did the trick. You didn't mention you used a vlookup, but it could be happening in the index/match formulas.
this is an areas problem. When you filter your data and select an entire column, you are selecting multiple non contiguos ranges, i.e multiple areas. A workaround could be:
sort your data from a to z to group the rows you want to delete in
only one area
Filter the values you want to delete
Delte rows
Enjoy!
If the actual order of your data is important to you, just add a column, fill it with numbers from 1 to n. Perform steps 1) 2) and 3), then restore the original order. Perform step 3).
I am finding inserting rows in table structures or in normal cells - manually or otherwise - very very slow. Like it takes more than 10 mins to insert 7 rows in a table (containing literal strings only) or in adjacent cells, in a sheet with no conditional formatting.
The workbook has 45 worksheets and 20 tables, with the bigger tables having XML files of about 10KB. There are 33MB worth of spreadsheet XMLs with most being around 300KB with 5 more than 1MB and one being 15MB. Its fairly complex but not massive. All of the calculations flow nicely from left to right up to down, right sheet to left sheet and I've mostly managed to avoid array formulas. All of the tables have regular structures, with the calculated columns having one only formula. Most of the table columns are calculated, with only a couple of smaller ones containing literal data.
I do have a lot of conditional formatting on a couple of sheets but I've been very careful to keep it rational and stopped it from fragmenting: I have about 45 rules for the whole sheet and these are generalised to cover all columns. The main processing for the formating decisions are moved into the tables as helper columns and as I said, very regular in structure.
It seems that these type of edits are not thread safe so only one processor is loading up and there is very light disc activity. I can't understand what excel is doing all that time.
Of course I set calculation to manual...
I've seen comments attributing this type of thing to the increased row and column limits, but I don't understand why this should be a factor. If I look at the XML files of the spreadsheets, there is only code for rows and columns that are occupied with values or formulas. So why are the unoccupied cells in play?
This is having a massive effect on my productivity - although I'm learning a lot by reading in sites like this in my new-found spare time. I really need to figure out what the problem is so that I can avoid or work around this issue if possible.
Can anybody help me on that?
Just in case people are wondering about this, the answer is to use power query and power view in excel. I find medium (500k lines) datasets and complex structures and transformations all work without a hitch. I never use formulae in tables anymore. The other thing is that this naturally leads you to power bi which is great. That's my tip.
Long insertion times may be due to INDEX (or other functions) that reference a whole column, or a whole row.
I had a very similar problem: not too complex worksheet (about 2500 rows, with 15 columns of data (results from a query), and about 10 columns of formulas to extract data from the query results. when I inserted a column, the first columns might insert within 4 seconds or so, but the second insert would take over a minute. Yikes! I searched the internet and found this site http://support.microsoft.com/kb/2755145.
My experience:
I was using a formula like =INDEX(11:11,1,MATCH(AC$5,$10:$10,0)), about 25000 times in my worksheet. You can see that each formula references an entire row twice. Apparently, when I added a column, since each row is affected, and therefore each of my formulas was affected, Excel would dutifully go to work trying to figure out what to do about that.
Based on what I learned form the microsoft website, I changed the formula to =INDEX(QueryResults,ROW()-ROW(QueryHeaders),MATCH(AC$5,QueryHeaders,0)), where the QueryResults and QueryHeaders are simple named ranges.
After I made this change throughout the sheet, inserting a column became almost instantaneous - less than a second.
This sounds like the problem described here http://fastexcel.wordpress.com/2012/01/30/excel-2010-tableslistobject-slow-update-and-how-to-bypass/
If so you have to break one of the conditions to bypass it:
For this slowdown to occur each of the following conditions must be true:
A cell within the Table must be selected
The sheet containing the Table must be the Active Sheet
The cell being updated must be on the same sheet as the table, but does not have to be within the table
There must be a reasonable number of formulas in the workbook.
Maybe you could do the update indirectly via VBA with another sheet active. Or Maybe moveing all the formulas to a separate workbook would bypass it. Or convert your Tables back to normal ranges (& use dynamic range names if neccessary)
Try removing conditional formatting and then reapplying it with vba after main code is through. Worked for me.
Hopefully I can explain this decently.
I am attempting to merge two unique excel spreadsheets, with some of the same data, into one spreadsheet. When needed I would like to remove the data from the incoming spreadsheet. I am doing this as it would make it easier to edit one "like" spreadsheet, rather then keep and update two copies. I do not want to hide the incoming data, I NEED to completely remove it when needed.
Thanks!
It depends on what the spreadsheets look like and what, exactly, you mean by merge.
If, for example, the two worksheets contain a table each, then you could copy/append one table to the bottom of the other and use Excel's Remove Duplicates feature (on the Data tab) to delete rows.
The duplicates can be identified either by a single code-number column, all of the columns (meaning that the entire row is duplicated) or a selection of columns. Be aware that it is the first duplicated row that is kept, the subsequent duplicates will be removed.
If, on the other hand, you want to find values in the rows of one of the worksheets, based on a code number contained in a column of the other worksheet, and insert them into specific cells, then this requires more effort, perhaps with the help of the VLOOKUP function (or similar).
I need to count same items in excel.
In excel sheet in rows with following data. (large amount of data in one column).
data: natural,amenity,highway,amenity,amenity,highway,shop,highway,place,place,sport,barrier
amenity,highway,barrier,highway,highway,highway,amenity,amenity,amenity,amenity, natural,amenity,highway,amenity,amenity,highway,shop,highway,place,place,sport,barrier
amenity,highway,barrier,highway,highway,highway,amenity,amenity,amenity,amenity, natural,amenity,highway,amenity,amenity,highway,shop,highway,place,place,sport,barrier
amenity,highway,barrier,highway,highway,highway,amenity,amenity,amenity,amenity.
From this how i can get count of amenity , count of shop.
thank you
Several ways to do it and listed out here: Count how often a value occurs
The Pivot table approach would be more organized and can be just refreshed if new entries are added. Insert a Pivot table and drag your "Data" field both in the Row Labels and Values of the pivot table (which defaults to Count of Values).
PS: Though you have tagged VBA for this question, please note this is not needed for this simple count.
While I would go with #Kash is answer.
If you know the row values you want, and asuming the data in column A, then you could use the formula:
=COUNTIF(A:A,"amenity")
replacing "amenity" with each value you want to count
My Duplicate Master addin is another potential solution. While for your question as asked I would normally go with the PivotTable suggestion from Kash (but using dynamic ranges to capture any data size changes), the addin provides further flexibility and output options that may be of use
From you question you want only a duplicate count, not values that only occur once (the addin can handle both dupes and uniques)
Ignoring Case Sensitivity
Ignoring any white spaces (spaces, line breaks, carriage returns, the non-printing Char 160)
Ability to work across multiple sheets
Highlighting, Deletion and Selection options as well as the summary option you need