I have a table that is pulling thousands of rows of data from a very large sheet. Some of the columns in the table are getting their data from every 5th row on that large sheet. In order to speed up the process of creating the cell references, I used an OFFSET formula to grab a cell from every 5th row:
=OFFSET('Large Sheet'!B$2572,(ROW(1:1)-1)*5,,)
=OFFSET('Large Sheet'!B$2572,(ROW(2:2)-1)*5,,)
=OFFSET('Large Sheet'!B$2572,(ROW(3:3)-1)*5,,)
=OFFSET('Large Sheet'!B$2572,(ROW(4:4)-1)*5,,)
=OFFSET('Large Sheet'!B$2572,(ROW(5:5)-1)*5,,)
etc...
OFFSET can eat up resources during calculation of large tables though, and I'm looking for a way to speed up/simplify my formula. Is there any easy way to convert the OFFSET formula into just a simple cell reference like:
='Large Sheet'!B2572
='Large Sheet'!B2577
='Large Sheet'!B2582
='Large Sheet'!B2587
='Large Sheet'!B2592
etc...
I can't just paste values either. This needs to be an active reference, because the large sheet will change.
Thanks for your help.
And here is one last approach to this that does not use VBA or formulas. It's just a quick and dirty use of AutoFilter and deleting rows.
Main idea
Add a reference to a cell =Sheet1!A1 and copy it down to match as many rows as there are in the main data.
Add another formula in B1 to be =MOD(ROW(), 5)
Filter column B and uncheck the 0s (or any single number)
Delete all the rows that are visible
Delete column B
Voila, formulas for every 5th row
Some reference images, these are all taken on Sheet2.
Formulas with AutoFilter ready.
Filtered and ready to delete
Delete all those rows (select A1, CTRL+SHIFT+DOWN ARROW, SHIFT+SPACE, CTRL+MINUS)
Delete column B to get final result with "pure" formulas every 5th row.
If you want to take a VBA approach to this, you can generate the references very quickly using simple For loops.
Here is some very crude code which can get you started. It uses hard-coded sheet names and variables. I am really just trying to show the i*5 part.
Sub CreateReferences()
For i = 0 To 12
For j = 0 To 5
Sheet2.Range("H1").Offset(i, j).Formula = _
"=Sheet1!" & Sheet1.Range("A5").Offset(i * 5, j).Address
Next
Next
End Sub
It works by building a quick formula using the Address from a reference to a cell on Sheet1. The only key here is have one index count cells in the "summary" rows and multiply by 5 to get the reference to the "master" sheet. I am starting at A5 just to match the results from INDEX.
Results show the formula input for H1 and over. I am comparing to the INDEX results generated above.
Here is one approach using INDEX instead of OFFSET. I am not sure if it is faster, I guess you can check. INDEX is not volatile, so you might get some advantage from that.
Picture of ranges, you can see that Sheet1 has a lot of data and Sheet2 is pulling every 5th row from that sheet. The data in Sheet1 goes from A1:F1000 and just reports the address of the current cell.
Formulas use INDEX and are copied down and across from A1 on Sheet2.
=INDEX(Sheet1!$A$1:$F$1000,ROW()*5,COLUMN())
Related
I have an excel document with two sheets, data and edu-plan. The sheet data has the following information:
The sheet edu-plan looks like this:
My question is: how do i create an excel formula that checks if the target group on the specific row in edu-plan! has the course name in question on the same row as the target group in sheet data!, i.e. if Sales and Sales course is on the same row in the sheet data!?
In reality, the data sheet as a couple of hundred rows and will change over time, so i am trying to develop a formula that i can apply easily on all rows/columns in edu-plan!.
The desired result in edu-plan would look like this:
A pivot table might be a good way to go.
If you would like to do it by formula, then you can just use a COUNTIFS
=IF(COUNTIFS(data!$A$2:$A$10,$A2,data!$B$2:$B$10,B$1),"X","")
A possible way to solve your issue with an array formula:
Write in B2 of sheet edu-plan
{=IFERROR(IF(MATCH('edu-plan'!$A2&'edu-plan'!B$1,data!$A$2:$A$6&data!$B$2:$B$6,0)>0,"x",""),"")}
Since it is an array formula, you need to hit shift + ctr + enter.
Here is the formula broken down:
MATCH('edu-plan'!$A2&'edu-plan'!B$1,data!$A$2:$A$6&data!$B$2:$B$6,0)
checks whether the combination of row header and column header is in the data table. MATCH returns the index of the found combination. Since we are not interested in the location, we only ask IF(MATCH > 0, "x", "") to write an "x" if a match was found. If MATCH finds nothing, it returns an error, which is why we add an IFERROR(VALUE, "") around the construct.
I'm learning to use array formulas and have been successful doing simple things like adding 2 columns together in a third column. For example, I can put =arrayformula(B:B+C:C) in D1 and it adds B and C for each row.
But now I have a situation where I want to subtract two numbers in the same column. I want to take the value of that column in the current row and subtract the previous row's value from it. Without array formulas this is simple: in O7 I put =N7-N6 and cop that down so O8 gets =N8-N7, etc. But that requires copying down every time - can I do the same thing with an array formula?
Basically, can I do something like =arrayformula(B:B+(B-1):(B-1)) ?
Context: column N is a monthly account balance. I would like to calculate how much that balanced changed each month. So for row 7, =N7-N6 gives me that difference. But I'm changing the entire spreadsheet to array formulas so I can stop pasting all of the formulas and I'm stuck on this one since it's comparing the same column.
I'm trying to get everything into Row 1 so my values and calculations can start in Row 2. For example, here's one of my formulas in Row 1:
arrayformula(if(row(A:A)=1,"Total gross income",if(LEN(B:B),B:B+C:C,"")))
Unfortunately, in Column O (the one I asked about originally) if I do this:
=arrayformula(if(row(A:A)=1,"Amount saved this month",if(row(A:A)>1,if(LEN(N:N),N2:N-N:N,""))))
Or this:
=arrayformula(if(row(A:A)=1,"Amount saved this month",if(row(A:A)>1,if(LEN(N:N),offset(N:N,1,0)-N:N,""))))
Every row is off by 1 - the result that should go in Row 3 goes in Row 2, etc. And if I do this:
=arrayformula(if(row(A:A)=1,"Amount saved this month",if(row(A:A)>1,if(LEN(N:N),N:N-offset(N:N,-1,0),""))))
Then it gives me an error because the offset function is trying to evaluate something out of range (possibly it starts with N1 and tries to grab a value 1 row above N1?)
Any advice on how to handle that out-of-range error?
I think the error is because of offset range N:N which starts from N1 and you are trying to shift it -1 or one cell up, which brings the formula out of sheet.
Try this formula instead:
=arrayformula(
{"Amount saved this month";
if(LEN(N2:N),N2:N-offset(N2:N,-1,0),"")})
It uses {} to make an array. See more info:
https://support.google.com/docs/answer/6208276?hl=en
Bonus. There is no reason to check row number now.
I have a sheet setup that calculates totals. Easy enough if the data is already there but not if adding new data. So what I would like to be able to do is to not specify a specific end cell for the sum formula but let it update as more columns are added.
How can I do this with =SUM(m4:m?)
Suppose you need totals for data in M4:M10.
To make just open-ended range, you can make lower limit "too far": =SUM(M4:M100000).
Alternatively you can make it as =SUM(M:M) - SUM(M1:M3)
But this is not applicable when you need to have totals value just below the set of values. In this case you have 2 ways.
Using Excel embedded features
The formula will look like his: =SUM(M4:M10). If you insert a new row between M4 and M10 (for instance, select row 5, right-click, insert row), you formula will be automatically adjusted to =SUM(M4:M10).
The problem may happen if you want to insert a new value above the first row (select row 4, right-click, insert row) or below the last row (select row 11, right-click, insert row). In these cases totals formula will not be adjusted.
Possible workarounds:
For the "above first row" issue, I prefer to make some empty row above and hide it. In our case I would hide row 3 and make totals formula look like =SUM(M3:M10), so, when you insert a new row above the first row, in fact you insert a row to the middle of the table, and totals formula will be adjusted.
For the "below last row" - leave empty row below; but in this case you cannot hide it; just make it different color and make some remark like "new values shall be inserted ABOVE this line".
INDEX()
Interesting trick is using INDEX() function, which returns a reference to a cell in the array. For our case, the array can be the whole M row and, the index - row number.
For the "above first row" issue make totals formula like this =SUM(INDEX(M:M;4):M10). So, calculation will always start at row 4, even if some lines will be added/deleted.
"below last row". Suppose you have your "totals cell" in M13 and you want to have totals for all value between M4 and the "totals cell". The formula may look like =SUM(M4:INDEX(M:M;ROW(M13))) or, considering "above first row" case: =SUM(INDEX(M:M;4):INDEX(M:M;ROW(M13)))
Hope this helps
Sum(m4:m?) insinuates that you are looking to add more rows as opposed to adding column data.
If you want to auto sum a row data you can use something like:
=SUM(OFFSET(A1;0;0;COUNT(A:A);1))
However this assumes that the data is contiguous in each cell and also empties are not allowed for 0 because it gets the count wrong.
However: You could also define a table for the data range. If you add data to columns/rows that are in that data range, they will be included in the adjusted formula automatically - very nice indeed.
Select your data range, then Select Insert:Table. This will give your table a name like Table1.
Your sum function would now be adjusted to look something like:
=SUM(Table1)
Now, as you add to the range, the table resizes, and your function just works.
The beauty of using a table, is that if you add data to the row/column immediately after the table it resizes and includes that range. This is hard to do without a table. You can also change the format of the table, or make the format colours invisible but you're probably better off with some format to show the data area of the table to the user.
You can compute the last row containing a number using this formula:
=LOOKUP(2,1/ISNUMBER($J:$J),ROW($J:$J))
This formula would not have a problem if you had text or blanks in the range.
You could then define that formula as a Defined Name
and use the formula:
=SUM(OFFSET(J4,0,0,LastRow-3))
to Sum the range. Note the -3 at the end to compensate for the first cell being in row 4.
Another option would be to just set your range to a fixed range that you can guarantee will be larger than any range you might actually use:
=SUM(J4:J1000)
You can use a counta to find the max row number. Then pushing that into an indirect will give you the range you need.
=SUM(INDIRECT("A1:A" & COUNTA(A1:A1000000);TRUE))
Assumptions:
Data are on column A
Data start from first row
There are no blanks rows
I need help with the following:
I have a worksheet containing some data. Row 1 is header and from row 2 downward is the data. At the end there is total for all the data above. This worksheet is dynamic, i.e., if week 1 has 200 rows of data, then week 2 could have 250 or 190 rows of data.
Likewise, the columns across, change every week. This week I have 18 columns and next week I could have 20 columns.
Within row # 1, the header, I have two headings "CTAEO1P" and "CTAEO2P".
On another worksheet, I want to add the "totals" of both of those columns i.e., Individual totals of CTAEO1P = 32.98 + CTAEO2P = 46.25 = 79.23
I am using named ranges and named the whole of the worksheet with data as "MT". The range is whole of the worksheet so when next week I copy the data over from another worksheet, I should not have to adjust the range.
I am using the following formula, courtesy of another expert on this forum:
=HLOOKUP("CT*",MT,MATCH(9^99,INDEX(MT,0,MATCH("CT*",INDEX(MT,1,0),0))),0)
This formula look for any column that starts with "CT" and then "Match(9^99" and "index" finds the last number within that column (the total in this case) and then return that value on the worksheet. In this case this formula is returning "32.98" only, as this is the first occurrence.
I think I can use "Sumproduct" formula here but then a) I would have to create more than one named range, one for the header row and another for the "Total" row, b) every week I would have to adjust the range for "Total" row. Unless, if I can nest "Match(9^99..." part within "SUMPRODUCT" function.
I want to use "MT" range alone and want to add the totals of all the columns that start with "CT".
I hope I have been able to explain my problem better enough to make some sense, however, if you need any further information, then please let me know.
Regards
Tariq
I will forget about the MT range, as long as your data starts in A1 this will work
=SUMPRODUCT(ISNUMBER(SEARCH("CT*";OFFSET(A1;0;0;1;MATCH(9^99;2:2))))*OFFSET(A1;MATCH(9^99;A:A)-1;0;1;MATCH(9^99;2:2)))
Depending on your regional settings you may need to replace field separator ";" by ","
I think you can use a relatively simple SUMPRODUCT solution like this
=SUMPRODUCT((LEFT(INDEX(MT,1,0),2)="CT")*ISNUMBER(MT),MT)/2
SUMPRODUCT will total all values in the relevant columns, including the totals so divison by 2 will ensure you get the correct count
If you don't like that approach then assuming first column of MT always has data and that the totals for each column will all be in the same row you can use SUMIF like this
=SUMIF(INDEX(MT,1,0),"CT*",INDEX(MT,MATCH(9^99,INDEX(MT,0,1)),0))
That should be more efficient than the first version
This may have a simple solution which I haven't found yet. but here's the situation.
I have a data in Excel sheet:
This is a log file generated from simulations. The simulation is run tens of times (variable) and each run generates one block starting at "-------------------" and ending before the next "-----------------" divider. The number of rows between these dividers is variable but certain things are fixed. the number and order of columns and the first row & cell being the divider, the next row in the same column having date stamp, the next row having column headings. the divider and date stamp are contained in only 1 cell.
What I need to do is mind the MAX of CNT & SIM_TIME for each simulation run. I will then take the average of these. I only need to do this for the "Floor 1" table from the screenshot.
What's the best way to proceed? which functions should I use? (I have Office 2010 if that has new functions not present in 2007)
General approach, by example:
Data sheet: Sheet1
Results on seperate sheet: Sheet2
Number of rows in data: Cell F2
=COUNTA(Sheet1!B:B)
Intermediate result, Row of data set Cell A3
=MATCH(Sheet1!$B$1,OFFSET(Sheet1!$B$1,A2,0,$F$2),0)+A2
Intermediate result, row of next data Cell B3
=IF(IFERROR(N(A4),0)=0,IF(ISNA(A3),"",$F$2),A4)
Max of CNT data set, Cell C3
=IF(B3<>"",MAX(OFFSET(Sheet1!$B$1,$A3+2,0,$B3-$A3-3)),"")
Max of SIM_TIME, Cell D3
=IF(C3<>"",MAX(OFFSET(Sheet1!$B$1,$A3+2,3,$B3-$A3-3)),"")
Date from data set
=IF(D3<>"",OFFSET(Sheet1!$B$1,$A3,),"")
To expand to give results for all available data, copy range C3:E3 down for as many rows as are in data. any extra rows will show N/A in column A and blanks in others
Screen shot of results:
Screen Shot of formulas:
I am not sure i got what you want to do.
Perhaps something like this would work, although it is not automatic. I am making the assumption that the last value of each simulation is the MAX value.
Put the following formula at cell "I4" =if(B5 = "---------" ; B4 ; "")
Pull the cell formula down till the last row of "Floor 1"
Calculate the average =average(I:I). Don't put this type on column I!!!
Notes
use as many "-" as there are at cell B22
you may want to insert a new column between I and J, in order to average SIM_TIME. The procedure is the same. Only the cells change.
You could easily automate this procedure a little bit with macros.