This may have a simple solution which I haven't found yet. but here's the situation.
I have a data in Excel sheet:
This is a log file generated from simulations. The simulation is run tens of times (variable) and each run generates one block starting at "-------------------" and ending before the next "-----------------" divider. The number of rows between these dividers is variable but certain things are fixed. the number and order of columns and the first row & cell being the divider, the next row in the same column having date stamp, the next row having column headings. the divider and date stamp are contained in only 1 cell.
What I need to do is mind the MAX of CNT & SIM_TIME for each simulation run. I will then take the average of these. I only need to do this for the "Floor 1" table from the screenshot.
What's the best way to proceed? which functions should I use? (I have Office 2010 if that has new functions not present in 2007)
General approach, by example:
Data sheet: Sheet1
Results on seperate sheet: Sheet2
Number of rows in data: Cell F2
=COUNTA(Sheet1!B:B)
Intermediate result, Row of data set Cell A3
=MATCH(Sheet1!$B$1,OFFSET(Sheet1!$B$1,A2,0,$F$2),0)+A2
Intermediate result, row of next data Cell B3
=IF(IFERROR(N(A4),0)=0,IF(ISNA(A3),"",$F$2),A4)
Max of CNT data set, Cell C3
=IF(B3<>"",MAX(OFFSET(Sheet1!$B$1,$A3+2,0,$B3-$A3-3)),"")
Max of SIM_TIME, Cell D3
=IF(C3<>"",MAX(OFFSET(Sheet1!$B$1,$A3+2,3,$B3-$A3-3)),"")
Date from data set
=IF(D3<>"",OFFSET(Sheet1!$B$1,$A3,),"")
To expand to give results for all available data, copy range C3:E3 down for as many rows as are in data. any extra rows will show N/A in column A and blanks in others
Screen shot of results:
Screen Shot of formulas:
I am not sure i got what you want to do.
Perhaps something like this would work, although it is not automatic. I am making the assumption that the last value of each simulation is the MAX value.
Put the following formula at cell "I4" =if(B5 = "---------" ; B4 ; "")
Pull the cell formula down till the last row of "Floor 1"
Calculate the average =average(I:I). Don't put this type on column I!!!
Notes
use as many "-" as there are at cell B22
you may want to insert a new column between I and J, in order to average SIM_TIME. The procedure is the same. Only the cells change.
You could easily automate this procedure a little bit with macros.
Related
I'm entering datas in A column of a table. According to the datas, I making sums from the yellow cell (actually this yellow cell is the high value of the A column). So all 10 cells, there is a sum until the end of datas.
I'm looking for automatically report the seven first grey cells (the sums)to another table. The problem is, according to the datas, high value is not at the same place so the sums are not to the same place too.
How can I do ?
Thank you for your help
MY ERROR :
And the message when I press ctrl maj enter in same time :
You might use this array formula in your report.
=INDEX($F:$F,SMALL(ROW($F$4:$F$117)+(100*(F$4:$F$117="")), ROW(F1)))&""
Bear in mind that, as an array formula, it must be confirmed with Control+Shift+Enter. Enter the formula in the row where you have Somme = 1, then copy down to 6. Note that Row(F1) is a counter. You have a similar counter (1 to 6) in F124:F130. Therefore you can replace ROW(F1) with $F124 (if that is where the "1" is) to make it easier to understand, perhaps.
The formula retrieves the value of the 1st, 2nd, 3rd etc non-blank cell in the range F4:F117. If those cells contain a formula they will be considered "blank" if their result equals "".
BTW, if you don't always have 113 results to evaluate you might consider giving a name to the range E4:E117. For example, if you name that range as "Results" then =SUM(Results) would be the same as =SUM($E$4:$E$117), but as you insert or delete rows within the named range the formula doesn't need to be amended. Use of a named range would simplify understanding your existing formula. You could do the same with column F.
Finally I find a solution to report the values from F to another table. As values positions are dependant of the MAX raw in E (every 10 cells) I make this formula :
For the first : INDEX(E4:F117;EQUIV(GRANDE.VALEUR($E$4:$E$117;1);$E$4:$E$117;0)+10;2)
For the second :
INDEX(E4:F117;EQUIV(GRANDE.VALEUR($E$4:$E$117;1);$E$4:$E$117;0)+20;2)
Etc...
I have a sheet setup that calculates totals. Easy enough if the data is already there but not if adding new data. So what I would like to be able to do is to not specify a specific end cell for the sum formula but let it update as more columns are added.
How can I do this with =SUM(m4:m?)
Suppose you need totals for data in M4:M10.
To make just open-ended range, you can make lower limit "too far": =SUM(M4:M100000).
Alternatively you can make it as =SUM(M:M) - SUM(M1:M3)
But this is not applicable when you need to have totals value just below the set of values. In this case you have 2 ways.
Using Excel embedded features
The formula will look like his: =SUM(M4:M10). If you insert a new row between M4 and M10 (for instance, select row 5, right-click, insert row), you formula will be automatically adjusted to =SUM(M4:M10).
The problem may happen if you want to insert a new value above the first row (select row 4, right-click, insert row) or below the last row (select row 11, right-click, insert row). In these cases totals formula will not be adjusted.
Possible workarounds:
For the "above first row" issue, I prefer to make some empty row above and hide it. In our case I would hide row 3 and make totals formula look like =SUM(M3:M10), so, when you insert a new row above the first row, in fact you insert a row to the middle of the table, and totals formula will be adjusted.
For the "below last row" - leave empty row below; but in this case you cannot hide it; just make it different color and make some remark like "new values shall be inserted ABOVE this line".
INDEX()
Interesting trick is using INDEX() function, which returns a reference to a cell in the array. For our case, the array can be the whole M row and, the index - row number.
For the "above first row" issue make totals formula like this =SUM(INDEX(M:M;4):M10). So, calculation will always start at row 4, even if some lines will be added/deleted.
"below last row". Suppose you have your "totals cell" in M13 and you want to have totals for all value between M4 and the "totals cell". The formula may look like =SUM(M4:INDEX(M:M;ROW(M13))) or, considering "above first row" case: =SUM(INDEX(M:M;4):INDEX(M:M;ROW(M13)))
Hope this helps
Sum(m4:m?) insinuates that you are looking to add more rows as opposed to adding column data.
If you want to auto sum a row data you can use something like:
=SUM(OFFSET(A1;0;0;COUNT(A:A);1))
However this assumes that the data is contiguous in each cell and also empties are not allowed for 0 because it gets the count wrong.
However: You could also define a table for the data range. If you add data to columns/rows that are in that data range, they will be included in the adjusted formula automatically - very nice indeed.
Select your data range, then Select Insert:Table. This will give your table a name like Table1.
Your sum function would now be adjusted to look something like:
=SUM(Table1)
Now, as you add to the range, the table resizes, and your function just works.
The beauty of using a table, is that if you add data to the row/column immediately after the table it resizes and includes that range. This is hard to do without a table. You can also change the format of the table, or make the format colours invisible but you're probably better off with some format to show the data area of the table to the user.
You can compute the last row containing a number using this formula:
=LOOKUP(2,1/ISNUMBER($J:$J),ROW($J:$J))
This formula would not have a problem if you had text or blanks in the range.
You could then define that formula as a Defined Name
and use the formula:
=SUM(OFFSET(J4,0,0,LastRow-3))
to Sum the range. Note the -3 at the end to compensate for the first cell being in row 4.
Another option would be to just set your range to a fixed range that you can guarantee will be larger than any range you might actually use:
=SUM(J4:J1000)
You can use a counta to find the max row number. Then pushing that into an indirect will give you the range you need.
=SUM(INDIRECT("A1:A" & COUNTA(A1:A1000000);TRUE))
Assumptions:
Data are on column A
Data start from first row
There are no blanks rows
I have two different sheets with 300,000 data in Excel.
First sheet contains:
S2_Symbol Start_Pos End Position
STE 254857 267891
PRI 748578 758962
ILA 852741 963369
VIS 789456 796325
Second:
S1_Location
789460
852898
748678
My output should be like this:
S1_Location Symbol
789460 VIS
852898 ILA
748678 PRI
I have to find that S1_location falls in which S2_location and its corresponding Symbol. I have used INDEX formula in Excel but for each cell, I have to change the reference cell manually. I couldn't do it 300,000 data.
How can I do in an in Excel or should I use a script?
This solution assumes the following:
Start and End Positions for each S2 Symbol are unique (i.e. there is no intersection between the ranges allocated to each symbol)
Data in first sheet is located at A1:D17 (adjust ranges in formulas as needed)
Data in second sheet is locate at A1:B300010 (adjust ranges in formulas as needed)
The solution requires:
To add a working column in worksheet one. Enter this formula in D2 and copy till last record.
=ROWS($A$1:$A2)
Fig. 1
Then in second worksheet enter this formula at B2 and copy till last record.
=INDEX( Sheet1!$A$1:$A$17,
SUMIFS( Sheet1!$D$1:$D$17,
Sheet1!$B$1:$B$17, "<=" & $A2, Sheet1!$C$1:$C$17, ">=" & $A2 ) )
Fig. 2
It took aprox. less than 14 seconds to copy downwards and calculate the formulas in sheet 2.
As it can be seen in figures 1 and 2 none of the tables need to be sorted.
Assuming both sheets start in A1, and First sheet ColumnB is sorted ascending, in Second sheet B2 please try:
=INDEX(First!A:A,MATCH(A2,First!B:B))
copied down to suit. It relies on inexact matching.
Assuming we have a Sheet1 like this:
note, the Sheet1is sorted by Start_Pos, End_Pos in ascending order.
and a Sheet2 like this:
Then the formula in Sheet2!B2 downwards could be:
=INDEX(Sheet1!A:A,IF(MATCH(A2,Sheet1!B:B)>IFERROR(MATCH(A2-(10^-10),Sheet1!C:C),0),MATCH(A2,Sheet1!B:B),NA()))
See MATCH: https://support.office.com/en-us/article/MATCH-function-e8dffd45-c762-47d6-bf89-533f4a37673a
The idea is: MATCH without exact matching (without parameter match_type) gets the row of the largest value which is smaller or equal the search value. So in the Start_Pos column it will get the row from which we can get the S2_Symbol. But from the End_Pos column it should get one row beforehand if the value is not outside the given ranges.
There is only one exception. If the value is exact the value in the End_Pos column, then it will return the same row as in the Start_Pos column. Considering this exception, we can search in the End_Pos column with a little bit smaller value. Thanks to Tom Sharpe for his comment.
The formula in Sheet2!D2 downwards is:
{=INDEX(Sheet1!A:A,MIN(IF($A2>=Sheet1!$B$2:$B$300000,IF($A2<=Sheet1!$C$2:$C$300000,ROW(Sheet1!$A$2:$A$300000),2^20+1))))}
this is an array formula which is exactly formulated respecting the requirements. But this is very bad in performance for using in much many cells. But using this, the Sheet1 is not required to be sorted.
Benchmark test:
Have the following Sheet1:
Formulas:
A2:A300002: ="S"&(ROW(A1)-1)*10&"-"&(ROW(A1)-1)*10+7
B2:B300002: =(ROW(A1)-1)*10
C2:C300002: =B2+7
and the following Sheet2:
Formulas:
A2:A300002: =RANDBETWEEN(0,3000007)
B2:B300002: =INDEX(Sheet1!A:A,IF(MATCH(A2,Sheet1!B:B)>IFERROR(MATCH(A2-10^-9,Sheet1!C:C),0),MATCH(A2,Sheet1!B:B),NA()))
Note the -10^-9 instead of -10^-10 in previous version. This is because we have only 16 digits precision. In previous version this was maximum 6 digits integer part and then 10 digits decimal part. Now it is maximum 7 digits integer part and then 9 digits decimal part.
Calculation after pressing F9 in Sheet2 takes ca. 2 s. (Excel 2007, Windows 7, 4 core processor).
I would have gone for something like this which gives you the first match if there is one:-
=INDEX(First!A:A,MATCH(1,(First!B:B<=A2)*(First!C:C>=A2),0))
assuming keys and start and end values are in a sheet called First and lookup values start in A2.
Array formula which must be entered with CtrlShiftEnter
In response to the question from #pnuts about how long it will take, I have set up a similar benchmark with 300,000 rows in each sheet and it has reached 1% after 90 minutes, so it should take about 150 hours to reach 100% or roughly one week. This is to be expected as the number of computations required is (rows in sheet 1) X (rows in sheet 2)
300,000 X 300,000
but in fact because the multiplication applies to complete columns, I believe it is more correctly
300,000 X 1,048,576
i.e. > 300 billion.
A practical version which gives good response for smaller ranges is as follows:-
I define three named ranges Range1, Range2 and Range3
=First!$A$1:INDEX(First!$A:$A,MATCH("ZZZ",First!$A:$A))
=First!$B$1:INDEX(First!$B:$B,MATCH(9.9E+307,First!$B:$B))
=First!$C$1:INDEX(First!$C:$C,MATCH(9.9E+307,First!$C:$C))
and the modified formula is
=INDEX(Range1,MATCH(1,(Range2<=A2)*(Range3>=A2),0))
I was thinking of deleting this answer, but would rather it stood as a counter-example.
I have a table that is pulling thousands of rows of data from a very large sheet. Some of the columns in the table are getting their data from every 5th row on that large sheet. In order to speed up the process of creating the cell references, I used an OFFSET formula to grab a cell from every 5th row:
=OFFSET('Large Sheet'!B$2572,(ROW(1:1)-1)*5,,)
=OFFSET('Large Sheet'!B$2572,(ROW(2:2)-1)*5,,)
=OFFSET('Large Sheet'!B$2572,(ROW(3:3)-1)*5,,)
=OFFSET('Large Sheet'!B$2572,(ROW(4:4)-1)*5,,)
=OFFSET('Large Sheet'!B$2572,(ROW(5:5)-1)*5,,)
etc...
OFFSET can eat up resources during calculation of large tables though, and I'm looking for a way to speed up/simplify my formula. Is there any easy way to convert the OFFSET formula into just a simple cell reference like:
='Large Sheet'!B2572
='Large Sheet'!B2577
='Large Sheet'!B2582
='Large Sheet'!B2587
='Large Sheet'!B2592
etc...
I can't just paste values either. This needs to be an active reference, because the large sheet will change.
Thanks for your help.
And here is one last approach to this that does not use VBA or formulas. It's just a quick and dirty use of AutoFilter and deleting rows.
Main idea
Add a reference to a cell =Sheet1!A1 and copy it down to match as many rows as there are in the main data.
Add another formula in B1 to be =MOD(ROW(), 5)
Filter column B and uncheck the 0s (or any single number)
Delete all the rows that are visible
Delete column B
Voila, formulas for every 5th row
Some reference images, these are all taken on Sheet2.
Formulas with AutoFilter ready.
Filtered and ready to delete
Delete all those rows (select A1, CTRL+SHIFT+DOWN ARROW, SHIFT+SPACE, CTRL+MINUS)
Delete column B to get final result with "pure" formulas every 5th row.
If you want to take a VBA approach to this, you can generate the references very quickly using simple For loops.
Here is some very crude code which can get you started. It uses hard-coded sheet names and variables. I am really just trying to show the i*5 part.
Sub CreateReferences()
For i = 0 To 12
For j = 0 To 5
Sheet2.Range("H1").Offset(i, j).Formula = _
"=Sheet1!" & Sheet1.Range("A5").Offset(i * 5, j).Address
Next
Next
End Sub
It works by building a quick formula using the Address from a reference to a cell on Sheet1. The only key here is have one index count cells in the "summary" rows and multiply by 5 to get the reference to the "master" sheet. I am starting at A5 just to match the results from INDEX.
Results show the formula input for H1 and over. I am comparing to the INDEX results generated above.
Here is one approach using INDEX instead of OFFSET. I am not sure if it is faster, I guess you can check. INDEX is not volatile, so you might get some advantage from that.
Picture of ranges, you can see that Sheet1 has a lot of data and Sheet2 is pulling every 5th row from that sheet. The data in Sheet1 goes from A1:F1000 and just reports the address of the current cell.
Formulas use INDEX and are copied down and across from A1 on Sheet2.
=INDEX(Sheet1!$A$1:$F$1000,ROW()*5,COLUMN())
I need help with the following:
I have a worksheet containing some data. Row 1 is header and from row 2 downward is the data. At the end there is total for all the data above. This worksheet is dynamic, i.e., if week 1 has 200 rows of data, then week 2 could have 250 or 190 rows of data.
Likewise, the columns across, change every week. This week I have 18 columns and next week I could have 20 columns.
Within row # 1, the header, I have two headings "CTAEO1P" and "CTAEO2P".
On another worksheet, I want to add the "totals" of both of those columns i.e., Individual totals of CTAEO1P = 32.98 + CTAEO2P = 46.25 = 79.23
I am using named ranges and named the whole of the worksheet with data as "MT". The range is whole of the worksheet so when next week I copy the data over from another worksheet, I should not have to adjust the range.
I am using the following formula, courtesy of another expert on this forum:
=HLOOKUP("CT*",MT,MATCH(9^99,INDEX(MT,0,MATCH("CT*",INDEX(MT,1,0),0))),0)
This formula look for any column that starts with "CT" and then "Match(9^99" and "index" finds the last number within that column (the total in this case) and then return that value on the worksheet. In this case this formula is returning "32.98" only, as this is the first occurrence.
I think I can use "Sumproduct" formula here but then a) I would have to create more than one named range, one for the header row and another for the "Total" row, b) every week I would have to adjust the range for "Total" row. Unless, if I can nest "Match(9^99..." part within "SUMPRODUCT" function.
I want to use "MT" range alone and want to add the totals of all the columns that start with "CT".
I hope I have been able to explain my problem better enough to make some sense, however, if you need any further information, then please let me know.
Regards
Tariq
I will forget about the MT range, as long as your data starts in A1 this will work
=SUMPRODUCT(ISNUMBER(SEARCH("CT*";OFFSET(A1;0;0;1;MATCH(9^99;2:2))))*OFFSET(A1;MATCH(9^99;A:A)-1;0;1;MATCH(9^99;2:2)))
Depending on your regional settings you may need to replace field separator ";" by ","
I think you can use a relatively simple SUMPRODUCT solution like this
=SUMPRODUCT((LEFT(INDEX(MT,1,0),2)="CT")*ISNUMBER(MT),MT)/2
SUMPRODUCT will total all values in the relevant columns, including the totals so divison by 2 will ensure you get the correct count
If you don't like that approach then assuming first column of MT always has data and that the totals for each column will all be in the same row you can use SUMIF like this
=SUMIF(INDEX(MT,1,0),"CT*",INDEX(MT,MATCH(9^99,INDEX(MT,0,1)),0))
That should be more efficient than the first version