Averaging a Dynamic range in Excel - excel

I have a list of values that looks like this. Every 30min the top row in my data is shifted down and a new latest value is inserted. I'd like to be able to take the average of all the values in the model column from 18:30 to the top most cell each day (so just the latest cell that has an 18:30 value to the top most cell)
Any help would be greatly appreciated
I've been able to find the row that contains the date I need using this:
=MATCH(DATE(YEAR(TODAY()),MONTH(TODAY()),DAY(TODAY()-1))+TIME(18,30,0),AF47:AF4595,0)
So now I just need to able to average from row 1 to the output of that formula. I've tried Index(NamedRange,1:3,0) but that didn't work

=INDEX(AF:AF,1):INDEX(AF:AF,MATCH(DATE(YEAR(TODAY()),MONTH(TODAY()),DAY(TODAY()-1))+TIME(18,30,0),AF47:AF4595,0))
Although I would change your formula just to shorten it...your formula is more readable. I would change your formula to:
MATCH(INT(TODAY())-1+TIME(18,30,0),AF:AF,0)
I was assuming your data was all in AF based on your formula. Adjust AF to match the column of your needs.
Since INDEX returns a cell address, you just need to wrap the range defined by the two indexes in an AVERAGE function:
=AVERAGE(INDEX(AF:AF,1):INDEX(AF:AF,MATCH(DATE(YEAR(TODAY()),MONTH(TODAY()),DAY(TODAY()-1))+TIME(18,30,0),AF47:AF4595,0)))

This is pretty ugly and a roundabout way of doing it, but you can first identify the last row you care about using sumproduct() and some sneakiness. And then pump that into your =Average() formula using Indirect:
=AVERAGE(INDIRECT("B1:B" & SUMPRODUCT((HOUR(A:A)=18)*(MINUTE(A:A)=30)*ROW(A:A))))
It's going to be SLOW though hitting up all of A:A so maybe narrow that range down like A1:A1000 or whatever row is likely to be the max possible row to hold that 18:30 time.

Here is another way using AVERAGEIF:
=IF(NOW()-TODAY()<5.5/24,AVERAGEIF(A:A,">" & (TODAY()-5.5/24),B:B),AVERAGEIF(A:A,">" & (TODAY()+18.5/24),B:B))
It first checks if the current time is before 18:30 then calculates the average as described (from the last 18:30 to now)

An AVERAGEIF/AVERAGEIFS can solve your problem but I see at least three problematic areas.
The 18:30:00 time you want to stop at may or may not be in the same day as the day in the top of the Prediction Area column. This means that you cannot construct a limiting date time from the day at the top of column AF and add TIME(18, 30, 0). You might need to subtract a day if the top of column AF is after midnight and before 18:30:00. A TRUE is considered 1 in a worksheet formula and FALSE is considered zero so taking the date at the top of column AF and subtracting
'Is the time of the datetime in AF47 less than 18:30:00?
(MOD(AF47, 1)<TIME(18, 30, 0))
will make the deciding date today or yesterday depending on the time in AF47.
Your program is inserting Prediction Area datetimes and Model numbers at the top of the list. Depending on the method, this can disrupt conventional formula range addressing, even when using absolute row references. While INDIRECT can be used to 'lock' a cell address by converting from a text string, INDEX is a more efficient method.
'cell address with 'absolute' row (will change with inserted rows)
AF$47
'INDIRECT method (will not change with inserted rows)
INDIRECT("AF47")
'better INDEX method (will not change with inserted rows)
INDEX(AF:AF, 47)
'range of datetimes in column AF
INDEX(AF:AF, 47):INDEX(AF:AF, 95)
Time values typically resolve to pseudo-irrational numbers (aka *repeating decimals') that cannot be relied upon in Excel's 15 significant digit floating point mathematics. For this reason you should use the 00:30:00 interval to ensure true comparity by avoiding >= and <= in favor of > and < against a time value offset by a second. IOW, 18:30:00 is not equal to either 18:29:59.999 or 18:30:00.001.
So if your datetimes (e.g. 04/25/2019 15:00) start in AF47 and are incremented every thirty minutes then the furthest cell you need to examine is AF95 (47 + 24*2).
With Prediction Area in AF46 and and Model in AG46 this is my effort.
' |<----------AG47:AG95---------->| |<----------AF47:AF95---------->| |<----AF47---->| |<----AF47---->|
=AVERAGEIFS(INDEX(AG:AG, 47):INDEX(AG:AG, 95), INDEX(AF:AF, 47):INDEX(AF:AF, 95), ">"&INT(INDEX(AF:AF, 47))+TIME(18, 29, 59)-(MOD(INDEX(AF:AF, 47), 1)<TIME(18, 30, 1)))
Truth be told, the AVERAGEIFS function does suffer significantly by using full column references so if there are no rogue values that could produce false positives above or below the ranges referenced, full column references could be used for the average_range and criteria1_range cell range references. Full column references would not be affected by row/data insertion.

Related

Excel Calculate a running average on filtered data with criteria

I am trying to calculate a running average on a filtered data table. All other posts online use Sumproduct(Subtotal) on the entire range but do not calculate a row by row running average
I am stuck on how to calculate columns C and D.
If column B (Score) > 0, I want to sum and average it under column C (Average Win)
If column B (Score) < 0, I want to sum and average it under column D (Average Loss)
The table is filterable by column A (Type) and the results should look as follows
Progress so far:
I have figured out how to calculate a Cumulative score based on filtered data. However this does not fully solve my problem. I appreciate any help!
=SUBTOTAL(3,B3)*SUBTOTAL(9,B$3:B3)
SUBTOTAL(3,B3) checks if the current row is visible, SUBTOTAL(9,B$3:b3) sums the values.
Final update needed
Jos - Thank you for your detailed explanation on how subtotal() works. I learned a ton through your explanation and will continue to study it. This is my first time being exposed to structured referencing so some of the syntax is a bit confusing to me still
The last formula I need is a running win % column where a Win is defined by score > 0. Please see the picture below
My assumptions believe that the same formula would work, except that we average a 1 or 0 in each row instead of the [Score] column.
Using the prior solution, why can't we modify the output of your prior solution to calculate a running win %?
[...] IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Win])))),0)
Where [Win] is a helper column with the outputs 1 for win, 0 for loss.
This could be done by saying
if([#score]>0,1,0)
Instead of averaging out the actual #Score, this would average out a column of 1's and 0's with the desired output 0%, 50%, 66%, etc.
I am aware that the solution I provided does not work but I am trying to embrace the correct logic. I still struggle to understand how these structured column references are calculated on a row by row basis.
For example: Average(If([Score]>0,[Score])
How is this calculated on a row by row basis? When A3 does If([Score] > 0,), does this equal If({-10}>0)? When on A4, does If([Score]>0) equal If({-10,20} >0)? Thank you for your patience and help thus far.
I disagree with your result for Average Loss for the last row of your unfiltered table (surely -9.33...?), but try this for Average Win:
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(INDEX([Score],1),ROW([Score])-MIN(ROW([Score])),)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
Same formula for Average Loss, changing [Score]>0 to [Score]<0.
Explanation:
Using the data you provided and assuming:
The table's top-left cell is in A1
The table is filtered on the Type column for "A"
In order to determine which rows are filtered, we must pass an array of range references - i.e. for each cell within a chosen column of the table - to the SUBTOTAL function. It's a touch unfortunate that such an array of range references can only be generated via a volatile function (INDIRECT or OFFSET), but here, unless we resort to helper columns, we are left with no choice.
INDEX([Score],1)
simply returns a range reference to the first cell within the Score column. When using Excel tables, it's preferable not to write formulas which include a mixture of structured and non-structured referencing, even if that results in slightly longer expressions. So here, for example, we would not reference A2 within the formula.
ROW([Score])-MIN(ROW([Score]))
generates an array of integers from 0 up to one fewer than the number of rows in the table, i.e.
{0;1;2;3;4}
and so
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(INDEX([Score],1),ROW([Score])-MIN(ROW([Score])),)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
becomes
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(A2,{0;1;2;3;4},)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
OFFSET then generates an array of range references (though note that you will not be able to 'see' this step within the Evaluate Formula window - rather, an array of #VALUE! errors is displayed):
=IFERROR(AVERAGE(IF(SUBTOTAL(3,{A2;A3;A4;A5;A6}),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
SUBTOTAL then determines which of these range references is filtered (note that care must be given here to the choice of first parameter), returning the relevant Boolean, so that:
SUBTOTAL(3,{A2;A3;A4;A5;A6})
resolves to:
{1;1;1;0;1}
And so we now have:
=IFERROR(AVERAGE(IF({1;1;1;0;1},IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
and the rest is straightforward.
So, I would use averageifs().
=averageifs(B:B,B:B,">=1",A:A,"A")
is one example, note I have added the control of Type A in the example.
See:

Finding the last occupied cell using a formulae

This formulae "works", but it seems a bit long winded. Can anyone suggest an "easier" formulae. It returns the value in the 5th Column to a date in the 1st column which is called from a separate (and discrete) worksheet. If there is no data in the 5th column (because we haven't reached or passed that date), then the latest available data is called ($A$20, is the "calling date, on a current worksheet):-
IF(VLOOKUP($A$20,'[FTSE100.xlsx]Meteor Step Down
Plans'!$A$18:$E$2828,5)="",VLOOKUP(HLOOKUP(A20,'[FTSE100.xlsx]Meteor Step
Down Plans'!$A$18:$A$2628,MATCH(TRUE,INDEX(ISBLANK('[FTSE100.xlsx]Meteor
Step Down Plans'!$E$18:$E$2828),0,0),0)-1),'[FTSE100.xlsx]Meteor Step Down
Plans'!$A$18:$E$2828,5),VLOOKUP($A$20,'[FTSE100.xlsx]Meteor Step Down
Plans'!$A$18:$E$2828,5))
The formula is in a different workbook to the data. The formula is used to determine the last value in an array, the first column of which is a date range (01/01/2000 to 31/12/2027....say) The required data is in column 5. Entries may be up to date (i.e. the last entry was made today, therefore it will pickup todays value. The last entry may have been several days (weeks) ago, therefore it will pick up the last value in the designated column (column 5). I don't see why it can't go into a named range, as the array size is fixed. Does that help?
Thanks Ron, nearly there. That works beautifully for "missing" dates in the array. i.e. when Saturday and Sunday dates are missing, it finds the value (in column 5) of the previous Friday. However (and there is usually a "however"), if $A$20 is a date in advance of the date of the last data in the array i.e. column 5 is blank at that date, your formulae returns 0 (zero), instead of the data as at the last date that data has been entered. i.e. enter 02/04/2018 and your formulae returns 0, not the data at the last date available 28/02/2018. Is that a little more clear?? (Thank you anyway, for spending the time - I'm still trying to work out how your formulae works as it does. Is the "/" significant (other than being a "divide by" symbol, if so it's something I've never come across before? Regards...........Mike. P.S. The dates in the array (Column 1) are all in from 2016 to 2027 (excluding weekend and bank holidays), and are arranged in ascending order i.e. from 01/01/2016 to 31/12/2027
Ron - Sorry mate, buy I can't see a "second" formulae???
If I understand what you want to do correctly, (and that is hard to do since you give NO examples of your data, actual output or desired output), try:
=LOOKUP(2,1/((myDate>=A:A)*(LEN(A:A)>0)),E:E)
If it is the case that you have dates in column A and no entries in Column E, then you might have to do something like:
=LOOKUP(2,1/((myDate>=A:A)*(LEN(A:A)>0)*(LEN(E:E)>0)),E:E)
Of course, you will have to modify the cell references and named range to refer to the proper workbooks and cell.
Possibly something like:
=LOOKUP(2,1/(($A$20>='[FTSE100.xlsx]Meteor Step Down Plans'!$A$18:$E$2828)*(LEN('[FTSE100.xlsx]Meteor Step Down Plans'!$A$18:$E$2828)>0)),'[FTSE100.xlsx]Meteor Step Down Plans'!$E$18:$E$2828)
This should return the item from Column E that is either at or above the same row as myDate in Column A. If myDate does not exist in column A, it will match the nearest earlier date in Column A and return from that row in Column E.
In the second version of the formula, it also checks that there is an entry in Column E matching the column A date, and, if not, it will look above.

How to find row at which summed total is greater than a certain amount?

I would like to find the row at which running summed value has reached a specified amount and several criteria have been met (similar to sumifs).
I can't just add a cumulative row, as suggested here:
Count rows until the sum value of the rows is greater than a value
....because I have other criteria to meet in the data, and therefore can't have a running total.
In the following dummy example, I'd like to find the date at which the "Design" project has spent or exceeded $30,000
A late answer, but one that doesn't use a Helper Column.
By using Matrix Multiplication (MMULT) on a TRANSPOSEd array of whether the Row is larger than itself and the list itself, we can produce an array of the Running Total.
MMULT(--(TRANSPOSE(ROW(D2:D21))<=ROW(D2:D21)), D2:D21)
If we shrink this to just the first 3 rows, to demonstrate, then you are making this calculation:
[[--(2<=2)][--(3<=2)][--(4<=2] [[10,000]
[--(2<=3)][--(3<=3)][--(4<=3] ∙ [ 8,000]
[--(2<=4)][--(3<=4)][--(4<=4]] [ 6,000]]
Which gives us this:
[[1][0][0] [[10,000] [[10,000]
[1][1][0] ∙ [ 8,000] = [18,000]
[1][1][1]] [ 6,000]] [24,000]]
We can then compare that against the target value as a condition and divide by the result to eliminate values with #DIV0! errors, and use AGGREGATE to get the smallest non-error value
=AGGREGATE(15, 6, Row(D2:D21) / (MMULT(--(TRANSPOSE(ROW(D2:D21))<=ROW(D2:D21)), D2:D21)<30000), 1)
This will give us the first row where the Running Total is >= $30,000
make a cumulative row, only adding up if column B equals 'Design'
If you want to be able to do a vlookup, you should make an extra column that checks if amount exceeded 30k, and then output anything that will be your key for the vlookup.
Ok, for anyone who is interested, here is what I ended up doing. I created a separate table that had all of my possible weeks (10/17, 10/24, 10/31, etc.) in one column, and corresponding sequential numbers in the next column. I ended up with 54 in my actual project.
Then, I ended up having to insert one column into my dataset for the purposes of looking up that "Week No", for each "Week" in all rows. Then on my other sheet where I was solving, I had a cell be my decision variable for the week #. I had another be my target $. I created a formula that took my target amount minus the SUMIFS for all of my criteria (Project, Name, etc.) with the last criteria being that the week number had to be "<=" & (decision cell). I then used Solver to minimize the output by changing the target week with constraints that the target week had to be integer, >=1, <=54, and that the output had to be >=0. That got me to the week prior to where the funding went negative. I then had a cell lookup that week number +1 on my weeks table to find the week at which my target amount would be met.
Sorry, had to explain it that way, vs. the actual formula, as my actual formula has a lot of SUMIFS criteria and cell references that wouldn't make any sense here.

Excel (numbers) formula

this should be simple enough, but numbers (on OSX) keeps throwing an error about the ranges being different sizes.
I have a list of numbers, each with an associated date, and I want a sum of all numbers within a particular month (to give, on a separate sheet, a monthly total).
Here is what I've tried:
SUMIFS(
Sheet1::Table 1::D2:D84,
MONTH(Sheet1::Table 1::A2:A84), "=04",
YEAR(Sheet1::Table 1::A2:A84), "=2014"
)
Sorry if this is a stupid question, but I've tried fiddling with it and it just won't accept it.
Thanks in advance.
You cannot put a function inside the range:
=SUMIFS(C1:C25;Month(A1:A25);"=3")
than you need to add two (hidden?) columns with Month & year function.
After you build SUMIFS based on the new columns
=SUMIFS(C1:C25;D1:D25;"=4";E1:E25;"=2014")
sum all value from C1 to C25 if column of month (D) is equal to 4 and column of year (E) is equal to 2014.
I would suggest considering using a SUMIFS function with an upper\lower limit, and then either referencing a cell with the dates, or using their numerical value in the formula (the former is my preference, to reduce hard coded values = easily updated\reused). So something like:
=SUMIFS(D2:D84, A2:A84, ">="&E1, A2:A84, "<="&E2)
In this example:
column 'D' has the values you want to sum
column 'A' has the date values
columns 'B' and 'C' are treated as irrelevant (for the sake of this formula)
column 'E' has 2 values, in row 1, the lower limit (for this specific question, the first of April) and in row 2 the upper limit (for this specific question, the final day of April)
Then have your lower limit for dates (the first day from when you would like column 'D' to be counted) in cell E1, and your date upper limit in E2.
Easily updated for future months, so might save you some work down the line.
The next easier option would be to update it to be formatted as a table, because your formula would be slightly more legible, add in some named ranges (in this case, E1 = 'lowerlimit' and E2 = 'upperlimit) to once again make it easier to read the formula, in which case you'd end up with something like:
=SUMIFS(table[FigureToBeAccrued], table[dates], ">="&lowerLimit, table[dates], "<="&upperlimit)
Might've overcooked this answer, it's my first, so wanted to make sure I didn't skimp. Let me know if you've got any follow up questions.

Finding the value in excel or vba

Column A Column B
13-06-2013 10:50
13-06-2013 11:30
13-06-2013 12:40
14-06-2013 10:30
I need to find the values which are before a particular entry date and time.
For example, say I want to find the values in the example table above that are immediately prior to the values "13-06-2013" and "12:30".
Since 12:30 is not in column B, how do I find the values I am looking for? The answer should be 13-06-2013 and 11:30.
C7 =VLOOKUP(A7&B7,A1:C4,3,TRUE)
Here A1 = B1&C1
A B C
1 414380.451388888888889 13-06-2013 10:50
2 414380.479166666666667 13-06-2013 11:30
3 414380.527777777777778 13-06-2013 12:40
4 414390.4375 14-06-2013 10:30
5
6 Enter date Enter Time Returned Time
7 13-06-2013 12:30 11:30:00
Setting 'range_lookup' as 'True' adds the flexibility to return the closest approximate value if the exact value is not available.
I think you're looking for something like this. using index and match.
I didn't take into account the date for now. but this gives you an example.
You can compare date strings with operators like > or < etc. Concatenate your values in columns A & B, compare to the desired date/time string. In cell C1 put the following formula, and then drag down:
="13-06-2013 12:30"<A1&" "&B1
or more specifically, depending on which "12:30" you want (AM or PM), ="13-06-2013 12:30AM" or ="13-06-2013 12:30PM"
Your data in column B may default to AM unless otherwise specified/imported differently, so you may need to tweak the data or to account for this.
Here is another approach to answering your question that uses a combination of MATCH, INDEX, and array operations to provide a compact formula solution that does not rely on helper columns.
I'll assume that your two columns of dates and times are in cells A2:B5, and the two date and time values that you want to look up are in cells A9:A10. Then the following two formulas will return what you require, the latest date and time values in your data that are less than or equal to the date and time that you are looking up. (The dollar signs in the formulas are hard on the eyes, but they are important if you will need to copy the formulas to other locations; for clarity, I omit them in the discussion that follows.)
DATE: =INDEX($A$2:$B$5,MATCH(A9+A10,$A$2:$A$5+$B$2:$B$5,1),1) --> 13-06-2013
TIME: =INDEX($A$2:$B$5,MATCH(A9+A10,$A$2:$A$5+$B$2:$B$5,1),2) --> 11:30 AM
These are array formulas and need to be entered with the Control-Shift-Enter key combination. (Of course, only the bits starting with the equal (=) sign and ending with the last parenthesis need to be entered into the worksheet.)
Things to consider:
The formulas assume that your data are valid Excel date and time values. Excel date values are whole numbers that count the number of days that have elapsed since January 1, 1900; Excel time values are decimal amounts between 0 and 1 that represent the fraction of 24 hours that a particular time represents. While your example data don't display AM or PM, I assume that their underlying values do have that information.
If your values are text (having been imported from another source, for instance), you should convert them to date/time values, if lucky, using only Excel's DATEVALUE and TIMEVALUE functions; if not so lucky, using some combination of Excel's string manipulation functions as well. (The values could be kept as strings, but you would almost certainly need to massage them so they would compare correctly "alphabetically" - much easier just to deal with Excel date/time values.)
If they are not already, your dates and times will need to be sorted from smallest to largest. (Your sample looks like they are sorted, and the formulas assume as much.)
How the formulas work
The basic idea behind the formulas is two-fold: first find the row in your data that holds the latest (largest) date and time that is still less than or equal to the date and time you are looking up. That row information can then be used to fetch the final result from each column of the data range (one for date and one for time).
Since both date and time figure in to what point in time is latest, the date and time components of both the value to be looked up and the values that will be searched must be combined somehow.
This can be achieved by simply adding the dates and times together. This does nothing more than what Excel does: an Excel date/time value has an integer part (the number of days since 1/1/1900) and a decimal part (the fraction of 24 hours that a particular time represents).
What is neat here is that the adding up of the dates and times - and the lookup of the particular date and time - can be done all at once, on the fly.
Take a look at the MATCH: The cells that contain the date and time to be looked up - A9 and A10 - are added together, and then this sum is matched against the sum of the date column (A2:A5) and the time column (B2:B5) - an operation that is possible of Excel's array arithmetic capabilities. The match returns a value of 2, indicating correctly that the date and time that fill your requirements are in row 2 of the data table.
DATE/TIME MATCH: = MATCH( A9+A10, A2:A5 + B2:B5, 1 ) --> 2
The 1 that is the final argument to the MATCH function is an instruction that the match results be calculated to be less than or equal to the value to be looked. It is the default value and is often omitted, or replaced with another value (for example, using a value of 0 will produce an exact match, if there is one).
(For readability, I've removed the dollar signs that are in the full formula; these anchor a range so that it remains the same even if the formula is copied to another location.)
Having figured out the row to look in, the rest of the formula is straightforward. The INDEX function returns the value in a data range that is at the intersection of a specified row and column. So, for the date in question, the formula reduces to:
DATE FETCH: = INDEX( A2:B5, 2, 1) --> 13-06-2013
In other words, INDEX is to return the value in the second row and first column of the data range A2:B5.
The formula for the time proceeds in exactly the same fashion, with the only difference that the value is returned from the second column of the data range.

Resources