Excel average if function for obtaining hourly data - excel-formula

I have minute data for air temperature and relative humidity (about 930,000 cells) for a building and I am trying to obtain the hourly average using excels "AverageIF" function. Here's the criteria that I have come up with: whenever one hour passes, average the air temperature and relative humidity for that hour. I am having trouble communicating these criteria to excel and any help would be appreciated. I have included a pic of what my data looks like:
the yellow row is me manually adding a row after an hour has passed and calculated the average using the average function.

In the example above I went a bit long winded in the formula to deal with time intervals where there was no data, as that generates a divide by 0 error.
Essentially what I did was setup a summary table to the right. I gave a start and end time for each hour (ASSUMING this is based on a 24 hour clock and not 1 hour from some arbitrary start time)
I used the AVERAGEIFS function as you have two criteria. Everything after a start time and everything before an end time. Only lines that match BOTH criteria are included.
The basic formula looks like this and is based on my example cell J4:
AVERAGEIFS(E$4:E$31,$C$4:$C$31,">="&$H4,$C$4:$C$31,"<="&$I4)
If the date is important, then you would need to add a date column to the table on the right and add a third criteria to the formula above.
In order to deal with divide by zero error and display blank instead of zero so you can distinguish between actual average of 0 and no data and still presenting an error when another type of error comes up I use the following formula in J4:
=IFERROR(AVERAGEIFS(E$4:E$31,$C$4:$C$31,">="&$H4,$C$4:$C$31,"<="&$I4),IF(ERROR.TYPE(AVERAGEIFS($E$4:$E$31,$C$4:$C$31,">="&$H4,$C$4:$C$31,"<="&$I4))=2,"","ERROR"))
The above formulas can be copied down and to the right.

Related

Is a date within several date ranges

I'm trying to put together an excel spreadsheet to show the day-by-day impact of stays by 3rd country citizens in the EU's Schengen area (it's a fairly complex rolling 180 day window). At the moment, I have a column with every day from a start date (so lots of rows) and another where I enter a 1 if there was a stay in a Schengen country for that day. Then other columns calculate when the 180 day period started and how many days have been in Schengen for that day.
It's a bit of a faff having to enter a 1 for every day away. I'd like to have another worksheet where I simply enter the start and end dates of (an arbitrary number of) each stay and then the calculation spreadsheet simply works out if each date falls within a stay or not. Working out whether a date falls within one date range is pretty straightforward, but working out if it falls within several date ranges isn't obvious to me. Any suggestions please?
After a very fair comment, here's a couple of images that hopefully illustrate what I'm trying to do (please ignore the colours).
Example of date ranges, but the number of these should be arbitrary:
The calculated sheet - currently I have to enter the 1s individually whereas I'd like them evaluated from the date ranges:

Forcing order with time format on Excel

I run a test overnight and have the results written on a txt file that I later extract the time and value from. The result is something like this:
Time data
Because it goes overnight and the time resets, the resulting plot is this:
Out of order plot
Is there any way to force the order in which the values are plotted without modifying the original .txt file so that it looks like my daytime tests?
in order plot
Edit:
Indeed the actual data skips and the timestamps are not periodic, here's how it looks around midnight
Not periodic timestamps
You will need a helper column and you need to make sure your time is actual excel time, and not a time string. Assuming your time is in column B starting in cell B2 you can use the following process:
1) Test cell
=ISNUMBER(B2)
If that is TRUE then you know you have your time stored in excel format. In excel time is stores as a decimal which represents the portion of a day. 12 noon is 0.5. Dates are stored as integers and represent number of days since 1900/01/01 with that date being day 1.
2) Add 24 hour increment
In order to properly sort your order for graphing purposes, you will need to add a day everytime your data passes the 24 / midnight mark. Assuming your data is sorted chronologically as per your example, use the following formulas in lets say column C
First cell C2
=B2
In C3 and copy down
=IF(B2>B3, B3+1,B3)
which can be rewritten as
=B3+(B2>B3)
Now format the cells in C for time. The date will not display, only time will. The times after the 23:59:59 mark will all be the following day.

Excel: Count Total Schedules at 30 Minute intervals taking day into account

In assessing how many agents can be added to certain times of day without exceeding the number of seats in the call center, I'm trying to discern how many agents are scheduled for each half hour interval on each day of the week.
Using the =SUMPRODUCT(((A$2:A$1000<=D2)+(B$2:B$1000>D2)+(A$2:A$1000>B$2:B$1000)=2)+0) formula I've been able to identify how many total agents work for each interval, however this doesn't take the day of week into account.
I currently have my spreadsheet setup this way:
K is the start time of the shift, L is the end time of the shift, M to S pulls data from another sheet that shows a 1 if the agent works on that day of the week and 0 if they do not, and then U has all the time intervals listed out. In the example, it's cut off but the columns continue down as needed. U goes to 49 and I've just been using a range from 2 to 500 for the others as we currently do not have that many shifts and I'm leaving space for the moment.
After some Googling, I tried =SUMPRODUCT(--(M2:M500="1"),(((K$2:K$1000<=U2)+(L$2:L$1000>U2)+(K$2:K$1000>L$2:L$1000)=2)+0)) but it only returns #VALUE! so I'm not sure what I'm doing wrong.
Any suggestions of how I can make this work? Please let me know if more information would be useful. Thanks.
=sumproduct(($K$2:$K$1000<=U2)*($L$2:$L$1000>=U2))
That will count the number of occurrences where the start time is less than or equal to the time in U2 AND the end time is greater than or equal to U2. It will check time from row 2 to row 1000. Any time one condition is checked and its true the comparison will result in value of TRUE and FALSE when its not true. The * act like an AND condition while at the same time converts TRUE and FALSE values to 1 and 0. So both conditions have to be true for a value of 1 to result. Sumproduct then totals up all the 1 and 0 to get you a count.
In order to consider the days of the week, you will need one thing to be true. Your headers in M1:S1 will need to be unique (which I believe they are). You will need to either repeat them in adjacent columns to M or in say V1 you have a cell that can change to the header of the day of the week you are interested in. I would assume the former so you can see each day at the same time.
In order to do this you want to add more conditions and pay attention to you reference locks $.
In V2 you could use the following formula and copy down and to the right as needed:
=sumproduct(($K$2:$K$1000<=$U2)*($L$2:$L$1000>=$U2)*(M$2:M$1000))
UPDATE #1
ISSUE 1 Time ranges ending the following day (after midnight)
It is based on the assumption that the ending time is later than the start time. When a shift starts at 22:00 and end at 6:30 as an example, our mind says that the 0630 is later than 22:00 because it is the following day. Excel however has no clue that 0630 is the following day and without extra information assumes it to be the same day. In excel date is stored as an integer and time is stored as a decimal. When you only supply the time during entry it uses 0 for the date.
In excel the date is the number of days since 1900/01/00. So one way to deal with your time out is to add a day to it. This way excel will know your out time is after your in time when the hour is actually earlier in the day.
See your sample data below.
Using your sample data, I did a straight copy of the value in L and placed it in M (=L3 and copy down). I then changed the cell display format from time to general. This allows you to see how excel sees the time. Note how all the time is less than 1.
In column N I added 1 to the value when the out time was less than the in time to indicate that it was the following day and we had not actually invented time travel. I also used the trick of a math operation will convert a TRUE/FALSE result to 1 or 0 to shorten the equation. I used =M3+(L3<K3) and copied down. You will notice in the green bands that the values are now greater than 1.
In the next column I just copied the values from N over using =N3 copied down, and then I applied just a time display format for the cell. Because it is only time format, the date is ignored and visually your time in column O looks the same as column L. The big difference is excel now knows that it is the following day.
you can quickly fix your out times by using the following formula in a blank column and then copying the results and pasting them back over the source column using paste special values.
=M2+(L2<K2)
The next part is for your time check. When looking at the 12:00 time you need to look at 1200 of the current day incase a shift started at 12:00 and you need to look at the 1200 period of the following day. In order to do that we need to modify the the original formula as follows:
=sumproduct(($K$2:$K$1000<=$U2)*($L$2:$L$1000>=$U2)*(M$2:M$1000)+($K$2:$K$1000<=$U2+1)*($L$2:$L$1000>=$U2+1)*(M$2:M$1000))
Note that the + in the middle of (M$2:M$1000) + ($K$2:$K$1000<=$U2+1)? This + acts like an OR function.
Issue 2 Time In/Out 15 minute increments, range 30 minute increments
You may be able to achieve this with the ROUNDDOWN Function or MROUND. I would combine this with the TIME function. Basically you want to have any quarter hour start time be treated as 15 minutes sooner.
=ROUNDDOWN(E3/TIME(0,30,0),0)*TIME(0,30,0)
Where E3 is your time to be converted
So your formula may wind up looking something like:
=sumproduct((ROUNDDOWN($K$2:$K$1000/TIME(0,30,0),0)*TIME(0,30,0)<=$U2)*($L$2:$L$1000>=$U2)*(M$2:M$1000)+((ROUNDDOWN($K$2:$K$1000/TIME(0,30,0),0)*TIME(0,30,0)<=$U2+1)*($L$2:$L$1000>=$U2+1)*(M$2:M$1000))
similar option could be used for the leaving time and rounding it up to the next 30 minute interval. In which case just use the ROUNDUP function. Though I am not sure it is required.

Excel AVERAGEIFS else statement

I'm trying to perform an AVERAGEIFS formula on some data, but there are 2 possible results and as far as I can tell AVERAGEIFS doesn't deal with that situation.
I basically want to have an ELSE inside it.
At the moment I have 2 ranges of data:
The first column only contains values 'M-T' and 'F' (Mon-Thurs and Fri).
The second column contains a time.
The times on the rows with an 'F' value in column 1 are an hour behind the rest.
I want to take an average of all the times, adjusting for the hour delay on Fridays.
So for example I want it to take an average of all the times, but subtract 1 hour from the values which are in a row with an 'F' value in it.
The way I've been doing it so far is by having 2 separate results for each day, then averaging them again for a final one:
=AVERAGEIFS(G3:G172, B3:B172, "M-T")
=AVERAGEIFS(G3:G172, B3:B172, "F")
I want to combine this into just one result.
The closest I can get is the following:
=AVERAGE(IF(B3:B172="M-T",G3:G172,((G3:G172)-1/24)))
But this doesn't produce the correct result.
Any advice?
Try this
=(SUMPRODUCT(G3:G172)-(COUNTIF(B3:B172,"=F")/24))/COUNTIF(B3:B172,"<>""""")
EDIT
Explaining various steps in the formula as per sample data in the snapshot.
SUMPRODUCT(G3:G17) sums up all the value from G3 to G17. It gives a
value of 4.635416667. This after formatting to [h]:mm gives a value
of 111.15
OP desires that Friday time be one hour less. So I have kept one hour less for Friday's in the sample data. Similar SUMPRODUCT on H3:H17 leads to a value of 4.510416667. This after formatting to [h]:mm gives a value
of 108.15. Which is exactly three hours less for three occurrences of Fridays in the sample data.
=COUNTIF(B3:B17,"=F") counts the occurrences of Friday's in the B3:B17 range which are 3 occurrences.Hence 3 hours have to less. These hours are to be represented in terms of 24 hours hence the Function COUNTIF() value is divided by 24. This gives 0.125. Same is the difference of 4.635416667 and 4.510416667 i.e. 0.125
Demonstration column H is for illustrative purposes only. Infact Friday accounted values that is 108.15 in sample data has to be divided by total data points to get the AVERAGE. The occurrences of data points are calculated by =COUNTIF(B3:B17,"<>""""") with a check for empty columns.
Thus 108:15 divided by 15 data points give 7:13 in the answer.
Revised EDIT Based upon suggestions by #Tom Sharpe
#TomSharpe has been kind enough to point the shortcomings in the method proposed by me. COUNTIF(B3:B172,"<>""""") gives too many values and is not advised. Instead of it COUNTA(B3:B172) or COUNT(G3:G172) are preferable. Better Formula to get AVERAGE as per his suggestion gives very accurate results and is revised to:
=AVERAGE(IF(B3:B172="M-T",G3:G172,((G3:G172)-1/24)))
This is an Array Formula. It has to be entered with CSE and further cell to be formatted as time.
If your column of M-T and F is named Day and your column of times is named TIME then:
=SUMPRODUCT(((Day="M-T")*TIME + (Day="F")*(TIME-1/24)))/COUNT(TIME)
One simple solution would be to create a separate column that maps the time column and performs the adjustment there. Then average this new column.
Is that an option?
Ended up just combining the two averageifs. No idea why I didn't just do that from the start:
=((AVERAGEIFS(G$3:G171, $B$3:$B171, "F")-1/24)+AVERAGEIFS(G$3:G171, $B$3:$B171, "M-T"))/2

Time series mock data generation for 16 years of quarterly data in Excel or Matlab

I would like to generate a mock time series quarterly dataset from, say, 2000-2016 for a variable (quarterly credit growth) that averages around a certain value (say, 30%). Can anyone give a suggestion on how to do this in principle?
Edit: what I was implying were the actual data values for each time period, i.e. data with a certain mean and variance.
Found a solution with a code in Matlab, for anyone interested, see below in answers.
Excel approach:
You can make column A your date list. In A1, or A2 or more if you have header rows, you will have to seed your list by providing the first start date. Lets assume you put your seed date in A2. I would then go about adding 3 month to you start date using a formula, and copy down until you have hit your desired date. In order to add the 3 months I would use the following in A3.
=date(year(A2),Month(A2)+3,day(A1)
that will give you the first day of the month every 3 months. If you want the first day of the month every 3 months, set the day to 1 like so:
=date(year(A2),Month(A2)+3,day(A1)
And end of month could be calculated as:
=eomonth(date(year(A2),Month(A2)+3,day(A1)),0)
however I would prefer to do the end of month calculation based on the row you are in so I would do it more like:
=EOMONTH($A$2,(rows($A$2:A3)-1)*3)

Resources