Conditional averages (AVERAGEIF, AVERAGEIFS, or other option?) - excel

Perhaps it's just been a long week, but I can't think of how to get a pretty simple average.
Here's my data (two columns):
1/3/1994 1165
1/4/1994 1162
1/5/1994 1133
1/6/1994 1133
1/7/1994 1138
1/10/1994 1143
1/11/1994 1118
1/12/1994 1150
1/13/1994 1171
1/14/1994 1177
1/17/1994 1161
1/18/1994 1162
1/19/1994 1121
1/20/1994 1112
1/21/1994 1129
1/24/1994 1136
1/25/1994 1124
1/26/1994 1118
1/27/1994 1127
1/28/1994 1133
1/31/1994 1088
2/1/1994 1055
2/2/1994 1051
2/3/1994 1071
2/4/1994 1079
2/7/1994 1054
2/8/1994 1079
2/9/1994 1079
2/10/1994 1089
2/11/1994 1074
2/14/1994 1083
2/15/1994 1068
2/16/1994 1075
2/17/1994 1071
As you can see, it's a column of dates (that continue until Sept. 9 2015, so it's long), and another of price. I am just trying to get the averages for January each month, of each year (i.e. Jan 1994, 1995, 1996 ... 2015, then Feb 1994, etc).
Here's the table I plan on using the formula in:
2007 2008 2009 2010 2011
January
February
March
April
So, in the cell right of "January" and below "2007", I want the average of prices that are in Jan, 2007.
I tried using this (again, my data starts in A1 and B1):
=AverageIfs(B:B,year(A:A),1994,month(A:A),1) (regular and as array), but it doesn't work - I keep getting the error "The formula you typed contains an error." (I'd really prefer this to be a formula, rather than a VB solution)
Thanks for any ideas!
Edit: In the mean time, I have created two helper columns, that are just the Month() and Year() of each row of data. Then I can use =AverageIfs(B:B,[month helper range],1,[year helper range],2007). Is there a way to do this without a helper column though?

Try this
=AVERAGE(IF(YEAR(A:A)=1994,IF(MONTH(A:A)=1,B:B,""),""))
entered as an array formula (CTRL-SHIFT-ENTER). If you want to use the month as text you could use
=AVERAGE(IF(YEAR(A:A)=1994,IF(TEXT(A:A,"mmmm")="January",B:B,""),""))
Hope that helps

assuming your data has a header: "Date" and "Price" in cells A1, B1.
assuming your data begins in A2 = "1/3/1994" and B2 = 1165
C1 = "Month"
D1 = "Year"
C2 = =TEXT(A2,"Mmmm")
D2 = =YEAR(A2)
Copy Cells C2+D2 down ...
I place your new table in:
H2 = "January"
H3 = "February"
... etc.
I1 = 1994
J1 = 1995
... etc.
I2 = =AVERAGEIFS($B:$B,$C:$C,$H2,$D:$D,I$1)
and copy that formula throughout the table.
Cheers!

Yes, you can use AVERAGEIFS() and you should. This is about a thousand times faster than the accepted answer:
=AVERAGEIFS(B:B,A:A,">="&DATE(1994,1,1),A:A,"<"&DATE(1994,2,1))
You can even do it this way for a more concise formula, but I believe it raises problems for non-USA users because of their date format settings:
=AVERAGEIFS(B:B,A:A,">=1/1/94",A:A,"<2/1/94")

I don't know if this is the most concise solution, but it works. You can use SUMPRODUCT as follows:
=SUMPRODUCT((MONTH($A:$A)=1)*(YEAR($A:$A)=1994)*$B:$B)/SUMPRODUCT((MONTH($A:$A)=1)*(YEAR($A:$A)=1994))
What this is essentially doing is summing the values in column B based on the two criteria, and then counting the number of rows that matched the criteria and dividing by that number.
For each row, the MONTH and YEAR conditions evaluate to either 1 (true) or 0 (false) and then those two values are multiplied with the value in column B, resulting in column B's value if both conditions are true, or 0 if one or both conditions are false.

This solution requires to use in the "table with the results" the number of the month instead of the names of the month
It also assumes that the "table with the results" starts at F2 (see picture)
Then use this formula:
=IFERROR(AVERAGEIFS($B:$B,$A:$A,">="&DATE(G$2,$F3,1),$A:$A,"<="&EOMONTH(DATE(Q$19,$P20,1),0)),"N/A")
The formula shows “N/A” if there are no prices for the period (Year/Month), if you want to see blank then replace it with “”
Small changes done to your sample data to work with several periods

Related

Summing every first month columns in Excel

I am trying to add the sum of the first 7 columns and then the next 7th columns etc in Excel. So for example if I have the below data and I needed to be added weekly,
Day 02/01/2017 03/01/2017 04/01/2017 05/01/2017 06/01/2017 07/01/2017 08/01/2017 09/01/2017 10/01/2017 11/01/2017 12/01/2017 13/01/2017 14/01/2017 15/01/2017
Presented Calls 1000 1550 900 1455 789 987 1435 1200 1675 1230 1232 1400 999 650
So if I want to add the presented calls from 02/01 - 07/01 this should be sum(B2:H2)
Then the sum of the presented calls from 08/01-15/01 this should be sum(I2:M2)
etc
However at the moment in Excel it is being sum(B2:H2) then sum(C2:I2) which is incorrect, can anyone help?
You can use the OFFSET() function combined with the COLUMN() function and a bit of arithmetic to get the desired range to sum.
Try entering this formula and fill across.
=SUM(OFFSET($B$2,0,(COLUMN()-COLUMN($B$2))*7,1,7))

Conditional formatting on the first x number of rows, regardless of filter or sort, in Excel

I'm trying to find a way to easily identify the first ten rows in a table column, no matter how it's been sorted/filtered. Is there a way to use conditional formatting to highlight these cells?
Examples of desired results...
Sample data:
product price units code
Item02 15.97 2191 7UQC
Item05 12.95 1523 TAAI
Item13 9.49 1410 LV9E
Item01 5.69 591 6DOY
Item04 15.97 554 ZCN2
Item08 10.68 451 2GN0
Item03 13.95 411 FP6A
Item07 25.45 174 PEWK
Item09 14.99 157 B5S4
Item06 18 152 XJ4G
Item10 11.45 148 BY8M
Item11 16.99 66 86C2
Item12 24.5 17 X31K
Item14 24.95 14 QJEI
When sorting by price the first 10 products highlighted differ from those in the next example.
The first 10 visible products are highlighted after filtering out Item12, Item05, and Item08.
Choosing to sort by units automatically highlights a different set of products.
Use this formula in the Conditional Formatting:
=SUBTOTAL(3,$A$2:$A2)<11
Make sure it applies to the entire dataset.
The formula returns the row number relative to the visible row number. Thus as a row is hidden the row beneath the hidden returns one greater than it would.
To see how it works place SUBTOTAL(3,$A$2:$A2) in an empty column. Then filter the table and watch as the numbers change.
The 3 refers to the COUNTA() function, which will count any non-empty cell.
Subtotal is designed to work with data that gets filtered to return only the visible data.
So the Formula will only count the visible cells that are not empty.
In the conditional formatting dialog, choose New rule -> Use a formula.... Enter =row()<=10.

Spreadsheet aggregation/manipulation

I have a spreadsheet structured like
2005 Alameda total HS graduates 1234
2005 Alameda UC enrollees 112
2006 Alameda total HS graduates 892
2006 Alameda UC enrollees 84
...
2009 Yolo total HS graduates 1300
2009 Yolo UC enrollees 93
and so on for every CA county for several years.
I want to generate a spreadsheet like this:
county 2005 2006 ...
Alameda 11.1% 9%
Alpine 7% 8%
...
Yolo 5.5% 4%
i.e. I want to project the years from rows to columns and have a row for each county, then divide the number of graduates (the data from each odd-numbered row in the original sheet) by the number of UC enrollees (even-row data) for each year, and insert it in the appropriate cell.
This would be easy enough for me to do in Java, but I want to get a feel for what's possible just using excel/Google sheets alone - how might I go about accomplishing this?
Assuming the counties are sorted, and they start in cell B2, enter =B2 in cell F2, and enter the following in F3:
=INDIRECT("B"&COUNTIF(B3:B$9999,"<="&F2)+ROW())
You can change 9999 based on the number of records, but it's fine as-is.
Copy F3 down as many rows as are needed:
You can then calculate percentages using SUMPRODUCT:
=IFERROR(
SUMPRODUCT(($A$2:$A$100=G$1)*
($B$2:$B$100=$F2)*
($C$2:$C$100="UC enrollees")*
$D$2:$D$100
)
/
SUMPRODUCT(($A$2:$A$100=G$1)*
($B$2:$B$100=$F2)*
($C$2:$C$100="total HS graduates")*
$D$2:$D$100
),
"")
The first SUMPRODUCT totals UC enrollees that match the year and county. The second SUMPRODUCT does the same for HS graduates. The results are divided, and IFERROR handles divide-by-zero errors for missing data.
Since your example shows percentages, I assume you want to divide UC enrollees by HS graduates, and not the other way around. Either way, I don't get the same totals as you, so let me know if I misunderstood.
Here is the pivot table way of doing it for comparison.
They are many ways of doing this but I've added column headers and chosen to use this formula to put percentages in even rows of column E and zeroes in odd rows in sheet 1:-
=IF(ISEVEN(ROW()),D3/D2*100,0)
Then I've inserted a pivot table in sheet 2 referring to my data in sheet 1 and set up the fields as shown and it's pretty automatic:-

SUMIF dynamically change summing column

I am using SUMIFS and want the sum_range dynamically to change according to the name I have of a column.
I have a table with about 100 columns. Say one of these columns is Paid_BC_items. I want a formula that looks for which column Paid_BC_items is in and somehow insert that into the SUMIF here where the Sheet4!J:J part is. I have a few other criteria here too which are fixed so they don't need to be dynamic.
=SUMIFS(Sheet4!J:J,Sheet4!$C:$C,Sheet2!$D$3,Sheet4!$E:$E, Sheet2!$C6, Sheet4!$G:$G, Sheet2!$D6)
If for example I changed the column heading to something else I want the SUMIF then to look for that column in the big tables and return that.
I know it has something to do with indexing, matching and indirects but I just can't figure it out right now.
Year Week Total Orders Paid_BC_items Free_BC_items
2014 1 971 147 104
2014 2 1565 339 213
2014 3 1289 391 209
2014 4 1171 389 228
2014 5 1163 375 240
2014 6 1298 405 330
2014 7 1233 404 292
Try using this in place of the sum range
INDEX(Sheet4!A:DZ,0,MATCH("Paid_BC_Items",A1:DZ1,0))
when you use INDEX with 0 as the row argument you get the whole column....and MATCH picks the right column based on the header
Whole formula becomes:
=SUMIFS(INDEX(Sheet4!A:DZ,0,MATCH("Paid_BC_Items",A1:DZ1,0)),Sheet4!$C:$C,Sheet2!$D$3,Sheet4!$E:$E, Sheet2!$C6, Sheet4!$G:$G, Sheet2!$D6)
to make dynamic & be able to drag across for all column headings
MATCH(indirect(d$1),A1:DZ1,0))

In excel, I need to find the maximum date based on the employee number

I have tried to use the following formula when trying to find the max date of these columns based on the employee number in my hundreds of thousands lines of data. The formula bar gives me 'yes' when it is the max, however in my cell it says 'no'. I cannot figure out what the issue is. Thanks for the help.
Tamara
Excel Max date formula Image
Formula used: =IF(AQ2=MAX(IF($C:$C=C2,$AQ:$AQ)),"YES","NO")
A B Employee Number Max?
11-Mar-13 12-Mar-13 199 NO
24-Mar-13 26-Mar-13 199 NO
1-Aug-13 6-Aug-13 199 NO
22-Dec-13 27-Dec-13 199 NO
15-Apr-13 17-Apr-13 206 NO
18-Apr-13 18-Apr-13 206 NO
8-Aug-13 10-Aug-13 206 NO
17-Oct-13 18-Oct-13 206 NO
25-Dec-13 20-Feb-14 206 YES
8-May-13 8-May-13 214 NO
You can also accomplish this without an array is all of the dates for a specific employee ID are unique--that is, you won't have two of the same date. In this case, the following formula will check that (a) the number of dates with employee ID is equal to (b) the number of dates with employee ID that are less than or equal to the current employee ID. This will only be true for the "max" date for said employee id:
=IF(COUNTIFS($C:$C,C2)=COUNTIFS($C:$C,C2,$A:$A,"<="&A2),"Yes","No")
If I understand your question correctly, you want to find the set of dates with the largest time span in between said dates. If this is the case, then I would recommend using two seperate fucntions, the =DAYS360 function and the =MAX function.
I have re-created your sheet and it will end up looking similar to this:
Here is the same picture of the same sheet with functions revealed, so that you can see how the functions are used:
The =DAYS360 function takes two inputs, and return the number of days in between two dates. The max function simply finds the largest number in a range. Please let me know if this helped.
EDIT: Also, if you want to see the actual word Max next to the largest date range, you can nest the Max fucntion from my column E within an If function, like this:
=IF(MAX(D:D)=D2,"Max","")
If I understand you correctly, do you want "YES" to appear for each employee's max date range? Assuming column AQ contains the spans between dates in columns A and B (i.e. =B2-A2 copied down), your formula should work.
This only works as an array formula, so make sure you press CTRL+SHIFT+ENTER when entering the formula, then copy it down to all cells in the same column.
=IF(AQ2=MAX(IF($C:$C=C2,$AQ:$AQ)),"YES","NO"), entered in D2 using CTRL+SHIFT+ENTERand copied down produces the following:
A B C D ... AQ
11-Mar-13 12-Mar-13 199 NO 1
24-Mar-13 26-Mar-13 199 NO 2
1-Aug-13 6-Aug-13 199 YES 5
22-Dec-13 27-Dec-13 199 YES 5
15-Apr-13 17-Apr-13 206 NO 2
18-Apr-13 18-Apr-13 206 NO 0
8-Aug-13 10-Aug-13 206 NO 2
17-Oct-13 18-Oct-13 206 NO 1
25-Dec-13 20-Feb-14 206 YES 57
8-May-13 8-May-13 214 YES 0
If you are simply looking for the greatest date range, the formula =IF(E2=MAX($E:$E),"YES","NO") entered in D2 and copied down will do the trick.

Resources