Excel average with missing values - excel

I have data like the following in cells A1:F4.
Quarter | FY15Q4 | FY16Q1 | FY16Q2 | FY16Q3 | FY16Q4
Company A | 0.34% | 0.48% | 0.55% | 0.68% | ------
Company B 0.32% 0.36% 0.34% 0.35% 0.35%
Company C | 1.18% |------ |----- |----- |
I'm trying to find the average of the most recent non-missing value from last 4 columns in each row. So for:
FY15Q4, I want the average of 0.34%, 0.32%, and 1.18%
FY16Q1, I want the average of 0.48%, 0.36%, and 1.18%
FY16Q2, I want the average of 0.55%, 0.34%, and 1.18%
FY16Q3, I want the average of 0.68%, 0.35%, and 1.18%
FY16Q4, I want the average of 0.68% and 0.33% (Company C has no data for the most recent 4Qs, so it is to be ignored from the calculation of the average)
The following array formula works as I want for FY16Q3...
{=AVERAGE(IF(COUNT(B2:E2)=0,"",INDIRECT(ADDRESS(ROW(B2:E2),MAX((B2:E2<>"")*COLUMN(B2:E2))))),IF(COUNT(B3:E3)=0,"",INDIRECT(ADDRESS(ROW(B3:E3),MAX((B3:E3<>"")*COLUMN(B3:E3))))),IF(COUNT(B4:E4)=0,"",INDIRECT(ADDRESS(ROW(B4:E4),MAX((B4:E4<>"")*COLUMN(B4:E4))))))}
But for FY16Q4, the same formula structure...
=AVERAGE(IF(COUNT(C2:F2)=0,"",INDIRECT(ADDRESS(ROW(C2:F2),MAX((C2:F2<>"")*COLUMN(C2:F2))))),IF(COUNT(C3:F3)=0,"",INDIRECT(ADDRESS(ROW(C3:F3),MAX((C3:F3<>"")*COLUMN(C3:F3))))),IF(COUNT(C4:F4)=0,"",INDIRECT(ADDRESS(ROW(C4:F4),MAX((C4:F4<>"")*COLUMN(C4:F4))))))
returns the #VALUE error value.
It seems that the AVERAGE function, which usually deals well with blank cell ("") values, is struggling because of the added complexity of the array formula.
Any suggestions on how I can make this work without either (i) using find-and-replace to replace "" with a truly blank cell; or (ii) using VBA?
Surely there must be a way to make this work using only formulas...

One method without array formulas is to use helper cells. I set up another table with the same column/row headers.
Then in the first cell I put:
=IFERROR(INDEX($A:$F,MATCH($A12,$A:$A,0),IF(MATCH(1E+99,INDEX($A:$F,MATCH($A12,$A:$A,0),1):INDEX($A:$F,MATCH($A12,$A:$A,0),MATCH(B$11,$1:$1,0)))<=MATCH(B$11,$1:$1,0)-4,NA(),MIN(MATCH(1E+99,INDEX($A:$F,MATCH($A12,$A:$A,0),1):INDEX($A:$F,MATCH($A12,$A:$A,0),MATCH(B$11,$1:$1,0))),MATCH(B$11,$1:$1,0)))),"")
Then drag it across and down to fill in the table with the correct number:
Then it is just a simple AVERAGE() formula:
=AVERAGE(B12:B14)
Which when dragged over will ignore the blank cells.
A few caveats put out by #Jeeped and #XOR LX.
This only works if the "Blank" are NOT 0 formatted as something else.
As companies and quarters are added the reference table will need the same columns and rows added.

Related

Excel Offset: Automating the Reference Cell by using Index Match

If the answer was already provided, please feel free to post the link, but I did not see one.
This is a real estate example. I have a table of a lot of properties and information. One of the fields is "Occupancy." I am trying to use the AVERAGE OFFSET functions combined with CELL and INDEX/MATCH to calculate the average Occupancy of the top 5 and top 10 buildings in this table. To accomplish this, I wanted to create a "Summary Table" below the Property table to list the averages.
Specifically:
For OFFSET, in the "Reference" section, I want to look up "Occupancy" in my summary cell, use INDEX MATCH to find "Occupancy" in the table, and then;
I want the Cell Reference returned AS A VALUE and then move one row down and then highlight the top 5.
Most importantly, I want everything cell referenced without hardcoding or navigating the table to highlight the cell reference. I am not sure if this is possible, but after pouring through a bunch of articles, I am having trouble finding a solution.
To make things simple, let's pull out all the columns I had in the original table and go off the below. Assume all the data types are in the proper format:
**COLUMN A** **COLUMN B**
ROW 1 | Properties | Occupancy |
ROW 2 | Property A | 58% |
ROW 3 | Property B | 56% |
ROW 4 | Property C | 77% |
ROW 5 | Property D | 85% |
ROW 6 | Property E | 92% |
ROW 7
ROW 8 |Top 5 | | 'NOTE: I used a custom formula to format:
"Top "#. Please do not think this is text.
ROW 10|Occupancy | **CODE HERE**| 'Format is "General"
ROW 11| | |
ROW 12| | |
Here's how I initially coded it in B10:
'=AVERAGE(OFFSET(CELL("address",INDEX($A$1:$B$1,MATCH(A10,$A$1:$B$1,0))),1,0,5,1))
I am getting "there is a problem with your formula" since I am guessing it is due to the cell returning text in lieu of the cell reference as a value. So I error checked myself and stripped out the AVERAGE, OFFSET and put this is B11:
'=CELL("address",INDEX($A$1:$B$1,MATCH(A10,$A$1:$B$1,0)))
I know I get $B$1, so I then put $B$1 to replace the CELL function within B10 to see if my average offset is wrong:
=AVERAGE(OFFSET($B$1,1,0,5,1))
I now get the right solution since I went and hardcoded the reference, which does not solve the issue.
If I use INDIRECT and then CELL in B12:
'=INDIRECT(CELL("address",INDEX($A$1:$B$1,MATCH(A10,$A$1:$B$1,0))))
I then get the title "Occupancy" and not the cell reference.
Obviously, I am aware of the fact that the CELL function will return text and not a value. I can't use INDIRECT because when I do, INDIRECT will just display the title of the column in text and not return the cell reference as a value. ADDRESS is not useful because you have to hardcode and the goal is to have zero hardcodes.
How can I code finding the cell reference of specific text in a table without actually going to the table to cell reference when I am building out my OFFSET function?
As you can see, I went through a variety of ways to dissect the problem and I would appreciate anyone's help to come to a solution. Thanks in advance.

Excel: Cannot average blanks determined by formula

I have some data in B1:B10 (values) and in C1:C10 (strings) that I want to average.
My values are (from row 1-10):
B | C
-----
1 | Approved
1 | Approved
1 | Approved
1 | Approved
| N/A
| N/A
| N/A
1 | Approved
1 | Approved
0 | Disapproved
When I enter the following formula in A1 to average the data in column B, I get a result (0.857143), no problem:
=AVERAGE(B1,B2,B3,B4,B5,B6,B7,B8,B9,B10)
When I instead enter the following formula in D1, I get a #VALUE! error instead, though from what I can tell, the logic is the same (replacing N/A's with blanks):
=AVERAGE(
IF(C1="Approved",1,IF(C1="Disapproved",0,IF(C1="N/A","",""))),
IF(C2="Approved",1,IF(C2="Disapproved",0,IF(C2="N/A","",""))),
IF(C3="Approved",1,IF(C3="Disapproved",0,IF(C3="N/A","",""))),
IF(C4="Approved",1,IF(C4="Disapproved",0,IF(C4="N/A","",""))),
IF(C5="Approved",1,IF(C5="Disapproved",0,IF(C5="N/A","",""))),
IF(C6="Approved",1,IF(C6="Disapproved",0,IF(C6="N/A","",""))),
IF(C7="Approved",1,IF(C7="Disapproved",0,IF(C7="N/A","",""))),
IF(C8="Approved",1,IF(C8="Disapproved",0,IF(C8="N/A","",""))),
IF(C9="Approved",1,IF(C9="Disapproved",0,IF(C9="N/A","",""))),
IF(C10="Approved",1,IF(C10="Disapproved",0,IF(C10="N/A","","")))
)
What gives, and what do I need to change in order to get 0.857143 as a result in the formula for the strings values in column C?
Also tried changing the "if true" and "if false" parts for N/A with VALUE("") and VALUE(0). With VALUE("") it still results in #VALUE! error, and with VALUE(0) it still counts the blank into the average, which is not desired as I only want an average on the 1's and 0's
Additional info: If I split up the formula for the strings to evaluate each one on a separate line, THEN pull an average on THAT range, it works fine.. Though, considering the data set I am working with, I would rather not add them all separately, as it clutters the work space enormously.
AVERAGE won't work with text-strings in a given range of numbers. It might skip empty cells (as per your first example), but surely will error out on comparing text in a numeric equation (your second example). So try this instead:
=COUNTIF(C1:C10,"Approved")/SUM(COUNTIF(C1:C10,{"Approved","Disapproved"}))
This will leave N/A out of the equation.

Retrieve unique counts in excel

I'm looking to create a formula that counts the number of sign ups for a given day. My input data is based off an event log, and looks something like this:
AccountName | SignUpDay | EventLogged
acct1 | 06/01/2016 | 06/01/2016
acct1 | 06/01/2016 | 06/05/2016
acct2 | 06/01/2016 | 06/02/2016
acct3 | 06/01/2016 | 06/04/2016
acct3 | 06/01/2016 | 06/06/2016
acct4 | 06/03/2016 | 06/06/2016
The above is dummy data. But lets say I have 10k lines for my input data. For my output, given a specific day I want to look at the input data and return the number of signups for that particular day. What I want to achieve is something like this:
SignUpDay | Count
06/01/2016 | 3
06/02/2016 | 0
06/03/2016 | 1
I know I could probably do something like this in R, but I'm working within what I have right now, which is excel. Anyone have any ideas on how to achieve this?
With Excel's FREQUENCY() and an Array formula:
Assuming
your sheet is set up with AccountNames in A2:A100
your SignupDates are stored in B2:B100
the current Date you want to subset for is in F2
then enter:
=SUM(IF(FREQUENCY(MATCH(IF(B$2:B$100=F2,A$2:A$100,""),IF(B$2:B$100=F2,A$2:A$100,""),0),MATCH(IF(B$2:B$100=F2,A$2:A$100,""),IF(B$2:B$100=F2,A$2:A$100,""),0))>0,1))-1
and press CTRL+SHIFT+ENTER to enter this as an array formula.
With your list of sequential dates in, for example, F2:Fn, you may try this array-entered formula:
G2: =SUM(1/COUNTIF(AccountName,AccountName)*(SignUpDay=F2))
AccountName and SignUpDay should refer only to the existing data range (no blanks). If there are blanks, a more complex formula would be required. If you use a table, with structured addressing, the names can adjust automatically.
Also, the formula assumes that the same account will not sign up on more than one day. If that might be the case, a more complex formula would be required.
eg:
=SUM(1/COUNTIF(SignUpTable[[AccountName ]],SignUpTable[[AccountName ]])*(SignUpTable[ [ SignUpDay ] ]=F2))
To array-enter a formula, after entering
the formula into the cell or formula bar, hold down
while hitting . If you did this
correctly, Excel will place braces {...} around the formula.
EDIT: If your data does not fit into the above constraints, I would suggest one of the solutions linked to by #pnuts, in his comments, where you add an additional column to your data with the formula:
=IF(SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))>1,0,1)
and then construct a Pivot table with SignUpDay in the Rows area, and this new column in the Values area. A disadvantage of the Pivot Table solution is that dates with zero signups will not be represented in the table.
In Excel 2013+ it is possible to generate Unique Counts in the Values area, but you mentioned you are using Excel 2010, so that is not a possibility.

Excel - INDEX function for multiple columns of same value

I'm trying to put something together in excel that allows me to find the location of a lift at any time for a specific floor. A sample of the data is as follows:
Time | Catering Area | Waste Area | Waste Area
5:00am | L2 | L2 | L4
5:05am | L2 | L7 | L5
5:10am | B1 | L3 | L7
5:15am | B2 | L4 | L9
I have set up two dropdown fields to select a level and area of the building (eg. L7 and Waste Area). Based on these these selections, I want to show when the lift is at the desired level within the area of the building; ie:
Level Selected: L7; Building Area Selected: Waste Area
Time | Lift At L7?
5:00am | No
5:05am | Yes
5:10am | Yes
5:15am | No
I have set up an INDEX function, however I need to search across multiple columns with the same header name, ie. "Waste Area". The function so far is as follows:
INDEX($A$1:$D$5,MATCH(A10,A:A,0),[col_num])
This would then be paired with an IF statement to check whether the returned level matches the desired level from the dropdown field. The result will be a list of "Yes" or "No" for each time as shown above.
Any help would be greatly appreciated.
Try the COUNTIFS() worksheet function.
=COUNTIFS(B$1:D$1,"Waste Area",B2:D2,"L7")
=COUNTIFS(B$1:D$1,"Waste Area",B3:D3,"L7")
=COUNTIFS(B$1:D$1,"Waste Area",B4:D4,"L7")
=COUNTIFS(B$1:D$1,"Waste Area",B5:D5,"L7")
It will count the entries when the area matches and the floor matches. Non-zero = "Yes".
Because your output time column has exactly the same is the input, I assume you don't actually need to search for the correct row. If you do there are more tricks I can show you...
(edit...)
If you need to search for a specific time, replace the third argument B2:D2 with a formula that finds the correct row.
OFFSET(B$1:D$1,MATCH(A16,A$2:A$5),0)
So it becomes
=COUNTIFS( B$1:D$1, AREA_NAME, OFFSET(B$1:D$1,MATCH(A16,A$2:A$5),0), FLOOR)
Assuming your Data is in A1:D5 and that you have NAME'd your ranges (note that Waste covers two columns)
Catering =Sheet1!$B$2:$B$5
Times =Sheet1!$A$2:$A$5
Waste =Sheet1!$C$2:$D$5
Duplicate your list of Times in some cell; I used A11:A14
With your Dropdown values in the Area duplicating the defined Names, B9 containing the level selected, and D9 containing the Building area (both from the dropdown list), you can use the following formula to return TRUE or FALSE depending on if the lift is in position.
For Yes/No responses, use this formula as the Logical_test in an IF statement.
This formula must be array-entered, then fill down adjacent to each time:
=OR((A11=Times)*($B$9=INDIRECT($D$9)))
To array-enter a formula, after entering
the formula into the cell or formula bar, hold down
while hitting . If you did this
correctly, Excel will place braces {...} around the formula.

Find one cell inside a string and then populate another cell based off of that

I have a basic financial spreadsheet that I am trying to make even nicer. I have a table that looks like this:
| Description | Category | Amount
| 2341234 Chick-fil-a | Eating Out | 5.89
| 11234 Redbox/Georgetown | Entertainment | 1.21
And another table that looks like this:
| Name | Category |
| Chick-fil-a | Eating out |
| Redbox | Entertainment |
I want to find the strings from the first column of the second table in the first column of the first table and automaically populate the second column of the first table. I am pretty sure I can find the position of the string by doing a {=find(2ndSheet!A:A,1stSheetA:A)} or something of the like, but I have no idea how to populate another cell based off of that.
Does it have to be a macro?
This is a bit ugly but seems to work. It handles the situation where your description may not necessarily match the lookup table (for instance, with Redbox | Redbox/Georgetown). Replace the [Lookup_Table] with the absolute range of your lookup table, and enter as an array formula (Control+Shift+Enter):
=INDEX([Lookup_Table],MATCH(TRUE,NOT(ISERR((FIND([Lookup_Table],A2)))),0),2)
As for what it's doing:
=INDEX(
[Lookup_Table],
MATCH(
TRUE,
NOT(ISERR((FIND([Lookup_Table],A2)))),
0),
2)
It's using an INDEX formula with the [Lookup_Table] as the base range. For the row to match, it looks for instances where running the FIND formula using the current cell does not cause an error. So in the case of Redbox, you'd get an array like {#VALUE!, 7}. You then set the values to TRUE/FALSE based on whether they are errors, and then flip them so that non-errors are True. You then match the TRUE condition against that array, and a hit will return the row number in your lookup table. You then use that row number and column 2 to find the corresponding value.
Again, that is not a pretty one (and there is probably a better way to handle all of your cases), but it seems to work in my extremely exhaustive test of three rows :)
Assuming that your first table is starts in column A, substituting this formula starting in B2 should find the value for you.
=VLOOKUP(RIGHT(A2,LEN(A2)-FIND(" ",A2,1)),[Range of Helper Table],2,FALSE)
You'll need to substitute [Range of Helper Table] with the actual range. Make sure you use absolute references using $, for example: 2ndSheet!A$2:B$20. so that you can copy and paste.

Resources