Average the sum of rows without a creating new column in Excel - excel

Here's a sample of my matrix:
A B C D E
1 0 0 1 1
0 0 0 0 0
0 0 1 1 0
0 2 1
You can think of each row as a respondent and each column as an item on a questionnaire.
My goal is to take an average of the sum of each row (i.e. total score for each respondent) without creating a new column AND accounting for the fact that some or all of the entries in a given row are empty (e.g., some respondents
missed some items [see row 5] or didn't complete the questionnaire entirely [see row 3]).
The desired solution for this matrix = 1.67, whereby
[1+0+0+1+1 = 3] + [0+0+0+0+0 = 0] + [0+0+1+1+0 = 2]/3 == 5/3 = 1.67
As you can see, we have averaged over three values despite there being five rows because one has missing data.
I am already able to take an average of the sum of rows which are only summed for non-missing entries, e.g.,:
=AVERAGE(IF(AND(A1<>"",B1<>"",C1<>"",D1<>"",E1<>""),SUM(A1:E1)),IF(AND(A2<>"",B2<>"",C2<>"",D2<>"",E2<>""),SUM(A2:E2)),IF(AND(A3<>"",B3<>"",C3<>"",D3<>"",E3<>""),SUM(A3:E3)),IF(AND(A4<>"",B4<>"",C4<>"",D4<>"",E4<>""),SUM(A4:E4)),IF(AND(A5<>"",B5<>"",C5<>"",D5<>"",E5<>""),SUM(A5:E5)))
However, this results in a value of 1 because it treats any row with some or all values values as = 0.
It does the following:
[1+0+0+1+1 = 3] + [0+0+0+0+0 = 0] + [0+0+0+0+0 = 0] + [0+0+1+1+0 = 2] + [0+0+0+0+0 = 0]/4 == 5/5 = 1
Does anyone have any ideas about how to adapt the current code to average over non-missing values or an alternative way of achieving the desired result?

You can do this more concisely with an array formula, but the short answer to fix up your existing formula is, if you have a blank cell in your sheet somewhere (say it's F1) AVERAGE will ignore blank cells so change your formula to
=AVERAGE(IF(AND(A1<>"",B1<>"",C1<>"",D1<>"",E1<>""),SUM(A1:E1),F1),IF(AND(A2<>"",B2<>"",C2<>"",D2<>"",E2<>""),SUM(A2:E2),F1),IF(AND(A3<>"",B3<>"",C3<>"",D3<>"",E3<>""),SUM(A3:E3),F1),IF(AND(A4<>"",B4<>"",C4<>"",D4<>"",E4<>""),SUM(A4:E4),F1),IF(AND(A5<>"",B5<>"",C5<>"",D5<>"",E5<>""),SUM(A5:E5),F1))
This would be one array formula version of your formula - it uses OFFSET to pull out each row of the matrix then SUBTOTAL to see if every cell in that row has a number in it. Then it uses SUBTOTAL again to work out the sum of each row and AVERAGE to get the average of rows.
=AVERAGE(IF(SUBTOTAL(2,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))=COLUMNS(A1:E1),SUBTOTAL(9,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1))),""))
Has to be entered as an array formula using CtrlShiftEnter
Note 1 - some people don't like using OFFSET because it is volatile - you can use matrix multiplication instead but it's arguably less easy to understand.
Note 2 - I used "" instead of referring to an empty cell. Interesting that the non-array formula needed an actual blank cell but the array formula needed an empty string.
You can omit the empty string
=AVERAGE(IF(SUBTOTAL(2,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))=COLUMNS(A1:E1),SUBTOTAL(9,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))))

Basically, what you're describing here for your desired result is the =AVERAGEA() function
The Microsoft Excel AVERAGEA function returns the average (arithmetic
mean) of the numbers provided. The AVERAGEA function is different from
the AVERAGE function in that it treats TRUE as a value of 1 and FALSE
as a value of 0.
With that in mind, the formula should look like this.
=SUM(AVERAGEA(A1:A4),AVERAGEA(B1:B4),AVERAGE(C1:C4),AVERAGEA(D1:D4),AVERAGEA(E1:E4))
Produces the expected result:
Note, if you want to ROUND() the result to two digits, add the following formula to it:
=ROUND(SUM(AVERAGEA(A1:A4),AVERAGEA(B1:B4),AVERAGE(C1:C4),AVERAGEA(D1:D4),AVERAGEA(E1:E4)), 2)

Related

lookup for a signal (lookup of 2 consecutive points) in EXCEL

I have a data set. There is a signal triggered at one point: it change from 1 -> 0 ( I know the column number), the column looks like this
00000111111000022222233333 (transpose this line please)
I want to write a command that do this (not necessarily a macro)
if row(x) = = 1 && row (x+1) = = 0
return x
the problem is I don't know how to use IF(AND... without the row number...
Thank you for your help in advance
Supposing the column for the signal is B starting in row 1 then in another column (say C starting in C2 ) enter
=if(AND($B1=1,$B2=0),"Trigger","")
and copy down, then filter on Trigger
The following formula will return the row number of the 1 that precedes the 0
=LOOKUP(2,1/((myRng=1)*(OFFSET(myRng,1,0)=0)),ROW(myRng))
You did not write what you want to happen if there are multiple triggers. The above will return the row number of the last trigger.
myRng cannot refer to the entire column, and could be replaced by, for example, A1:A100. If you do that, you might as well replace the OFFSET(myRng,1,0) with A2:A101 to make the formula non-volatile
Explanation:
(myRng=1)*(OFFSET(myRng,1,0)=0)
This multiplies each cell in myRng by the next cell. If the first = 1, and the second = 0, then the factors resolve to TRUE * TRUE and returns a 1. So the above will return an array looking like:
{0;0;0;0;0;0;0;0;0;0;1;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}
Dividing that array into 1 returns an array of DIV/0 errors or a 1.
The LOOKUP formula will then return the position of the last 1 and, using an array of row numbers for result_vector, will match it appropriately.

Excel: Sum up N largest numbers in non-continous array of cells

I am not that fluent in Excel, so my question is potentially very simple but nevertheless gives me a headache.
I have a speadsheet with two types of values. One absolute value, and one that is calculated as a percentage from this absolute value.
A
1 10
2 0.1
3 20
4 0.2
5 30
6 0.3
7 40
8 0.4
9 50
10 0.5
In this example, the second value is 1% from the first value (e.g. 0.1 from 10). In my actual table these values differ and the numbers are random. The % fraction from the second value depends on some key etc. So this is a simplified representation for the sake of a minimal example.
I want to determine the sum of the largest 4 (out of 5) numbers, but only from those 1% (e.g. 0.1, not 10) values. The numbers are all below each other. Basically, i want to ignore the absolute numbers (e.g. 10) and only apply the relative (e.g. 0.1) numbers.
The LARGE function determines the largest n numbers and has the following format:
=SUM(LARGE(array, k))
The array represents a continuous range in the table. However, I need to throw in a selected set of fields. Is there a way to do this with set of cells?
In other words, if i use the array I have
=SUM(LARGE(A1:A10, {1,2,3,4}))
the algorithms will always pick up 20,30,40 and 50.
Ideally, I want something like this:
=SUM(LARGE(array(A2,A4,A6,A8,A10), {1,2,3,4}))
Help?
Using your provided sample data, something like this regular formula (does not require array entry) should work for you because the percentages will always be less than or equal to 1:
=SUMPRODUCT(LARGE((A1:A10<=1)*A1:A10,{1,2,3,4}))
If you want a more flexible way of grabbing the top N numbers, you can substitute the {1,2,3,4} with the ROW() function, like so:
=SUMPRODUCT(LARGE((A1:A10<=1)*A1:A10,ROW(1:4)))
EDIT: If the only way to get relative values is if they are every other row, starting in row 2, you can use this formula instead:
=SUMPRODUCT(LARGE(INDEX((MOD(ROW(A1:A10),2)=0)*A1:A10,),ROW(1:4)))
For your simplified example, suppose that B1:B4 contains the values 1,2,3,4. Then in C1:C4 enter the array formula:
{=LARGE(IF(MOD(ROW(A1:A10),2)=0,A1:A10,-1),B1:B4)}
Similarly, the formula
{=SUM(LARGE(IF(MOD(ROW(A1:A10),2)=0,A1:A10,-1),B1:B4))}
(using Ctrl + Shift + Enter to accept as an array formula)
will give you the sum of the top 4.
This assumes that the numbers are all positive. You can replace the -1 in the formula by the min of all the values -1 if that assumption isn't valid.
Another approach, if the criteria for being a relative cell is too ad-hoc to be summarized by a simple formula but if you have a listing of the cells is to use the Indirect function:
In the above screenshot I have a listing of the cells containing the relative values. In D1 I put the formula =INDIRECT(C1) and copied down. Then, the formula
=SUM(LARGE(D1:D5,{1,2,3,4}))
returns the desired sum.
There might be a way to dispense with the helper column, though the function INDIRECT seems to not play very nicely with array formulas.
On Edit: Here is a VBA solution:
'The following function returns the sum of the largest k
'elements in range R that are at the list of indices
'if indices is left blank, then the sum of
'the largest k in R is returned
Function SumLargest(R As Range, k As Long, ParamArray indices() As Variant) As Double
Dim A As Variant
Dim i As Long, n As Long
Dim sum As Double
n = UBound(indices)
If n = -1 Then
For i = 1 To k
sum = sum + Application.WorksheetFunction.Large(R, i)
Next i
SumLargest = sum
Exit Function
Else
ReDim A(0 To n)
For i = 0 To n
A(i) = R.Cells(indices(i)).Value
Next i
For i = 1 To k
sum = sum + Application.WorksheetFunction.Large(A, i)
Next i
SumLargest = sum
End If
End Function
If you put this function in a standard code module then it can be used from the worksheet like:
=SumLargest(A1:A10,4,2,4,6,8,10)
this last returns the sum of the largest 4, drawn from the entries at 2,4,6,8,10

excel average if-function advanced

I am trying to get the average value(s) of some specific entries. I have two columns: A-which is an index column (it goes e.g. from 1 to 1000) and B which is the values column.
I know there is an AVERAGE function and there is an AVERAGE IF function, which will probably help me but I can't seem to get it working the way I need to.
What I need to do is to get the average value of the entries in column B that match this description for the index in column A: 3 + (3*n) in which n >= 0. In this case I need the average of the values in column B, whose entries in A are 3, 6, 9, 12, 15...
Is it possible to do this with excel or do you think it would be better to write a program to get those values?
Thanks for your tips!!
-Jordi
You can use an "array formula" with AVERAGE function, e.g.
=AVERAGE(IF(MOD(A2:A100,3)=0,IF(A2:A100>0,B2:B100)))
confirmed with CTRL+SHIFT+ENTER
To modify according to your comments in simoco's answer you can use this version
=AVERAGE(IF(MOD(A2:A100-11,3)=0,IF(A2:A100-11>=0,B2:B100)))
That will average for 11, 14, 17, 20 etc.
You can use SUMPRODUCT for this:
=SUMPRODUCT((MOD(A1:A1000,3)=0)*B1:B1000)/MAX(1,SUMPRODUCT(1*(MOD(A1:A1000,3)=0)))
Explanation:
MOD(A1,3) gives you 0 only if value in A1 is in form 3*n
MOD(A1:A1000,3)=0 gives you array of true/false values {FALSE,FALSE,TRUE,FALSE,..}
since False is casts to 0 and TRUE casts to 1 when multipliybg by any value, (MOD(A1:A1000,3)=0)*B1:B1000 returns you array of values in column B where corresponding value in column A is in form 3*n (otherwise zero 0): {0,0,12,0,..}
SUMPRODUCT((MOD(A1:A1000,3)=0)*B1:B1000) gives you a sum of thouse values in column B
SUMPRODUCT(1*(MOD(A1:A1000,3)=0)) gives you number of values in form 3*n in column A
and the last thing: MAX(1,SUMPRODUCT(1*(MOD(A1:A1000,3)=0))) prevent you from #DIV/0! error in case when there is no values in column A in form 3*n
UPD:
in general case, say for rule 11+3*n you could use MOD(A1:A1000-11,3)=0

Excell : how many times a value OR another value appear in a column

I have a column on Excel, I found out how to count how many times a value appears in it (ex: how many 1 in the column), but I can't find how to make it count how many times one OR another value appears in this column (ex: how many cells with the value 1 OR 2).
Here is an example of my colum:
A1 1;
A2 1;2;
A3 1;3;
A4 2;
A5 1;
A6 2;3;
A7 1;2;
In this column, if I want to find how many cells with the number 1 there are, I would do :
=COUNTIF(A1:A7,"1") and then the result would be : 5
But if I want to find how many cells have the number 1 OR the number 2, I can't find how to do, but I know the result is 7 (because the all of these cells have or the number 1, or the number 2)
The only way I found is to calculate the number of cells with the number 1 wich don't have the number 2, and to calculate the number of cells with the number 2 wich don't have the number 1, and add the sum of those to the number of cells with the value "1;2", wich gives me a long formula like:
=(COUNTIF(A1:A7,"1")-COUNTIFS(A1:A7,"1",A1:A7,"2"))+(COUNTIF(A1:A7,"2")-COUNTIFS(A1:A7,"2",A1:A7,"1"))+COUNTIF(A1:A7,"1;2")
Any one has a simpler formula?
Thank you so much if someone can resolve this!!
I'm a little confused by your formula, this part, for example
=COUNTIFS(A1:A7,"1",A1:A7,"2")
...can only ever return 0 because COUNTIFS works on an "AND" basis and no cell can be both = 1 and = 2 at the same time
and if the data is exactly as shown with semi colons then surely this formula
=COUNTIF(A1:A7,"1")
will return zero too because none of your cell values are exactly 1
Are you over-simplifying your data for your question? I don't see how that formula will give you a result of 7
Try this formula to count how many cells contain either 1 or 2 (or both)
=SUMPRODUCT((ISNUMBER(FIND(1,A1:A7))+ISNUMBER(FIND(2,A1:A7))>0)+0)
...of course it will also count a cell if it contains 22 or 11, do you want it to do so in that case?
You could create a vba function and reference it.
something like
Function MyOrTest(cell)
If InStr(1, cell.Value, 1) Or InStr(1, cell.Value, 2) Then
MyOrTest = 1
Else
MyOrTest = 0
End If
End Function
You will need to add it to a new Module in VBA. Using this will allow just a normal sum function.

Excel Solver: Solving based on an average

I have a parameter in A1 that influences "TOTAL" in a random and very high standard deviation. Lets say A1 is 2...then TOTAL Values could be 1...5...17...3...2..2...etc If A1 is 1 then TOTAL Values could be 1....3...5..15...9...10..etc
I would like solver to figure out which value in A1 would equate to the best AVERAGE of TOTAL after X runs. Where I can define X.
In my example you can tell that A1=1 is better on average after 6 runs. However, if you run solver normally it would say A1=2 is the best, because it produced a value of 17.
This doesn't seem to be the kind of problem you solve with solver. Why not write a macro that loops through the values of A1, X times, keeping a running sum of the TOTAL values for each A1? When it's all over, the largest sum is also the largest average.
The inner loop will be something like this:
Redim tSum(1 to maxA1)
for i = 1 to maxA1
tSum(i) = 0
for j = 1 to X
[A1] = i
Application.calculate
tSum(i) = tSum(i) + TOTAL
next j
next i
'now step through tSum. The index of the largest value
' is the value of A1 desired. Put it in a handy cell.
It has to be a macro, not a function because it changes A1.

Resources