Excel: 'Sumifs' ignoring #n/a - excel

I'm trying to sum some values (both positive and negative) in a column, but it won't sum as there are #n/a values. Is there any way to get around this? Please see the example below:
Each city has lower areas that have income,crime & unemployment scores (either positive or negative).
My goal is to get city-wide income, crime and unemployment scores by summing lower areas' scores.
I'm using 'sumifs' in G2 as in =SUMIFS(C$2:C$10,$A$2:$A$10,$A2) and then dragging it to I10. In this toy example, there are only 10 rows, but my data have 1 million rows, so I cannot really do dragging. Any suggestions on that would also be helpful!
But, most importantly my problem is that I cannot use 'sumifs' due to the #N/A values. I want to ignore them.
Or simply replacing all the #N/A values with 0 in the 1 million rows would also be an option.
p.s. I have done some research on previous similar questions, but they seem to use different 'sumifs' formulae...

Using SUMIFS with <0 AND >0
You could use this:
=SUMIFS(C$2:C$10,$A$2:$A$10,$A2,C$2:C$10,"<0")+SUMIFS(C$2:C$10,$A$2:$A$10,$A2,C$2:C$10,">0")
It avoids zero (which makes no difference in this case).
It also avoids cells that contain values that are not numbers, such as NA, which often prevents summing (avoiding these solves your issue in this case).
If I understand it correctly, you want to get the sum of both positive and negative numbers (i.e. their existing values, not absolute values), however that is currently being prevented using your existing approach by the presence of the NA items.
If this is not correct, please advise.
USING AN INTERIM LOOKUP, THEN SUMIF WITH <0 AND >0
You could alternatively insert a lookup column, which concatenates the values to evaluate, and then use SUMIF.
For example, new column J:
=$A2&C2
(noting the $ for the A as well as the absence of $ for the C)
Then fill J1 to the right to K1 and L1
Then in M1:
=SUMIF(J$2:J$10,"<0",C$2:C$10)+SUMIF(J$2:J$10,">0",C$2:C$10)
Then fill right to N1 and O1
The potential issue with this approach is if you fill down millions of rows the spreadsheet may slow down significantly.
REPLACING NA
Replacing the NA with 0 will also work, but you may not wish to lose the distinction between "0" and "not shown in source data".
Please post your existing formula for the NA values if you would like to go down that path. They may be reworkable into something instead of NA that is meaningful and does not prevent summing.
It is often preferable when looking up data to trap for the possibility of NA and return some other more meaningful (or in this case, more sum-friendly) result.

Perhaps the easiest way is to use the "<>#N/A" criteria as in =SUMIFS(C$2:C$10,$A$2:$A$10,$A2, C$2:C$10, "<>#N/A") which will ignore NA in the sum column. You could add additional criteria for other conditions columns if they are interfering with numeric conditions checks.
Thanks: https://www.mrexcel.com/board/threads/sumifs-while-ignoring-n-a.1042338/ for the tip.

Related

Give Sum for matching Column and Row values - repeating variables

The first table below shows how much each person owes and who pays it (it's part of a larger model so I simplified it for our purposes here).
Our goal in the second table below is to give a sum when both the column and row value match.
For example: A (column C) paid $244.17 (D36:H48) in expenses for B (row 35).
Where am I wrong here? I have tried different methods suggested here.
This is another alternative, that only requires to extend the formula down, but not to the left, because on each row it returns an array with all column values. In cell I3 enter the following formula:
=MMULT(N(TRANSPOSE($A$3:$A$15=H3)),IF($B$3:$F$15="", 0, $B$3:$F$15))
or using LET to avoid repetition of the same range:
=LET(set, $B$3:$F$15, MMULT(N(TRANSPOSE($A$3:$A$15=H3)),IF(set="", 0, set))
Notes:
MMULT only works with numeric values, so empty cells need to be converted.
You can replace TRANSPOSE with TOROW if you want.
$-notation is not required in H3, because we extend the formula only down
Here is the output:
Note: This solution assumes header values to compare are the same, i.e. same values for Paid For (I2:M2) and Paid By (H3:H7). Which is the most common situation. That is why in the formula only Paid By column is used. If that is not the case, then the solution provided by #JB-007 is more flexible, because the values can be different, but then you need to extend the formula in both directions.
screenshot/s here refer:
=SUM($C$4:$E$6*($C$3:$E$3=C$8)*($B$4:$B$6=$B9))
(sumifs will really struglly to work across different dimensions)
PS - as you'll see most will advise sumproduct - I think it's overdue deprecation because there's very little (if anything) you can do with sumproduct that you cannot with sum. You can even do counts with sum SUM(1*($C$3:$E$3=C$8)*($B$4:$B$6=$B9))) returns the count of where these values are equivalent...
Save yourself the extra seven letters over and over! ☺

Compare multiple columns as pair-wise for Excel/Google Sheets

I am new to Excel/Google Sheets. I have a difficulty of writing a formula to compare columns as a pair-wise since the formula would be
so big as the day goes.
For example, there're 2 main columns Foo and Bar. I want to find the total number of days that Foo
and Bar are equal so the current formula is =IF(A3 = G3, 1, 0)+IF(B3 = H3, 1, 0)+IF(C3 = I3, 1, 0)+...
But this is kind of tedious because there're ~40 days to compare with. Are there any other alternatives
to write a formula in efficient way? Either Google-App-Scripts or Excel Formula is appreciated.
Cheers!
Give a try on below google-sheet formula. Adjust ranges as you need.
=ArrayFormula(SUM(IF(A3:E3=G3:K3,1,0)))
Assuming that you're needing to get such a total for each row and not merely a single row, try this:
=ArrayFormula(IF(A3:A="",,MMULT(IF(A3:F=G3:L,1,0),SEQUENCE(COLUMNS(A:F),1,1,0))))
Of course you will need to adjust the three ranges to match your own FOO and BAR ranges.
This one formula will produce all results for all rows.
The MMULT function is tricky to explain to those as yet unfamiliar with it. But it's a powerful tool. I'll add a picture I created that may best explain what it does:
By making the second matrix a simple SEQUENCE of 1s as long as the other matrix is wide, we wind up multiplying everything by 1 before adding together. And since anything multiplied by 1 is itself, this combination serves only to do a row-by-row add.
Things to keep in mind with MMULT:
1.) Every cell in every matrix must be a number or it will produce an error.
2.) As in the above formula, there are ways to use either/or conditions to turn every cell in a matrix into a number.

How to improve the formula writing and avoid repeating the entire formula depending on the condition

So, say I have at cell A1:
=IF(A2=1,A2,0)
That OK, that's a tiny formula easy to understand.
If the formula starts to grow, I would have something like:
IF(...big formula here...=1,...repeat the big formula here...,0)
It's a dummy example but the key point here is that when I repeat the big formula at the TRUE condition position the formula double its size, what can hinder the formula debugging, for example.
Is there a way to not repeat the whole formula writting at this situation?
I don't want to use any macro/VBA to do this or any other 'helper' cells.
Thanks
In this particular case you don't have to use an IF statement, can just use
=--(A2=1)
Or for some other value, say 2,
=(A2=2)*2
These work if one of the results you want is zero.
It is a little more difficult if you have an IF statement like
=IF(A2>2,A2,2)
but you can often use MAX or MIN to avoid the IF statement
=MAX(A2,2)
If you had a chain of IF statements to divide the number in A2 into ranges like
=IF(A2>=2,20,IF(A2>=1,10,0))
You could replace it with a lookup
=IFERROR(VLOOKUP(A2,{1,10;2,20},2),0)
Sometimes you can replace a series of IF statements with CHOOSE, e.g. to return "Negative", "Positive" or "Zero"
=CHOOSE(SIGN(A2)+2,"Negative","Zero","Positive")
One tricky way I have seen is to use inverse functions one of which gives an error under certain conditions, so you could try
=IFERROR((SQRT(A2-2)^2)+2,2)
but I'm not sure I could recommend it as these methods can be vulnerable to rounding errors.
See this previous question
Create a helper column -- say, col X -- that calculates your big formula. Hide the column if you don't want to confuse other spreadsheet viewers.
Then your long, difficult to debug formula becomes IF(X1=1,...X1...,0).

Sumproduct or Countif on a 2D matrix

I'm working on data from a population of people with allergies. Each person has a unique ExceptionID, and each allergen has a unique AllergenID (451 in total).
I have a data table with 2 columns (ExceptionID and AllergenID), where each person's allergies are listed row by row. This means that the ExceptionID column has repeated values for people with multiple allergies, and the AllergenID column has repeated values for the different people who have that allergy.
I am trying to count how many times each pair of allergies is present in this population (e.g. Allergen#107 & Allergen#108, Allergen#107 & Allergen#109,etc). To keep it simple I've created a matrix of 451 rows X 451 columns, representing every pair (twice actually because A/B and B/A are equivalent).
I somehow need to use the row name (allergenID) to lookup the ExceptionID in my data table, and count the cases where that matches the ExceptionIDs from the column name (also AllergenID). I have no problem using Vlookup or Index/Match, but I'm struggling with the correct combination of a lookup and Sumproduct or Countif formula.
Any help is greatly appreciated!
Mike
PS I'm using Excel 2016 if that changes anything.
-=UPDATE=-
So the methods suggested by Dirk and MacroMarc both worked, though I couldn't apply the latter to my full data set (17,000+ rows) because it was taking a long time.
I've since decided to turn this into a VBA macro because we now want to see the counts of triplets instead of pairs.
With the 2 columns you start with, it is as good as impossible... You would need to check every ExceptionID to have 2 different specific AllergenID. Better use a helper-table with ExceptionID as rows and AllergenID as columns (or the opposite... whatever you like). The helper table needs a formula like:
=COUNTIFS($A:$A,$D2,$B:$B,E$1)
Which then can be auto-filled. (The ranges are from my example, you need to change them to your needs).
With this helper-matrix you can easily go for your bigger matrix like this:
=COUNTIFS(E:E,1,INDEX($E:$G,,MATCH($I2,$E$1:$G$1,0)),1)
Again, you can auto-fill with this formula, but you need to change it, so it fits your needs.
Because the columns have the same ID2 (would be your AllergenID), there is no need to lookup them because E:E changes automatically with the auto-fill.
Most important part of the formulas are the $ which should not be messed up, or you can not auto-fill it.
Picture of my self-made example (formulas are from the upper left cell in each table):
If you still have any questions, just ask :)
It can be done straight from your original set-up with array formulas:
Please note that array formulas MUST be entered with Ctrl-Shift-Enter, before copying across and down:
In the example pic, I have NAMED the data ranges $A$2:$A$21 as 'People' and $B$2:$B$21 as 'Allergens' to make it a nicer set-up. You can see in the formula bar how that looks as a formula. However you could use the standard references like this in your first matrix cell:
EDIT: silly me, N function is not needed to turn the booleans into 1's and 0's, since multiplying booleans will do the trick. Below formula works...
SUM(IF(MATCH($A$2:$A$21,$A$2:$A$21,0)=ROW($A$2:$A$21)-1, NOT(ISERROR(MATCH($A$2:$A$21&$E2,$A$2:$A$21&$B$2:$B$21,0)))*NOT(ISERROR(MATCH($A$2:$A$21&F$1, $A$2:$A$21&$B$2:$B$21,0))), 0))
Then copy from F2 across and down. It can be perhaps improved in technique with sumproduct or whatever, but it's just a rough example of the technique....

How to get not empty row count?

I find myself doing a lot of operations on tables where I am not sure about entry count.
I simply guess that it should be lower than 100 and just do =SUM(A1:A100). Now if I have only 2 entries, all the other rows are useless for other things.
How can I solve this problem? Maybe I can automatically detect continuous values without an empty row in between or something?
I am not about performance. If I use 100 rows for some formula just to be safe in the future but only 3 rows have values present I just wasted a lot of spreadsheet space making it harder to use and read.
EDIT
To explain what I mean by saying 'waste of space'.
I don't know how many name:value pairs I will have. Maybe 5 maybe 100. So in this case I have 3 entered but 5 empty columns. That means I have wasted 2 columns of space. When I want to be sure my calculations will handle a lot of values, I just do like =SUM(A2:A100) and leave it like that but then it's impossible to place another attributes or more values.
You can use =CAUNTA() function.
COUNTA
I'm still not convinced how one could 'waste spreadsheet space' and I would recommend using simply =SUM(A:A) in such a case.
If you must sum up to the very last cell in column A, then maybe this formula would suit you:
=SUM(A1:INDEX(A:A,MATCH(9^99,A:A)))
This formula will ignore any blanks if any and count down to the last value.
Another possible (and maybe simpler) formula is with SUMIF:
=SUMIF(A:A, "<>0")
Since blanks are considered as 0, they won't get summed, but as I said, I find it much simpler to just use SUM(A:A) since blanks are zeros anyway.

Resources