Count distinct values in Excel with multiple criteria (strings) - excel

I have an excel sheet with a large table with 3 main columns (Forward, Reverse and Species).
I need to count distinct species for each pair from Forward and Reverse columns but I can't seem to make it work.
I've tried going around it but the closest thing I found was this: "=SUM(IF(("Jennifer"=$D$6:$D$27)*($B$6:$B$27<=DATE(2011, 1, 31)), 1/COUNTIFS($D$6:$D$27, "Jennifer", $E$6:$E$27, $E$6:$E$27, $B$6:$B$27, "<="&DATE(2011, 1, 31))), 0))", which comes from this website https://www.get-digital-help.com/count-unique-distinct-values-that-meet-multiple-criteria-in-excel/ . Still, it doesn't work when I apply it to my sheet.
I'm a complete newbe with excel so I'm looking for some help please.
I'll drop a print of the sheet as well.
Thanks in advance.

Well, this is the formula, you need to use to get the Sum of Distinct Species
Formula used in cell I21
=SUMPRODUCT(IF(($A$2:$A$27=$H21)*($B$2:$B$27=I$20),
1/COUNTIFS($D$2:$D$27,$D$2:$D$27,$A$2:$A$27,$A$2:$A$27,
$B$2:$B$27,$B$2:$B$27),0))
And Fill Down!

Related

How to sum the output of multiple DATEDIF results in Excel

So, I’ve created an Excel sheet for work that sums the total between two dates (on multiple different lines) with each individual lines result being in the D, M , Y format. No problem there.
Here’s a screen shot of my Excel sheet thus far.
The formula I’m using is =DATEDIF(A2,C2, “y”) &” years, “&DATEDIF(A2,C2, “ym”) &” months, “ &DATEDIF(A2,C2, “md”) &” days”
But now I need a formula to sum the results of all the lines into one aggregate total of them all. Here’s the curveball - this sheet will be used as a calculator and the list of dates will change between each use.
Any help is greatly appreciated!
Thank you!
It would probably be easiest to make three columns: one for =DATEDIF(A2,B2,"Y"), one for =DATEDIF(A2,B2,"YM"), and one for =DATEDIF(A2,B2,"MD"),
Then you can have a function in either a different column with Sum(C:C) or in the top row of that column with Sum(C2:C50) (this is assuming that you will have max 50 rows. you can set it to the bottom possible row if you want).
Here is how I did it (The sum function is a little different because I used google sheets):

Excel: Get values in non-adjacent cells based on multiple criteria

I have an excel sheet sort of like this:
I'm trying to figure out how to get the totals in cells B1 through B4.
I tried INDEX-MATCH, where I tried to match the words in A1:A4 with the words in row 7, get the numbers relative to them, and then sum them, but it was a lot of Google searching and stabbing in the dark -- every attempt returned an error.
I also tried to INDEX-MATCH the words in A1:A4 with row 7, and then nest a VLOOKUP in there where it'd get the number relative to "visits:" but that didn't work at all either.
Is INDEX-MATCH even the correct function? Any help would be much appreciated, I'm not even sure what to Google anymore.
EDIT: I need to use a search function of some kind, like the INDEX-MATCH method, rather that static formulas because the sheet will change periodically and I don't want to have to update the formula every time I add an animal.
Your data table is unusual in structure.
However, if you are gong to keep a fixed rule such that the number of visits is always offset 2 rows and 1 column from the animal type(and that itself is always in row 7), you could do:
In B1:
=SUM(IF($A$7:$AAA$7=$A1, $B$9:$AAB$9, 0))
Confirm with Ctrl-Shift-Enter, and then copy down..
DOes this work?
=SUM(IF($B$7=A1,$C$9,0),IF($D$7=A1,$E$9,0),IF($F$7=A1,$G$9,0),IF($H$7=A1,$I$9,0))
I'm not sureto have fully grasped your challenge. Yet it seems the following solution would work:
Add the following formula in each box where the number of visits is added as
=+SUMIF($A$1:$A$end;animal;$B$1:$B$end)
Where end is a number of the last cell in the first and second columns data contain the data.
And animal is the cell that contains the name of the animal.
Therefore in your simple example, the formulas on cells C9;E9;G9 and I9 would be respectively:
=+SUMIF($A$1:$A$4;B7;$B$1:$B$4) ; =+SUMIF($A$1:$A$4;D7;$B$1:$B$4); =+SUMIF($A$1:$A$4;F7;$B$1:$B$4) and =+SUMIF($A$1:$A$4;H7;$B$1:$B$4).

Taking average of certain values in one Excel column based on values in another

I have a (large) array of data in Excel of which I need to compute the average value of certain values in one column, based on the values of another column. For example, here's a snippet of my data:
So specifically, I want to take the average of the F635 mean values corresponding with Row values of 1. To take it a step further, I want this to continue to Row values of 2, Row values of 3 etc.
I'm not familiar with how to run code in Excel but have attempted to solve this by using the following:
=IF($C = "1", AVERAGE($D:$D), "")
which (to my understanding) can be interpreted as "if the values (anywhere) in column C are equal to 1, then take the average of the corresponding values in column D."
Of course, as I try this I get a formula error from Excel.
Any guidance would be incredibly appreciated. Thanks in advance.
For more complicated cases, I would use an array-formula. This one is simple enough for the AVERAGEIF formula. For instance =AVERAGEIF(A1:A23;1;B1:B23)
Array-formula allows for more elaborate ifs. To replicate the above, you could do =SUM(IF($A$1:$A$23=1;$B$1:$B$23;0))/COUNT(IF($A$1:$A$23=1;$B$1:$B$23;0)).
Looks like more work but you can create extremely elaborate if-statements. Instead of hitting ENTER, do CTRL-ENTER when entering the formula. Use * between criteria to replicate AND or + for OR. Example: SUM(IF(($A$1:$A$23="apple")*($B$1:$B$23="green");$C$1:$C$23;0)) tallies values for green apples in c1:c23.
Your sample data includes three columns with potential ifs so my guess is that you're going to need array formulas at some point.
Excel already has a builtin function for exactly this use; AVERAGEIF().
=AVERAGEIF(C:C,1,D:D)

Sumproduct or Countif on a 2D matrix

I'm working on data from a population of people with allergies. Each person has a unique ExceptionID, and each allergen has a unique AllergenID (451 in total).
I have a data table with 2 columns (ExceptionID and AllergenID), where each person's allergies are listed row by row. This means that the ExceptionID column has repeated values for people with multiple allergies, and the AllergenID column has repeated values for the different people who have that allergy.
I am trying to count how many times each pair of allergies is present in this population (e.g. Allergen#107 & Allergen#108, Allergen#107 & Allergen#109,etc). To keep it simple I've created a matrix of 451 rows X 451 columns, representing every pair (twice actually because A/B and B/A are equivalent).
I somehow need to use the row name (allergenID) to lookup the ExceptionID in my data table, and count the cases where that matches the ExceptionIDs from the column name (also AllergenID). I have no problem using Vlookup or Index/Match, but I'm struggling with the correct combination of a lookup and Sumproduct or Countif formula.
Any help is greatly appreciated!
Mike
PS I'm using Excel 2016 if that changes anything.
-=UPDATE=-
So the methods suggested by Dirk and MacroMarc both worked, though I couldn't apply the latter to my full data set (17,000+ rows) because it was taking a long time.
I've since decided to turn this into a VBA macro because we now want to see the counts of triplets instead of pairs.
With the 2 columns you start with, it is as good as impossible... You would need to check every ExceptionID to have 2 different specific AllergenID. Better use a helper-table with ExceptionID as rows and AllergenID as columns (or the opposite... whatever you like). The helper table needs a formula like:
=COUNTIFS($A:$A,$D2,$B:$B,E$1)
Which then can be auto-filled. (The ranges are from my example, you need to change them to your needs).
With this helper-matrix you can easily go for your bigger matrix like this:
=COUNTIFS(E:E,1,INDEX($E:$G,,MATCH($I2,$E$1:$G$1,0)),1)
Again, you can auto-fill with this formula, but you need to change it, so it fits your needs.
Because the columns have the same ID2 (would be your AllergenID), there is no need to lookup them because E:E changes automatically with the auto-fill.
Most important part of the formulas are the $ which should not be messed up, or you can not auto-fill it.
Picture of my self-made example (formulas are from the upper left cell in each table):
If you still have any questions, just ask :)
It can be done straight from your original set-up with array formulas:
Please note that array formulas MUST be entered with Ctrl-Shift-Enter, before copying across and down:
In the example pic, I have NAMED the data ranges $A$2:$A$21 as 'People' and $B$2:$B$21 as 'Allergens' to make it a nicer set-up. You can see in the formula bar how that looks as a formula. However you could use the standard references like this in your first matrix cell:
EDIT: silly me, N function is not needed to turn the booleans into 1's and 0's, since multiplying booleans will do the trick. Below formula works...
SUM(IF(MATCH($A$2:$A$21,$A$2:$A$21,0)=ROW($A$2:$A$21)-1, NOT(ISERROR(MATCH($A$2:$A$21&$E2,$A$2:$A$21&$B$2:$B$21,0)))*NOT(ISERROR(MATCH($A$2:$A$21&F$1, $A$2:$A$21&$B$2:$B$21,0))), 0))
Then copy from F2 across and down. It can be perhaps improved in technique with sumproduct or whatever, but it's just a rough example of the technique....

Combining INDEX/MATCH with COUNTIF

I'm working on a sensitive Excel file where I'm not allowed any VBA code, so I have to help myself with basic formulas in Excel. Please note that I am not allowed to add extra columns and the solution has to be in one single cell. What I have are two columns:
Column A has numbers from 1-10.
Column B has numbers from 1-10.
I'd like to know e.g. for all 3's in A, how many 5's are in B.
The result would be as well seen with the use of two filters, but I don't want to do this over and over again, since the size of the columns will only get bigger.
I tried to use INDEX-MATCH command in the COUNTIF, but it's not that simple. The main problem is to define how to look and search in each row and then count/sum/whatever.
Does anyone have any ideas?
Though a function not available in versions of Excel before 2007, it seems:
=COUNTIFS(A:A,3,B:B,5)
met the requirement here.

Resources