Count how many values have duplicates in a column - excel

I have values in a column like:
08FHI800
08FHI800
08FHI800
07FJM933
07FJM933
89MNA900
I need a formula that tells me how many items in the column have corresponding duplicates. In this case, it would be 2.

Try this formula assuming data in A2:A100
=SUMPRODUCT((A2:A100<>"")/COUNTIF(A2:A100,A2:A100&"")-(COUNTIF(A2:A100,A2:A100&"")=1))
It will ignore blanks

=SUMPRODUCT((A1:A12<>"")/COUNTIF(A1:A12,A1:A12&"")-(COUNTIF(A1:A12,A1:A12&"")=1))
Using this data as an example
Row# ColA
1 1
2 2
3 2
4 2
5 3
6 4
7 5
8 5
9 6
10 6
11 6
12 7
Break the function apart into 3 components:
(A1:A12<>””)
COUNTIF(A1:A12,A1:A12&””)
COUNTIF(A1:A12,A1:A12&””)=1
Component 1
(A1:A12<>””) evaluates to an array containing {T, T, T, T, T, T, T, T, T, T, T, T} ---------(1)
Component 2
COUNTIF(A1:A12,A1:A12&””)
evaluates to
COUNTIF({1,2,2,2,3,4,5,5,6,6,6,7},{1,2,2,2,3,4,5,5,6,6,6,7}
and it counts the number of times each value appears in the range
This in turn evaluates to:
{1,3,3,3,1,1,2,2,3,3,3,1} -----------------(2)
(the &”” is to avoid #DIV/0 error)
Now, because of the brackets we need to evaluate (Component 1 / Component 2) first before looking at Component 3.
Component 1/Component 2 is
(A1:A12<>"")/COUNTIF(A1:A12,A1:A12&"")
So from (1) and (2),
{T,T,T,T,T,T,T,T,T,T,T,T}/{1,3,3,3,1,1,2,2,3,3,3,1}
which evaluates to:
{1,0.3333,0.3333,0.3333,1,1,0.5,0.5,0.333,0.333,0.333,1}
Now we can look at
Component 3
COUNTIF(A1:A12,A1:A12&””)=1
We already have the first bit of this:
COUNTIF(A1:A12,A1:A12&””)
from (2), which evaluates to
{1,3,3,3,1,1,2,2,3,3,3,1}
Combining this with =1 becomes
COUNTIF(A1:A12,A1:A12&””)=1
Which in turn evaluates to
{T,F,F,F,T,T,F,F,F,F,F,T}
So, finally combining this all together, we have
SUMPRODUCT({1,0.3333,0.3333,0.3333,1,1,0.5,0.5,0.333,0.333,0.333,1} - {T,F,F,F,T,T,F,F,F,F,F,T})
Now, T equates to 1 and F equates to 0 so this now becomes:
SUMPRODUCT({1-1,0.333-0, 0.333-0, 0.333-0,1-1,1-1,0.5-0,0.5-0, 0.333-0, 0.333-0, 0.333-0,1-1}
Becomes
SUMPRODUCT({0,0.333,0.333,0.333,0,0,0.5,0.5,0.333,0.333,0.333,0}
As there is only one array, SUMPRODUCT simply sums the elements
1+1+1 =3
There are 3 items which are duplicated.

Related

Finding Closest Available Non-0 or NA value

I have an excel dataset that looks something like this:
Variable 1.2018 2.2018 3.2018 ...
A 4 5 8 ...
B 4 5 n.a ...
C 4 0 5 ...
D 4 n.a 9 ...
On a separate sheet I have a summary table that extracts numbers from this dataset using an index match function.
However, I am hoping for my function to not take on 0 or n.a values. Take for example, ideally, I would wish to compare growth between A and B at 3.2018, variable B contains n.a and wouldn't be very useful. In this case I would rather then compare between A and B at 2.2018 instead.
Variable 3.2017 3.2018 Growth
A 5 8 60%
B 5 n.a #VALUE
Variable 2.2017 2.2018 Growth
A 3 5 66%
B 4 5 25%
In the other case, say I were comparing between C and D. If I were to compare them at 3.2018, I would have no problems because they do not contain 0 or n.a values. However if I were to compare them at 2.2018, then I would want the formula to take the values from 1.2018 instead.
In the above cases, I would also like to know when it is the case that the values do not come from the 'ideal' time frame.
I tried to do an "if" before the index match but in the case of the first example it will only change the number of B and not A. It also does not work if I have 2 or more 0's or na's in a row.
Do an IF() function, wherein you check if either of your 3.2018 values are either 0 or #N/A (assuming these are actually the excel value of #N/A, and not a string representation like "n.a.")... if either are true, use the 2.2018 value otherwise use the 3.2018 values
=IF( OR(IFNA(D3=0, TRUE), IFNA(D2=0, TRUE)), C2=C3, D2=D3)

Subtract cells when the condition matches in a Row

How do I subtract horizontal multiple cells when the condition matches.
If match found then return subtracted value if not then return current value.
I tried the below formula but not able to do multiple matches
=IF(ROW(A3)=2,0,D3-D2)
Date Type Content Value Answer
1-Oct-18 Type 1 Content 1 7 7
1-Oct-18 Type 1 Content 1 7 0
1-Oct-18 Type 1 Content 1 9 2
2-Oct-18 Type 2 Content 1 8 8
2-Oct-18 Type 2 Content 2 10 10
2-Oct-18 Type 2 Content 2 3 -7
Put this in E2 and copy down:
=D2-SUMIFS($E$1:E1,$B$1:B1,B2,$C$1:C1,C2)
You do not need to check the row here as your current formula does. ROW(A3) will always return the row. Thus your test statement can be reduced to 3 = 2 which will always show TRUE
The equation you are looking for is =IF(A2=A1, D2-D1, 0)

What does this formula mean in excel? (A cell equals a range)

I see the following formula in a Excel spread sheet and can not understand... Can anyone explain what the test condition "N5=N4:N741" mean?
=MIN(IF(N5=N4:N741,K4:K741))
I made some experiments and still cannot get a clue...
I'm assuming this is an array formula.
What this does is takes the minimum of the range K4:K741 where the value in N4:N741 equals the value in N5.
Let's look at a smaller example. K4:N9 is shown below.
K L M N
----------
4 | 4 2
5 | 8 7
6 | 3 4
7 | 2 1
8 | 7 9
9 | 1 7
The expression N5=N4:N9 is true in row 5 and row 9 since both of those match N5 (value = 7), giving the array {False,True,False,False,False,True} Thus IF(N5=N4:N9,K4:K9) will return {False,8,False,False,False,1} since the True values are replaced by the corresponding row in column K. The MIN() function will then ignore the False parts and return the minimum of the corresponding values in column K (the value 1 since 1 < 8).
I believe it returns an array of true and false values. I also believe the true shows up for the 3 because it is the third item in the array. but that is a guess on my part.
{false, false, true, false,false}
If you change your 5 in E1 to a 1, it will return a true.
Research all of many things excel

CountifS + multiple criteria + distinct count

I'm looking for a formula calculating : distinct Count + multiple criteria
Countifs() does it but do not includes distinct count...
Here is an example.
I have a table on which I want to count the number of distinct items (column item) satisfying multiple conditions one column A and B : A>2 and B<5.
Image description here
Line Item ColA ColB
1 QQQ 3 4
2 QQQ 3 3
3 QQQ 5 4
4 TTT 4 4
5 TTT 2 3
6 TTT 0 1
7 XXX 1 2
8 XXX 5 3
9 zzz 1 9
Countifs works this way : COUNTIFS([ColumnA], criteria A, [ColumnB], criteria B)
COUNTIFS([ColumnA], > 2 , [ColumnB], < 5)
Returns : lines 1,2,4,5,8 => Count = 5
How can I add a distinct count function based on the Item Column ? :
lines 1,2 are on a unique item QQQ
lines 4,5 are on a unique item TTT
Line 8 is on a unique item XXX
Returns Count = 3
How can I count 3 ?!
Thanks
You can download the excel file # Excel file
Newer versions of Excel allow for this problem to be solved in a (relatively) more simple way. It certainly is easier to follow and understand, conceptually.
First, filter the table based on multiple criteria (join multiple with the *):
=FILTER(Table,(Table[Column A]>2)*(Table[Column B]<5))
Then, grab the "Item" column with INDEX:
=INDEX(FILTER(Table,(Table[Column A]>2)*(Table[Column B]<5)),,2)
Next, filter for unique entries:
=UNIQUE(INDEX(FILTER(Table,(Table[Column A]>2)*(Table[Column B]<5)),,2))
Finally, perform a count:
=COUNTA(UNIQUE(INDEX(FILTER(Table,(Table[Column A]>2)*(Table[Column B]<5)),,2)))
Ugly formula, but it works.
=SUM(((FREQUENCY(IF(C2:C10>2,1,0)*IF(D2:D10<5,1,0)*(COUNTIF(B2:B10,">"&B2:B10)+1),ROW(B2:B10)-ROW(B2)))*(ROW(B2:B11)-ROW(B2))>0)*1)
I'll start with the criteria IFS:
IF(C2:C10>2,1,0)*IF(D2:D10<5,1,0)
Gives an array of 1s and 0s for the rows that satisfy both criteria. ARRAY = {1;1;1;1;0;0;0;1;0} for your example.
Where B2:B10 is the Item column, the countif formula:
COUNTIF(B2:B10,">"&B2:B10)
returns {6;6;6;3;3;3;1;1;0} where the number equals the number of item values in B2:B10 alphabetically less than the tested item value.
QQQ goes to 6 [3"TTT", 2"XXX", 1"zzz"]
TTT goes to 3 [2"XXX", 1"zzz"]
XXX goes to 1 [1"zzz"]
zzz goes to 0 [0 less than "zzz"]
Need to add 1 to this array to make sure there are no 0 values:
{7;7;7;4;4;4;2;2;1}.
So when multiplying the criteria, and the countif statement:
(IF(C2:C10>2,1,0)*IF(D2:D10<5,1,0)*(COUNTIF(B2:B10,">"&B2:B10)+1)
You get ARRAY = {7;7;7;4;0;0;0;2;0}.
FREQUENCY(ARRAY,ROW(B2:B10)-ROW(B2))
ROW(B2:B10)-ROW(B2) sets the frequency bins to {0;1;2;3;4;5;6;7;8}. So the output of the frequency formula is {4;0;1;0;1;0;0;3;0;0} where the last 0 is for all values greater than 8.
((ROW(B2:B11)-ROW(B2)>0)*1) equals {0;1;1;1;1;1;1;1;1;1}. Multiplying ARRAY by this removes the 0 count at the start: ARRAY = {0;0;1;0;1;0;0;3;0;0}. [NOTE: B11 is lowest item column cell+1 because of the added array value from the frequency formula for values over 8]
(ARRAY)>0)*1 = {0;0;1;0;1;0;0;1;0;0}
SUM this = 3.
ctrl + shift + enter, because it's an array formula.
cmd + shift + enter for mac.
You could try this:
=SUMPRODUCT(1/COUNTIF(B2:B10,B2:B10))
Credit where credit due, however ... I found it over here:
https://exceljet.net/formula/count-unique-values-in-a-range-with-countif

Spreadsheets: how do I SUM values in a column, only if the text column has a 1?

Let's say I have this data
4 1
4 0
4 1
3 0
5 1
How do I write a function (using SUM or something like that) to add all the values on the left, if the values on the right are 1, or true
The total should be 13
Assuming columns A and B are used...
Check just for 1:
=SUMIF(B1:B5,"=1",A1:A5)
If you want to check for TRUE also:
=SUMIF(B1:B5,"=1",A1:A5) + SUMIF(B1:B5,"=TRUE",A1:A5)
sort by column b, then auto-sum the values you want in column a. cheap & lazy solution.

Resources