Check the number of unique cells in a range - excel

I have an excel sheet.
Under column E, I have 425 cells with data. I want to check if the same data (i.e. text inside the cell) is repeated anywhere else in any of the remaining 424 cells under column E. How do I do this?
For example, in E54 I have
Hello Jack
How would I check this value to see if it was in any other of these cells?

You could use
=SUMPRODUCT(1/COUNTIF(E1:E425,E1:E425))
to count the number of unique cells in E1:425
An answer of 425 means all the values are unique.
An answer of 421 means 4 values are duplicates of other value(s)

Use Conditional Formatting on all the cells that will highlight based on this formula:
COUNTIF(E:E,E1) <> 1
This is based on the column being E, and starting on E1, modify otherwise.
In Excel 2010 it's even easier, just go into Conditional Formatting and choose
Format only unique or duplicate values

If you have to compensate for blank cells, take the formula supplied above by #brettdj and,
Adjust the numerator of your count unique to check for non-blanks.
Add a zero-length string to the COUNTIFS's criteria arguement.
=SUMPRODUCT((E1:E425<>"")/COUNTIF(E1:E425,E1:E425&""))
Checking for non-blank cells in the numerator means that any blank cell will return a zero. Any fraction with a zero in its numerator will be zero no matter what the denominator is. The empty string appended to the criteria portion of the COUNTIF is sufficient to avoid #DIV/0! errors.
More information at Count Unique with SUMPRODUCT() Breakdown.

This formula outputs "unique" or "duplicates" depending if the column values are all unique or not:
{=IF(
SUM(IF(ISBLANK(E1:E425),0,ROW(E1:E425)))
=
SUM(IF(ISBLANK(E1:E425),0,MATCH(E1:E425,E1:E425,0)))
,"unique","duplicates")}
This is an array formula. You don't type the enclosing {} explicitly. Instead you enter the formula without {} and then press cmd-enter (or something else if not a Mac - go look it up!) If you want to split your formula over multiples lines for readability, use cmd-ctrl-return on a Mac.
The formula works by comparing two SUM() results. If they are equal, all the nonblank entries (numeric or text) are unique. If they are not equal there are some duplicates. The formula does not tell you where the duplicates are.
The first sum is what you get by adding up the row numbers of every non-blank entry.
The second sum does a lookup of each nonblank entry using MATCH(). If all entries are unique, MATCH() finds each entry at its own position, and the result is the same as the first sum. But if there are duplicate entries then a later duplicate will match an earlier duplicate and the later duplicate will contribute a different value to the sum, and the sums won't match.
You might have to adjust this formula:
if you want cells containing "" to count as blank, then use LEN(...)=0 for ISBLANK(...). I suppose you could put other tests in there if you wanted, but I have not tried that.
if you want to test an array not starting at row 1, then you should subtract a constant from ROW(...).
if you have a huge column of cells, you might get integer overflow when computing this sum. I don't have a solution to that.
It's a shame that Excel does not have an ISUNIQUE() function!

This may be a simpler solution. Assume column A contains data in question. Sort on that column. Then, starting in B2 (or first non-blank cell, use the following formula:
=IF(A2=A1,1,0).
Than sum on that column. When sum = 0, all values are unique.

highlight E and on the home tab select conditional formatting > Highlight Cell Rules > Duplicate Values...
It will then highlight everything that is repeated.

Related

Countifs does not work when a range with multiple column is selected

I need a count if function that counts me the cells that meet a certain criteria. This should be done with countifs. The formula is the following:
=COUNTIFS(Orders!D:D;"*Ecolab*";Orders!B:B;">=01/01/2019";Orders!U:U;">=36";Orders!K:Q;">=1")This formula returns me an value type error.
This formula works well until I introduce the last condition orders!K:Q;">=1"
I would like a formula that counts if the word Ecolab is present in the cell; if the date is after or equal 01/01/2019; if the column U has more or equal than the number 36 and if there is at least a "1" in the cells in the row from column K to column Q. I could do this by easily replicating the countifs several times, (i.e =COUNTIFS(Orders!D:D;"*Ecolab*";Orders!B:B;">=01/01/2019";Orders!U:U;">=36";Orders!K:K;">=1")+COUNTIFS(Orders!D:D;"*Ecolab*";Orders!B:B;">=01/01/2019";Orders!U:U;">=36";Orders!L:L;">=1")+...........+COUNTIFS(Orders!D:D;"*Ecolab*";Orders!B:B;">=01/01/2019";Orders!U:U;">=36";Orders!Q:Q;">=1")
But I would rather not include such a long formula as it would create confusion for the ultimate user of the excel sheet
Per my comment above, you could use SUMPRODUCT (avoid using whole columns for that) or an array with OFFSET like this:
=SUM(COUNTIFS(Orders!D:D;"*Ecolab*";Orders!B:B;">=01/01/2019";Orders!U:U;">=36";OFFSET(Orders!J:J;0;{1;2;3;4;5;6;7});">=1"))
If the count for K:Q should be 1 when there may be more than one cell greater or equal to 1 in a single row then you need to apply OR criteria in a SUMPRODUCT.
SUMPRODUCT formulas should not use full column references; there is too much wasted calculation. The following is for rows 2:99; adjust for your own use.
=SUMPRODUCT(--ISNUMBER(SEARCH("ecolab", Orders!D2:D99)),
--(Orders!B2:B99>=DATE(2019, 1, 1)),
--(Orders!U2:U99>=36),
SIGN((Orders!K2:K99>=1)+(Orders!L2:L99>=1)+(Orders!M2:M99>=1)+(Orders!N2:N99>=1)+(Orders!O2:O99>=1)+(Orders!P2:P99>=1)+(Orders!Q2:Q99>=1)))

In Excel, how do I get the header corresponding to the max value from a subset of a range?

I'm pushing beyond my Excel knowledge here. I'm trying to do a poll like thing in Excel. My problem lies on showing the selected result. Here's what I have so far:
I need to select the header corresponding to the cell with the highest value in the range B2:G2 (type 1). However, if there's a tie, I need to select the header corresponding to the highest value in the range B3:G3 amongst the cells with highest values in the range B2:G2.
In my sample, column "bb" and "cc" both share highest value on type 1 (5). So, in order to determine the winner, I need to compare the highest value for type 2 between them. Since "bb" is 0 and "cc" is 1, I expect "cc" as final result.
Components for formula are below:
J2: Displays the count of cells on line 2 with the highest value in the range. So, 2. I did that with COUNTIF comparing with MAX.
K2: Displays the first header it finds with the highest value on line 2. I managed with the following formula:
=INDEX($B$1:$G$1;0;MATCH(MAX($B$2:$G$2);$B$2:$G$2;0))
To be honest, I don't fully understand that formula. Did it with help of tutorials from the internet.
I2: Displays "TIE" when there's a tie on range B2:G2. Otherwise display the winning header (K2).
J3: Displays the number of cells with the maximum value on range B3:G3 but only considering winning cells from line 2. I did that with COUNTIFS.
=COUNTIFS(B3:G3;LARGE(B3:G3;1);B2:G2;MAX(B2:G2))
Edit: Just found out by entering number "4" on B3 that this formula above is also not working...
I3: Should follow the same pattern as the cell above. Displays "TIE" when there's still a TIE. Otherwise would display winning header (to be presented on K3).
K3: I don't know what to put here. Probably because I don't quite understand that formula with INDEX, MATCH and so on, I can't figure out a way to check the highest value between the two "winning" columns from the line above and get the header.
Could somebody help me with this?
First, let's establish if there is a tie. As you have discovered, you can do this by counting how many times the highest number appears in the range.
=COUNTIF($B2:$G2;MAX($B2:$G2))
If that count is more than 1, then there is a tie.
=IF(COUNTIF($B2:$G2;MAX($B2:$G2))>1;"TIE";"no tie")
In case of a tie you want to involve the values in row 3 as a tie breaker. We could add them to the values in row 2 using this array formula. You must confirm the array formula with Ctrl+Shift+Enter, not just Enter, otherwise it won't work.
=INDEX($B$1:$G$1,MATCH(MAX(((IF(B2:G2=MAX(B2:G2),MAX(B2:G2),0))+B3:G3)),INDEX((B2:G2+B3:G3),0)))
You only want to factor in row 3 if there is a tie, though, so you can re-use the IF statement from above and replace the "tie" in the formula above with the array formula and remember to press Ctrl+Shift+Enter!!
=IF(COUNTIF($B$2:$G$2,MAX($B$2:$G$2))>1,INDEX($B$1:$G$1,MATCH(MAX(((IF(B2:G2=MAX(B2:G2),MAX(B2:G2),0))+B3:G3)),INDEX((B2:G2+B3:G3),0))),"no tie")
You already have the formula to look up the value if there is no tie.
My system uses the comma as the list separator. I have manually replaced these with semicolons in the formulas I posted, but please bear with me if I may have missed one.
Now you can copy these formulas down to row 3. If there is a tie in the data in row 3, you will need data in row 4 to break the tie.
To understand the Index/Match combo, start with your first formula and read it from the inside out. The Max() finds the largest number. The Match() returns the position, i.e. column number, of the largest number in the range B2 to G2, i.e. 2 (the second column in the range). Index looks at B1 to G1 and returns the column value from the position that the Match returned, i.e. the 2nd column, which is the text bb.
Using row 3 as the tie breaker, the formula works pretty much the same, only that rows 2 and 3 are added together when the value in row 2 is the Max value and then that number is used to find the Max and the Match.
Here is an approach with sumproducts. I dont really inderstand what results you want in I3, J3, and K3. will try to workout.
I2:
=IF(SUMPRODUCT(--(B2:G2=MAX(B2:G2)))>1,"TIE","")
J2:
=SUMPRODUCT(--(B2:G2=MAX(B2:G2)))
K2:
=IF(B7>1,OFFSET(B1,0,SUMPRODUCT(--(B3:G3=MAX(B3:G3))*--(B2:G2=MAX(B2:G2))*{0,1,2,3,4,5})),OFFSET(B1,0,SUMPRODUCT(--(B2:G2=MAX(B2:G2))*{0,1,2,3,4,5})))
the {0,1,2,3,4,5} refers to the number of headers, if there are more, this array needs to changed

Splitting a column by consecutive zeros

I'm new to excel and I am struggling to do something very easy. I have a column of unsorted numbers. I want to set all of the numbers above the last two zeros in consecutive rows in the column to zero (and also in another column set all of the numbers below the first two consecutive zeros to zero).
For example, I have highlighted all of the cells in column A that are above the last two consecutive zeros. I want a method (VBA or formula) that will set these cells to zero and create the output in seen in column B.
Would anyone be able to help?
There are various ways of doing it but I would be inclined to use a helper cell (say D1) to work out the last row where that row and the next row contain a pair of zeroes:-
=MAX(IF((A1:A19+A2:A20)=0,ROW(A1:A19)))
(this is an array formula and must be entered with CtrlShiftEnter)
Then enter the following formula in B1:
=IF(ROW()<$D$1,0,A1)
If you had negative numbers as well as positive, this would be better in D1:-
=MAX(IF((A1:A19=0)*(A2:A20=0),ROW(A1:A19)))

Excel: Find intersection of a row and a column

My question is how can I find an intersecting cell of a specific column and row number?
My situation is this: with some calculations I find two cells, lets say B6 and E1. I know that I need a row of the first one and a column of the second one. So I could just use ROW and COLUMN functions to get the numbers. After that, I need to find an intersecting cell. Which would be E6 in this example.
I would just use INDEX(A1:Z100;ROW;COLUMN) but I don't know the exact area that I'm going to need - it depends on other stuff. I could use something like A1:XFG65000 but that is way too lame. I could also use a combination of INDIRECT(ADDRESS()) but I'm pulling data from a closed workbook so INDIRECT will not work.
If this would help to know what is this all for - here's a concrete example:
I need to find limits of a section of a sheet that I would work with. I know that it starts from the column B and goes all the way down to the last non-empty cell in this column. This range ends with a last column that has any value in first row. So to define it - I need to find the intersection of this last column and the last row with values in B column.
I use this array formula to find the last column:
INDEX(1:1;MAX((1:1<>"")*(COLUMN(1:1))))
And this array formula to find the last row:
INDEX(B:B;MAX((B:B<>"")*(ROW(B:B)))
Last column results in E1 and last row results in B6. Now I need to define my range as B1:E6, how can I get E6 out of this all to put into the resulting formula? I've been thinking for a while now and not being and Excel expert - I couldn't come up with anything. So any help would really be appreciated. Thanks!
You can use an Index/Match combination and use the Match to find the relevant cell. Use one Match() for the row and one Match() for the column.
The index/match function to find the last cell in a sheet where
column B is the leftmost table column
row 1 is the topmost table row
data in column B and in row 1 can be a mix of text and numbers
there can be empty cells in column B and row 1
the last populated cell in column B marks the last row of the table
the last populated cell in row 1 marks the last column of the table
With these premises, the following will return correct results, used in a Sum() with A1 as the starting cell and Index to return the lower right cell of the range:
=SUM(A1:INDEX(1:1048576,MAX(IFERROR(MATCH(99^99,B:B,1),0),IFERROR(MATCH("zzzz",B:B,1),0)),MAX(IFERROR(MATCH(99^99,1:1,1),0),IFERROR(MATCH("zzzz",1:1,1),0))))
Since you seem to be on a system with the semicolon as the list delimiter, here is the formula with semicolons:
=SUM(A1:INDEX(1:1048576;MAX(IFERROR(MATCH(99^99;B:B;1);0);IFERROR(MATCH("zzzz";B:B;1);0));MAX(IFERROR(MATCH(99^99;1:1;1);0);IFERROR(MATCH("zzzz";1:1;1);0))))
Offset would seem to be the way to go
=OFFSET($A$1,ROW(CELL1)-1,COLUMN(CELL2)-1)
(The -1 is needed because we already have 1 column and 1 row in A1)
in your example, =OFFSET($A$1,ROW(B6)-1,COLUMN(E1)-1) would give the value in E6
There is also ADDRESSS if you want the location: =ADDRESS(ROW(B6),COLUMN(E1)) gives the answer $E$6
The following webpage has a much easier solution, and it seems to work.
https://trumpexcel.com/intersect-operator-in-excel/
For example, in a cell, type simply: =C:C 6:6. Be sure to include one space between the column designation and the row designation. The result in your cell will be the value of cell C6. Of course, you can use more limited ranges, such as =C2:C13 B5:D5 (as shown on the webpage).
As I was searching for the answer to the same basic question, it astounded me that there is no INTERSECT worksheet function in Excel. There is an INTERSECT feature in VBA (I think), but not a worksheet function.
Anyway, the simple spacing method shown above seems to work, at least in straightforward cases.

How to use IF and SUM in excel to count unique entries in a row?

Basically I have a large set of data in excel, and I was wondering how to count across a row how many cells are not #N/A?? I think it should be possible with IF and SUM but I'm not entirely certain.
To count all values except blanks and #N/A errors try COUNTIFS like this for data in row 2
=COUNTIFS(2:2,"<>#N/A",2:2,"<>")
If you don't want to count duplicates then this version will give you a count of all different values (except blanks and errors)
=SUM(IF(1-ISERROR(2:2),(2:2<>"")/COUNTIF(2:2,2:2&"")))
that's an "array formula" that needs to be confirmed with CTRL+SHIFT+ENTER
Note that the first formula uses COUNTIFS function and therefore will not work in versions of excel before 2007 - this is an alternative that will work in those versions
=COUNTA(2:2)-COUNTIF(2:2,"#N/A")
Try using =COUNTIF(RANGE, VALUE), here's an example that will count the numer
=COUNTIF(A:A, "Yes")
or
=COUNTIF(A1:D16, "Yes")
To count the cells that contain a value (I.E., are not empty) then use `=COUNTA(A:A)
When you want to "mark" the duplicates, use this in an empty column:
=COUNTIF($A$2:$A2,A2)>1
Puth the formula is row 2 and copy this all the way down to the last used row.
(What I usually do: Somewhere in column A, press [Ctrl]+[Down], to jump to the last item, then move sideways to the column where you want to put your formula in and put something e.g. an "X". Then jump all the way up [Ctrl]+[Up], put the formula in row 2, copy it and press [Shift]+[Ctrl]+[Down] to mark the wole range in this column from row 2 to the last used row, and press [Enter] to paste your formula.)
In this formula, the search area increases, the further you copy this down.
So this first time a duplicate item is found, the value will be 1 (i.e. false) the second, third or more times this duplicate item is found, the value will be greater than 1 and give a value of true.

Resources