Identify unique and duplicate data in multiple rows across three columns - excel-formula

I have the following data:
Column A Column B Column C
AA AC AD
AC BK DD
AA AC AD
CC CA CA
CC BA CC
I need a formula to identify Row 1 and Row 3 as duplicates, without adding an additional column.
I have already tried concatenating Columns A, B and C and then using COUNTIF to identify duplicates, but I do not want an additional column.
Is there a way?

With a formula and without access to another column certainly cuts down on the options, but I think possible.
Select ColumnsA:C and apply a Conditional Formatting formula rule of:
=COUNTIFS($A:$A,$A1,$B:$B,$B1,$C:$C,$C1)>1
with formatting of your choice. That formatting should distinguish the relevant cells (provided not the same formatting as applied elsewhere).

Related

How to find non matching records from two columns while accounting for duplicate values in Excel

I have two large columns.
Column A contains 100,000 different numbers/rows. Column B contains 100,210 numbers/rows. They have the same numbers except column B has 210 extra rows. I need to be able get the values of that extra 210 rows.
The issue im having is that the numbers in these rows are not unique.
For example,
Column A contains the following numbers: 2,1,3,4,5,5,6,7
Column B contains the following numbers: 1,2,3,4,5,5,5,5,6,6,7,8
I want the outcome result to be: 5,5,6,8
I can't seem to wrap my head around a way to do this.
I have the two columns in a text file that im importing into excel. If there are better ways to do it outside of excel, I am open to it too.
With the Dynamic Array formula Filter:
=FILTER(B1:B12,COUNTIF(OFFSET(B1,0,,SEQUENCE(ROWS(B1:B12))),B1:B12)>COUNTIF(A:A,B1:B12))
Without FILTER:
Put this in the first cell and copy down:
=IFERROR(INDEX(B:B,AGGREGATE(15,7,ROW(B1:B12)/(COUNTIF(OFFSET(B1,0,,ROW(INDEX($ZZ:$ZZ,1):INDEX($ZZ:$ZZ,ROWS(B1:B12)))),B1:B12)>COUNTIF(A:A,B1:B12)),ROW($ZZ1))),"")
Try to follow these steps, supposing that Column A has less values than the Column B and the rows start at 1:
A. Create Column C.
In the cell C1 place the function: =COUNTIF(A:A;B1)
Copy this function to the rest of cells, for all items of Column B. So, cell C2 will have the function =COUNTIF(A:A;B2) and so on.
B. Create column D.
In the cell D1 place the function: =COUNTIF($B1:$B1;B1)
Copy this function to the rest of cells, for all items of Column B. So, cell D2 will have the function =COUNTIF($B$1:$B2;B2) and so on.
C. Create column E.
In the cell E1 place the function: =IF(D1<=C1,"Exists","Missing")
Copy this function to the rest of cells, for all items of Column B. So, cell E2 will have the function =IF(D2<=C2,"Exists","Missing") and so on.
D. Filter to show only the rows that Column E values are "Missing".
Of course you can combine all above 3 columns to one (e.g. in Column F), so these cells will have the functions:
F1: =IF(COUNTIF($B$1:$B1,B1)<=COUNTIF(A:A,B1),"Exists","Missing")
F2: =IF(COUNTIF($B$1:$B2,B2)<=COUNTIF(A:A,B2),"Exists","Missing")
and so on
Explanation:
In column C we count how many times the value of the respective cell
of Column B exist in the whole Column A.
In Column D we count how many times we have "met" this value in Column B so far.
In Column E we check if we have "met" the value more times that it exists in Column A. If indeed we have "met" it more times, then we mark the cell as "missing"
Tested with the example you provided and works okay.
I hope it helps!
Good luck!
EDIT - Addition of Screenshot

How to find duplicates values in rows?

How to find duplicates values in rows using conditional formatting?
before:
After:
You can use additional column where you use join columns B, C and D
=B2&C2&D2
[
and in conditional formatting on columns B, C and D use formula
=COUNTIF($E:$E;$E2)>1
[
Select the column you want to test, and then click the following in the home ribbon:
conditional formatting > highlight cells rules > duplicate values
Repeat this for each column you want to test.
Or if you only want to highlight rows where all values are the same, make a new column that concatenates all the value columns, and test that.

VBA to come up with all the combinations of two columns of data

I have two columns of data (column A, column B) and I want to list all the combinations of these in columns C and D. I.e. If column A has 5 numbers in a list and column B has 3 numbers in a list I should have 15 combinations listed in columns C and D. This is just an example the length of the data in columns A and B change dynamically.
I am pretty new to VBA, so a simple step by step guide would be appreciated.
VBA is not required for this.
In C1 enter:
=INDEX(A:A,ROUNDUP(ROW()/COUNT(B:B),0))
and copy down. In D1 enter:
=INDEX(B:B,MOD(ROW()-1,COUNT(B:B))+1)
and copy down.
You can add or remove items from either list. You can use numbers or text values in either list.
EDIT#1
To remove the unnecessary zeros at the bottom of columns C and D, In C1 use:
=IF(ROW()>COUNT(A:A)*COUNT(B:B),"",INDEX(A:A,ROUNDUP(ROW()/COUNT(B:B),0)))
before the copy-down and in D1 use:
=IF(ROW()>COUNT(A:A)*COUNT(B:B),"",INDEX(B:B,MOD(ROW()-1,COUNT(B:B))+1))
This is based on knowing that there can be only Na X Nb combinations.

How to highlight 1st, 2nd and 3rd highest from two column

I tried by selecting range then conditional formatting then cell value equal to then =large($C:$E,1) THEN fill golden color for 1st highest value. This formula applies to entire columns. I need help with applying formula within range i.e from 31 t0 41. I have values in column C i.e c31:c41 and column E i.e E31:E41 in percentage . I want golden color for 1st ,silver for 2nd and yellow for 3rd highest of two columns. Column D has names so column D can not be selected.
SAMPLE EXCEL FILE
sample data
C D E
2.54% vinu 5.69%
119.90% anand 157.34%
49.32% tanaji 7.39%
82.28% umesh 121.21%
-21.66% chandu 94.10%
-60.45% rajan -25.71%
-20.12% mule 37.02%
-16.05% jafgtap 31.085%
-3.50% kunal 282.62%
-3.27% ramesh 14.58%
-8.12% rajesh 5.86%
Select the cells C31:C41 and insert a new formatting rule using a formula. This is the formula if your data starts in row 31 and the active cell is C31.
=C31=LARGE($C$31:$E$41,1)
Note the placement of the $ signs. It is important. Format this to be gold, then
create two more rules with 2 and 3 as the last parameter for silver and bronze.
If the currently selected cell is on a different row than row 31, use the respective row number. My screenshot starts with row 1.
Select the cells in the worksheet, click the Format Painter on the Home ribbon and select cell E1 to apply the same rule to the cells in column E. In the screenshot I changed your sample numbers so column C has the third highest value.
Edit after comment:
If you explicitly want to exclude the values in column D, you can perform the Large() function on a limited list of ranges like this:
=LARGE(($C$31:$C$41,$E$31:$E$41),1)
Unfortunately, Conditional Formatting rules will not accept formulas with that level of complexity. The solution is to create three defined names with these formulas:
Gold =LARGE((Sheet11!$C$31:$C$41,Sheet11!$E$31:$E$41),1)
Silver =LARGE((Sheet11!$C$31:$C$41,Sheet11!$E$31:$E$41),2)
Bronze =LARGE((Sheet11!$C$31:$C$41,Sheet11!$E$31:$E$41),3)
Then you can use three conditional formatting rules that compare the value in the range with the values of the defined names Gold, Silver and Bronze
This post has been quiet for a while, but maybe you can help me.
I have a large file with criteria across the top, labels on the left and scores in the middle. I've correctly modified the formula above for row based eval.
=B2=LARGE($B2:$X2,1) and =B2=LARGE($B2:$X2,2) and =B2=LARGE($B2:$X2,3)
I've noticed that if there are joint second largest values, then the third one doesn't work, and but the 4th one does work. It's not a major pain.
What I want to do now is apply the conditional formatting for the second row to all other rows (about 40). I can't see a way to copy the conditional formatting and I don't really want to enter the three of them 40 times.
Any ideas?
Thanks

Matching Excel Columns

Would really appreciate some help. Pretty basic problem. In column A I have SSN's, Column B I also have SSN's and in column C I have dates associated with the SSN's in Column B. The problem the 2 different SSN's columns don't match and I need them too. Problem 2: The dates in column C need to stay associated with the same socials in column b.
This is fairly simple.
Either in a new sheet or in separate columns from your original data, create create a column that represents the same column as A or the original SSN's, this can easily be done with a simple reference formula =A1 and autofilled down. You can do the same for the second column that is a copy of the first SSN.
For the third column just use a simple INDEXand MATCHformula like this:
=INDEX(C:C,MATCH(E1,B:B,0))
This formula I have the new data in Columns E-G with this formula in column G.
What this formula is doing is looking for the value of E1 within column B (looking for the value of the first SSN within the span of the column B). It will then grab the date value from column C associated with the found value in B. This will not work if multiple of the same SSN's are found within column B.
Note: You have to set the formatting of the formula cell as a Date

Resources