Find common rows from a given range of rows in excel - excel

So I have to clean data where from a given range of rows maybe 2 or 3 are exact same, rest have at least one column different. I need a way to find it out as I don't want to do it manually. I've tried conditional formatting but that only works with columns.
In the image you can see rows 550:569 a few of them are exactly same. How do I highlight or find out that. I don't want to manually check each column
enter image description here

Insert a column (let's say column AG) where you put a formula like =TEXTJOIN(",",TRUE,A2:AF2)
Sort the range per the new column
Eliminate duplicate using Excel's Remove Duplicate tool.

Related

Determine if an Excel column has any value?

Question: with or without VBA, how can we determine if a specific column (say column "A") has any value in any cell in it?
Remarks: Question is about finding out whether a column (with a header, say, LastName) has any value (text/number) in it. The search is not on any specific value.
Reason for the Question:
We have more than a 1GB Excel file with about 1 million rows and several columns with headings. When we scroll down to one of the columns the column looks empty. But since the file has more than 1 million rows, we just keep scrolling down with no text found. But it is taking too much time to keep scrolling down to determine if there is any value inside the column. And, we may have to do the same for some other columns that do seem empty. So, we are looking for a better way to do it.
The issue is somewhat related to what's described in item 4 of this article: Tackling the most common errors when
trying to import a CSV
Consider Conditional Formatting
Apply to Header Row only
Select header row cells containing headers (not the whole row)
Add CF Formula, and set format to suit your preference
=COUNTA(A:A)>1
Highlights headers in columns that contain data. To highlight columns headers that don't contain data, use
=COUNTA(A:A)<=1
Note: this will consider cells that contain an empty string (eg from a formula) to contain data. If you want to consider those cells as empty, use
=COUNTBLANK(A:A)=ROWS(A:A)-1

Identifying conflicting cells with duplicate IDs in Excel

So I have an excel table setup consisting of rows with their own IDs. Each ID should be unique but we have a lot of duplicate IDs that I'm trying to remove. So the first thing I tried to do was identify and remove the duplicates using the Conditional Formatting and Remove Duplicates tool in Excel but unfortunately I'm still left with several rows that share an ID but haven't been removed. So now I have to go into them and look through them manually. I was wondering if there was a way to have excel highlight the discrepancies between the rows. I've provided an example below.
ID
number
description
123abc
3
Three
123abc
4
Three
456def
5
Five
In this sort of table the discrepancies are 3 and 4 since they belong to the same ID but differ in value so I would want Excel to highlight them. Is this possible in Excel? Thank you.
if you want just highlight duplicate and unique cell use Conditional formating. there is a rule of duplicate filter.
In the Number column, use a Conditional Format formula rule like:
=COUNTIFS($A$2:$A$3000,A2)>=2
Assuming the ID is in column A, and you may need to change the 3000.
Sorting by the ID column would also be useful.

Extract all rows from excel data set when multiple cells contains certain text

This question was asked and the answer ALMOST works for me.
THE PROBLEM
Very simply from the above dataset I wish to recreate this range but filter for only select BLOOD TYPE O.
The answer given is:
=IFERROR(INDEX(A:A,SMALL(IF(ISNUMBER(SEARCH("O",INDIRECT("$A2:$A"&COUNTA(A:A)))),ROW(INDIRECT("$A2:$A"&COUNTA(A:A))),""),ROW()-1)),"")
This works only in ROW 2. I have tried everything to get this to begin in a new row and column (I also want the data to be in a different row and column) but whenever i update the formula, nothing is returned.
ED please see this new picture:
In the image above, I place your example data set in the range C3:F13. Based on your question it sounded like you were trying to filter your list based on blood type, but I was not sure if you just wanted names, or some other combination of columns. This solution assumes you want all columns in the order they are presented. I placed the following formula in I5:
=IFERROR(INDEX($D$4:$F$13,AGGREGATE(15,6,(ROW($C$4:$C$13)-3)/($C$4:$C$13=$I$2),ROW(A1)),COLUMN(A1)),"")
In I2 is the value of the blood type you are filtering your list for.
in the formula above, adjust the ranges to suit your data range locations. The -3 in the formula is for the number of header rows before the data starts. If you have headers or other space and your first piece of data was in row 15 then you would need to change -3 to -14.

Find and Compare Two Columns Excel (With Screenshots)

I have a spreadsheet that will occasionally get new data that I don't know the contents of, I just have to add it to the spreadsheet. Some of the new data is just updating rows that are already in the spreadsheet, and other data is adding new rows. I'm looking for a way to add a column that will tell me if something has changed in the row when I compare the old spreadsheet to the new one.
The sheets have one column that will always have a unique value among all the rows, so I can use that to match rows if the sheets aren't sorted the same way. Here are some screenshots to show what I'm trying to do:
Old Spreadsheet:
New Spreadsheet:
The only solution I can think of is a large nested IF formula that compares each column one by one, something like:
=IF(Old!B2=New!B2,IF(Old!C2=New!C2,"NO","YES"),"YES")
The problem with that is that it gets very hard to look at since my actual data is using 33 columns (not including this "Changed?" column) and new columns could be added in the future.
I'm not very technical with Excel, nor have I ever used VBA, so I apologize in advance if there is a simple/obvious solution that I'm missing.
Thanks in advance for your help.
Using your example, in the 'New' sheet cell D2 and copied down:
=IF(COUNTIF(Old!A:A,A2)=0,"YES",IF(SUMPRODUCT(COUNTIF(INDEX(Old!A:AG,MATCH(A2,Old!A:A,0),0),LEFT(A2:AG2,254)&"*"))=SUMPRODUCT(COUNTIF(A2:AG2,LEFT(A2:AG2,254)&"*")),"NO","YES"))
vlookup would also work well for this problem.
in D2, the formula would be:
=IF(AND(VLOOKUP(A2,Old!A:C,2,FALSE)=B2,VLOOKUP(A2,Old!A:C,3,FALSE)=C2),"NO","YES")
The column numbers (2 and 3) are the columns that correspond to the data you are trying to match, using the ID column.
It's possible to find the appropriate column using MATCH if the column names you have match the column names in the old sheet
This would make the formula look more complex, but Excel would adjust the Old!A:C reference if more columns are inserted.
The formula would look like this to match against column names
=IF(AND(VLOOKUP(A2,Old!A:C,MATCH($B$1,Old!$1:$1,0),FALSE)=B2,VLOOKUP(A2,Old!A:C,MATCH($C$1,Old!$1:$1,0),FALSE)=C2),"NO","YES")
The difference between this and the last one is the use of MATCH($B$1,Old!$1:$1,0) to find the column (using $s to anchor the lookup values)
In this case, specialized software for Excel compare is better.
My company use this software. Check it out.
http://www.suntrap-systems.com/ExcelDiff/
http://www.youtube.com/watch?v=QQgnWr_RT-8

How can I mark duplicate values in a column based on a second value in a different column in Excel 2007?

I have been trying to mark duplicates in the same excel column based on a criteria in a different column, and I would love to have some help. In reference to the example below, I would like to highlight all the rows that is a duplicate value of another row red, and put a Y in a third column ("Delete" in the example below). When the value in the Name column is a duplicate of another, disregarding case sensitivity, I would like to mark all but one value based on a hierarchy in the Status column, i.e Excellent, Good, and Bad.
Only one of each unique value can be left unmarked, and if two share the same status with no duplicate value of a higher status then either one can be marked (the one further down the list if thats easier to specify).
I have been looking around the site and have found lots of similar entries on deleting duplicates but nothing quite the same. I need to highlight and not delete the duplicate rows and I have not been able to find anything that will let me sort based on a heirarchy in a second column. I only need to execute the command once as oppose to on a recurrence so the time it takes is not a concern to me. Any help you guys can throw my way would be greatly appreciated.
See if the sheet and steps below solves your issue.
Status_order Formula: Range("E2") =VLOOKUP(D2,$I$2:$J$4,2,FALSE)
Sort A1:E15 by Name, Then by Status_Order Smallest to Largest
Delete Formula =IF(A2=A3,"Delete","")
Fill All Formulas Down
Add Conditional Formatting on all columns to be Red if Column C = "Delete"

Resources