Finding duplicate rows in excel - excel

I have an excel spreadsheet with two rows. One is for name, and the other is for id. Is there any way i can find the rows that have the exact same id and allow me to easily see those? For example i have the rows:
Name: Id
Hello 1
World 5
Mylo 1
Jack 6
Jil 9
Frank 5
So in the above example data, excel should somehow mark the rows with hello and mylo and world and frank to indicate that there are duplicate id's for those

The absolute fastest and easiest way. Conditional formatting, highlight duplicates (on the ID column). Then filter the column (presumably in a table) by the coloring (above the check boxes).

In the third column you could add this formula (cell B3 example) :
=IF(COUNTIF(B$2:B$7,"="&B2)>1,"<--Dup!","")
You will have to adjust the range B$2:B$7 to reflect the actual data range...

Related

Using two values in a sheet to filter and return values from a table in another sheet

I'm fairly new to coding and i've been googling around for the last few hours trying to solve this problem but it seems to be a little beyond what i'm able to do so i would be very grateful for some help
In Sheet1, I have a table which has columns between M - CV (175 columbs). For each column, i have an "ID number" value in row 3. From Row 6 to the end of the table, i have several "search terms" separated by commas in the column CV
In Sheet2, the corresponding "ID Numbers" are in column B. Column AN contains strings.
For each ID Number value in sheet1, i'm looking to find find all the corresponding cells in sheet2 where the ID number in Column B is the same, and Column AN of sheet2 contains at least one of the "search terms" in column CV
For each ID number, i'm hoping to join the entries in Column AN of sheet2 which match the criteria above and paste them into Row 5 of the respective column in Sheet1
I've gone around in quite a few circles trying to do this and i'm back to square 1 with no code to show for it.
I've tried to research both the autofilter function, and using for loops. The research i've done indicates that for loops are rather slow to run for a large data set.
I'm hoping to find a solution which is as easy to read and understand as possible
I hope i've given enough information for everyone to understand and help
THank you in advance
My Excel subscription has expired an I've started using Google Sheets for most of my spreadsheet work, so I tested this there. Some conversion may be required. I did this using formulas, not VBA also, not sure if that changes things for you.
If I understand correctly, you have two sheets with a shared key column, sheet 1 contains search terms across multiple columns, and sheet 2 contains search terms comma delimited in a single column.
With this setup we want to bring the search term column of sheet 2 into the correct row of sheet 1 by key using VLOOKUP. I made a named range in sheets which contained all my data on sheet 2 and called it "dst". My formula was then =VLOOKUP(A2, dst, 7, true) since my key in sheet 1 was in column A, dst was the range I was searching, my column with my delimited search terms was column 7 in relation to dst, and I had ordered sheet 2 by key. I pasted this formula relatively down all rows as needed.
We want to construct a regex string using our search terms across multiple columns in sheet 1, into a single cell. I used =JOIN("|", B2:E2) on sheet 1 since my search terms were in columns B:E, and this resulted in a regex that looked like this for me: alligator|dog|rabbit|lizard where alligator, dog, rabbit, and lizard, were all search terms in that row. Paste down relative as needed.
We want to run our regex against our search target cell containing the comma delimited search terms. I ran =REGEXMATCH(F2, G2) where F2 was my delimited search terms from sheet 2, and G2 was my constructed regex for the row. Paste down relative as needed.
A screenshot of my completed sheet 1:
Once you know which cells have matches you can do whatever you want.

amounts and ageing in excel

I have two excel files that I need to cross reference amounts in.
The first sheet looks like the below:
what I need to do is find any amount that are contained on sheet 2 and the month they fall into.
Sheet 2 looks like below:
for example on sheet 1 I have 56.49 in column C for reference AK1080117 in column A and this shows as Person 8 on sheet 2.
I can see this is correct as on sheet one it has a transaction date of 08-jan and on sheet 2 is in the column JAN.
There is no same ref that can be used between sheet 1 and 2 as sheet 1 has Reference and sheet 2 has Name.
Can anyone advise the best way to do this.
The complete sheets are hundreds of lines long.
Many thanks,
Note: Make sure your data has unique values AND it is not exceding the year 2017
If so, I have found a way to maybe do this in a few steps:
1: Add another column E to your first sheet and add this formula to second row of the column and drag it all the way down:
=SUBSTITUTE(ADDRESS(1,MONTH(B2)+1,4),"1","")
2: Now create another column F next to the freshly made one and put this formula in second row:
=MATCH(C2;INDIRECT("Sheet2!"&E2&":"&E2);0)
3: Now create a third column G and put this formula in second row to be dragged down:
=INDEX(Sheet2!A1:D15;F2;1)
4: Now you have created a cross-reference because column G will show you the person with a match.
Edit: You can combine the three formulas direclty obviously but my personal preference is to brake things down to make them easier to understand :)
=INDEX(Sheet2!A1:D15;MATCH(C2;INDIRECT("Sheet2!"&SUBSTITUTE(ADDRESS(1,MONTH(B2)+1,4),"1","")&":"&SUBSTITUTE(ADDRESS(1,MONTH(B2)+1,4),"1",""));0);1)
Good luck with it!
You need a third table that has Reference and Name. Then you can use lookup functions or table relationships to link the data together.
Ask the source of the first table to include Name as a field.

assigning a value based on another column in Excel

I am trying to find a quick way to assign a value for a column depending on the value of another.I want to increase the value of column A by one each time the value of column B changes.
Column A Column B
1 (520)998-7765
1 (520)998-7765
1 (520)998-7765
2 (450)877-4563
2 (450)877-4563
2 (450)877-4563
2 (450)877-4563
3 (650)989-7654
3 (650)989-7654
3 (650)989-7654
.... ....
I need to know if there is a formula that I can use to sort through 27,000 lines of data rather than assigning them one by one.
I am using a phone number as a unique identifier and I want the ID# to increase by one every time the phone number changes.
Please Help!!
Put value "1" in A2.
In A3 and after put the following function
=IF(B3=B2,A2,A2+1)
EDIT
You can make a single formula for all of the rows.
Put something like this in A2 and copy it down:
=IF(ROW(A2)=2,1,IF(B2=B1,A1,A1+1))
It seems like you are trying to extract the unique list of phone numbers? If so, there is a Remove Duplicates function in Excel (under the Data tab) that should do what you need.
You would select your full range (27,000 phone numbers) and run Remove Duplicates. Excel would then leave behind a single row for each unique number.
I believe Remove Duplicates is available for Excel 2007 or later.
If you are using an older version of Excel, here is a link that gives more information on filtering for a unique list:
Count unique values among duplicates

Combining Like Data in Excel

So here is my situation: I need to take two spreadsheets in excel and combine the data together so that any additional data is paired up with common data between the cells. Here's an example of what I mean.
Sheet 1
1234567, JOHN, DOE, 1234567.JPG
Sheet 2
JOHN, DOE, 6634
First and Last names are common data, but the number in the second sheet does not exist in the first. The user list in both sheets are slightly different from each other so I can't simply alphabetize the names and move the additional column over. I have about 500 users to go through and may have to use what ever solution I come up with for similar lists of users.
Any assistance would be great.
There's various techniques you can use to combine data but you'll have to be a bit more specific. For example is there a fixed number of columns that sheet 1 doesn't have that sheet 2 does?
The basic technique would be to create some sort of unique identifier, perhaps by concatanating the names together in both sheets? that way you can use VLOOKUP to put all the missing data in one sheet into the other
Not sure I understood "I cant alphabetize the names". However, if the names have the same spelling i.e John is John in both sheets you can concatenate John and Doe in sheet 1 and do same in sheet 2 and use a vlookup function. Something like
A=cellcontainingJohn&CellcontainingDoe in sheet1
B=cellcontainingJohn&CellcontainingDoe in sheet2
C= Vlookup(A,rangeforB,columnnumber)
Here's what I would do:
Select the sheet into which you want to pull data from the other. I'll assume we're pulling data from sheet 2 into sheet 1.
In sheet 2, insert a column to the left of what you have already. JOHN is now in column B, DOE in column C, and 6634 in column D.
In sheet 2, column A, row 2 (assuming you have a row of column headers) which is currently empty, use the formula
=CONCATENATE(B2,C2)
Now, head back over to sheet 1. Let's assume you also have a row of column headers in sheet 1, so the cell immediately to the right of your 1234567.jpb is E2 and it's empty. In E2, use the following formula
=IFERROR(VLOOKUP(B2&C2,'Sheet 2'!$A:$D,4,FALSE),"")
That should give you what you're asking for, if I understand your question correctly.

HOWTO ad a unique id to each unique row?

I have data in two columns:
a 1
a 1
a 2
b 3
b 4
In the list there is 4 unique rows. I would like to ad a unique id to each unique row.
Like this:
1 a 1
1 a 1
2 a 2
3 b 3
4 b 4
Of course I have many more rows and columns and date are more complex than in this example.
Anyway to do this i excel?
Mvh Kresten Buch
I have the same issue, I have developed a three formula approach to this. I could probably concatenate it if I nested them, but whatevs, this works.
Assume the data you want to 'number' is in column A, and the first row of the table is row 3.
The first column (in column B) counts occurrences of the 'value' and the range expands from the top of the table downwards as the table grows:
=COUNTIF($A$3:A3,A3)
the second column's formula also expands as the row count does, and simply adds 1 to the transaction count every time it encounters a 1 (ie first occurrence of a new unique value) in column B
=IF(B3=1,MAX($C$2:C2)+1,"")
This one worked for me even in the first row of the table btw - i was expecting to have to manually input a 1 to start the list. Having it work without the manual entry is a good thing, it means the formmulas all work even if you resort the table data into a different order.
The third one in column D uses a vlookup to find the value. Note that when vlookup finds more than one number, it always pulls the first occurrence.
=VLOOKUP(A3,$A$3:C3,3,FALSE)
Note that this will renumber all the data outcomes dynamically if you do resort the entire thing. ie the formulas all work, but the number 'assigned' to a praticular set of data might be different, as it all works from whatever order the list of items is in.
My use case for these formulas assumes that every month i paste a new set of data to the bottom of the table, some items of which are repeats from previous months - ie are already in the table, and some of which are new.
if the dynamic renumbering is a problem, use a 'row key' so you can resort back to the original order at the end.
Assuming your data is in B2:C6 please try =IF(AND(B1=B2,C1=C2),A1,A1+1) in A2, copied down
If your data is not sorted, it's more complicated... but you can use something like this in A2:
=IF(COUNTIFS($B$1:B2,B2,$C$1:C2,C2)>1,INDEX($A$1:A1,IFERROR(MATCH(B2&"-"&C2,$B$1:B1&"-"&$C$1:C1,0),1)),MAX($A$1:A1)+1)
I'm assuming that there are no headers and you have already put 1 in cell A1 for the first record.
It basically checks the whole columns above the formula and if there's already a similar record, it'll assign the previously given unique ID and if not, it'll give a new ID.
This is an array function and as such will work if you use Ctrl+Shift+Enter and not Enter alone.
The IFERROR() is there because MATCH(B2&"-"&C2,$B$1:B1&"-"&$C$1:C1,0) would return an error if it is on row 2 (the first record to check).
Once you put that in the first cell, you can fill down the formula.
I deal with this issue all the time when structuring a data set into panel data. say you have multiple columns of data, and each are unique based on the name of someone, like:
ANNE
ROSE
ANNE
FRANK
TOM
ROSE
ANNE
but instead of having each column related to Anne, Rose, Frank, or Tom, you want it to look like this:
1 ANNE
2 ROSE
1 ANNE
3 FRANK
4 TOM
2 ROSE
1 ANNE
So that each name now has a unique numerical identifier that can be used in place of the name.
Make a pivot table of your data and only place the column that has the names (or whatever the identifier may be) into the Rows section. This will single out all the different names used within the dataset. Copy and paste this pivot table anywhere on the sheet so that the names are in actual cells and not off of a pivot table. To the right of the names, enter 1 next to the first name, and then =B1+1, and so on so that you number each name with a unique value; then copy and paste this column as numbers so that their formulas are erased. Finally, just go to your original dataset and perform a VLOOKUP so that the names get attached with whatever unique value was assigned off of the pivot table. Make sure to copy and paste as numbers once done to remove the VLOOKUP formula.
Takes literally 2 minutes to do, depending on size of dataset, and is very easy. It will work perfectly every time.

Resources