excel delete row if column contains value from to-remove-list - excel

Let's say that I've got a sheet - number one - with over 5000 rows (say, columns 'A' - 'H' each).
In another sheet - number two - I have a "to-remove-list" - a single column 'A' with 400 values, each containing alphanumerical string (example: xxx1234).
I have to remove every entire row from sheet number one, if column 'E' contains any value from "to-remove-list" (from column 'A' of sheet number two).
By removing the entire row, I mean delete the row and move it up (not leaving the blankspace)
How do I achieve that? Any help would be much appreciated.

Given sheet 2:
ColumnA
-------
apple
orange
You can flag the rows in sheet 1 where a value exists in sheet 2:
ColumnA ColumnB
------- --------------
pear =IF(ISERROR(VLOOKUP(A1,Sheet2!A:A,1,FALSE)),"Keep","Delete")
apple =IF(ISERROR(VLOOKUP(A2,Sheet2!A:A,1,FALSE)),"Keep","Delete")
cherry =IF(ISERROR(VLOOKUP(A3,Sheet2!A:A,1,FALSE)),"Keep","Delete")
orange =IF(ISERROR(VLOOKUP(A4,Sheet2!A:A,1,FALSE)),"Keep","Delete")
plum =IF(ISERROR(VLOOKUP(A5,Sheet2!A:A,1,FALSE)),"Keep","Delete")
The resulting data looks like this:
ColumnA ColumnB
------- --------------
pear Keep
apple Delete
cherry Keep
orange Delete
plum Keep
You can then easily filter or sort sheet 1 and delete the rows flagged with 'Delete'.

I've found a more reliable method (at least on Excel 2016 for Mac) is:
Assuming your long list is in column A, and the list of things to be removed from this is in column B, then paste this into all the rows of column C:
= IF(COUNTIF($B$2:$B$99999,A2)>0,"Delete","Keep")
Then just sort the list by column C to find what you have to delete.

Here is how I would do it if working with a large number of "to remove" values that would take a long time to manually remove.
-Put Original List in Column A
-Put To Remove list in Column B
-Select both columns, then "Conditional Formatting"
-Select "Hightlight Cells Rules" --> "Duplicate Values"
-The duplicates should be hightlighted in both columns
-Then select Column A and then "Sort & Filter" ---> "Custom Sort"
-In the dialog box that appears, select the middle option "Sort On" and pick "Cell Color"
-Then select the next option "Sort Order" and choose "No Cell Color" "On bottom"
-All the highlighted cells should be at the top of the list.
-Select all the highlighted cells by scrolling down the list, then click delete.

For a more modern answer, bring the data into powerquery, merge the 2nd sheet into the first with a left outer join. Expand. Use drop down filter to remove any rows that don't match as null. Remove test column and file close and load back to excel

New Answer 9/28/2022
Now you can use FILTER function that simplifies it.
=FILTER(A3:B7, ISNUMBER(MATCH(A3:A7,D3:D4,0)))
Note: The question requires to modify the original data sheet, this is in a general not recommended, because you are altering the input, better to have a working sheet with the transformations required.

Related

Excel: Turn duplicates in a link into blank cells

I have a list of names that includes duplicates (all in Column A of my worksheet). What I am trying to do is convert the duplicates into blank cells. I do need to keep the values in the rows where column A ends up with blanks. In the example below, I demonstrate what I currently have in my list and the second table demonstrates what I need the result to look like. The names Mike, Bill and Jim are the duplicates that are converted to blanks, but next to those blanks I still have the values I need (Xs in columns 1 and 2). The reason I want to get blanks is because I will filter those blanks out and remove them from a master table I am working.
I used the available tools in excel to identify duplicates (conditional formatting), and then "remove duplicates" but when I do that Mike, Bill, and Jim are deleted and all of my data shifts around and it doesn't work for what I am trying to do.
I am wondering if there is a formula that I can possibly use? or perhaps a macro/vba? Any help would be greatly appreciated! Thank you.
name
column_1
column_2
bill
jim
mike
sandra
mike
x
bill
x
x
dave
x
x
jim
x
name
column_1
column_2
bill
jim
mike
sandra
x
x
x
dave
x
x
x
I have your data starting in Column A1 with a Header (called Name in A1). Then I use the formula- in Column B2 =COUNTIF($A$2:A2,A2) * kindly note, that the 1st part of the range is 'Anchored' with the absolute reference aka the '$' sign. this will start at the top of the range and when copied down will include the 1st cell (with Bill in it).
so in column B, it looks to Col A for how many times that name occurs. I would filter on Column B and show all cells NOT equal to 1 (uncheck the filter box for '1') and remove the rows and you will be left with the 1st occurance of the name.
when you delete, delete entire rows in filtered data (this will remove the spaces
Create a helper column and put the following formula into all of its cells:
=COUNTIF(A$2:A2;A2)-1
All entries with a duplicate (in a row above it) will contain a value > 0

Fill in table based a column of categories in Excel

I have a table that looks like this:
Type Value
Movie 5
Food 3
Gas 10
Food 2
.... ....
And There's a second table I want to fill in with "Value" based on their type in the first table, so that the corresponding rows look like this:
Rent Food Movie Gas Clothing ... ( appear in specific order bc they are subcategories)
5
3
10
2
The title row is already there, so I was thinking there might be some kind of lookup method to do this? How do I do that?
your second table apperas to hold one value per row but it doesn't have a label. it does correlate to the original row number, is this by design or coincidence?
if this is by design then you can use those 2 columns, hide them if you like, get a unique list of categories by copying you r abels to a new colum, removing duplicates in the data tab, then paste special transpose in c1 to create colum headers.
so column a and b remain unchanged
row 1 contains header starting at column c
your data starts at c2
this is the formula
=Iferror(vlookup(C$1,$A2:$B2,2,false),"")
drag it down and to the right
you can copy paste special values when done to remove the formulas
for something with only a hundred or thousand cells this will be one of the easier options but i would not do this on large tables, for those i would use power query or VBA
Assuming your 1st table is in Sheet1 and 2nd table is in sheet2.. you may try to fill in Sheet2!A2
=IF(Sheet1!$A2=A$1,Sheet1!$B2,"")
and drag it all the way.. Hope you get how it works.. and what you need.

Highlight duplicates, ignoring same row

I have a worksheet containing names in 2 dimensions. Each row represents a general location, every other column represents a specific slot in that location (each location has the same number of available slots), alternating with a parameter belonging to that name. There is a name in each cell. Here's a simplified version to show what my data looks like:
Location 0 ( ) 1 ( ) 2 ( ) 3 ( )
Garden Tim 3 Pete 1 Oscar 1 Lucy 2
Room1 Lucy 1 Tim 1 Lucy 5 Anna 1
Kitchen Frank 1 Frank 2 Frank 1 Lucy 1
What I want to achieve is to highlight (using conditional formatting, I'm open to alternative methods though) each entry that also appears in another row. So basically it should highlight duplicates, but ignore duplicates in the same row. The first row and column are to be excluded from the operation (no big deal, I just don't select them), as are the parameter columns (this is a big deal, as this pretty much breaks everything I've tried including the first answers given). I have access to the entire meaningful data area (all cells containing names) by the name "entries" and all meaningful entries in a given row by the name "row".
In my example above, all Tim and Lucy entries should be highlighted because they have duplicates in other rows. Pete, Oscar and Anna are unique, so they're not highlighted. Frank, while having duplicates, only has them in the same row, no other row contains Frank, so he should not be highlighted. Excel's own highlight duplicates would highlight Frank, while handling all the others correctly.
How can I modify the conditional formatting's behaviour to ignore duplicates in the same row?
The following formula (thanks to #Dave) resulted in a #VALUE! error:
=(COUNTIF(entries;B2)-COUNTIF(row;B2))>0
or you could just do (no need for an IF() when used in Conditional Formatting Formula box:
=COUNTIF($B$2:$I$4;$B2)>COUNTIF($B2:$I2;$B2)
This single formula should prevent the parameters from being highlighted
select B2:I2 and
put this (exactly) in the conditional formatting box: =AND(NOT(ISNUMBER(B2));COUNTIF($B$2:$I$4;B2)>COUNTIF($B2:$I2;B2))
Something like this:
=(COUNTIF($B$2:$E$4,B2)-COUNTIF($B2:$E2,B2))>0
The first countif counts all instances in the range, the second one subtracts the count of entries in the row. If there are more instances in the entire range than in the row it returns true

Pick Out Email From a Row

I have over 100k rows similar to:
Joe | 123 | email#domain.com
Bob | 456
Ben | 567 | Denver | email#email.com
The emails could be at any point/cell in that row and some cells have # in them.
How could I get the following output:
Joe | email#domain.com
Bob |
Ben | email#email.com
or maybe just tag the email address onto the end of the row and then hide the other columns?
I've tried various bits and pieces but am getting nowhere fast.
Now that I know you can have it in every column, Add a column left of column A. In your example you then have B1="Joe", etc.
Then put this formula in A1:
=IFERROR(OFFSET(A1,0,SUM(IFERROR(IF(FIND("#",$B1:$O1)>0,1,0),0)*COLUMN($B1:$O1))-1),"")
Adjust the range $B1:$O1 to match your needs. I suggest you make it as tight as possible because array formulas are resource-intensive.
========================
If the email addresses were always in the last column of a given row, and if there weren't any blanks in the row until the last value, you could just do that:
First, Add a column left of column A. In your example you then have B1="Joe", etc.
Then, put this formula in cell A1
=OFFSET(A1,0,COUNTA($B1:$XFD1))
and drag and drop it on all your rows. (I'm using Excel 2010, hence XFD in the above formula. Adjust as you see fit, just make sure you use a range that covers the maximum number of columns for your dataset)
A bit of fun really.
Embolden the names and for the rest Replace All # without formatting with # and Font style Bold. Copy into Word, Select All, Find what: * with Font Not Bold, Use wildcards and replace with nothing. Copy back into Excel and Go To Special, Blanks. Right click on one of selection and Deleteā€¦, Shift cells left.

Add cell string to another cell if 2 cells are the same for 2 rows

I'm trying to make a macro that will go through a spreadsheet, and based on the first and last name being the same for 2 rows, add the contents of an ethnicity column to the first row.
eg.
FirstN|LastN |Ethnicity |ID |
Sally |Smith |Caucasian |55555 |
Sally |Smith |Native American | |
Sally |Smith |Black/African American | |
(after the macro runs)
Sally |Smith |Caucasian/Native American/Black/African American|55555 |
Any suggestions on how to do this? I read several different methods for VBA but have gotten confused as to what way would work to create this macro.
EDIT
There may be more than 2 rows that need to be combined, and the lower row(s) need to be deleted or removed some how.
If you can use a formula, then you can do those:
Couple of assumptions I'm making:
Sally is in cell A2 (there are headers in row 1).
No person has more than 2 ethnicities.
Now, for the steps:
Put a filter and sort by name and surname. This provides for any person having their names separated. (i.e. if there is a 'Sally Smith' at the top, there are no more 'Sally Smith' somewhere down in the sheet after different people).
In column D, put the formula =if(and(A2=A3,B2=B3),C2&"/"&C3,"")
Extend the filter to column D and filter out all the blanks.
That is does is it sees whether the names cells A2 and A3 are equal (names are the same), and whether the cells B2 and B3 are equal (surnames are the same).
If both are true, it's the same person, so we concatenate (using & is another way to concatenate besides using concatenate()) the two ethnicities.
Otherwise, if either the name, or username, or both are different, leave as blank.
To delete the redundant rows altogether, copy/paste values on column D, filter on the blank cells in column D and delete. Sort afterwards.
EDIT: As per edit of question:
The new steps:
Put a filter and sort by name and surname. (already explained above)
In column E, put the formula =IF(AND(A1=A2,B1=B2),E1&"/"&C2,C2) (I changed the formula to adapt to the new method)
In column F, put the formula =if(and(A1=A2,B1=B2),F1+1,1)
In column G, put the formula =if(F3<F2,1,0)
In column H, put the formula =if(and(D2="",A1=A2,B1=B2),H1,D2) (this takes the ID wherever it goes).
Put the formulae as from row 2. What step 3 does is putting an incremental number for the people with same name.
What step 4 does is checking for when the column F goes back to 1. This will identify your 'final rows to be kept'.
Here's my output from those formulae:
The green rows are what you keep (notice that there is 1 in column G that allows you to quickly spot them), and the columns A, B, C, E and H are the columns you keep in the final sheet. Don't forget to copy/paste values once you are done with the formulae and before deleting rows!
If first Sally is in A1 then =IF(AND(A1=A2,B1=B2),C1&"/"&C2,"")copied down as appropriate might suit. Assumes where not the same a blank ("") is preferred to repetition of the C value.

Resources