Excel - Remove duplicate row if column value is null - excel

If I have 3 rows in Microsoft Excel with the following data
id name address
id01 john rundown avenu
id02 bill maptown drive
id01 john null
What is the easiest way to remove the third row because id01 already exist AND already has an address? Formula or ribbon buttons...I don't care how.
Thanks for any advice!

There are a number of ways you could do this including a vba procedure. However, one easy way without needing VBA would be to use the next available column to mark rows for delete. If this was column D using the example above then you would paste the following formula into cell D2:
=AND(COUNTIF(A$2:A2,A2)>1, C2="null")
This can then be pasted down the remaining rows. The A$2 reference will remain the same (because of the dollar) and the other A2 references will change relative to the cell they are pasted to.
You can then set auto filter, to only the true records, delete these rows and then unfilter.
Let me know if you would prefer an automated solution as the VBA for this would also be pretty straight forward.

Related

MS Excel - Match 3 columns between two sheets. When matches are found, copy a 4th column from each to a third sheet

When the all three columns "Last Name, First Name and DOB" match on any rows between the 2 sheets, I need to have the account numbers from the matching rows listed on a third sheet. There are thousands of rows in each sheet. There will likely be multiple matches for some accounts. I prefer to put the functions on the 3rd sheet so that I can change out the lists in the first 2 sheets without needing to update them.
Sheet1
Acct # Last Name First Name DOB
89158 Stevens John 1/23/2012
Sheet2
Acct # Last Name First Name DOB
124578 Stevens John 1/23/2012
Sheet3
Sheet1 Acct # Sheet2 Acct#
89158 124578
Thank you in advance!
This works with duplicate values
The formula will need to be entered as an array (once copy and pasted while still in the formula bar hit CTRL+SHIFT+ENTER) and the range adjusted to fit your total values.
=IFERROR(INDEX(Sheet1!$A$2:$D$50,MATCH(1,($C2=Sheet1!$B$2:$B$50)*($D2=Sheet1!$C$2:$C$50)*($E2=Sheet1!$D$2:$D$50),0),1),"No match found")
The exact same formula can be used again changing Sheet1! to Sheet2!.
This will search for the last name that is in Sheet3!C2, first name that is in Sheet3!D2 and the DOB that is in Sheet3!E2.
I have only locked off the column in case you are looking to use this for a large volume of data and wish to drag it down.
If you want to display the additional account numbers that meet the search criteria and are only using sheet 3 to search for one person, then you will need to look at using INDEX(), MATCH() and SMALL().
I had looked to include this alternative in my answer too but I am now leaving the office. It won't take me long to conjure if you struggle so drop me a comment and I will be happy to explain how it all works.
EDIT: To list all ID's found for the search criteria - Leave blanks where not found
=IFERROR(INDEX(Sheet1!$A$2:$D$50,SMALL(IF(COUNTIF($C$2,Sheet1!$B$2:$B$50)*COUNTIF($D$2,Sheet1!$C$2:$C$50)*COUNTIF($E$2,Sheet1!$D$2:$D$50),ROW(Sheet1!$A$2:$D$50)-MIN(ROW(Sheet1!$A$2:$D$50))+1),ROW(Sheet1!1:1)),1),"")
Again, replace sheet 1 for sheet 2 and update the ranges to match what you are searching through, use CTRL+SHIFT+ENTER in the formula bar to make the formula an array then you can drag it down to cover all the potential matches, best to aim for what you believe would be the highest number of duplicate IDs.
Let me know how you get on, if this answers your question, please mark this as an answer using the tick to the left, thank you.
I ended up putting list A and List B into a sheet and then using this formula: =ISNA(MATCH(B2&C2&D2,I:I&J:J&K:K,0)) Then I created another sheet where I added list B and then list A and used the same formula. Once the calculation was complete for each sheet, I used filters to only list the matches for the first lists in each spreadsheet.
I then took the matches for each sheet and listed them side by side in the third sheet. I made sure the sort was matched between the two lists and then was able to scroll through a page at a time and identify a few users with multiple account numbers in each system.
It's not as automated as I would like, but it is done for now. Thank you Stack Overflow!

Can I pull correct values for a variable from an old spreadsheet into a new one that's missing those values?

I have an excel spreadsheet with several columns, each representing different variables collected from various patients (rows). One of the columns is the unique medical record #, another is a unique visit identification #. The problematic one is "age." I must have inadvertently dragged and replaced the ages of about half of my subjects, since I doubt that >3000 of my 6000 patients are 54 years old.
I have the original file with correct ID# and age pairs, but I've done considerable work on this file and cannot start over. Is there a way in my new file to look at the ID# in column C, go to the old excel file, find that ID#, go over 3 cells to column F (age), copy that age value, go back to the new excel file and paste the correct age for each ID#?
I cannot simply sort both files by ID# and copy/paste all of the ages as a number of the cases have been intentionally removed and so the ID#s wouldn't match up because the total N is different.
I also have SPSS and R available to me, although I'm not particularly proficient with either.
Just, as an example, here's what the two spreadsheets look like:
http://imgur.com/OjZsLEJ
I've manually highlighted the bad values, but in reality there are 3000+ of them and manually checking would be very time consuming.
Thanks in advance!
A VLOOKUP function should work here:
=VLOOKUP(C3,[OldWorkBook.xlsx]Sheet1!$C:$F,3,FALSE)
If you place this function in Column C, Row 3 of the New Workbook and then change "OldWorkBook.xlsx" in the function to reflect the name of your old Workbook, it should return the correct value from your old Workbook.
You can then copy that formula and paste it into the remaining cells in that column.
If the values are correct, you can copy them, Right-Click and select "paste values" to solidify them in your new workbook.
If I've understood your question, that should fix the problem. If not, please let me know.
You can do that with a VLOOKUP formula.
It should look like this (check if the cell references are right, and also the file and sheet name).
You should put this in a new column in your "NewFile".
The formula references the "OldFile" and should bring the value for the "F" column in the "OldFile" whenever the values for the "C" column are the same.
This example would be for the second row of the file (I am assuming the first row are column headers).
=VLOOKUP(C2,'[OldFile.xls]Sheet1'!$C$2:$F$6000,4,FALSE)

Find and Compare Two Columns Excel (With Screenshots)

I have a spreadsheet that will occasionally get new data that I don't know the contents of, I just have to add it to the spreadsheet. Some of the new data is just updating rows that are already in the spreadsheet, and other data is adding new rows. I'm looking for a way to add a column that will tell me if something has changed in the row when I compare the old spreadsheet to the new one.
The sheets have one column that will always have a unique value among all the rows, so I can use that to match rows if the sheets aren't sorted the same way. Here are some screenshots to show what I'm trying to do:
Old Spreadsheet:
New Spreadsheet:
The only solution I can think of is a large nested IF formula that compares each column one by one, something like:
=IF(Old!B2=New!B2,IF(Old!C2=New!C2,"NO","YES"),"YES")
The problem with that is that it gets very hard to look at since my actual data is using 33 columns (not including this "Changed?" column) and new columns could be added in the future.
I'm not very technical with Excel, nor have I ever used VBA, so I apologize in advance if there is a simple/obvious solution that I'm missing.
Thanks in advance for your help.
Using your example, in the 'New' sheet cell D2 and copied down:
=IF(COUNTIF(Old!A:A,A2)=0,"YES",IF(SUMPRODUCT(COUNTIF(INDEX(Old!A:AG,MATCH(A2,Old!A:A,0),0),LEFT(A2:AG2,254)&"*"))=SUMPRODUCT(COUNTIF(A2:AG2,LEFT(A2:AG2,254)&"*")),"NO","YES"))
vlookup would also work well for this problem.
in D2, the formula would be:
=IF(AND(VLOOKUP(A2,Old!A:C,2,FALSE)=B2,VLOOKUP(A2,Old!A:C,3,FALSE)=C2),"NO","YES")
The column numbers (2 and 3) are the columns that correspond to the data you are trying to match, using the ID column.
It's possible to find the appropriate column using MATCH if the column names you have match the column names in the old sheet
This would make the formula look more complex, but Excel would adjust the Old!A:C reference if more columns are inserted.
The formula would look like this to match against column names
=IF(AND(VLOOKUP(A2,Old!A:C,MATCH($B$1,Old!$1:$1,0),FALSE)=B2,VLOOKUP(A2,Old!A:C,MATCH($C$1,Old!$1:$1,0),FALSE)=C2),"NO","YES")
The difference between this and the last one is the use of MATCH($B$1,Old!$1:$1,0) to find the column (using $s to anchor the lookup values)
In this case, specialized software for Excel compare is better.
My company use this software. Check it out.
http://www.suntrap-systems.com/ExcelDiff/
http://www.youtube.com/watch?v=QQgnWr_RT-8

How do I get my formula to always reference to the last sheet?

I currently have 2 worksheets in my excel file.
The first sheet is known as the Summary page, which displays an summary result of the second sheet.
The second sheet is known as the raw data. An example would be a column named Fruits.
Apple
Apple
Apple
Banana
Banana
Pear
In the first sheet, I would have a formula that counts the number of time the respective fruits appear and the result will be displayed in different cells.
=COUNTIF(Fruits!A2:A7,"Apple")
=COUNTIF(Fruits!A2:A7,"Banana")
What I wanna do is, is it possible for me to program the formula such that everytime I add a new sheet of raw data (3rd sheet), the statistics on the first sheet is able to reference to the latest sheet to get the information.
(Assuming that the positioning of the data and all are the same as the second sheet.)
What I have done so far is to come out with a function GETLASTWSNAME() which is able to always retrieve the name of the last worksheet. but it seems kinda impossible for me to nest the function within the countif formula itself.
=COUNTIF((GETLASTWSNAME())!A2:A7,"Apple)
The above formula is how i want my formula to work, but sadly excel does not allow me to do that.
Any comments would be appreciated. Thanks!
You can use the XLM/Range Name workaround for this and not VBA if you prefer
Define a range name, wshNames to hold the array of sheet names
=RIGHT(GET.WORKBOOK(1),LEN(GET.WORKBOOK(1))-FIND("]",GET.WORKBOOK(1)))
Uses David Hager's technique
Use this Excel formula to extract the last sheet name from the array of sheet names
=INDEX(wshNames,COUNTA(wshNames)+RAND()*0)
This formula says look at all the sheets, then return the last (using the COUNTA). The RAND()*0) portion ensures that the formula is volatile and updates when Excel does
If you do use VBA you will need to ensure your GETLASTWSNAME function is volatile, i.e. it gets updated when changes occur.
=COUNTIF(INDIRECT(GETLASTWSNAME() & "!A2:A7"),"Apple")
In Excel with spill function and the new Sequence() you can list all your sheet names with the same technique just from one cell! First to last or last to first, your choice. With Transpose you get a column header for each sheet (Obs volatile).
After defining the named formula wshNames as told by Tomalak ( thanks for the tip ) I used:
=INDEX(wshNames;COUNTA(wshNames)+1-SEQUENCE(COUNTA(wshNames);1;COUNTA(wshNames);-1))
my Excel is using ";" as a separator, you may have to change the semicolons with a comma
Rolf H

How to mask accounts in excel sheet?

Im working for a bank where customer accounts with varied lengths need to masked in excel sheet.
Is there any macro or formatting method which could help me doing this?
Eg:
Cell No Value:
A10 46579094628
A11 NL6539123747796621
This would turn to
A10 46XXXXXXX28
A11 NLXXXXXXXXXXXXXX21
I want to keep 1st 2 digits and last 2 or 3 digits intact. Please advise.
Who is going to be using this spreadsheet?
I ask because you can certainly create a new column with values computed by manipulating the text in the account-number column and you can get it to look exactly the way you want. But then you'll have to hide the original column. That may be inconvenient because:
Harder to maintain. Who/how are new accounts going to be added?
How do you know users won't just unhide the column? Seems like now you've got a password to manage.
Are you sure bank officials are ok with this?
Option two is to create the spreadsheet by manipulating account numbers from an export from a more secure DB so that they never make it into Excel. Then you don't have to worry about passwords, hidden cells, etc.
If you're looking for a way to create a separate column with the mask you can use the following:
=LEFT(<targetcell>,2)&REPT("X",LEN(<targetcell>)-4)&RIGHT(<targetcell>,2)
Just replace targetcell with the cell containing the account number.
I'd then copy the formula down for the full column; copy and paste the formula column "As Value" and delete the original account column to remove the sensitive information. (These steps could be entirely automated in VBA but this is an easy solution.)
Mask by
=right(sheet_org!a1a;2) & left("XXXXXXXXXXXXXXXX";len(sheet_org)-4) & left(sheet_org!a1,2)
if sharing without original content, copy and paste to new spreadsheet by macro.
or get external data in new workbook pointing at sheet with masked values.
regards,
//t
If your value is in cell A10 then insert the following formula in any blank cells
=LEFT(A10,2)&REPT("X", LEN(A10)-4)&RIGHT(A10, 2)

Resources