Get unique sheet from 2 separate sheets - excel

I have a single Excel document with 2 sheets. The first sheet contains "active" clients and the 2nd "inactive clients" but we want to merge both into a 3rd sheet "all clients". We want to ensure that there isn't any multiple rows. Column A in both sheets is the "identifier" which is a 16 digit numeric value. Both sheets have the same columns so effectively I want to match column A in both sheets and return the entire row if it's not found yet. There is around 1.2 million rows combined in both sheets, hence why I cannot just copy and paste them into a single document.
How would I go about doing this?

Good advice in the comments but because even after removal of duplicates there may not be enough rows to accommodate the de-duped list I would suggest starting by determining how many duplicates and deciding which list to delete any from (or you might end up with an incomplete combined list but no practical way to extend it). In both sheets add a column with:
=MATCH(A1,Sheet2/1!A:A,0)
(Sheet2 name for one, Sheet1 name for the other) and copy down to suit.
Then check the de-duped combination number less than 1,048,576 in total. If more they won't fit on a single sheet without an additional set of columns and even if less a database is to be preferred, though not obligatory, with the Excel version possibly convenient for upload.

Related

Macro to Look up words in a column from a word list on seperate sheet, then copy row to new sheet

I'm still pretty clumsy with Macros and have only managed to get this working by running lots (i mean lots!) of seperate scripts to achieve the task, and it takes way too long handling large data sheets..
I need a way of filtering out certain rows based on a list of words on another sheet.
I have a sheet of raw data containg the stocked items, warehouse locations and quantities etc...
I need to look through the 'Long description' (Sheet1, Column E) and look for words that appear in the word list on a seperate sheet (Sheet2, ColumnA, if there is no match, then copy row to a new sheet.
So for instance, i want to filter out the word 'Guard' i'll add this to the list, and only rows not containing 'Guard' shall be copied over,
So adding 'Guard' to my word list on Sheet2, ignores rows 4 & 5, and copies the other rows to Sheet3
The raw data can be thousands of lines, and my 'word list' is a couple of hundred words i need to filter out.
I can do this manually, which can take hours or days!, and i've tried automating this one script at a time, but it still seems to get it wrong, or crash in the process.
I've found a few scripts that are close to what i need, but they only look for an excact match in the cell, not a partial string.
I also want to copy the row, not move it.. as i want to preserve the raw data on sheet1
I hope this makes sense :)

Matching mulitple criteria across 2 sheets and pulling some data from that match in excel

I have two sheets of data, one is a list of 4000+ companies and some data about the company (including a CUSIP and an issue date). The other lists stock prices per day for a said list of companies spanning multiple years.
I need to match the CUSIP and the issue date from the first sheet with that of the second and extract a number from sheet 2 where both where a match and put it in sheet one in a colomn next to the other data from that company.
Sheet 1
Sheet 2
I tried =VLOOKUP(E1076&O1076;Sheet1!A:Sheet1!K;11;FALSE) but all this did was give me a #NAME error same for when I tried to do this on the same sheet
I tried =INDEX(W:AP,MATCH(1,(X:X=D5)*(AE:AE=N5),0),42) but that just tells me it isn't a formula to begin with
Combined Sheets
In Column R:
{=INDEX(AG,MATCH(D2&N2, U&AB,0))}
will work for you but will likely be slow (make sure to enter with ctrl+alt+enter)
You can try and mitigate the lag by using defined ranged (e.g. U2:U4000&AB2:AB4000) but since your list is ever growing I'll guess that the lag will come back pretty quick.
To keep things faster, I suggest you use a helper column where you concatenate U and AB. Let's say column AC:
=U2&AB2
(copied all the way down)
You can then use a simple INDEX/MATCH:
=INDEX(=INDEX(AG,MATCH(D2&N2, AC,0))
You could also concatenate D and N to another column and use that column as your lookup value.

how to optimize speed excel 2007 (±20,000 rows)

I'm in the process of working with an Excel file that contains two columns (old URL and new URL). But it contains about 20,000 rows.
And I have another file containing about 400 old/new URL that needs to be imported in the big ±20,000 rows file.
I have to do all kinds of processing, like:
- Find all duplicate rows (same two columns more than once...). That functionnality would be in a column and it would be good to run that function each time I add 1 row to check if that URL combination already exists in the file
Note that I already turned the sheet into a table.
2 questions now:
1) should I do some kind of vlookup from the ±20,000 rows sheet and the ±400 rows sheet, or VBA? I don't know what would be the best way to do this (i.e.: if that row from the ±400 rows sheet is not in the ±20,000 rows sheet, add it...). Should I use vlookups or populate arrays in VBA (speed-wise)? If I use vlookup, it is true that it is possible to put the vlookup function in a sheet and refer to it in every row instead of puting a vlookup function directly in every row?
2) How can I optimize the 20,000 rows sheet because now, each time I want to sort or filter, it takes an eternity to redraw and it freeze my PC for that time!
Thanks for you help.
Firstly to ommit the dupes from the 400ish row sheet that need to be added in, use a COUNTIFS formula against the big sheet, then sort by this value and only copy in things where there is < 1 for the value (or error).
Secondly I would probably do the same thing in the big sheet but referencing itself, anything with a value above 1 is a dupe.
Lastly, are there formulas in the 20,000 row sheet? I could set up a 20,000 row sheet with just a "1" in range A1:A20,000 and doing anything on it would be super quick. It all comes down to what data you have in there and what you can do to reduce it's load on the system (ie convert formulas to values if they no longer need to calculated)
Excel 2007 has a built-in feature and VBA you can use for your situation: Range.RemoveDuplicates or Data tab -> Data Tools group -> Remove Duplicates
For example data:
Click the Remove Duplicates button:
And you are done!
The VBA equivalent is:
ActiveSheet.Range("$A$1:$B$10").RemoveDuplicates Columns:=Array(1, 2), Header:=xlYes
Note the 1 & 2 does not mean Columns A & B. It means the Columns of the selected Range.
If your worksheet only contains 2 columns, you could use UsedRange instead.

Merge two unique excel spreadsheets, remove one spreadsheet when needed

Hopefully I can explain this decently.
I am attempting to merge two unique excel spreadsheets, with some of the same data, into one spreadsheet. When needed I would like to remove the data from the incoming spreadsheet. I am doing this as it would make it easier to edit one "like" spreadsheet, rather then keep and update two copies. I do not want to hide the incoming data, I NEED to completely remove it when needed.
Thanks!
It depends on what the spreadsheets look like and what, exactly, you mean by merge.
If, for example, the two worksheets contain a table each, then you could copy/append one table to the bottom of the other and use Excel's Remove Duplicates feature (on the Data tab) to delete rows.
The duplicates can be identified either by a single code-number column, all of the columns (meaning that the entire row is duplicated) or a selection of columns. Be aware that it is the first duplicated row that is kept, the subsequent duplicates will be removed.
If, on the other hand, you want to find values in the rows of one of the worksheets, based on a code number contained in a column of the other worksheet, and insert them into specific cells, then this requires more effort, perhaps with the help of the VLOOKUP function (or similar).

Combining Excel sheets/groups of columns by a condition in Excel 2007

Is there a way to combine 2 Excel sheets (or groups of columns inside one Excel sheet) so that the rows in one sheet/group append to the other sheet/group where so that certain columns values match.
To clarify:
Lets say I have 2 sheets - Sheet1 and Sheet2. Sheet1 has the columns A,B,C,D. Sheet2 has columns A,E,F,G. Column A in both sheets contains the same data but differently sorted (it is not sorted in conventional way (alphabetically or numerically)). I need to combine these 2 sheets into one, but they need to be combined so that the values in A column match (if possible the result should be ordered in the same way as the Sheet2).
Ideally, the functionality I'm looking for would need to be like SQL's INNER JOIN command.
I'm using Excel 2007.
Thanks
I think you basically described the VLOOKUP function.
You have your two sheets, now you want to create a list, which extends A,B,C,D to A,B,C,D,E,F,G.
For that, you could just use
Sheet1!E1=VLOOKUP(Sheet1!A1,Sheet2!A:G,5,FALSE)
Sheet1!F1=VLOOKUP(Sheet1!A1,Sheet2!A:G,6,FALSE)
Sheet1!G1=VLOOKUP(Sheet1!A1,Sheet2!A:G,7,FALSE)
If you need to create an extra sheet3 as a result, use this:
Sheet3!A1=Sheet1!A1
Sheet3!B1=VLOOKUP(Sheet3!A1,Sheet1!A:D,2,FALSE)
Sheet3!C1=VLOOKUP(Sheet3!A1,Sheet1!A:D,3,FALSE)
Sheet3!D1=VLOOKUP(Sheet3!A1,Sheet1!A:D,4,FALSE)
Sheet3!E1=VLOOKUP(Sheet3!A1,Sheet2!A:G,5,FALSE)
Sheet3!F1=VLOOKUP(Sheet3!A1,Sheet2!A:G,6,FALSE)
Sheet3!G1=VLOOKUP(Sheet3!A1,Sheet2!A:G,7,FALSE)
Hope this interpretation was correct.
Edit:
By the way, because Excel is not mainly intended to function as a database, this operation is a bit messy, because it does not dynamically scale. At least with the second approach, using a thrid sheet. You will have to copy down A1 at least that far, to match the last used row from Sheet1. And if you should copy it down further, so you won't have to worry about it for a while, you might need to error-proof against the empty cells.

Resources