I have one Excel file that contains 2 columns: Words & Definition (3000 rows). I have another Excel file which contains only words not definition (200 rows).
How can I extract only those rows (Words and definition- from 3000) which are there in the 2nd Excel file (200 rows)?
Basically i want to filter those .
In SQL i would write
Select * from table1 where table1.words=table2.words
How do i implement this in excel ?
Please give me the procedure too...
If it's only the value from 2 columns you need to copy, I'd use the VLOOKUP() function in the 2nd file to look up and return the matches from the first file. Don't forget to set the range_lookup parameter to false.
If you really need to copy the entire row, then a loop in a VBA macro would be a better choice.
Related
I to exclude rows in a excel table based on certain values
For example:
I need to exclude all rows if column A is equal to any of these numbers ( 5840,4302,4432, and so on)
As the table data will be huge to filter only the data that I want.
One way is to exploit Excel Table feature together with the FILTER() spreadsheet function. NB. You will need a relatively recent Excel version for this. Using a Table provides some extra useful functionality (such as automatically adding rows and allowing reference by column name).
The OP's input data may already be a Table, if so, this first step can be skipped.
Put the input and filter list into tables. Excel help page. After the table has been created I have used the Table Design menu (which appears in the menu bar when a cell in the table is selected) to turn off the row banding format and header filters. This is also where you can rename the Tables. I have named them "Input" and "Exclude"
For the filtered output, choose where you want the output to start (cell H3 in my example), and enter a formula to copy the headers: =Input[#Headers]. Of course you can copy and paste the headers manually if you like. Here I've used the Format Painter to copy across the cell formats for the headers.
In the next cell down (H4 in my example), enter this formula: =FILTER(Input,(LEN(Input[ID])>0) * ISERROR(MATCH(Input[ID],Exclude[IDs to exclude],0))).
You should be able to add or delete new rows (right-click in the Table and choose Delete) in both the Input and Exclude tables, and the output should react (if you have Calculation set to Automatic).
NB. The Output range is NOT a Table. Excel doesn't let you convert dynamic ranges into Tables.
EDIT: If you don't want to use Tables, you can simply supply the ranges as the parameters to the FILTER function.
In this example =FILTER(B4:D13,(LEN(B4:B13)>0) *ISERROR(MATCH(B4:B13,F4:F5,0)))
This question already has answers here:
How to make a kind of pivot with strings?
(2 answers)
Closed 1 year ago.
I have data in the format shown in the first image with thousands of rows and different attributes. I would like for the data to be changed to a format in the 2nd image which would be easier to use and analyze. Are there any useful functions that could help me achieve this?
I would recommend you to do the following (it is not easy as applying some function, but it works haha). My Excel is in Portuguese, but I hope the images help you.
Copy the first column to another worksheet
Delete duplicates
Insert a new line before the first line on the new worksheet
Copy the second column of the original worksheet and paste it transposed - right click, transposed. Remember to just copy once each value. For example, in your case, just copy Name, Age, Height and Weight.
Finally, it is possible to add a line to represent header for your original data. For example, the first column you can call ID, the second you call Attribute and the third, Value. Then, you filter the colum Attribute by the first possibility - Name, in your case - and copy the column Value to the respective column on new worksheet. Then you filter by Attribute = Age and do the same. Do the same for all possibilities.
So, on the last step, the number of copy-paste is exactly the number of different values you have for your second column.
It is possible to do a VBA code to automate it, if you are going to do it frequently.
#1. You can try to use distinct Formula to get your data index (column A)
or Use Remove Duplicate icon at Excel Ribbon.
A2:A10 is the source list.
B1 is the cell above the first cell of the distinct list.
#2. Create Uniq Row (see my screencapture).
#3. You can use vlookup formula to match your data from sheet1 to sheet2.
I am trying to write an excel function/combination of functions that will loop without the use of macros.
I have one table with two columns and another table with 4 columns. The only important columns are the first two.
I need a function that searches through the first column of the top table and finds all materials on line 51. Material codes that are on rows with line numbers that match 51 will then be placed into the line 51 table below the first table.
NB: the first table (the one with data) will most likely never be sorted so I can not make my life easier via sorting... This first table will also be changing as it is fed from an add in excel program. I must not use macros/VBA as it needs to be sustainable for the average excel user to comprehend. I've tried nested IFs inside VLOOKUPS and many combinations of formulas, I am thinking INDEX is the way to go but I can't find a way to use INDEX to reach my desired goal.
**Even though the Line column contains some multi-lines ex 8/9, these will be non-factors as my formula should only look for a specific line, 51.
This formula should work:
=IFERROR(INDEX($B$2:$B$6,AGGREGATE(15,6,(ROW($A$2:$A$6)-1)/(ISNUMBER(SEARCH(G2,$A$2:$A$6))),ROW(1:1))),"")
It is an array formula so limit the reference range to the extents of the dataset.
I have about 70,000 excel files each of size about 300kb. The first column is date and time and rest columns are all doubles.
How do I merge them into 1 single csv file or bring them all together into one sheet of an excel work book. I was thinking about using Matlab but it runs out of memory.
You could try RDBMerge. It's a free add-in for Excel built for this sort of work.
Alternatively, you may find the following info useful:
http://ask.metafilter.com/106144/Combining-a-ton-of-Excel-files-into-one-Excel-file
If it's a one off this is what I would do:
Copy all details from both sheets to a single sheet in a new workbook
Sort descending by your date and time column
Duplicate the data and time column on the end (For vlookup, if you are comfortable using index you can avoid this)
Duplicate the sheet
Delete the data in the date and time column
Data / Remove duplicates to get a distinct set
Use a vlookup on some referential column to get the date and time (Make the last argument in the vlookup 0)
This will pull in the first instance it finds for the data (ie the highest date / time).
Should take about 2 mins all up.
I'm in the process of working with an Excel file that contains two columns (old URL and new URL). But it contains about 20,000 rows.
And I have another file containing about 400 old/new URL that needs to be imported in the big ±20,000 rows file.
I have to do all kinds of processing, like:
- Find all duplicate rows (same two columns more than once...). That functionnality would be in a column and it would be good to run that function each time I add 1 row to check if that URL combination already exists in the file
Note that I already turned the sheet into a table.
2 questions now:
1) should I do some kind of vlookup from the ±20,000 rows sheet and the ±400 rows sheet, or VBA? I don't know what would be the best way to do this (i.e.: if that row from the ±400 rows sheet is not in the ±20,000 rows sheet, add it...). Should I use vlookups or populate arrays in VBA (speed-wise)? If I use vlookup, it is true that it is possible to put the vlookup function in a sheet and refer to it in every row instead of puting a vlookup function directly in every row?
2) How can I optimize the 20,000 rows sheet because now, each time I want to sort or filter, it takes an eternity to redraw and it freeze my PC for that time!
Thanks for you help.
Firstly to ommit the dupes from the 400ish row sheet that need to be added in, use a COUNTIFS formula against the big sheet, then sort by this value and only copy in things where there is < 1 for the value (or error).
Secondly I would probably do the same thing in the big sheet but referencing itself, anything with a value above 1 is a dupe.
Lastly, are there formulas in the 20,000 row sheet? I could set up a 20,000 row sheet with just a "1" in range A1:A20,000 and doing anything on it would be super quick. It all comes down to what data you have in there and what you can do to reduce it's load on the system (ie convert formulas to values if they no longer need to calculated)
Excel 2007 has a built-in feature and VBA you can use for your situation: Range.RemoveDuplicates or Data tab -> Data Tools group -> Remove Duplicates
For example data:
Click the Remove Duplicates button:
And you are done!
The VBA equivalent is:
ActiveSheet.Range("$A$1:$B$10").RemoveDuplicates Columns:=Array(1, 2), Header:=xlYes
Note the 1 & 2 does not mean Columns A & B. It means the Columns of the selected Range.
If your worksheet only contains 2 columns, you could use UsedRange instead.