How to count the number of matches between two excel files - excel

Say I have a data set in a worksheet. I want to compare the contents of column A with the contents of column B in another file and put the result in column C. To do this, I could use the formula
=IFNA(MATCH($A1,'location/[filename.xlsx]worksheet'!$B:$B,0),FALSE)
If I then wanted to count how many matches there were, I need only to count how many entries in column C that contain a numeric value using
=COUNT($C:$C)
But, what if I wanted to count the number of matches between columns A and B, and each of those columns existed in separate Excel files?
I need cell C1 to calculate the number of matches between entries in column A of file 1 and column B of file 2. Is this possible without editing files 1 or 2? Manually copying the data is one way, but every day has different data, and C2 will need to do the same thing for tomorrow, and C3 for the next day, etc. Manually copying the data would make the workbook size balloon quickly.

Create pivot tables in File 3 of the data from File 1 and File 2. The pivot table for File 2 would have a occurrence count for every entry in the sheet. The pivot table from File 1 would only have row labels, giving you the unique entries that you are going to be searching for.
beside each pivot label for File 1, enter a formula similar to the following:
=+GETPIVOTDATA("Item",PivotTableFile1!$H$7,"Item",A4)
Should also do some error checking, which makes the formula:
=IFERROR(GETPIVOTDATA("Item",PivotTableFile1!$H$7,"Item",A4),0)

Related

How to delete duplicates within a large database on a column by column basis

I have a large set of data (over 3000 columns) for work, with text in every cell. Each column is unrelated to each other. Within each column there are potentially duplicates and I need to keep only the first instance , but there is no way to highlight the cells with duplicates on a column by column basis as when the whole data set is highlighted excel treats the rows as related data and looks for duplicates on a row by row basis. I have tried using macros (I am a total novice) but the macros don't work.
Image shows the columns of data with some duplicates in the columns.
If you use the modern Excel, you could use the UNIQUE function, which returns the array of unique elements.
Just duplicate the sheet and in the copy delete everything below the lines with "Processor 1" and "Processor 2". Then in the first column use UNIQUE referring to the first respective column of the original sheet.
Just fill the formula right (Ctrl + R) and in the new sheet each column will have only the unique elements.
You can then paste the whole resulting table as values and delete the original one.

Excel copy cell value when row does not contain certain text

I'm looking to avoid VBA if possible and manipulate some Excel data.
I have data with one column of values, containing blanks and numbers. For example column A: 1, 5, , 2. And another column containing text, for example column B: Include, exclude, include, exclude. Not sure how to put a table here, but basically:
1 include
5 exclude
include
2 exclude
I am looking to (in any order) copy data to a new column that both does not contain the word "exclude" in its row and does not contain blank data in its cell.
To get rid of blanks I use
=IFERROR(INDEX(AL$2:AL$3309,SMALL(IF(AL$2:AL$3309<>"",ROW(AL$2:AL$3309)-ROW(AL$2)+1),ROWS(AW$2:AW2))),"")
where AW is where I want copied values and AL is where the values and blanks are. Data in rows 2 through 3309.
How can I also copy only data that does not include the text "exclude" in its row? (There is a separate column for each value with a text value) If this can be done in one function, great! If not, that is fine.
Thank you!
As a small separate issue, is there a way to auto detect the last value in a column? So that for my formula, instead of updating every instance of 3309 to 3310 if I were to add another row of data, I could just say "last row containing data" if that makes sense.

Using two values in a sheet to filter and return values from a table in another sheet

I'm fairly new to coding and i've been googling around for the last few hours trying to solve this problem but it seems to be a little beyond what i'm able to do so i would be very grateful for some help
In Sheet1, I have a table which has columns between M - CV (175 columbs). For each column, i have an "ID number" value in row 3. From Row 6 to the end of the table, i have several "search terms" separated by commas in the column CV
In Sheet2, the corresponding "ID Numbers" are in column B. Column AN contains strings.
For each ID Number value in sheet1, i'm looking to find find all the corresponding cells in sheet2 where the ID number in Column B is the same, and Column AN of sheet2 contains at least one of the "search terms" in column CV
For each ID number, i'm hoping to join the entries in Column AN of sheet2 which match the criteria above and paste them into Row 5 of the respective column in Sheet1
I've gone around in quite a few circles trying to do this and i'm back to square 1 with no code to show for it.
I've tried to research both the autofilter function, and using for loops. The research i've done indicates that for loops are rather slow to run for a large data set.
I'm hoping to find a solution which is as easy to read and understand as possible
I hope i've given enough information for everyone to understand and help
THank you in advance
My Excel subscription has expired an I've started using Google Sheets for most of my spreadsheet work, so I tested this there. Some conversion may be required. I did this using formulas, not VBA also, not sure if that changes things for you.
If I understand correctly, you have two sheets with a shared key column, sheet 1 contains search terms across multiple columns, and sheet 2 contains search terms comma delimited in a single column.
With this setup we want to bring the search term column of sheet 2 into the correct row of sheet 1 by key using VLOOKUP. I made a named range in sheets which contained all my data on sheet 2 and called it "dst". My formula was then =VLOOKUP(A2, dst, 7, true) since my key in sheet 1 was in column A, dst was the range I was searching, my column with my delimited search terms was column 7 in relation to dst, and I had ordered sheet 2 by key. I pasted this formula relatively down all rows as needed.
We want to construct a regex string using our search terms across multiple columns in sheet 1, into a single cell. I used =JOIN("|", B2:E2) on sheet 1 since my search terms were in columns B:E, and this resulted in a regex that looked like this for me: alligator|dog|rabbit|lizard where alligator, dog, rabbit, and lizard, were all search terms in that row. Paste down relative as needed.
We want to run our regex against our search target cell containing the comma delimited search terms. I ran =REGEXMATCH(F2, G2) where F2 was my delimited search terms from sheet 2, and G2 was my constructed regex for the row. Paste down relative as needed.
A screenshot of my completed sheet 1:
Once you know which cells have matches you can do whatever you want.

Excel if cell in range contains specific text, export and compile on new worksheet

I am working with a decent size data set in excel. It has 5 columns and up to 5000 rows of data. Column A data is irrelevant to my goal as is Column B and E. The two columns I care about are, Column C which is numbers only in content, and column D which contains text (comments left about the numbers in Column C. For example, the cell D2 would state: “Machine is acting erratically and repairs have not been attempted. Issue has been ongoing for three months and multiple service call requests have been submitted.”
I am curious to know if there is a formula I can use to review ALL cells in Column D to determine if certain words are being used, such as “machine” or “not attempted”, and in some cases, BOTH of those words/phrases, and if the text IS contained in Column D, can the data of column C be compiled and moved into a new sheet to form a list.
Another way of looking at it, Review Column D and if key words or phrases exist, export the cell from column C and the same row to a new sheet in a list. Is this something that is outside of a mere formula and would require going into the macros to try and perform or?
So far I’ve managed to get Excel to count the number of times the phrases and words I’m looking for can be found, but I would like for it to be able to locate the referenced logs and move the log numbers into it’s own sheet.

Unique Values in Drop Down List

I have two workbooks, a source file and an output file.
The source file contains information which occupies some drop-down lists in the output file.
For each drop-down list I have two 'names' (in the name manager) linked to it. For instance, the name 'SchemeID' in my output file refers to the same name in my source file. It consists of several rows/columns of data, and that populates my drop-down list.
There are some repeats in the source file (e.g. different names associated with the same number) which are appearing in the drop down lists, and I'd like to get rid of them so the list only displays unique values. Is it possible to do this using data from different workbooks?
The easiest way is oging to be to go to the source workbooks, Data Ribbon -> Remove Duplicates. Anything else will require a couple of in-between data sheets or VBA to do cleanly. If your data doesn't change option this manual method should be fine.
EDIT as you seem restricted from editing the Source File
In a different sheet (let's say Sheet2) you will need a formula which pulls in all of your data from your 2 source Names. To my knowledge there is no clean non-VBA way to combine to Named Ranges, so we will need to do this by dumping the data down to a sheet, and then picking it up again.
There are a lot of ways to do this, but I'm going to pick the one broken down to the most steps; it will be a pretty messy sheet, but you can hide it if you need to, which shouldn't be a huge concern as a non-VBA method will need a data dump sheet anyway.
In Cell D1, we will put the number of rows in SchemeID, as follows:
=ROWS(SchemeID)
In Cell D2, we will put the number of rows in SchemeID2 (which I assume is the name for your second list, which you didn't specify):
=ROWS(SchemeID2)
In column B we will be dumping in the data from both named lists, without sorting or eliminating duplicates. Do this as follows, starting at A1 and dragged down (if you want headers this gets a little trickier, so I will assume no headers).
=IF(ROW()<=$D$1,INDEX(SchemeID,ROW()),INDEX(SchemeID2,ROW()-$D$1)
This says - if the row is not more than the total entries in SchemeID, then pull the value from SchemeID at the current row #. Otherwise, pull the entry from SchemeID2, at the current row# less the total rows in SchemeID (so if we are at row 10, but SchemeID ends at row 4, then row 10 will pull the 6th entry from SchemeID2).
Now in Column A, we will be checking to see which row is a duplicate, as follows starting at A2 [A1 is hardcoded as 1]:
=IF(ISERROR(MATCH(B2,$B$1:B1,0)),A1+1,A1)
This checks to see if there's a duplicate of the current value in column B, in the rows above the current row. If there is, it keeps the same index # as the row above (which will be ignored when we use this as the index key next). If there's no duplicate, it adds 1 to the index number.
In cell D3, put the following formula to track how many unique IDs there are:
=MAX(A:A)
Next, in column C, put your new list, which pulls from column B for as many unique values as there are [drag down]:
=VLOOKUP(ROW(),A:B,0)
This is your new non-duplicate list. To make a clean reference to it, create a new named range with the following formula:
=INDIRECT("'Sheet2!R1C3:R"&'Sheet2!$D$3&"C3", FALSE)
This will simplify to [Assuming 20 rows of data in column C, bsaed on what D3 says]:
='Sheet2!R1C3:R20C3'
Which, in the R1C1 method of referencing, means Sheet2!C1:C20.
This new named range should be what your dropdown lists refer to on your other tab.

Resources