Matching columns in excel - excel

Long time lurker first time posting. I have a problem regarding matching columns in excel. I searched for different methods but I can't seem to find something that fits my needs and requirements.
I have two columns. I need to check and see if certain process's or values(column c) match a master list(column a). I have a rule set up for matches between the two columns and it highlights the matches in the master list column when compared and fills it green.
The problem I am having is that the master list contains duplicates. So two rows will have the same data. For example. a1=str123 and a4=str123.
When using the Match command, it turns both cells in column a (a1 & a4) green but this occurs even if the match column only contains one instance of the data (c5=str123).
I am looking for a way to do a Match but the master list to only turn a cell green if there is one instance of it and then ignore the match that occurred when checking for matches for another cell.
So basically I am looking for a way to setup matching between two cells but once one match has been found to ignore it when going through the rest of the list.
Any help would be appreciated.

This can be done by conditional formatting using two match formulas. I assume you have a fixed set of master values in column a, say in cells A2:A6. I will assume you have values to test if they exist in column C, say in C2:C6 and other columns are blank. In column B enter numbers 1..5 in B2..B6 as an index to the master values. In column D enter the formula =MATCH(C2,$A$2:$A$6,0) in D2 and copy down. In column E enter the formula =MATCH(B2,$D$2:$D$6,0) in E2 and copy down. Select the master range A2:A6 and choose conditional formatting, using the formula =ISNUMBER($E2) and format green fill or whatever you want. With this combination the formulas in column D find the first matching master item, and formula in col E tests to see if the index in col B is in the list of matches in col D.

Related

How to find non matching records from two columns while accounting for duplicate values in Excel

I have two large columns.
Column A contains 100,000 different numbers/rows. Column B contains 100,210 numbers/rows. They have the same numbers except column B has 210 extra rows. I need to be able get the values of that extra 210 rows.
The issue im having is that the numbers in these rows are not unique.
For example,
Column A contains the following numbers: 2,1,3,4,5,5,6,7
Column B contains the following numbers: 1,2,3,4,5,5,5,5,6,6,7,8
I want the outcome result to be: 5,5,6,8
I can't seem to wrap my head around a way to do this.
I have the two columns in a text file that im importing into excel. If there are better ways to do it outside of excel, I am open to it too.
With the Dynamic Array formula Filter:
=FILTER(B1:B12,COUNTIF(OFFSET(B1,0,,SEQUENCE(ROWS(B1:B12))),B1:B12)>COUNTIF(A:A,B1:B12))
Without FILTER:
Put this in the first cell and copy down:
=IFERROR(INDEX(B:B,AGGREGATE(15,7,ROW(B1:B12)/(COUNTIF(OFFSET(B1,0,,ROW(INDEX($ZZ:$ZZ,1):INDEX($ZZ:$ZZ,ROWS(B1:B12)))),B1:B12)>COUNTIF(A:A,B1:B12)),ROW($ZZ1))),"")
Try to follow these steps, supposing that Column A has less values than the Column B and the rows start at 1:
A. Create Column C.
In the cell C1 place the function: =COUNTIF(A:A;B1)
Copy this function to the rest of cells, for all items of Column B. So, cell C2 will have the function =COUNTIF(A:A;B2) and so on.
B. Create column D.
In the cell D1 place the function: =COUNTIF($B1:$B1;B1)
Copy this function to the rest of cells, for all items of Column B. So, cell D2 will have the function =COUNTIF($B$1:$B2;B2) and so on.
C. Create column E.
In the cell E1 place the function: =IF(D1<=C1,"Exists","Missing")
Copy this function to the rest of cells, for all items of Column B. So, cell E2 will have the function =IF(D2<=C2,"Exists","Missing") and so on.
D. Filter to show only the rows that Column E values are "Missing".
Of course you can combine all above 3 columns to one (e.g. in Column F), so these cells will have the functions:
F1: =IF(COUNTIF($B$1:$B1,B1)<=COUNTIF(A:A,B1),"Exists","Missing")
F2: =IF(COUNTIF($B$1:$B2,B2)<=COUNTIF(A:A,B2),"Exists","Missing")
and so on
Explanation:
In column C we count how many times the value of the respective cell
of Column B exist in the whole Column A.
In Column D we count how many times we have "met" this value in Column B so far.
In Column E we check if we have "met" the value more times that it exists in Column A. If indeed we have "met" it more times, then we mark the cell as "missing"
Tested with the example you provided and works okay.
I hope it helps!
Good luck!
EDIT - Addition of Screenshot

Excel - See if cells in one column contain text from another

Currently I am using the below to see if cells in column B exist in Column A...
=IF(ISERROR(MATCH(B1,A:A,0)),"False","True")
... However I've just realised the need to not check for an exact match, but rather if the cells in column B contain text that appear in any of the cells in column A.
For example, if a cell in column B contains "TIC081_CL1" then I could return "true" if column A contains a cell with only "TIC081"...
Can anyone help me with this please, as my current working only works for eact matches. Many thanks for any assistance!!!
Since it is guaranteed that there will be an underscore "_" separating the values you are searching for we only need to check 2 distinct conditions:
If the entire value B1 is found within A:A
If the value before the _ is found within A:A
We can accomplish that by using an additional MATCH derived from your original formula:
=IF(ISERROR(MATCH(MID(B1,1,FIND("_",B1)-1),A:A,0)),IF(ISERROR(MATCH(B1,A:A,0)),"False","True"),"True")
You can alternatively do this using 2 COUNTIF formulas:
=IF(COUNTIF(A:A,B1)+COUNTIF(A:A,MID(B1,1,FIND("_",B1)-1))>0,"True","False")

Find the difference between data in 2 columns in Excel

I have 2 sets of data. I put it in Excel e.g. column A and column B. Now I want to know which data from B is part of column A. I run this formula =IF(COUNTIF($A$1:$A$327238,B1)>0,"Exist", "Nope")
Then I 'filter it and look only 'Exist'. Based on that I know that all data in B that has label 'Exist' is part of column A
Now I want to know opposite i.e. which data from A are part of B. For that reason I use the same formula but I replace the data in columns i.e. data from B now in A and vice versa.
Then I randomly verify results.
For case 1 it looks it works fine but for second case it looks it's not accurate.
My assumption: should it work in case 2 as well ( maybe I just was not very accurate in some way ) and I should expect it to work?
Thanks
In cell C1 (assuming your data starts from 1st row) type the following =IF(A2=B2,"equal","no"), and then populate the same formula to the last row where there is still data, so that for row N, your formula in column C is =IF(AN=BN,"equal","no"). After that you will just need to count the cells with value "no" to know the differences. Sorry if I didn't get the question correctly.
Ok, assuming that the two sets of data are in columns A and B (they might be of different sizes), and the last rows of data are L and M respectively, click on D1 and type the following: =IFNA(INDEX(B$1:B$5,MATCH(A1,B$1:B$5,0),1),"Unique"). Drag down to apply this formula on D1 - DL. That's it, you have the duplicate elements. Since the duplicate elements are the same in both columns - A and B, you don't need to repeat this for column B. Note, that for all the unique elements the corresponding rows of column D have the word "Unique", so if you want the unique elements, you can just get the elements from A with the mentioned row numbers:
Just select any column's first row cell and type the following formula: =IF(D1="Unique",INDEX(A$1:A$L,ROW(D1)),"Duplicate").

Is there a way to delete cells if their value is contained in another column?

I have a spreadsheet with multiple columns with a few thousand rows and I would like to find the cells that are common across all columns. Is there a function that I can use to check if a cell value exists in a set of cells/column?
To find out if a value exist in all columns but in any row you can put this equation in the next open column and drag down:
=AND(MATCH(A1,B:B,0),MATCH(A1,C:C,0))
This assumes you have data in column A, B & C and the equation is in column D. now you can sort on column D for unique values.
Depending on your data type you might get an error. If that is the case try this:
=AND(IFERROR(MATCH(A1,B:B,0),FALSE),IFERROR(MATCH(A1,C:C,0),FALSE))

Count values in groups

I have a table with student IDs separated in groups. I need a handy way to count the total number of students in each group and populate it after the last row of each group (marked with ??)
Currently I just enter =COUNT() and then manually figure out the top and bottom borders of the range for each group. Not convenient at all.
I was thinking that a possible solution could be one of the following:
A some kind of pivot table permutation. I failed on this one.
Excel Data->Outline->Subtotals functions. Again, fail. It keeps creating new rows in my table.
A universal formula that can be pasted into each ?? cell. Not the most graceful solution, but still would do.
A macro. As a last remedy if nothing else works.
The following steps will calculate the subtotals while preserving the structuring and formatting of your worksheet.
Put this formula in cell C1 and copy the formula down the column:
=IF(NOT(ISERROR(SEARCH("Total",A1))),COUNTA(INDIRECT("B"&MATCH(LEFT(A1,LEN(A1)-7),A:A,0)+1&".B"&(MATCH(A1,A:A,0)+1))),IF(B1="","",B1))
Apply a conditional format to cell C1 with the formula rule =(MOD(ROW(C1),2)=0) and blue fill to match the shading on the other rows. Copy the format down the column using Paste Special Format.
Either hide column B, or copy the values in column C to column B using Paste Special Values and hide Column C. If you decide to copy the values to column B, you won't need to set the conditional formats.
Here is what the formula does:
First, check whether the formula's row is a Total row, by searching the cell in column A of the row for the word "Total," using the SEARCH function.
If the word "Total" is found:
Determine the range in the worksheet of the student IDs for the group for that total row:
a) Identify the rows in which the words "GroupX" and "GroupX Total" are found by using the MATCH function. With that, you know that the IDs for the group are in a range that starts at, say, row x and ends at row y.
b) With the starting and ending row numbers, construct the address range in which the IDs lie, which has to be the string "B" + (row x) + "." + "B" + (row y).
c) Turn the string into a range reference that can actually used in a formula using the INDIRECT function.
Count the number of students in the group using the COUNTA function and the range, and show that as the formula's result.
If the word "Total" is not found
Check whether the cell in column B is empty
a) If it is empty, show a blank as the formula's result
b) if it is not empty, it must be a student ID, so show the ID as the formula's result.
Add a column (I usually add it to the LEFT of the existing matrix) where you enter a formula from row 2 onwards that fills the blanks in the old column A. Then the old matrix including your new column can be used in a pivot.
So Insert a column left of your matrix, this is column A now. Put a header in Cell A1, for example "Group Name1"
Enter the following formula in cell B2 and extend it to the end:
=IF(B2="",A1,B2) This way your blanks will be filled.
Now apply a pivot on this matrix and there you are.
Maybe not the nicest looking solution, but its quick and works well.
If u have table like this
Students id Name of students group ........
then u can use countif/countifs formula

Resources