Excel - Comparing multiple columns to see if results are identical - excel-formula

I would like to compare three separate columns in an excel spreadsheet, across thousands of rows.
If any value appears in column A multiple times (say the word hello in column A rows 1 and 4, and the word bye in column A and rows 3 & 5, I would like to check the corresponding values in column B for those rows (ie rows 1&4 and 3&5).
If the values in column B for rows 1&4 are say 15 & 15, and the values for rows 3&5 are 20 & 20 , then I want to check column C.
Now we know rows 1&4 and 3&5 have the same corresponding values in column A & B, I would like to check the corresponding values in column C. If these are different then I would like to perform a specific calculation. If they are the same values in Column C, then I want to ignore these rows.
I am sorry this is very unclear, as I cannot paste an image to show what I mean. I can email you an example if it helps.
This is way beyond me and my excel skills and I do not know where to start. Any help would be appreciated. I am hoping I don't need to write a Macro.
Thanks in advance!

So, to resummarize your question as I understand it:
Column A holds string values (text). There are some duplicates here.
Column B holds number values. When a duplicate occurs in column A, the data in column B may or may not be identical as for the other duplicate entries.
Column C holds values (you did not define what type of values, but I assume these are number values). Sometimes, duplicates in column A hold the same values in column B, and also the same values in column C. In this case, we can ignore the row as all the duplicates agree. Sometimes, duplicates in column A hold different values in column B. In this case, we can also ignore the values. Finally, sometimes duplicates in column A hold the same values in column B, but different values in column C. For these specific values, we want to perform some other type of calculation (which you did not specify).
Put the following in column D, starting at row 2 (assuming a header on row 1), which is the starting point of the formula we will build.
=IFERROR(VLOOKUP(A$1:B1,A2,2,0)=B2,"")
This says: Look at column A, starting always at row 1, and going until 1 row above the current row. Check for a match of the text in the current row. If it finds a match there, pull the result from column B. Does that result match column B in the current row? If it matches it will say TRUE; if it doesnt match it will say FALSE. If there are no duplicates yet in column A it will say "".
Now add a new check - if the above formula is TRUE [ie: there is a duplicate in column A, and the result in column B matches], then we want to check the results from column C:
=IFERROR(IF(VLOOKUP(A$1:B1,A2,2,0)=B2,VLOOKUP(A$1:C1,A2,3,0)=C2,""),"")
This will now return TRUE if the values in column C match for that duplicate in column A (which is only checked if the values in column B match too). Finally, add in your "special calculation", like so:
=IFERROR(IF(VLOOKUP(A$1:B1,A2,2,0)=B2,IF(VLOOKUP(A$1:C1,A2,3,0)=C2,"",C2+1),""),"")
Where I have C2+1, this is where you will perform your special calculation. This will only be recorded by Excel if: there is a duplicate in column A, that duplicate has a matching value in column B, and that duplicate has an unmatched value in column C.

Related

How to amend a column based on duplicates in another and leave a unique value in Excel

I have a spreadsheet which has a lot of duplicates I need to cleanse but need to ensure the right data in another column is kept.
Data and desired outcome
Essentially in Column E there are duplicate values but these values could be duplicated any number of times, it is not the same amount each time.
In Column D for each record there should be either an A or B or blank.
Now the trouble is some duplicate sets have different values in column D. I need a way to remove all the duplicates from column E ensuring that each row in column E is unique while still ensuring the right value is kept from column D.
There are currently 3 different results in the raw data:
result 1: The duplicate sets (eg all HC0206 duplicates or HC0208 duplicates in column E) have the same value in column D (either all blank, all A or all B) - These are fine and don't cause a problem.
result 2: The duplicate sets have both blank and A in column D - When duplicates are removed an A must remain in column D.
result 3: The duplicate sets have both blank and B in column D - When duplicates are removed a B must remain in column D.
No duplicate sets have both A and B so we don't have to worry about that possibility.
I just can't work out how to ensure that when the duplicates are removed from results 2 and 3 above, that the letter remains and not the blank. If I could work out a way to ensure that all duplicate sets have the same value in column D then I could just remove duplicates without issue.
Any help would be greatly appreciated.
Thanks
Talking about overthinking.. you could realize it by formula in Office 365:
=LET(sorted,SUBSTITUTE(SORT(SORT(FILTER(D:E,E:E<>"","")),2),"",""),
uniqueE,UNIQUE(INDEX(sorted,,2)),
matchD,INDEX(INDEX(sorted,,1),MATCH(uniqueE,INDEX(sorted,,2),0)),
CHOOSE({1,2},matchD,uniqueE))
sorted-part makes sure column D:E are sorted by column 1, then 2 and blanks (that will result in 0) are shown as actual blank. The sorting for later use.
uniqueE-part results in the unique values in column E
matchD-part shows the match of the unique values uniqueE in sorted. The first match in sorted column 2 will return the indexed value of sorted column 1.
matchD followed by uniqueE is your spilled result

I am checking to see if any values in column C match with values from column B

I need preferably a formula or macro in excel 2013 to do the following:
Check if any given values in column C match with values from column B.
If they do I want to take the corresponding value from the same row in column A as the matched items in column B.
I then want to take those values from column A and put them in the same rows in column D.
Specifically, I am checking to see if any ID's in column C match with ID's from column B. If they do I want to take the corresponding city ID from column A in the same row as the matched items in column B.
I then want to take those values from column A and put them in the same rows in column D.
I used this formula =VLOOKUP(C6; A2:B14; 1; FALSE) but it returns #N/A
VLOOKUP will always use the first column as lookup_array. But in your case, you are using the second column for lookup_array, and wanting to return the value in the first column. So VLOOKUP is not appropriate.
Depending on your version of Excel, you could use INDEX(MATCH or XLOOKUP:
=INDEX($A$2:$A$14,MATCH(C2,$B$2:$B$14,0))
=XLOOKUP(C2,$B$2:$B$14,$A$2:$A$14)

Remove duplicates in excel based on another column value

My excel values are like (Column A and B):
Now, I want to remove duplicates in column A if any duplicates has value of zero (in column B). So for the above example, I should keep a and b only.
I tried:
IF((COUNTIF(A:A,A2)>1)*(B2=0),"REMOVE","KEEP")
and I get:
But for all c's, it should be remove.
What's wrong in my if condition?
Thanks.
Check if any row with the same value in column A contains a zero:
=IF(COUNTIFS(A:A,A1,B:B,0),"Remove","Keep")

Find the difference between data in 2 columns in Excel

I have 2 sets of data. I put it in Excel e.g. column A and column B. Now I want to know which data from B is part of column A. I run this formula =IF(COUNTIF($A$1:$A$327238,B1)>0,"Exist", "Nope")
Then I 'filter it and look only 'Exist'. Based on that I know that all data in B that has label 'Exist' is part of column A
Now I want to know opposite i.e. which data from A are part of B. For that reason I use the same formula but I replace the data in columns i.e. data from B now in A and vice versa.
Then I randomly verify results.
For case 1 it looks it works fine but for second case it looks it's not accurate.
My assumption: should it work in case 2 as well ( maybe I just was not very accurate in some way ) and I should expect it to work?
Thanks
In cell C1 (assuming your data starts from 1st row) type the following =IF(A2=B2,"equal","no"), and then populate the same formula to the last row where there is still data, so that for row N, your formula in column C is =IF(AN=BN,"equal","no"). After that you will just need to count the cells with value "no" to know the differences. Sorry if I didn't get the question correctly.
Ok, assuming that the two sets of data are in columns A and B (they might be of different sizes), and the last rows of data are L and M respectively, click on D1 and type the following: =IFNA(INDEX(B$1:B$5,MATCH(A1,B$1:B$5,0),1),"Unique"). Drag down to apply this formula on D1 - DL. That's it, you have the duplicate elements. Since the duplicate elements are the same in both columns - A and B, you don't need to repeat this for column B. Note, that for all the unique elements the corresponding rows of column D have the word "Unique", so if you want the unique elements, you can just get the elements from A with the mentioned row numbers:
Just select any column's first row cell and type the following formula: =IF(D1="Unique",INDEX(A$1:A$L,ROW(D1)),"Duplicate").

Is there a way to delete cells if their value is contained in another column?

I have a spreadsheet with multiple columns with a few thousand rows and I would like to find the cells that are common across all columns. Is there a function that I can use to check if a cell value exists in a set of cells/column?
To find out if a value exist in all columns but in any row you can put this equation in the next open column and drag down:
=AND(MATCH(A1,B:B,0),MATCH(A1,C:C,0))
This assumes you have data in column A, B & C and the equation is in column D. now you can sort on column D for unique values.
Depending on your data type you might get an error. If that is the case try this:
=AND(IFERROR(MATCH(A1,B:B,0),FALSE),IFERROR(MATCH(A1,C:C,0),FALSE))

Resources