Advance search to find duplicate records in Excel - excel

I want to find similar data in Excel cell based on characters for example in one column I have ABCDEF and another column I have DEAFBC, so in this case both cell contains characters abcdef, any solution to match?
I tried like, similar, partial match options in Excel which didn't meet my results.

Separate each letter into it's own cell (text to columns, fixed width). Copy to a new column first if you want to keep the original data.
Sort each row (this is a little tricky, but it is possible - see my example.)
Combine the cells again by just stitching them together using &.
You can now sort to easily see duplicates or remove them etc.
The formula to paste into cell I2 in my example is:
=INDEX($B2:$G2, MIN(IF(SMALL(COUNTIF($B2:$G2, "<"&$B2:$G2), COLUMNS($H2:H2))=COUNTIF($B2:$G2, "<"&$B2:$G2), ROW($B2:$G2)-MIN(ROW($B2:$G2))+1)), MATCH(SMALL(COUNTIF($B2:$G2, "<"&$B2:$G2), COLUMNS($H2:H2)), COUNTIF($B2:$G2, "<"&INDEX($B2:$G2, MIN(IF(SMALL(COUNTIF($B2:$G2,"<"&$B2:$G2),COLUMNS($H2:H2))=COUNTIF($B2:$G2,"<"&$B2:$G2), ROW($B2:$G2)-MIN(ROW($B2:$G2))+1)), , 1)), 0), 1)
Make sure you hit Ctrl+Enter when exiting the cell, since this is an array formula.

Related

Change part of excel formula with a constant value

I have an excel formula across a column for which the base changes every "x" number of rows. Note this "x" is not constant and keeps changing. e.g.
=D1/SUM(D$1:D$4)
=D2/SUM(D$1:D$4)
=D3/SUM(D$1:D$4)
=D4/SUM(D$1:D$4)
=D5/SUM(D$5:D$9)
=D6/SUM(D$5:D$9)
=D7/SUM(D$5:D$9)
=D8/SUM(D$5:D$9)
=D9/SUM(D$5:D$9)
I am trying to change the first part of the formulas without changing the second and vice versa. e.g. changing the numerator by 10 cells.
=D11/SUM(D$1:D$4)
=D12/SUM(D$1:D$4)
=D13/SUM(D$1:D$4)
=D14/SUM(D$1:D$4)
=D15/SUM(D$5:D$9)
=D16/SUM(D$5:D$9)
=D17/SUM(D$5:D$9)
=D18/SUM(D$5:D$9)
=D19/SUM(D$5:D$9)
or, changing the base by 100. e.g.
=D1/SUM(D$100:D$104)
=D2/SUM(D$100:D$104)
=D3/SUM(D$100:D$104)
=D4/SUM(D$100:D$104)
=D5/SUM(D$105:D$109)
=D6/SUM(D$105:D$109)
=D7/SUM(D$105:D$109)
=D8/SUM(D$105:D$109)
=D9/SUM(D$105:D$109)
Sometimes, both. Any guidance on how this can be possible?
Thank you.
the first part of this problem seems easy unless I am missing something?
Part 1:
Since the denominator is already in $x form, you can select and COPY the whole range of formulas and PASTE them 10 rows down and then CUT and paste it back into position. The COPY will update the numerators appropriately and when you CUT and PASTE it back into position they will now be just as you want? The second question will be a bit more of a challenge!
Part 2:
OK without VBA I can only think of a really long-winded way to change your demoninators, but I just checked that it does work:
To change the bottom.
Search and replace = with '=
Now you can edit the formulas more freely.
Search and replace D with D%
Search and replace D%$ with D
Search and replace D% with D$
get rid of the '= by using the Data>text to columns option
Now use the copy and paste, cut paste trick from part 1.
Then if you still need your $s back as they were you essentially repeat 1 to 5 again.
Sorry, this looks really long-winded, but if you are desperate and back up before you start it should work.
An excel formula can't replace another cells excel formula... One approach is to make the formula into text and then transform it by other formulas. When transformation is done, you could paste the formula back.
So for changing the D1 -> D11, I would build a dummy series (column K) then write a formula (cell L1). Then I can copy the formula and paste it into the correct column.
Replaceing the "=", with a special character and then you can transform the formulas.
(Column F).
In Column I, the formula used is: =RIGHT(F1,LEN(F1)-FIND("/",F1))
For changing D$1 -> D$100, I think I just would copy and replace it by searching in "Formulas".
This approach can be feasible for acouple of hundred cells. If the list is very long, I would recommend some VBA solution, where you can grab a cells formula with .Range("A1").Formula

Return column(s) name(s) if cell(s) is(are) not blank (separate by a coma)

Given the following table:
Example Table (column names colored for reference)
what formula can I use to return into the yellow cells the name(s) of the column(s) of the blue cell(s) that is(are) not blank [separate by coma(s) if there more than one]
In the sample image I had to write the result manually. Is there any formula that can do it for me?
Thank you!
At the end there was a better option! Instead of duplicating the result both in LEFT and LEN, I was able to use =REPLACE(A1,1,2,””). That way I eliminate both the comma and the space after it without duplicating the large string or using a helper table.
=REPLACE(IF(ISBLANK(B2),"",", "&B$1)&IF(ISBLANK(C2),"",", "&C$1)&IF(ISBLANK(D2),"",", "&D$1),1,2,”")
This formula contain only 3 cell to be analyze, but if you use it for more than that (as I will), it will work better for you don’t need to duplicate it.

Find matching rows in 2 excel lists based on 2 matching columns

I know that similar questions have been asked but I haven’t found the scenario that I am looking. I want to highlight ,using conditional formatting, matching rows in one list based on 2 matching columns on another list.
On the provided picture the fifth row in the second list is highlighted because both the ID and days match with a record from the first list.
That means that I don’t care about the client and provider columns but also means that the third row won’t be highlighted ,for example, because it match the id column but not the days.
I have found examples with conditional formatting but only matching one column.
Concatenate and then match.
That is, make a hidden column (say, column D) in your first table with the formula
=A2 & "##" & C2
Then, to find, use the concatenate string as your lookup parameter and look in the concatenated column. Something like:
=IF(ISNA(MATCH(E2&"##"&G2, $D$2:$D$4, 0)), FALSE, TRUE)
If you don't want to use an extra column for this intermediate calculation, then look into using array formulas.
Using your provided example, create a new conditional formatting rule that applies to $E$2:$G$7 and use this formula:
=SUMPRODUCT(--($A$2:$A$4=$E2),--($C$2:$C$4=$G2))>0
If you're on Excel 2007 or higher, you can use the COUNTIFS formula instead of SUMPRODUCT:
=COUNTIFS($A$2:$A$4,$E2,$C$2:$C$4,$G2)>0
Thanks a lot Josh! I tried your solution with partial success. First I did concatenate on rows in column “E” and used the following formula on the conditional formatting expression.
=IF(ISNA(MATCH(G2&I2, $E$2:$E$4, 0)), FALSE, TRUE)
However as shown in the picture and I only get the first column to be highlighted and I need the entire row. I am selecting the entire range when applying the conditional formatting.
EDIT
I found the error! the formula need some extra columns to be fixed. This is the right formula:
=IF(ISNA(MATCH($G2&$I2, $E$2:$E$4, 0)), FALSE, TRUE)

remove duplicate value but keep rest of the row values

I have a excel sheet(csv) like this one:
and I want the output(tab delimited) to be like this:
Basically:
replace duplicates with blanks but
if col6 value is different from the previous row for the same
col1
value, all the data fields should be included.
I am struggling to create a formula which would do this.
If I try to "Remove Duplicates" it removes the value and shifts the values up one row. I want it to remove the duplicates but not shift the values up.
Given that duplicate data cells are next to each other
and data are on column A with blank top row, this should work. It will remove duplicates except the first occurrence.
=IF(A1=A2,"",A2)
=IF(A2=A3,"",A3)
.
.
.
Try this (note, you need a blank top row (edit: Actually, you're fine you have a header row)):
=IF(A2<>A1,A2,IF(D2<>D1,A2,""))
=IF(A2<>A1,B2,IF(D2<>D1,B2,""))
=IF(A2<>A1,C2,IF(D2<>D1,C2,""))
etc
in the top row and drag down
Edit: Noticed you needed an additional condition.
Try this
=IF((COUNTIF(A1:A$203,A1))=1,A1,"")
It will count duplicates and for the last count it will keep value.
Try COUNTIF(A1:A$203,A1) and you should be good to understand the logic.
You asked for a formula? I suppose you could do something like this. Although it might be easier to use a macro:
=IF(COUNTIF($A$2:A6,"=" & A7),"",A7)
You could have a duplicate table on a separate tab using this formula to clear the rows you don't need - or however you want. Good Luck.
There is another way that doesn't involve a helper column... conditional formatting.
Highlight A2:G(whatever the last cell is)
Use a formula to decide which cells to highlight
Formula would be =AND($A2=$A1,$F2=$F1)
Set the format to be white text (or equal to whatever you have the background color)

How to split a single row into multiple columns in Excel

In Excel, I want to split up the data in row 1 into four single columns. A1:D1 should remain in place. E1:H1 should become A2:D2 and so on.
I am currently using this formula in A2:
INDEX($1:$1;(ROW()-1)*4+COLUMN()-1)
... But I am getting a #REF! error.
How do I solve this problem?
Try using this formula in Cell A2 (you may need to use semicolon instead of commas in non-English version of Excel)
=OFFSET(A$1,0,(ROW()-1)*4)
Then, copy & paste or drag the formula across columns B thru D, as many rows as you need.
Alternatively, your formula (=INDEX($1:$1,(ROW()-1)*4+COLUMN()-1)), is not returning an error for me, however it is not returning the correct column. If you prefer the Index function you can use:
=INDEX($1:$1,(ROW()-1)*4+COLUMN())
Substitute semicolons for commas if needed.

Resources