Remove delimited duplicate values from within an Excel cell - excel

I am trying to figure out how to remove duplicate values from with an Excel cell. My columns have values separated by a semi-colon. For example, in Column A I have these 3 rows with the values below.
3;5;6
2;2;4
9;5;12;12
What I am wanting to do in an adjacent column, call it column B, have a formula that returns:
3;5;6
2;4
9;5;12
What formula or function would I need to use to achieve this?

The following formula can work with some limited conditions:
-using >= excel2019 version.
-all numbers in the strings > 0.
-all numbers in the string are integer.
-the maximum number in the string = 10^6
You can change $1:$10 or $1:$100,..
C2=TEXTJOIN(";",,IF(1=FREQUENCY(IFERROR(MATCH(ROW($1:$10),--TRIM(MID(SUBSTITUTE(B2,",",REPT(" ",LEN(B2))),(ROW(INDIRECT("A1:A" & 1+ LEN(B2)-LEN(SUBSTITUTE(B2,",",""))))-1)*LEN(B2)+1,LEN(B2))),0),1000),IFERROR(MATCH(ROW($1:$10),--TRIM(MID(SUBSTITUTE(B2,",",REPT(" ",LEN(B2))),(ROW(INDIRECT("A1:A" & 1+ LEN(B2)-LEN(SUBSTITUTE(B2,",",""))))-1)*LEN(B2)+1,LEN(B2))),0),1000)),ROW($1:$10),""))

If you have office 365 then you can try below formula and see if it helps you with your requirement.
=TEXTJOIN(";",TRUE,(UNIQUE(FILTERXML("<t><d>"&SUBSTITUTE(A2,";","</d><d>")&"</d></t>","//d"))))
For FILTERXML part, I will recommend going through below link on stackoverflow which creates array of individual elements from base string:
Excel - Extract substring(s) from string using FILTERXML
Rest of the structure is pretty standard which uses UNIQUE to get list of unique items and then we join them back together using TEXTJOIN.

Another Office 365 or Excel 2019 formula solution
In B1, array (CSE) formula copied down :
=TEXTJOIN(";",1,IF(ISNUMBER(FIND(";"&ROW(A$1:A$9999)&";",";"&A1&";")),ROW(A$1:A$9999),""))

Related

Generate a unique ID (As much As possible) from a string in Excel using string functions

Let's say I have two strings in two cells
Cell A1 = Customer Country
Cell B1 = Customer City
I need to generate a unique ID using the Excel string functions (LEN, LEFT, MID, RIGHT etc.) or any other (CONCAT etc.) along with the ROW function.
Get first letter & last letter of each word, remove spaces and dashes, get the row number and return a unique string.
If I use
=IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=0,LEFT(A$1,1),IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=1,LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1),LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1)&MID(A$1,FIND(" ",A$1,FIND(" ",A$1)+1)+1,1))) &ROW(A$1)
I get results as CC1 in both cases. How would I get a unique ID in such as case.
The idea in the comment-section by #JosWoolley is a good one. Though, be careful how/where you'd add a column index. If you'd just add the column index number you'd create confusion between say CC111 from row 11 column 1 and the number from row 1 and possibly column 11. Just adding the actual address of the cell instead of these indices will help but can create confusion too if you don't add a delimiter first. Therefor I'd suggest something along the lines of:
Formula in D1:
=CONCAT(LEFT(TEXTSPLIT(A1," ")),"|",ADDRESS(ROW(A1),COLUMN(A1),4))
Note: If you don't yet have access to TEXTSPLIT() you can swap this with FILTERXML(). Also, you mentioned CONCAT() but if used with Excel 2019 you may need to CSE the formula.

Is there an excel formula that can check the number of words in consecutive cells and give an output based on conditions?

I am trying to create an output in excel based off the number of words in cells. Essentially i want to check if the sum of the words in 3 cells is = 1,2 or >=3. Im using the len formula which i have successfully used on single cell conditions but im struggling to create the formula that would check multiple cells.
Below is an example of my data:
Column A Column B Column C
Cat;dog Bird
Formula
=SUMIF(AND(LEN(TRIM(A4))-LEN(SUBSTITUTE(B4," ",""))+1, LEN(TRIM(C4))-LEN(SUBSTITUTE(C4," ",""))+1, >=3), "Titanium")
https://docs.google.com/spreadsheets/d/1W6nFr-W0r-XWZnvrFWndsvdBEEGHMQUa/edit?usp=sharing&ouid=103068518904190156690&rtpof=true&sd=true
First I made a single formula to work on a single cell. It ignores semicolons and commas to calculate total words. That formula is in column F and it's:
=IF(LEN(E5)=0;0;LEN(TRIM(SUBSTITUTE(SUBSTITUTE(E5;";";" ");",";" ")))-LEN(SUBSTITUTE(TRIM(SUBSTITUTE(SUBSTITUTE(E5;";";" ");",";" "));" ";""))+1
Notice I added an IF to make sure that blank cells will count as 0 words (because the +1 will be added wrongly and we need to avoid this.
Now you just need to sum up all results and we get 8 words.
What you want is to get this result with a single formula and that can be perfomed with array formulas. In cell F11 my formula is:
=SUM(IF(LEN(E5:E8)=0;0;LEN(TRIM(SUBSTITUTE(SUBSTITUTE(E5:E8;";";" ");",";" ")))-LEN(SUBSTITUTE(TRIM(SUBSTITUTE(SUBSTITUTE(E5:E8;";";" ");",";" "));" ";""))+1))
You need to introduce this formula pressing CTRL+ENTER+SHIFT or it won't work!
Now you got the result in a single formula and you just need to add your conditions mentioned in your post
UPDATE: In your Google Sheets, the correct formula would be:
=ArrayFormula(IF(SUM(IF(LEN(TRIM(A3:B3))=0,0,LEN(TRIM(A3:C3))-LEN(SUBSTITUTE(A3:C3," ",""))+1))>=3,"Good","Bad"))
Please, notice Excel is not the same as Google Sheets so sometimes the formulas may be different in one of them.

Find common text within a range of cells(range containing blanks as well)

This is the problem i am facing in Excel formula
enter image description here
In column F, i want to find the common text across A2 to E2 (containing Blanks)
My Question:
Is there a simple way to get the result without VB?
Any help is appreciated,thanks
I found that google sheets has some really cool functions.
If you put the formula =SPLIT(A1, ",", TRUE,FALSE) in the cell after your row of common text (or probably even in a different sheet - "probably because hadn't tried it, though it should), the next x cells (where x is the number of "," in A1 - because "," is the delimitator) will be the text.
then you can put the code =IF(SUM(ARRAYFORMULA(if(REGEXMATCH($A$1:$D$1,F1),1,0)))=COUNTA($A$1:$D$1),F1,"") into an equal number of cells after that (probably should just put into the max number), and =CONCATENATE(I1:L1) into the last cell.
Ok. So to tweak this for yourself: I found that ARRAYFORMULA lets you put an array in place of a single cell in a function inside. how it exactly works I read its like a for loop. but I can't really vouch for that. but here it lets you have REGEXMATCH (which is a Boolean check on the cell you give it for if it contains the given REGEX) check each cell in the array.
the sum will add them up, and the if will match against the COUNTA to find if the number of cells in the array that contain this string is equal to the number of non-empty cells.
the concatenate at the end adds all the cells (containing the regex function) together, and since the only non-empty cells will be the one with the string, that is what this cell will return (no spaces).
code:
results:
the test data:
If you need in specifically Excel... this won't help.
We can use power query to achieve the desired result.
Unpivot the columns in Power query
Split all the columns by Comma delimiter
Create a custom column to see if the first column records exist in the remaining columns.
Use the functionText.contains.
Sample function: =Text.Contains([column.1],[column.1]&[column.2]&[column.3])
If the above function returns TRUE then get the first column result(This is the expected result) and load the data back to your excel

Check cell against multiple others, Output Matches "," Delimited

I have the following Data
SKU Cell (M5)
SKU Check Range(Table 1, Column J2:J8000)
Model Range(F2:F8000)
I need to have the following logic converted to the formula for it
IF SKU = SKU Check, Output all models that match from Model Column
with ", " delimiters for multiple matches
I hope this makes sense
Any help on this would be incredibly appreciated
With Office 365 use:
=TEXTJOIN(", ",TRUE,FILTER(F2:F8000,J2:J8000 = M5,""))
Without O365, vba will be needed. There are many UDF that mimic TEXTJOIN in which one can use an Array IF() to return the correct values. Here is one:
VLOOKUP with multiple criteria returning values in one cell
Or this one that does TEXTJOINIFS:
Merge values of column B based on common values on column A

Remove values in a cell based on a string part

I have a column in Excel that contains a series of comma delimited values. The number of values in each row is different and the values I'm searching for can be in different positions within the cell. I would like to remove some of those values based on based on a string part.
Example cell:
2006CE3, 2007CE3, 2012CE1, 2012CE3, 2013CE1, 2013CE3, 2014CE2, 2015CE3, 2016CE2, 2019FA, 2020SP
Specifically, remove all values containing "CE". In the example above, I would like to remove 2006CE3, 2007CE3, 2012CE1, 2012CE3, 2013CE1, 2013CE3, 2014CE2, 2015CE3, 2016CE2, and leave 2019FA, 2020SP
To do this with a formula one will need TEXTJOIN:
=TEXTJOIN(", ",TRUE,IF(ISNUMBER(SEARCH("CE",FILTERXML("<z><y>"&SUBSTITUTE(A1,",","</y><y>")&"</y></z>","//y"))),"",TRIM(FILTERXML("<z><y>"&SUBSTITUTE(A1,",","</y><y>")&"</y></z>","//y"))))
Please try this formula solution of which using TEXTJOIN function available for Office 365
In B2, enter formula :
=TEXTJOIN(", ",1,INDEX(FILTERXML("<a><b>"&SUBSTITUTE(A2,", ","</b><b>")&"</b></a>","//b[not(contains(.,'CE'))]"),0))

Resources