I Have a column which contains multiple unique and duplicate values and I'm looking to extract only the non repeating values. How do i go about it?
Example:
A2: 28; 33; 34; 37; 44
A3: 28; 34; 37
I'm trying to get:
A4: 33; 44
example
I've tried finding an answer but couldn't find a solution for this exact problem.
=LET(a,TEXTJOIN(";",,B2:B4),UNIQUE(--TEXTSPLIT(a,";"),TRUE,TRUE))
I don´t see a way of doing this using only formulas. The best approach would be to save those values as a .csv file, and importing that csv file into Excel. Then, using Power Query, you would do the necessary transformations in order to get the desired result.
Here is a step by step solution:
Get your data into a .csv file
Import the csv file by going into Data > From Text/CSV,= and selecting the file. Then, select the delimiter "Semicolon" and click "Transform Data" to open Power Query:
With Power Query open, select all columns by pressing Ctrl + A, and then go to the "Transform" tab and click "Unpivot Columns". You should have something like this:
Delete the "Attribute" column
Remove duplicates from "Value" column
Click "Close & Load" in the Home tab
You then will have the desired result:
I have an answer which ignores any spaces between the values and formats the result by putting only one space between the values. Call the function with two parameters, the values of the two cells to be merged:
Public Function Uniques(stra As String, strb As String) As String
'Uniques("11; 12; 13; 14; 15", " 11;11; 14; 15; 88; 16 ") ==> "12; 13; 88; 16"
Dim ar() As String, delim As String, ss As String, s As String, alen As Integer
Dim z As Integer, p As Integer, c As Integer
delim = ";"
s = Replace(stra & delim & strb, " ", "")
ar = Split(s, delim)
s = delim & s & delim
alen = UBound(ar)
For c = 0 To alen
If ar(c) <> "" Then
ss = delim & ar(c) & delim
p = InStr(1, s, ss)
If (p > 0) Then
p = InStr(p + 1, s, ss)
If p > 0 Then
s = Replace(s, delim & ar(c), "")
For z = c + 1 To alen
If ar(z) = ar(c) Then ar(z) = ""
Next
End If
End If
End If
Next
If (s = delim) Then
Uniques = ""
Else
s = Mid(s, 2, Len(s) - 2)
s = Replace(s, delim, delim & " ")
Uniques = s
End If
End Function
if the two cells have the same value returns empty string ("")
Use Reduce, Textsplit and Hstack to split each row and add it to a single-row array, then use Reduce again to check the number of times each element of the array occurs, and concatenate it to the output if it only occurs once.
=LET(
data, DROP(
TOCOL(A:A, 1),
1
),
row, DROP(
REDUCE(
"",
data,
LAMBDA(a, c,
HSTACK(
a,
TEXTSPLIT(
c,
"; "
)
)
)
),
,
1
),
MID(
REDUCE(
"",
row,
LAMBDA(a, c,
IF(
SUM(
N(
c =
row
)
) = 1,
a & "; " &
c,
a
)
)
),
3,
99
)
)
Related
I want to separate a character string using the special characters in that string as cutting lines. After each division the next group of strings should be copied in the next column. The picture below shows how it should work.
My first approach doesn't work and maybe it's too complicated. Is there a simple solution to this task?
Sub SeparateString()
Dim i, j, k, counterA, counterB As Integer
Dim str1, str2 As String
Const Sonderz As String = "^!§$%&/()=?`*'_:;°,.-#+´ß}][{³²"
For i = 1 To Worksheets("Tabelle1").Range("A1").End(xlDown).Row
counterA = 0
For j = 1 To Len(Worksheets("Tabelle1").Range("A" & i))
counterB = 0
For k = 1 To Len(Sonderz)
If Mid(Worksheets("Tabelle1").Range("A" & i), j, 1) = Mid(Sonderz, k, 1) Then
counterA = counterA + 1
End If
If Mid(Worksheets("Tabelle1").Range("A" & i), j, 1) <> Mid(Sonderz, k, 1) And counterA = 0 And counterB = 0 Then
Worksheets("Tabelle1").Range("B" & i) = Worksheets("Tabelle1").Range("B" & i) & Mid(Worksheets("Tabelle1").Range("A" & i), j, 1)
counterB = counterB + 1
End If
Next k
Next j
Next i
End Sub
If you are interested and you do have access to Microsoft 365's dynamic arrays:
Formula in B1:
=LET(X,MID(A1,SEQUENCE(LEN(A1)),1),TRANSPOSE(FILTERXML(CONCAT("<t><s>",IF(ISNUMBER(FIND(X,"^!§$%&/()=?`*'_:;°,.-#+´ß}][{³²")),"</s><s>",X),"</s></t>"),"//s")))
Or nest a SUBSTITUTE() if you need to return string variables:
=LET(X,MID(A1,SEQUENCE(LEN(A1)),1),TRANSPOSE(SUBSTITUTE(FILTERXML(CONCAT("<t><s>'",IF(ISNUMBER(FIND(X,"^!§$%&/()=?`*'_:;°,.-#+´ß}][{³²")),"</s><s>'",X),"</s></t>"),"//s"),"'","")))
If VBA is a must, you could think about a regular expression to replace all the characters from a certain class with a uniform delimiter to use Split() on:
Sub Test()
Dim s As String: s = "CD!02?WX12EF"
Dim arr() As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "[!§$%&\/()=?`*'_:;°,.#+´ß}\][{³²^-]"
arr = Split(.Replace(s, "!"), "!")
End With
For Each el In arr
Debug.Print el
Next
End Sub
The caret has been moved from being the 1st character to any but the first to prevent a negated-character class; also the hyphen has been moved to the back to prevent an array-construct of characters. Funny enough, if you actually want to be less verbose you could throw these character in a more condense class [!#-/:;?[]-`{}§=°´ß³²].
Assuming the first data is in cell A2,
I would go with the simple use of find() with left() mid() and right()
=left(A2,find("!",A2,1)-1)
then:
=mid(A2,find("!",A2,1)+1,find("?",A2,1)-find("!",A2,1)-1)
and:
=right(A2,len(A2)-find("?",A2,1))
Tested and working with one correction done.
You can also do this in Power Query which has a command to split by ANY delimiter.
In the code below, I generate a list of all possible special characters defined as characters NOT in the set of A..Za..z0..9 and you can easily add to that list by editing the code if you want to include other characters in the permitted list.
Edit: If you only have a few special characters, you can just hard-code the list, eg {"!","?"} instead of using List.Generate, but in your question you did not necessarily restrict the list of special characters, even though those are the only two showing in your examples
To use Power Query:
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
let
//change Table name in next line to reflect actual table name
Source = Excel.CurrentWorkbook(){[Name="Table15"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Source", type text}}),
//Generate list of "special characters" for splitting
//the List.Contains function contains the non-special characters
splitterList = List.RemoveNulls(
List.Generate(()=>0,
each _ <= 255,
each _ +1,
each if List.Contains({"A".."Z","a".."z","0".."9"}, Character.FromNumber(_)) then null else Character.FromNumber(_))),
splitIt = Table.SplitColumn(#"Changed Type", "Source",
Splitter.SplitTextByAnyDelimiter(splitterList))
in
splitIt
I cannot extract the postal/zip code of a given address cell that comes like this :
"108, avenue du Grand Sud 37 170 CHAMBRAY les TOURS".
I have used :
=RECHERCHE(9^9;--("0"&STXT(A2;MIN(CHERCHE({0.1.2.3.4.5.6.7.8.9};A2&"0 123456789"));LIGNE($1:$100))))
Which sometimes works, sometimes not depending on the street number starting the address (here "108,").
The problem is the space of the pattern "37 170". I would like to remove the blank space in the pattern. Is there a regex way to search this pattern "## ###", and then to remove this poisonous blank space?
Thank you for your tricks.
I have tried this piece of code :
Function toto(r, Optional u = 0)
Application.Volatile
Dim i%, j%, adr$, cp$, loca$, x
x = Split(r)
For i = 0 To UBound(x)
If x(i) Like "#####" Then Exit For
Next
If i > UBound(x) Then
adr = r.Value 'facultatif
Else
cp = x(i)
For j = 0 To i - 1: adr = adr & x(j) & " ": Next
adr = Left$(adr, Len(adr) + (Len(adr) > 1))
For j = i + 1 To UBound(x): loca = loca & x(j) & " ": Next
loca = Left$(loca, Len(loca) + (Len(loca) > 1))
End If
x = Array(adr, cp, loca)
If 0 < u And u < 4 Then toto = x(u - 1) Else toto = x
End Function
The above code works fine for splitting addresses including street number, zip code, and city name. But it does not work when the zip code is ## ### = 2 digit integer - space - 3 digit integer.
Edit: 01 June 2021
Since it seems my question is not clear enough, let's rephrase :
Given an Excel worksheet containing in each cell of column A, from saying A1 down to A10000, complete addresses like this one :
"2 rue Rene cassin Centre Commercial Châlon 2 Sud 71 100 CHALON SUR SAONE"
or this one :
"15, Rue Emile Schwoerer 68 000 COLMAR"
Where "71 100" and "68 000" are a zip code in incorrect format because of the extra space between the 2 first digits and 3 last digits.
I need to split the Ai cell content in order to obtain :
in cell Bi : the text (street, etc.) placed left before the 2 first digits of the "wrong" zip code,
in cell Ci : the zip code with its correct format ("71100" and not "71 100"),
in cell Di : the text (city name) after the zip code.
It's a kind of left and right extraction around the zip code.
The above code that I have posted does not work.
In order to obtain the correct zip code format, I have tried the regex following function :
Function FindReplaceRegex(rng As Range, reg_exp As String, replace As String)
Set myRegExp = New RegExp
myRegExp.IgnoreCase = False
myRegExp.Global = True
myRegExp.Pattern = reg_exp
FindReplaceRegex = myRegExp.replace(rng.Value, replace)
End Function
But I am unable to determine the correct regular expression pattern to get rid of the space in the zip code.
PEH gave me the following pattern :
(.*)([0-9]{2} ?[0-9]{3})(.*)
When using the function, I have tried to define the replacement pattern by:
(.*)([0-9]{2}[0-9]{3})(.*)
But it would not work. Hope this will clarify my question.
Any idea is welcome. Thanks
If these input strings always have the same pattern, try:
=CONCAT(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*0=0]"))
Depending on your needs/edge-cases, you could add more xpath expressions.
If this is VBA, I have a fix for you (please forgive the crappy naming convention, I'm scribbling this down in work while waiting for SQL to refresh):
Sub test1()
a0 = Cells(1, 1) 'Get the text, in this case "108, avenue du Grand Sud 37 170 CHAMBRAY les TOURS"
aa = Replace(a0, ",", " ") 'Make all delimiters of same type, so removing commas, you may need to add more replace work here?
ab = Application.Trim(aa) 'Reduce all whitespace to single entries, i.e. " " rather than " "
ac = Split(ab, " ", -1) 'Now split by that single whitespace entry
Dim txt()
i2 = 0
lastIsNumeric = False
For i1 = 0 To UBound(ac) - 1 'Step through each entry in our "split" list
If IsNumeric(ac(i1)) = True And IsNumeric(ac(i1 + 1)) = True Then
'Two numbers back to back, join
ReDim Preserve txt(i2)
txt(i2) = ac(i1) + ac(i1 + 1)
i2 = i2 + 1
i1 = i1 + 1
Else
'Not two numbers back to back, don't join
ReDim Preserve txt(i2)
txt(i2) = ac(i1)
i2 = i2 + 1
End If
Next i1
If IsNumeric(ac(UBound(ac))) = False Then
'Need to add last entry to txt()
ReDim Preserve txt(UBound(txt) + 1)
txt(UBound(txt)) = ac(UBound(ac))
End If
End Sub
edit 2021-06-01:
The above will generate a list (txt) of all the entries within your address. You can then reassemble if you wish, or extract out the postcode only.
If you want it as a function, then it would be:
Public Function getPostcode(a0)
aa = Replace(a0, ",", " ")
ab = Application.Trim(aa)
ac = Split(ab, " ", -1)
Dim txt()
i2 = 0
lastIsNumeric = False
For i1 = 0 To UBound(ac) - 1
If IsNumeric(ac(i1)) = True And IsNumeric(ac(i1 + 1)) = True Then
'Two numbers back to back, join
ReDim Preserve txt(i2)
txt(i2) = ac(i1) + ac(i1 + 1)
i2 = i2 + 1
i1 = i1 + 1
Else
'Not two numbers back to back, don't join
ReDim Preserve txt(i2)
txt(i2) = ac(i1)
i2 = i2 + 1
End If
Next i1
If IsNumeric(ac(UBound(ac))) = False Then
'Need to add last entry to txt()
ReDim Preserve txt(UBound(txt) + 1)
txt(UBound(txt)) = ac(UBound(ac))
End If
'Re-assemble string for return
rtnTxt = ""
For i1 = 0 To UBound(txt)
rtnTxt = rtnTxt & " " & txt(i1)
Next i1
getPostcode = rtnTxt
End Function
I'm trying to make a molecular composition calculator but i can seem to separate a formula by case and numbers into different cells.
Is it possible to do this in excel?
Eg:
Cl2H0 ----> Cl | 2 | H | 0
A bit crude but you could write a parsing function like this that returns an array:
Public Function parseChem(str As String) As Variant()
'should error-check first that entire string is correct
Dim retArr() As Variant
Dim i As Long, numBlocks As Long
Dim currentChar As String, currentElement As String, typeOfChar As String
Dim digitChain As Boolean
For i = 1 To Len(str)
currentChar = Mid(str, i, 1)
typeOfChar = charType(currentChar)
Select Case typeOfChar
Case Is = "upperCase"
If currentElement <> "" Then
'possibly cast numbers to longs here, and at the end...
retArr(numBlocks) = currentElement
End If
numBlocks = numBlocks + 1
ReDim Preserve retArr(1 To numBlocks)
currentElement = currentChar
digitChain = False
Case Is = "lowerCase"
currentElement = currentElement & currentChar
Case Is = "digit"
If digitChain Then
currentElement = currentElement & currentChar
Else
'new digit block
retArr(numBlocks) = currentElement
numBlocks = numBlocks + 1
ReDim Preserve retArr(1 To numBlocks)
digitChain = True
currentElement = currentChar
End If
Case Else
'do something to flag error
End Select
Next i
retArr(numBlocks) = currentElement
parseChem = retArr
End Function
Private Function charType(str As String) As String
Dim ascii As Long
ascii = Asc(str)
If ascii >= 65 And ascii <= 90 Then
charType = "upperCase"
Exit Function
Else
If ascii >= 97 And ascii <= 122 Then
charType = "lowerCase"
Exit Function
Else
If ascii >= 48 And ascii <= 57 Then
charType = "digit"
Exit Function
End If
End If
End If
End Function
OK the algorithm in the end is very simple
If at any point in the formula you have a number, then look for the next capital letter and output all characters up to that point.
If at any point in the formula you have a letter, then look for the next capital letter *or number* and output all characters up to that point.
The formula is rather long
=IF(ISNUMBER(MID($A$1,SUM(LEN($B$1:B1))+1,1)+0),
MID(MID($A$1,SUM(LEN($B$1:B1))+1,9),1,MIN(FIND( MID("ABCDEFGHIJKLMNOPQRSTUVWXYZ",ROW($1:$26),1),MID($A$1,SUM(LEN($B$1:B1))+2,9)&"ABCDEFGHIJKLMNOPQRSTUVWXYZ" ))),
MID(MID($A$1,SUM(LEN($B$1:B1))+1,9),1,MIN(FIND( MID("ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789",ROW($1:$36),1),MID($A$1,SUM(LEN($B$1:B1))+2,9)&"ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" ))))
must be entered as an array formula using CtrlShiftEnter and the 9 would need increasing ( or changing to len($a1) ) if the formula was longer than 9 characters.
Here's a shorter version that doesn't have to be entered as an array formula
=IF(ISNUMBER(MID($A1,SUMPRODUCT(LEN($B1:B1))+1,1)+0),
MID(MID($A1,SUMPRODUCT(LEN($B1:B1))+1,9),1,AGGREGATE(15,6,FIND( MID("ABCDEFGHIJKLMNOPQRSTUVWXYZ",ROW($1:$26),1),MID($A1,SUMPRODUCT(LEN($B1:B1))+2,9)&"A" ),1)),
MID(MID($A1,SUMPRODUCT(LEN($B1:B1))+1,9),1,AGGREGATE(15,6,FIND( MID("ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789",ROW($1:$36),1),MID($A1,SUMPRODUCT(LEN($B1:B1))+2,9)&"A"),1)))
If you are familiar with VBA then you could write a function which reads in the cell value (e.g. Cl2H0) and then a For Loop that splits the string into seperate values. You would then write these seperated values (Cl, 2, H and 0) back to indivisual columns on the excel sheet.
One way of doing this would be to use the Asc() function in a loop which will give you the Ascii number corresponding to an indivisual charachter. Ascii charachters 65 to 90 are Upper Case charachters. In your case you would want to split the string when the charachter does not fall within this range.
If you want to try this and post your example then I can give some more guidance but its hard to give more advide without first understanding if you are trying to achieve this with VBA or some other means.
I've seen several examples of Excel formulas that can search for multiple words in a cell like so:
=IF(SUMPRODUCT(--(NOT(ISERR(SEARCH({"mail","post"},A4)))))>0,1,"")
And:
=OR(NOT(ISERR(SEARCH("mail",A4))),NOT(ISERR(SEARCH("post",A4))))
However, the results will pick up any instance of "mail" (i.e. "mail" or "email") or "post" (i.e. "post" or "posture"). Is there a way to run a search for multiple words and only the specific words listed?
Following the scheme:
The formula consider to search some char of end for words like "." (in a array).
Search this char before and after to try to define the complete word.
If you need to search only the first occurrance you can use EXCEL:
B3 -> =IF(IFERROR(VLOOKUP(MID(A2;SEARCH(B2;A2)+LEN(B2);1);{" ";"-";".";",";";";":"};1;);FALSE)=FALSE;FALSE;TRUE)
B4 -> =OR(SEARCH(B2;A2)=1;IF(IFERROR(VLOOKUP(MID(A2;SEARCH(B2;A2)-1;1);{" ";"-";".";",";";";":"};1;);FALSE)=FALSE;FALSE;TRUE))
B5 -> =AND(B3;B4)
If you need to search for EACH occurrance it's better to use VBA:
Public Function FindWords(xx As Range, Stri As String) As Boolean
For i = 1 To Len(xx.Value)
If Mid(xx.Value, i, Len(Stri)) = Stri Then
If (i = 1) Then
If InStr(1, " ,.-;:_", Mid(xx.Value, i + Len(Stri), 1)) > 0 Or (i + 1 + Len(Stri) > Len(xx.Value)) Then
FindWords = True
Exit Function
End If
ElseIf (InStr(1, " ,.-;:_", Mid(xx.Value, i - 1, 1)) > 0) Then
If InStr(1, " ,.-;:_", Mid(xx.Value, i + Len(Stri), 1)) > 0 Or (i + Len(Stri) > Len(xx.Value)) Then
FindWords = True
Exit Function
End If
End If
End If
Next
FindWords = False
End Function
Adding the function in a Module:
B7 -> =FindWords(A2;B2)
I'm trying to pad a string with 0's to the left.The length of the output string should be 7.
Here's my code :
inputstr = "38"
in = string(7 - Len(inputStr),0) & inputStr
msgbox in
I'm getting error Expected Statement
Please help me
Thank You
The following code will run 5% faster:
inputStr = "38"
result = Right("0000000" & inputStr, 7)
msgbox result
This function will left-pad an input value to the given number of characters using the given padding character without truncating the input value:
Function LPad(s, l, c)
Dim n : n = 0
If l > Len(s) Then n = l - Len(s)
LPad = String(n, c) & s
End Function
Output:
>>> WScript.Echo LPad(12345, 7, "0")
0012345
>>> WScript.Echo LPad(12345, 3, "0")
12345
in is a reserved word so can't be used as a variable name and you must pass a string "0" not an integer 0, so:
inputStr = "38"
result = string(7 - Len(inputStr), "0") & inputStr
msgbox result
Function:
Private Function LPad (str, pad, length)
LPad = String(length - Len(str), pad) & str
End Function
Use:
LPad(12345, "0", 7)