Add brackets to an expression with regex - excel

I want to add parentheses to parts of an expression.
The parts are any multiplied expressions that are NOT in parentheses.
For example, change:
a * b + c + d * e * f + g
to:
(a * b)+ c +( d * e * f) + g
another example, change:
(a1+b1) + ( c1 * d1) + e1 * f1 *g1 * h1 + J1
to:
(a1+b1) + ( c1 * d1) + (e1 * f1 *g1 * h1) + J1
The only elements in the expression to check will be * + ( ) and variable names with no spaces, like a1 f1 p3 etc.
I came up with something but it shows the expressions WITH brackets. I want the opposite of that.
This is what I have:
Function testParen()
Dim cText As String, cPattern As String
Dim itemFound As Variant, nItems
Dim getParen As New RegExp
getParen.Global = True
getParen.IgnoreCase = True
cText = "(a1*q1) + (b1*c1) + d1*e1*f1 * (g1+h1) + (k1*m1) "
cPattern = "(^|[^*])\(([^()]+)\)"
'cPattern = "\("
getParen.Pattern = cPattern
cText = Replace(cText, " ", "")
Dim mc As MatchCollection
Set mc = getParen.Execute(cText)
nItems = mc.Count
'Debug.Print vbNewLine
If nItems > 0 Then
Debug.Print nItems
For Each itemFound In mc
Debug.Print Replace(itemFound, "+", "")
Next
Else
Debug.Print "No items"
End If
End Function

I recommend you rethink your approach and parse the expression properly rather than just inserting parentheses. Perhaps initially splitting on + would be simpler, rather than using a regex that you don't fully understand and can't confidently adjust.
Anyway, here is an example in JavaScript of the kind of regex that could be used. I will leave any conversion to VBA to you.
const rx = /(\+\s*|^\s*)((?:\s*\w+\s*\*)+\s*\w+)(?=\s*\+|\s*$)/g;
const expression = '(a1+b1) + ( c1 * d1) + e1 * f1 * g1 * h1 + J1';
console.log(expression.replace(rx, '$1($2)'));
Note that the above has not been properly tested and may be brittle. I am not going to try and explain it, but you could enter the regex into for example https://regex101.com/ if you want a breakdown of how it works.
In the regex, the first \+ could be replaced with for example [^(\s\w] and the second \+ with [^)\s\w], if for example you want - to work also.

Related

VBA replace a string EXCEL 2019

I cannot extract the postal/zip code of a given address cell that comes like this :
"108, avenue du Grand Sud 37 170 CHAMBRAY les TOURS".
I have used :
=RECHERCHE(9^9;--("0"&STXT(A2;MIN(CHERCHE({0.1.2.3.4.5.6.7.8.9};A2&"0 123456789"));LIGNE($1:$100))))
Which sometimes works, sometimes not depending on the street number starting the address (here "108,").
The problem is the space of the pattern "37 170". I would like to remove the blank space in the pattern. Is there a regex way to search this pattern "## ###", and then to remove this poisonous blank space?
Thank you for your tricks.
I have tried this piece of code :
Function toto(r, Optional u = 0)
Application.Volatile
Dim i%, j%, adr$, cp$, loca$, x
x = Split(r)
For i = 0 To UBound(x)
If x(i) Like "#####" Then Exit For
Next
If i > UBound(x) Then
adr = r.Value 'facultatif
Else
cp = x(i)
For j = 0 To i - 1: adr = adr & x(j) & " ": Next
adr = Left$(adr, Len(adr) + (Len(adr) > 1))
For j = i + 1 To UBound(x): loca = loca & x(j) & " ": Next
loca = Left$(loca, Len(loca) + (Len(loca) > 1))
End If
x = Array(adr, cp, loca)
If 0 < u And u < 4 Then toto = x(u - 1) Else toto = x
End Function
The above code works fine for splitting addresses including street number, zip code, and city name. But it does not work when the zip code is ## ### = 2 digit integer - space - 3 digit integer.
Edit: 01 June 2021
Since it seems my question is not clear enough, let's rephrase :
Given an Excel worksheet containing in each cell of column A, from saying A1 down to A10000, complete addresses like this one :
"2 rue Rene cassin Centre Commercial Châlon 2 Sud 71 100 CHALON SUR SAONE"
or this one :
"15, Rue Emile Schwoerer 68 000 COLMAR"
Where "71 100" and "68 000" are a zip code in incorrect format because of the extra space between the 2 first digits and 3 last digits.
I need to split the Ai cell content in order to obtain :
in cell Bi : the text (street, etc.) placed left before the 2 first digits of the "wrong" zip code,
in cell Ci : the zip code with its correct format ("71100" and not "71 100"),
in cell Di : the text (city name) after the zip code.
It's a kind of left and right extraction around the zip code.
The above code that I have posted does not work.
In order to obtain the correct zip code format, I have tried the regex following function :
Function FindReplaceRegex(rng As Range, reg_exp As String, replace As String)
Set myRegExp = New RegExp
myRegExp.IgnoreCase = False
myRegExp.Global = True
myRegExp.Pattern = reg_exp
FindReplaceRegex = myRegExp.replace(rng.Value, replace)
End Function
But I am unable to determine the correct regular expression pattern to get rid of the space in the zip code.
PEH gave me the following pattern :
(.*)([0-9]{2} ?[0-9]{3})(.*)
When using the function, I have tried to define the replacement pattern by:
(.*)([0-9]{2}[0-9]{3})(.*)
But it would not work. Hope this will clarify my question.
Any idea is welcome. Thanks
If these input strings always have the same pattern, try:
=CONCAT(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*0=0]"))
Depending on your needs/edge-cases, you could add more xpath expressions.
If this is VBA, I have a fix for you (please forgive the crappy naming convention, I'm scribbling this down in work while waiting for SQL to refresh):
Sub test1()
a0 = Cells(1, 1) 'Get the text, in this case "108, avenue du Grand Sud 37 170 CHAMBRAY les TOURS"
aa = Replace(a0, ",", " ") 'Make all delimiters of same type, so removing commas, you may need to add more replace work here?
ab = Application.Trim(aa) 'Reduce all whitespace to single entries, i.e. " " rather than " "
ac = Split(ab, " ", -1) 'Now split by that single whitespace entry
Dim txt()
i2 = 0
lastIsNumeric = False
For i1 = 0 To UBound(ac) - 1 'Step through each entry in our "split" list
If IsNumeric(ac(i1)) = True And IsNumeric(ac(i1 + 1)) = True Then
'Two numbers back to back, join
ReDim Preserve txt(i2)
txt(i2) = ac(i1) + ac(i1 + 1)
i2 = i2 + 1
i1 = i1 + 1
Else
'Not two numbers back to back, don't join
ReDim Preserve txt(i2)
txt(i2) = ac(i1)
i2 = i2 + 1
End If
Next i1
If IsNumeric(ac(UBound(ac))) = False Then
'Need to add last entry to txt()
ReDim Preserve txt(UBound(txt) + 1)
txt(UBound(txt)) = ac(UBound(ac))
End If
End Sub
edit 2021-06-01:
The above will generate a list (txt) of all the entries within your address. You can then reassemble if you wish, or extract out the postcode only.
If you want it as a function, then it would be:
Public Function getPostcode(a0)
aa = Replace(a0, ",", " ")
ab = Application.Trim(aa)
ac = Split(ab, " ", -1)
Dim txt()
i2 = 0
lastIsNumeric = False
For i1 = 0 To UBound(ac) - 1
If IsNumeric(ac(i1)) = True And IsNumeric(ac(i1 + 1)) = True Then
'Two numbers back to back, join
ReDim Preserve txt(i2)
txt(i2) = ac(i1) + ac(i1 + 1)
i2 = i2 + 1
i1 = i1 + 1
Else
'Not two numbers back to back, don't join
ReDim Preserve txt(i2)
txt(i2) = ac(i1)
i2 = i2 + 1
End If
Next i1
If IsNumeric(ac(UBound(ac))) = False Then
'Need to add last entry to txt()
ReDim Preserve txt(UBound(txt) + 1)
txt(UBound(txt)) = ac(UBound(ac))
End If
'Re-assemble string for return
rtnTxt = ""
For i1 = 0 To UBound(txt)
rtnTxt = rtnTxt & " " & txt(i1)
Next i1
getPostcode = rtnTxt
End Function

How to filter symmetric words from a column of data?

I have a list of words and I would like to find all symmetric words and some how put value 1 for each of them (see the picture).
This does the trick:
=1*(A1=CONCAT(MID(A1,LEN(A1)-SEQUENCE(1,LEN(A1),0),1)))
It reads the string in a cell backwards using MID and SEQUENCE, and compares the CONCAT result with the original to see if it is the same, i.e. the string is symmetric.
Multiplying by 1 forces the Boolean into an integer.
With VBA. This assumes that a single character is symmetric:
Public Function Sym(s As String) As Long
Dim L As Long, L2 As Long
Dim p1 As String, p2 As String
L = Len(s)
L2 = Int(L / 2)
Sym = 0
If L Mod 2 = 0 Then
' even
p1 = Mid(s, 1, L2)
p2 = StrReverse(Mid(s, L2 + 1))
If p1 = p2 Then
Sym = 1
End If
Else
' odd
p1 = Mid(s, 1, L2)
p2 = StrReverse(Mid(s, L2 + 2))
If p1 = p2 Then
Sym = 1
End If
End If
End Function
This will handle both an even or odd number of characters.
EDIT#1:
Simply:
Public Function Sym(s As String) As Long
Sym = 0
If s = StrReverse(s) Then Sym = 1
End
With Microsoft365, try:
Formula in B1:
=EXACT(A1,CONCAT(MID(A1,SEQUENCE(LEN(A1),,LEN(A1),-1),1)))
Formula in C1:
=--EXACT(A1,CONCAT(MID(A1,SEQUENCE(LEN(A1),,LEN(A1),-1),1)))
If you are working in a version without CONCAT() it will get significatly more verbose, but still possible:
=SUMPRODUCT(--EXACT(MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1),MID(A1,(LEN(A1)+1)-ROW(A$1:INDEX(A:A,LEN(A1))),1)))=LEN(A1)
This, again, can be wrapped to return either 1 or 0 if you prefer that over the boolean results:
=--(=SUMPRODUCT(--EXACT(MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1),MID(A1,(LEN(A1)+1)-ROW(A$1:INDEX(A:A,LEN(A1))),1)))=LEN(A1))

Converting a string of HH:MM:SS to numeric in VBA

The problem is pretty straight forward: convert the string representation into a number representing minutes. The only problem is that my code only works when I type the HH:MM:SS within a set of "". If the quotes are not included, the code runs but I get a #value error inside the cell in Excel. I figured this has to do with the length. With/without the quotes included, the code goes to the first index of the converted character array (or I should say the 0th). Here is my code:
Public Function TimeToDbl(val As String)
'Convert string into character array
Dim buff() As String
ReDim buff(Len(val) - 1)
For i = 1 To Len(val)
buff(i - 1) = Mid$(val, i, 1)
Next
'Separate hours,minutes,seconds
Dim h, m, s As Double
h = 0
m = 0
s = 0
For i = 1 To 2
h = (h * 10 ^ (i - 1)) + CInt(buff(i))
Next i
For i = 4 To 5
m = (m * 10 ^ (i - 4)) + CInt(buff(i))
Next i
For i = 7 To 8
s = (s * 10 ^ (i - 7)) + CInt(buff(i))
Next i
'Combine values centering minutes
s = s * 0.017
h = h * 60
m = h + m + s
TimeToInt = m
End Function
When passing a string with quotes, the quotes are included in the string. When the string is copied to an array, arr[0] becomes ". When passed without quotes, the text is still of type "string".
make a small change in the algorithm for h,m, and s shown below:
For i = 0 To 1
h = (h * (10 ^ i) + CInt(buff(i)))
Next i

Replace one character with two using replace function

I am trying to convert accented characters to regular characters. Some characters need to be replaced with two characters. I tried MID(string,i,2).
Function ChangeAccent(thestring As String)
Dim A As String * 1
Dim B As String * 1
Dim C As String * 1
Dim D As String * 1
Dim i As Integer
Const LatChars="ßÄÖÜäöü"
Const OrgChars= "SSAEOEUEaeoeue"
For i = 1 To Len(LatChars)
A = Mid(LatChars, i, 1)
B = Mid(OrgChars, i, 2)
thestring = Replace(thestring, A, B)
Next
Const AccChars="ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
Const RegChars= "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
For i = 1 To Len(AccChars)
C = Mid(AccChars, i, 1)
D = Mid(RegChars, i, 1)
thestring = Replace(thestring, C, D)
Next
ChangeAccent = thestring
End Function
The code is working for one by one replacement (1 character by 1 character).
I want to replace one character in the variable LatChars with 2 characters in OrgChars. i.e ß with SS, Ä with AE and so on.
The Mid(OrgChars, i,2) is not extracting two characters.
Minor changes:
Dim B As String * 2
B = Mid(OrgChars, i * 2 - 1, 2)
Option Explicit
Function ChangeAccent(thestring As String)
Dim A As String * 1
Dim B As String * 2
Dim C As String * 1
Dim D As String * 1
Dim i As Integer
Const LatChars = "ßÄÖÜäöü"
Const OrgChars = "SSAEOEUEaeoeue"
For i = 1 To Len(LatChars)
A = Mid(LatChars, i, 1)
B = Mid(OrgChars, i * 2 - 1, 2)
thestring = Replace(thestring, A, B)
Next
Const AccChars = "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
Const RegChars = "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
For i = 1 To Len(AccChars)
C = Mid(AccChars, i, 1)
D = Mid(RegChars, i, 1)
thestring = Replace(thestring, C, D)
Next
ChangeAccent = thestring
End Function
B = Mid(OrgChars, i,2)
Should probably be
B = Mid(OrgChars, i*2-1,2)
One method is to use two arrays. One that contains the character you wish to replace and the other its replacement. This method depends on both arrays being in sync with one another. Element 1 in the first array must match element 1 in the second, and so on.
This method allows you to ignore the string lengths. There is no longer any need to process 1 and 2 character replacement strings separately. This code can also scale to 3, 4 or more character replacements without a logic change.
I've used the split function to build the arrays. I find this saves time when typing out the code. But you may prefer to define the elements individually, which is arguably easier to read.
Example
Sub Demo001()
' Demos how to replace special charaters of various lenghts.
Dim ReplaceThis() As String ' Array of characters to replace.
Dim WithThis() As String ' Array of replacement characters.
Dim c As Integer ' Counter to loop over array.
Dim Sample As String ' Contains demo string.
' Set up demo string.
Sample = "ß - Ä - Š"
' Create arrays using split function and comma delimitor.
ReplaceThis = Split("ß,Ä,Š", ",")
WithThis = Split("SS,AE,S", ",")
' Loop over replacements.
For c = LBound(ReplaceThis) To UBound(ReplaceThis)
Sample = Replace(Sample, ReplaceThis(c), WithThis(c))
Next
' Show result.
MsgBox Sample
End Sub
Returns
SS - AE - S
EDIT: Answer rewritten as first attempt misunderstood - and did not answer - op question

How to increment value in for loop which is declared as string in code, but has numeric val?

This code returning error in the For loop because of a type mismatch:
Dim sp, q, spb, spt, cp, cpb, cpt, g, n, i As String
n = Range("G3").Value
cp = ActiveCell.Offset(0, -4).Value
q = ActiveCell.Offset(0, -3).Value
cpt = cp * q
cpb = cp * n
i = 0.05
For i = 0.05 To 10
sp = cp + i
spt = (sp * q)
spb = spt * n
g = spt - cpt - spb - cpb
If g > 0 Then
Range("I10").Value = g
Exit For
End If
Next i
End Sub
Declare/Dim your variables to be of the correct type:
Dim i As Double
For the other variables, a numerical type would be appropriate too. If Excel has problems pulling them from Cells of String type, use a conversion (CDbl, CLng, ...), before you do the computations.
As stated above, dim your variable(s) correctly. For instance in your code sp, q, spb, spt, cp, cpb, cpt, g, n are all set to the "variant" type, which may or may not be what you're after. My guess is that you're wanting to set them all as strings, which would mean you'd have to do something like:
Dim sp as String, q as String, spb as String (etc, etc, etc). Of course, doing that will cause a lot of your code down the line to bomb (cpt = cp * q as an example).
Also, the default increment of a loop is 1. Is that what you're expecting in your for loop? In your example above, your for loop never gets to 10 (stops at 9.05).
I'm not sure if this helps you at all, but hopefully it will clear up any confusion down the road.
Good luck!
There a validation call you can make to the cell first:
If IsNumeric(Range("I10").Value) then
'// Do some numeric stuff
else
'// Do some non-numeric stuff
endif
There is a Val() function to try and interpret string content as numbers:
http://office.microsoft.com/en-gb/excel-help/HV080557263.aspx

Resources