Parsing and expanding algebraic equations that have variables using VBA - excel

I am working on a VBA script that is parsing down a user input string for later evaluation. I have managed to get it down to the following, where the letters are variables that contain strings, numbers, or other functions to evaluate later. They are being compared in AND/OR statements that have yet unspecified variables, so I can't just evaluate it ahead of time. Thus, I have simplified this:
( ( [$$FIELD_1] [$$REL_GREATER_THAN] [$$FUNC_ROUND_OPEN] 6000.25
[$$FUNC_ROUND_CLOSE] AND [$$FIELD_1] [$$REL_LESS_THAN] [$$FUNC_ABS_OPEN]
[$$FUNC_FLOOR_OPEN] - 7000.99 [$$FUNC_FLOOR_CLOSE] [$$FUNC_ABS_CLOSE] OR
[$$FIELD_1] [$$REL_LESS_THAN] [$$FIELD_3] ^ 2 + 8 ) AND ( [$$STRING_1] &
[$$STRING_2] & [$$STRING_3] [$$REL_NOT_EQUAL] [$$FIELD_5] & [$$STRING_4] &
[$$FIELD_4] ) ) OR [$$FIELD_6] [$$REL_EQUAL] [$$STRING_5]
by replacing the Booleans with variables, and I replaced AND with multiplication, OR with addition:
( ( A * B + C ) * ( D ) ) + E
What I need to do now is expand that out using the distributive property of algebra. The result would look something like
A*B*D + C*D + E
which I can then start substituting values back into.
The order the terms are in doesn't really matter, it just has to be grouped down into multiplication or addition. It has become a much harder problem than I initially thought, because I need it to be robust enough to handle other equations, not just this one particular example.
I have looked for something built into VBA that can do this naturally, but I haven't found anything yet. This is a similar question, but for Java and C: Parse non standard form to standard form in java. Unfortunately, the only answer there essentially boiled down to, "just parse it"... which is the part I need help with.
How can I programmatically figure out these distributive expansions? Is there an algorithm for this?

assuming you have a string with your equation
equation = "( ( 2 * 3 + 2.5 ) * ( -2.5 ) ) + 1.9"
you can Evaluate it
MsgBox equation & " = " & Evaluate(Replace(equation, ",", "."))
where the Replace is there to make sure you have the dot as decimal separator
here's a working snippet, where a I used a Dictionary to keep track of name variables and their value:
Dim equation As String
equation = "( ( A * B + C ) * ( D ) ) + E"
Dim varDict As Scripting.Dictionary
Set varDict = New Scripting.Dictionary
With varDict
.Add "A", 2# ' assigning variable "A" the value 2.0
.Add "B", 3 ' assigning variable "B" the value 3.0
.Add "C", 2.5 ' assigning variable "C" the value 2.5
.Add "D", -2.5 ' assigning variable "D" the value -2.5
.Add "E", 1.9 ' assigning variable "E" the value 1.9
End With
Dim key As Variant
For Each key In varDict ' loop all variables
equation = Replace(Replace(equation, key, varDict(key)), ",", ".") ' replace each variable name within 'equation' string with its corresponding value, and make sure you use dots for decimal separator
Next
MsgBox equation & " = " & Evaluate(equation) ' show the result

Related

Excel FindJobCode's problems

I am new in VBA and I have a code as below to find some job numbers in a description.
However, i have 3 problems on it...
if 1st character is small letter such as "s", "m", then it show error
i cannot solve Example3, the result will show "M3045.67," but all i need is "M3045.67" only, no comma
i don't know why it is failed to run the code Range("E2").Value = "Overhead" after Else in Example5
but for problem 3, i can run result "overhead" before i add 2nd criteria, is something wrong there ? Please help~~~thanks.
P.S. the looping will be added after solving above questions......
Sub FindCode()
'Example1 : G5012.123 Management Fee / Get Result = G5012.123
'Example2 : G3045.67 Management Fee / Get Result = G3045.67
'Example3 : M3045.67, S7066 Retenal Fee / Get Result = M3045.67,
'Example4 : P9876-123A Car Park / Get Result = P9876
'Example5 : A4 paper / Get result = Overehad
'Criteria1 : 1st Character = G / S / M / P
If Left(Range("A2"), 1) = "G" Or Left(Range("A2"), 1) = "S" Or Left(Range("A2"), 1) = "M" Or Left(Range("A2"), 1) = "P" Then
'Criteria2 : 2nd-5th Character = Number only
If IsNumeric(Mid(Range("A2"), 2, 4)) Then
'Get string before "space"
Range("E2").Value = Left(Range("A2"), InStr(1, Range("A2"), " ") - 1)
Else
'If not beginning from Crit 1&2, show "Overhead"
Range("E2").Value = "Overhead"
End If
End If
'If start from "P", get first 5 string
If Left(Range("A2"), 1) = "P" And IsNumeric(Mid(Range("A2"), 2, 4)) Then
Range("E2").Value = Left(Range("A2"), 5)
Else
End If
End Sub
The function below will extract the job number and return it to the procedure that called it.
Function JobCode(Cell As Range) As String
' 303
'Example1 : G5012.123 Management Fee / Get Result = G5012.123
'Example2 : G3045.67 Management Fee / Get Result = G3045.67
'Example3 : M3045.67, S7066 Rental Fee / Get Result = M3045.67,
'Example4 : P9876-123A Car Park / Get Result = P9876
'Example5 : A4 paper / Get result = Overhead
Dim Fun As String ' function return value
Dim Txt As String ' Text to extract number from
' Minimize the number of times your code reads from the sheet because it's slow
Txt = Cell.Value ' actually, it's Cells(2, 1)
' Criteria1 : 1st Character = G / S / M / P
If InStr("GSMP", UCase(Left(Txt, 1))) Then
Txt = Split(Txt)(0) ' split on blank, take first element
' Criteria2 : 2nd-5th Character = Number only
' Isnumeric(Mid("A4", 2, 4)) = true
If (Len(Txt) >= 5) And (IsNumeric(Mid(Txt, 2, 4))) Then
Fun = Replace(Txt, ",", "")
Fun = Split(Fun, "-")(0) ' discard "-123A" in example 4
End If
End If
' If no job number was extracted, show "Overhead"
If Len(Fun) = 0 Then Fun = "Overhead"
JobCode = Fun
End Function
The setup as a function, rather than a sub, is typical for this sort of search. In my trials I had your 5 examples in A2:A6 and called them in a loop, giving a different cell to the function on each loop. Very likely, this is what you are angling for, too. This is the calling procedure I used for testing.
Sub Test_JobCode()
' 303
Dim R As Long
For R = 2 To Cells(Rows.Count, "A").End(xlUp).Row
' I urge you not to use syntax for addressing ranges when addressing cells
Debug.Print JobCode(Cells(R, "A")) ' actually, it's Cells(2, 1)
Next R
End Sub
Of course, instead of Debug.Print JobCode(Cells(R, "A")) you could also have Cells(R, "B").Value = JobCode(Cells(R, "A"))
The reason why your Else statement didn't work was a logical error. The "Overhead" caption doesn't apply if criteria 1 & 2 aren't met but if all previous efforts failed, which is slightly broader in meaning. This combined with the fact that Isnumeric(Mid("A4", 2, 4)) = True, causing the test not to fail as you expected.
In rough terms, the code first checks if the first letter qualifies the entry for examination (and returns "Overhead" if it doesn't). Then the text is split into words, only the first one being considered. If it's too short or non-numeric no job code is extracted resulting in "Overhead" in the next step. If this test is passed, the final result is modified: The trailing comma is removed (it it exists) and anything appended with a hyphen is removed (if it exists). I'm not sure you actually want this. So, you can easily remove the line. Or you might add more modifications at that point.
What you are trying to do is FAR easier using regular expression matching and replacing, so I recommend enabling that library of functions. The best news about doing that is that you can invoke those functions in EXCEL formulas and do not need to use Visual Basic for Applications at all.
To enable Regular Expressions as Excel functions:
Step 1: Enable the Regular Expression library in VBA.
A. In the Visual Basic for Applications window (where you enter VBA code) find the Tools menu and
select it, then select the References... entry in the sub-menu.
B. A dialogue box will appear listing the possible "Available References:" in alphabetical order.
Scroll down to find the entry "Microsoft VBScript Regular Expressions 5.5".
C. Check the checkbox on that line and press the OK button.
Step 2: Create function calls. In the Visual Basic for Applications window select Insert..Module. Then paste the following VBA code into the blank window that comes up:
' Some function wrappers to make the VBScript RegExp reference Library useful in both VBA code and in Excel & Access formulas
'
Private rg As RegExp 'All of the input data to control the RegExp parsing
' RegExp object contains 3 Boolean options that correspond to the 'i', 'g', and 'm' options in Unix-flavored regexp
' IgnoreCase - pretty self-evident. True means [A-Z] matches lowercase letters and vice versa, false means it won't
' IsGlobal - True means after the first match has been processed, continue on from the current point in DataString and look to process more matches. False means stop after first match is processed.
' MultiLine - False means ^ and $ match only Start and End of DataString, True means they match embedded newlines. This provides an option to process line-by-line when Global is true also.
'
' Returns true/false: does DataString match pattern? IsGlobal=True makes no sense here
Public Function RegExpMatch(DataString As String, Pattern As String, Optional IgnoreCase As Boolean = True, Optional IsGlobal As Boolean = False, Optional MultiLine As Boolean = False) As Boolean
If rg Is Nothing Then Set rg = New RegExp
rg.IgnoreCase = IgnoreCase
rg.Global = IsGlobal
rg.MultiLine = MultiLine
rg.Pattern = Pattern
RegExpMatch = rg.Test(DataString)
End Function
'
' Find <pattern> in <DataString>, replace with <ReplacePattern>
' Default IsGlobal=True means replace all matching occurrences. Call with False to replace only first occurrence.
'
Public Function RegExpReplace(DataString As String, Pattern As String, ReplacePattern As String, Optional IgnoreCase As Boolean = True, Optional IsGlobal As Boolean = True, Optional MultiLine As Boolean = False) As String
If rg Is Nothing Then Set rg = New RegExp
rg.IgnoreCase = IgnoreCase
rg.Global = IsGlobal
rg.MultiLine = MultiLine
rg.Pattern = Pattern
RegExpReplace = rg.Replace(DataString, ReplacePattern)
End Function
Now you can call RegExpMatch & RegExpReplace in Excel formulas and we can start to think of how to solve your particular problem. To be a match, your string must start with G, S, M, or P. In a regular expression code that is ^[GSMP], where the up-arrow says to start at the beginning and the [GSMP] says to accept a G, S, M or P in the next position. Then any matching string must next have a number of numeric digits. Code that as \d+, where the \d means one numeric digit and the + is a modifier that means accept one or more of them. Then you could have a dot followed by some more digits, or not. This is a little more complicated - you would code it as (\.\d+)? because dot is a special character in regular expressions and \. says to accept a literal dot. That is followed by \d+ which is one or more digits, but this whole expression is enclosed in parentheses and followed by a ?, which means what is in parentheses can appear once or not at all. Finally, comes the rest of the line and we don't really care what is in it. We code .*$ for zero or more characters (any) followed by the line's end. That all goes together as ^[GSMP]\d+(\.\d+)?.*$.
Putting that pattern into our RegExpReplace call:
=RegExpReplace(A2,"^([GSMP]\d+(\.\d+)?).*$","$1")
We wrapped the part we were interested in keeping in parentheses because the "$1" as part of the replacement pattern says to use whatever was found inside the first set of parentheses. Here is that formula used in Excel
This works for all your examples but the last one, which is your else clause in your logic. We can fix that by testing whether the pattern matched using RegExpMatch:
=IF(regexpMatch(A2,"^([GSMP]\d+(\.\d+)?).*$"),RegExpReplace(A2,"^([GSMP]\d+(\.\d+)?).*$","$1"),"Overhead")
This gives the results you are looking for and you have also gained a powerful text manipulation tool to solve future problems.

How to identify and remove a single letter from a string in a cell?

I have a dataset of names in a column in Excel.
However, only some but not all of the names have a letter attached to the end of it (e.g. John Doe A, Kai Jin, Johnny Desplat Lang B, etc).
Can anyone think of a method to remove the letter from the end of the name from each row, if it is there? Such that, using the example above, I will be left with: John Doe, Kai Jin, Johnny Desplat Lang, etc.
I am fairly familiar with VBA and Excel and would be open to trying anything at all.
Thank you for your help with this question! Apologies beforehand if this seems like an elementary question but I have no idea how to begin to solve it.
"I am fairly familiar with VBA and Excel and would be open to trying anything at all."
If so, then this can be done with a simple formula if you wish to avoid VBA. With your value in A1:
=IF(MID(A1,LEN(A1)-1,1)=" ",LEFT(A1,LEN(A1)-2),A1)
If you must use VBA, I think the Like operator comes in handy:
Sub Test()
Dim arr As Variant: arr = Array("John Doe A", "Kai Jin", "Johnny Desplat Lang B")
For Each el In arr
If el Like "* ?" Then 'Or "* [A-Z]" if you must check for uppercase alpha.
Debug.Print Left(el, Len(el) - 2)
Else
Debug.Print el
End If
Next
End Sub
Just for fun and in order to demonstrate another approach via the Filter() function:
Function ShortenName(ByVal FullName As Variant) As String
'Purpose: remove a single last letter
Dim n: n = Split(FullName, " "): n = Len(n(UBound(n)))
ShortenName = Left(FullName, Len(FullName) + 2 * (n = 1))
End Function
Explanation
Applying the Split() function upon the full name and isolating the last name token (via UBound()) allows to check for a single letter length (variable n).
The function result returns the entire string length minus 2 (last letter plus preceding space) in case of a single letter (the the condition n = 1 then results in True equalling -1). - Alternatively you could have coded: ShortenName = Left(FullName, Len(FullName) - IIf(n = 1, 2, 0))

User defined column letter in a Range call

Asking to user the letter referencing a column ; then using the answer to get the range of this a cell with in this column at some row.
But the "Range" refuses to recognize the letter input.
What am I missing in the following simple two lines ?
UserCol = Application.InputBox(" Please enter the column...", Type:=0)
Set Z = Range(UserCol & 5)
You need to use Type:=2.
Using Type:=0 will return ="<input string>", rather than just <input string>.
So, after:
Dim UserCol As String
UserCol = Application.InputBox(" Please enter the column...", Type:=2)
You can either do:
Set Z = Cells(5, UserCol)
OR:
Set Z = Range(UserCol & "5")
I would also suggest that use Option Explicit and also fully qualify range references. e.g. instead of Set Z = Range(UserCol & "5"), use Set Z = Thisworkbook.sheets("MySheetName").Range(UserCol & "5")
Try this:
UserCol = Application.InputBox(" Please enter the column...", Type:=2)
Set Z = Range((UserCol & 5))
I set Type to 2 to return a string from your user's input (see her for more)
Additionally, I added a parenthesis to the Range, because (UserCol & 5) becomes e.g. A5, and you need Range(A5).
This way is easier:
Dim UserCol As String
UserCol = Application.InputBox(" Please enter the column...")
Set Z = Cells(5, UserCol)
I don't know how did you declare your UserCol or if you even declared it. If you didn't and to avoid problems always use Option Explicit on the top of your module.
This is a really simple bug to fix: Set Z = Range(UserCol & "5")
Why? Because when you use implicit conversion, by typing UserCol & 5, VBA includes a space between UserCol and the 5 (and also after the 5).
Here is a test:
MsgBox "A" & 5 'Outputs "A 5 "
MsgBox "A" & "5" 'Outputs "A5"
(As Gravitate points out, Application.InputBox(" Please enter the column...", Type:=0) is a Formula, so an input of "A" would give you "=""A"", and not "A" -since "=""A""5" is not a valid cell reference either, use Type:=2 for Text or InputBox(" Please enter the column...") without the Application. or the type filtering)

Comparing time in ASP classic/VBscript

I am working on a function for an asp page that compares if a time entered is greater than a time with added leeway. I noticed certain times when checked would fail the test when the times are equal. Included is a snip of my function to illustrate. Not sure why equal dates would fail, and would like to know if this is a good way to go about comparing time.
<%
function TimeTest(testTime, checkTime, buffer, try)
checkingTime = FormatDateTime(cdate(DateAdd("n", buffer, cdate(checkTime))),4)
if try = 1 then
testTime = FormatDateTime(testTime, 4)
checktime = FormatDateTime(checkTime, 4)
end if
if cdate(testTime) > DateAdd("n", buffer, cdate(checkTime)) then
TimeTest = "<p class = 'redS'>Fails! testTime: "&testTime&" < checkTime:"&checkingTime&"</p>"
else
TimeTest = "<p class = 'greenS'>Works! testTime: "&testTime&" > checkTime:"&checkingTime&"</p>"
end if
end function
response.write("<br><br><h1>Test2</h1><br>")
for i=0 to 23
for j=0 to 59
response.write(TimeTest(i&":"&j&":00", i&":00:00", j, 1))
response.write("<BR>")
next
next
%>
This problem has earned my attention! I can reproduce the results and it's very unclear what's going on behind the scenes in these comparisons. However, I have a workaround for you
Here is a modified version of the code that I've been using to analyse the issue...
<%
Option Explicit
Function TimeTest(a, b, buffer)
Dim c : c = DateAdd("n", buffer, b)
Dim s : s = Join(Array("a=" & a, "b=" & b, "c=" & c, "buffer=" & buffer), ", ")
Dim passed : passed = a <= c
'Dim passed : passed = DateDiff("s", a, c) <= 0
If passed Then Exit Function
Dim color : color = "red" : If passed Then color = "green"
TimeTest = "<div style='color:" & color & "'>" & s & "</div>"
End Function
Dim i, j, a, b
For i = 0 To 23
For j = 0 To 59
a = CDate(i & ":" & j & ":00")
b = CDate(i & ":00:00")
'a = CDate(Date() & " " & i & ":" & j & ":00")
'b = CDate(Date() & " " & i & ":00:00")
Response.Write(TimeTest(a, b, j))
Next
Response.Write("<hr>")
Next
%>
Note that commenting out line 13 will reveal lines that pass. By default, I'm showing only failures.
The first thing to note is that I have some commented variants on lines 24-25 where I add today's date to the value before casting it. Interestingly, doing this changes the pattern of which times fail the test. There are still roughly the same number of failures but they occur at different buffer values.
This leads me to believe that behind the scenes in VBScript, these datetimes might be cast to floating-point numbers when you use the native < <= > >= comparison operators on them and that's resulting in some precision errors. If they were converted to long integers, then they should surely be correct.
I did a version of the code where instead of using a direct comparison on the VBDateTimes, I compared the integer representation of them (unix time) using this function:
Function date2epoch(myDate)
date2epoch = DateDiff("s", "01/01/1970 00:00:00", myDate)
End Function
When doing that, all tests passed. However, it is an unusual way to do things. I thought there should be a more 'normal' way.
I then went back and replaced the straightforward <= operator with a call to DateDiff instead (comment out line 10, uncomment line 11). Whether I used seconds or minutes, the tests passed. So, I think the takeaway lesson here might be to always use DateDiff when comparing VBDateTimes. As someone who's used VBS for a while and never encountered issues with native comparisons before, this is a revelation and I may need to offer this advice to my colleagues too.

Can a logic test for an `if` statement be a variable in excel vba?

I have the following function:
Function get_equal_array_subset(column_label As String, _
loop_array() As Variant, _
values_array() As Variant)
' this function outputs an array of value from the values_array, based on a loop through the loop_array
' column_label is the first item in the array of the ouput array; i.e. the column lable of a new range
' loop_array is array being looped through and testing each value
' valus_array is the array from which values are taken with the test is met in the first array
' *** arrays have to be of equal lenght ***
Dim subset_array() As Variant
subset_array = Array(column_label)
Dim rows_dim As Long
Dim cols_dim As Integer
Dim agent_subset_counter As Long
agent_subset_counter = 0 ' counter to set the key for the new array
For rows_dim = 2 To UBound(loop_array, 1)
For cols_dim = 1 To UBound(loop_array, 2)
If loop_array(rows_dim, cols_dim) > 2 Then
agent_subset_counter = agent_subset_counter + 1 ' increase the subset counter by 1
ReDim Preserve subset_array(agent_subset_counter) ' resize the array account for the next id
subset_array(agent_subset_counter) = values_array(rows_dim, cols_dim) ' add the new id to the agent subset
End If
Next cols_dim
Next rows_dim
get_equal_array_subset = subset_array
End Function
Is there a way for me to make the If loop_array(rows_dim, cols_dim) > 2 Then a variable? Let's say I wanted the test to be > 3 or = 5 or non blank...etc.
I would go for the magic Application.Evaluate() method of the Application class. An example might be to define a series of tests into an array, let's say:
Dim myTests(4)
myTests(1) = "> 3"
myTests(2) = "= 5"
myTests(3) = "+3 < 5"
myTests(4) = "- 4 + sum(1,2) < 5"
Hence, using the simple statement:
If Application.Evaluate(loop_array(rows_dim, cols_dim) & myTests(j)) Then
Clearly, the variable j should be defined depending on the test you want to use and this kind of method would allow you to define several arrays of operators (one array for operators like +, - etc., another one for values like 3, 5 etc.)
NOTE If you don't know it yet, the Application.Evaluate() method will evaluate the expression and returning the result as Excel would do. It's basically using the same code that Excel uses to evaluate what you write in a cell:
Application.Evaluate("2+3") --> 5
Application.Evaluate("2 < 3") --> True
Application.Evaluate("IF(2=3,1,2)") --> 2
'etc.
If you wanted to make the "magic number" 2 into a variable, then you would use an array item in place of the 2.If, however, you wanted separate logic, then you use use a Select Case structure.

Resources