Does Excel's Worksheet Comparison Operator Give Inconsistent Results?

Does Excel's Worksheet Comparison Operator Give Inconsistent Results? - excel

Consider the following. Place =400000000000000/3 in a cell, say "A1". Excel displays 133333333333333.0000. The precision is zero digits to the right of the decimal point because, apparently, Excel's floating point precision is no more than 15 digits. Now place the following formula in a cell:
=A1=ROUND(A1,0)
The formula produces True since there are no digits to the right of the decimal point ... or are there? Open the VBA editor, right click on your Workbook in the Projects pane and insert a VBA module. Define the following UDF:
Function Equals(dblOne As Double, dblTwo As Double) As Boolean
Equals = dblOne = dblTwo
End Function
Now go back to your Worksheet and place the following formula in a cell:
=Equals(A1,ROUND(A1,0))
The result is now False. Why?

Excel's 15 apparent decimal digits of precision are the object of an entire section of William Kahan's essay “How Futile are Mindless Assessments of Roundoff in Floating-Point Computation ?”:
What’s so special about 15 sig. dec.? Displaying at most 15 sig. dec.,
as Excel does, ensures that a number entered with at most 15 sig.
dec., converted to binary floating-point rounded correctly to 53 sig.
bits (which is what Excel’s arithmetic carries), and then displayed
converted back to decimal floating-point rounded correctly to at least
as many sig. dec. as were entered but no more than 15, will always
display exactly the same number as was entered. The decision to make
Excel’s arithmetic seem to be Decimal instead of Binary restricted
Excel’s display to at most 15 sig. dec., thus hiding the deception
well enough to reduce greatly the number of calls upon Excel’s
technical help-desk. When symptoms of the deception are perceived they
are routinely misdiagnosed; e.g., see David Einstein’s column on p. E2
of the San Francisco Chronicle for 16 and 30 May 2005.
The section concludes with:
This is no place to list all the corrections Excel needs. It was cited
here only to exemplify Errors Designed Not To Be Found.

If you run the below code msgbox gives the difference in two values. Hence it returns false.
Function Equals(dblOne As Double, dblTwo As Double) As Boolean
MsgBox dblOne - dblTwo
Equals = dblOne = dblTwo
End Function
Below code will return true
Function Equals(dblOne As Double, dblTwo As Double) As Boolean
Equals = Round(dblOne, 0) = dblTwo
End Function

Related

Why do VBA and Excel disagree on whether two cells are equal? [duplicate]

This question already has answers here:
VBA rounding problem
(2 answers)
Closed 9 months ago.
I am trying to compare two cells in a table:
The column "MR" is calculated using the formula =ABS([#Value]-A1) to determine the moving range of the column "Value". The values in the "Value" column are not rounded. The highlighted cells in the "MR" column (B3 and B4) are equal. I can enter the formula =B3=B4 into a cell and Excel says that B3 is equal to B4.
But when I compare them in VBA, VBA says that B4 is greater than B3. I can select cell B3 and enter the following into the Immediate Window ? selection.value = selection.offset(1).value. That statement evaluates to false.
I tried removing the absolute value from the formula thinking that might have had something to do with it, but VBA still says they aren't equal.
I tried adding another row where Value=1.78 so MR=0.18. Interestingly, the MR in the new row (B5) is equal to B3, but is not equal to B4.
I then tried increasing the decimal of A4 to match the other values, and now VBA says they are equal. But when I added the absolute value back into the formula, VBA again says they are not equal. I removed the absolute value again and now VBA is saying they are not equal.
Why is VBA telling me the cells are not equal when Excel says they are? How can I reliably handle this situation through VBA going forward?

The problem is that the IEEE 754 Standard for Floating-Point Arithmetic is imprecise by design. Virtually every programming language suffers because of this.
IEEE 754 is an extremely complex topic and when you study it for months and you believe you understand fully, you are simply fooling yourself!
Accurate floating point value comparisons are difficult and error prone. Think long and hard before attempting to compare floating point numbers!
The Excel program gets around the issue by cheating on the application side. VBA on the other hand follows the IEEE 754 spec for Double Precision (binary64) faithfully.
A Double value is represented in memory using 64 bits. These 64 bits are split into three distinct fields that are used in binary scientific notation:
The SIGN bit (1 bit to represent the sign of the value: pos/neg)
The EXPONENT (11 bits, biased in value by +1023)
The MANTISSA (53 bits, 52 bits stored + 1 bit implied)
The mantissa in this system leverages the fact that all binary numbers begin with a digit of 1 and so that 1 is not stored in the bit-pattern. It is implied, increasing the mantissa precision to 53-bits for normal values.
The math works like this: Stored Value = SIGN VALUE * 2^UNBIASED EXPONENT * MANTISSA
Note that a stored value of 1 for the sign bit denotes a negative SIGN VALUE (-1) while a 0 denotes a positive SIGN VALUE (+1). The formula is SIGN VALUE = (-1) ^ (sign bit).
The problem always boils down to the same thing.
The vast majority of real numbers cannot be expressed precisely
within this system which introduces small rounding errors that propagate
like weeds.
It may help to think of this system as a grid of regularly spaced points. The system can represent ONLY the point-values and NONE of the real numbers between the points. All values assigned to a float will be rounded to one of the point-values (usually the closest point, but there are modes that enforce rounding upwards to the next highest point, or rounding downwards). Conducting any calculation on a floating-point value virtually guarantees the resulting value will require rounding.
To accent the obvious, there are an infinite number of real numbers between adjacent representable point-values on this grid; and all of them are rounded to the discreet grid-points.
To make matters worse, the gap size doubles at every Power-of-Two as the grid expands away from true zero (in both directions). For example, the gap length between grid points for values in the range of 2 to 4 is twice as large as it is for values in the range of 1 to 2. When representing values with large enough magnitudes, the grid gap length becomes massive, but closer to true zero, it is miniscule.
With your example numbers...
1.24 is represented with the following binary:
Sign bit = 0
Exponent = 01111111111
Mantissa = 0011110101110000101000111101011100001010001111010111
The Hex pattern over the full 64 bits is precisely: 3FF3D70A3D70A3D7.
The precision is derived exclusively from the 53-bit mantissa and the exact decimal value from the binary is:
0.2399999999999999911182158029987476766109466552734375
In this instance a leading integer of 1 is implied by the hidden bit associated with the mantissa and so the complete decimal value is:
1.2399999999999999911182158029987476766109466552734375
Now notice that this is not precisely 1.24 and that is the entire problem.
Let's examine 1.42:
Sign bit = 0
Exponent = 01111111111
Mantissa = 0110101110000101000111101011100001010001111010111000
The Hex pattern over the full 64 bits is precisely: 3FF6B851EB851EB8.
With the implied 1 the complete decimal value is stored as:
1.4199999999999999289457264239899814128875732421875000
And again, not precisely 1.42.
Now, let's examine 1.6:
Sign bit = 0
Exponent = 01111111111
Mantissa = 1001100110011001100110011001100110011001100110011010
The Hex pattern over the full 64 bits is precisely: 3FF999999999999A.
Notice the repeating binary fraction in this case that is truncated
and rounded when the mantissa bits run out? Obviously 1.6 when
represented in binary base2 can never be precisely accurate in the
same way as 1/3 can never be accurately represented in decimal base10
(0.33333333333333333333333... ≠ 1/3).
With the implied 1 the complete decimal value is stored as:
1.6000000000000000888178419700125232338905334472656250
Not exactly 1.6 but closer than the others!
Now let's subtract the full stored double precision representations:
1.60 - 1.42 = 0.18000000000000015987
1.42 - 1.24 = 0.17999999999999993782
So as you can see, they are not equal at all.
The usual way to work around this is threshold testing, basically an inspection to see if two values are close enough... and that depends on you and your requirements. Be forewarned, effective threshold testing is way harder than it appears at first glance.
Here is a function to help you get started comparing two Double Precision numbers. It handles many situations well but not all because no function can.
Function Roughly(a#, b#, Optional within# = 0.00001) As Boolean
Dim d#, x#, y#, z#
Const TINY# = 1.17549435E-38 'SINGLE_MIN
If a = b Then Roughly = True: Exit Function
x = Abs(a): y = Abs(b): d = Abs(a - b)
If a <> 0# Then
If b <> 0# Then
z = x + y
If z > TINY Then
Roughly = d / z < within
Exit Function
End If
End If
End If
Roughly = d < within * TINY
End Function
The idea here is to have the function return True if the two Doubles are Roughly the same Within a certain margin:
MsgBox Roughly(3.14159, 3.141591) '<---dispays True
The Within margin defaults to 0.00001, but you can pass whatever margin you need.
And while we know that:
MsgBox 1.60 - 1.42 = 1.42 - 1.24 '<---dispays False
Consider the utility of this:
MsgBox Roughly(1.60 - 1.42, 1.42 - 1.24) '<---dispays True
#chris neilsen linked to an interesting Microsoft page about Excel and IEEE 754.
And please read David Goldberg's seminal What Every Computer Scientist Should Know About Floating-Point Arithmetic. It changed the way I understood floating point numbers.

How to get equal results when doing arithmetic operations vba/excel [Double variable precision]

I am trying to get equal result of two exact calculations which are computed in a cell formula and the other one with a UDF:
Function calc()
Dim num as Double
num = 30000000 * ((1 + 8 / 100 / 365) ^ 125)
calc = num
End Function
Result of the calculation is different
A1 = 30000000 * ((1 + 8 / 100 / 365) ^ 125) not equal to A2 = calc()
We can test it with =if(A1=A2, TRUE, FALSE) which is false. I do understand that it has something to do with data types in vba and executing cell formula. Do you know how to make calculations to from vba function(s) and excel cell field(s) to render same result?

So, the calculation in application excel and the calculation in vba are presenting different outputs (what you've presented, with format displaying 20 decimal places):
As such, you would see false when comparing them. You will need to round() or format() to truncate the calculation at a level that is appropriate. E.g.:
calc = round(num,4)
calc = format(num,"0.###0")
The reason this is occurring is because of the inherent math you're using, specifically, ((1 + 8 / 100 / 365) ^ 125), and how that is being truncated/rounded in the allocated memory to each part of the calculation, which differs in VBA and in-application Excel.
Edit: Final image with the VBA changes I'd suggested:

Explanation
Double Data type seems to have flaws being "precise" after the "nth" digit. This is stated as well in the documentation
Precision. When you work with floating-point numbers, remember that they do not always have a precise representation in memory. This could lead to unexpected results from certain operations, such as value comparison and the Mod operator.
Troubleshooting
It seems that is the case here: I set up the value from the division on a cell and the division as formula in another one, although excel interface says there are not differences, when computing that value again, the formula on the sheet seems to be more precise.
Actual result
Further thoughts
It seems that is limited by the data type itself, if precision is not an issue, you may try to round it. If it is critical to be as precise as possible, I would suggest you to connect with an API to something that is able to handle more precision. In this scenario, I would use xlwings to use python.

Apache POI not returning the proper value for large numbers coming from Excel

I have an excel file with the value 6228480018362050000 the exported csv looks like this...
Int,Bigint,String
1,6228480018362050000,Very big
When I try running the following code...
InputStream inp = new FileInputStream("/.../test.xlsx");
DataFormatter df = new DataFormatter(true);
df.formatCellValue(WorkbookFactory.create(inp).getSheetAt(0).getRow(1).getCell(1));
I get 6228480018362049500 which is the wrong number because precision is hosed. Is there a way to get the actual value?

If we put long numbers into Excel cells, then those numbers will be truncated to 15 significant digits. This is because Excel does not know such things like big integers. It has only floating point to store numeric values. And with those it follows the IEEE 754 specification. But some numbers cannot be stored as floating point numbers according to the IEEE 754 specification. With your example the 6228480018362050000, which is 6.22848001836205E+018, cannot be stored as such. It will be 6.2284800183620495E+018 or 6228480018362049500 according to IEEE 754 specification.
Microsoft's knowledge base mentions: "Excel follows the IEEE 754 specification on how to store and calculate floating-point numbers. Excel therefore stores only 15 significant digits in a number, and changes digits after the fifteenth place to zeroes."
This is not the whole truth. In reality at least with Office OpenXML (*.xlsx) it stores the values according to IEEE 754 specification and not only 15 significant digits. With your example it stores <v>6.2284800183620495E+18</v>. But thats secondary. Because even if it would store 6.22848001836205E+018, somewhere this must be reconverted to floating point and then it will be 6.2284800183620495E+18 again. Excel does the same while opening the workbook. It converts <v>6.2284800183620495E+18</v> to floating point and then it only displays 15 significant digits.
So if you really need to store the 6228480018362050000 as a number in Excel, then the only way to get the same results as in Excel is to do the same as Excel. To do so we can use BigDecimal and it's round method which is able to use a MathContext with setted precision.
Example:
import org.apache.poi.ss.usermodel.*;
import java.io.*;
import java.math.BigDecimal;
import java.math.MathContext;
class ReadExcelBigNumbers {
public static void main(String[] args) throws Exception{
for (int i = 0; i < 10; i++) {
String v = "6.2284800183620" + i + "E+018";
double d = Double.parseDouble(v);
System.out.print(v + "\t");
System.out.print(d + "\t");
BigDecimal bd = new BigDecimal(d);
v = bd.round(new MathContext(15)).toPlainString();
System.out.println(v);
}
InputStream inp = new FileInputStream("test.xlsx");
Workbook wb = WorkbookFactory.create(inp);
for (int i = 1; i < 9; i++) {
double d = wb.getSheetAt(0).getRow(i).getCell(1).getNumericCellValue();
BigDecimal bd = new BigDecimal(d);
String v = bd.round(new MathContext(15)).toPlainString();
System.out.println(v);
}
}
}
The first part prints:
6.22848001836200E+018 6.2284800183620004E18 6228480018362000000
6.22848001836201E+018 6.2284800183620096E18 6228480018362010000
6.22848001836202E+018 6.2284800183620198E18 6228480018362020000
6.22848001836203E+018 6.2284800183620301E18 6228480018362030000
6.22848001836204E+018 6.2284800183620403E18 6228480018362040000
6.22848001836205E+018 6.2284800183620495E18 6228480018362050000
6.22848001836206E+018 6.2284800183620598E18 6228480018362060000
6.22848001836207E+018 6.22848001836207E18 6228480018362070000
6.22848001836208E+018 6.2284800183620803E18 6228480018362080000
6.22848001836209E+018 6.2284800183620905E18 6228480018362090000
There you can see the difference between wanted floating point value, real floating point value according IEEE 754 specification and reformatted BigDecimal. As you see only the 6.22848001836207E+018 can be stored according to the IEEE 754 specification directly.
The second part does the same using the following Excel sheet:
Another possible workaround is mentioned in the knowledge base article : "To work around this behavior, format the cell as text, then type the numbers. The cell can then display up to 1,024 characters. ". This is good if the numbers are not really numbers but Identifiers for example or some other strings where the digits are only meant as characters. Calculations with such "Text-Numbers" are of course not possible without reconverting them to floating point which will bring the problem again.

There is no change (loss or gain) of precision between 6228480018362050000 and 6228480018362049500. They are simply two different decimal presentations of the same internal binary value, which in decimal is exactly 6228480018362049536, by the way.
Regardless of the cell format, Excel displays (not "stores") only up to the first 15 significant digits, rounding any digits to the right [1].
However, other applications and file formats show up to the first 17 significant digits (or more), which is really what the IEEE 754 standard requires in order to represent every binary value [2]. Apparently, that is true of Apache POI and OpenXML.
You can demonstrate this by doing the following.
In Excel, enter 6228480018362050000. Save as XML.
Open the XML file in Notepad. Note that the Cell/Data element shows 6.2284800183620495E+18, which is 6228480018362049500.
Open the XML file in Excel. Note that Excel still displays 6228480018362050000 in the Formula Bar and in the cell formatted as Number.
It is true that Excel truncates manually-entered numbers (including those read from CSV and TXT files) to the first 15 significant digits, replacing any digits to the right with zeros. But Excel VBA does not.
So for another demonstration, enter the following in VBA, then execute the procedure.
Sub doit()
Range("a1:a2").NumberFormat = "0"
Range("a1") = CDbl("6228480018362050000")
Range("a2") = CDbl("6228480018362049536")
Columns("a").AutoFit
Range("b2") = "=match(a1,a2,0)"
End Sub
Note that A1 and A2 display 6228480018362050000. B2 displays 1, indicating that the internal binary values are an exact match, and VBA does not truncate after the first 15 significant digits.
Explanation....
Excel and most applications use IEEE 754 double-precision to represent numeric values. The binary representation is the sum of 53 consecutive powers of 2 ("bits") times an exponential factor.
Consequently, only integers up to 9007199254740992 (2^53) can be represented exactly. (But note that Excel displays 9007199254740990 for =2^53 because of its 15-significant-digit formatting limitation.)
Most larger integers can only be approximated.
And that is true of most decimal fractions as well, regardless of the number of significant digits. That is part of the reason why =10.1-10 displays 0.0999999999999996 in the Formula Bar and in the cell formatted with 16 decimal places (15 significant digits).
But beware: a calculated value that displays as 6228480018362050000 might differ from the actual internal binary value.
For example, if you enter 6228480018362050000 into A1 and the formula =6228480018362050000+1600 into A2, both A1 and A2 display 6228480018362050000.
But =MATCH(A1,A2,0) returns #N/A, which indicates that the internal binary values are not an exact match.
And the XML file would show 6.2284800183620516E+18 in the Data element corresponding to the Cell element for A2, which is 6228480018362051600. The actual internal binary value, in decimal, is exactly 6228480018362051584.
(FYI, the Excel equal operator ("=") does not compare the internal binary values. Instead, it compares the values rounded to 15 significant digits. So =(A1=A2) returns TRUE misleadingly. It is intended to be a feature; but it is implemented inconsistently.)
If you copy A2 and paste-value into A3, =MATCH(A1,A3,0) continues to return #N/A. But if you subsequently "edit" A3 (e.g. press f2, then Enter), =MATCH(A1,A3,0) returns 1. The internal value of A3 has been changed to the binary representation of 6228480018362050000.
I wonder if that is actually the mysterious problem that you encountered, and you inadvertently oversimplified it with your example.
Does that help?
[1] Cell format does not affect the internal binary value with two exceptions: (1) when Precision As Displayed is set, which is almost never recommended; and (2) when the cell value is calculated, and the worksheet is saved in CSV or TXT file, then re-open or imported in Excel.
[2] Although IEEE 754 specifies that 17 significant decimal digits are the minimum needed to represent all binary values, that does not mean that only 17 significant decimal digits are "stored". As demonstrated above, 6228480018362049500 is actually stored as exactly 6228480018362049536.

Rounding error when using INT function

I have user input in two cells, named "UpperRangeHigh" and "UpperRangeLow". I have the following code:
dRangeUpper = [UpperRangeHigh] - [UpperRangeLow]
lLines = Int(dRangeUpper * 100 / lInterval)
The user inputs 120.3 and 120 into the input cells respectively. lInterval has the value 10. VBA produces the result of 2 for lLines, instead of 3.
I can overcome this problem by adding 0.000000001 to dRangeUpper, but I'm wondering if there is a known reason for this behaviour?

This appears to be a problem with Excel's calculation and significant digits. If you do:
=120.3 - 120 and format the cell to display 15 decimal places, the result appears as:
0.2999999999999970
Here is a brief overview which explains how Excel uses binary arithmetic and that this may result in results divergent from what you would expect:
http://excel.tips.net/T008143_Avoiding_Rounding_Errors_in_Formula_Results.html
You can overcome this by forcing a rounded precision, e.g., to 10 decimal places:
lLines = Int(Round(dRangeUpper, 10) * 100 / lInterval

Kindly use single or double when working with decimals to get more accurate results.
Sub sample()
Dim dRangeUpper As Double
dRangeUpper = CDbl("120.3") - CDbl("120")
lLines = Round(CDbl(dRangeUpper * 100 / 10), 4)
End Sub
output = 3

This is a known Floating point issue within Excel
http://support.microsoft.com/kb/78113
From MSDN:
To minimize any effects of floating point arithmetic storage
inaccuracy, use the Round() function to round numbers to the number of
decimal places that is required by your calculation. For example, if
you are working with currency, you would likely round to 2 decimal
places:
=ROUND(1*(0.5-0.4-0.1),2)
In your case, using round() instead of INT should do the trick using 0 rather than 2

VBA rounding problem

I have this obscure rounding problem in VBA.
a = 61048.4599674847
b = 154553063.208822
c = a + b
debug.print c
Result:
154614111.66879
Here is the question, why did VBA rounded off variable c? I didn't issued any rounding off function. The value I was expecting was 154614111.6687894847. Even if I round off or format variable c to 15 decimal places I still don't get my expected result.
Any explanation would be appreciated.
Edit:
Got the expected results using cDec. I have read this in Jonathan Allen's reply in Why does CLng produce different results?
Here is the result to the test:
a = cDec(61048.4599674847)
b = cDec(154553063.208822)
c = a + b
?c
154614111.6687894847

The reason is the limited precission that can be stored in a floating point variable.
For a complete explanation you shoud read the paper What Every Computer Scientist Should Know About Floating-Point Arithmetic, by David Goldberg, published in the March, 1991 issue of Computing Surveys.
Link to paper
In VBA the default floating point type is Double which is a IEEE 64-bit (8-byte) floating-point number.
There is another type available: Decimal which is a 96-bit (12-byte) signed integers scaled by a variable power of 10
Put simply, this provides floating point numbers to 28 digit precission.
To use in your example:
a = CDec(61048.4599674847)
b = CDec(154553063.208822)
c = a + b
debug.print c
Result:
154614111.6687894847

Its not obscure, but its not necessarily obvious.
I think you've sort of answered it - but the basic problem is one of the "size" of the values that is how much data can be stored in a variable of a given type.
If (and this is very crude) you count the number of digits in each of the numbers in your first example you will see that you have 15 so whilst the range of values that a float (the default type) can represent is huge the precision is limited to 15 digits (I'm sure someone will be along to correct this, I'll tick the wiki box...)
So when you add the two numbers together it loses the least significant values in order to remain within the allowable precision for a flow.
By doing a cDec you're converting to a different type of variable (decimal) that is capable of greater precision

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string