I have this obscure rounding problem in VBA.
a = 61048.4599674847
b = 154553063.208822
c = a + b
debug.print c
Result:
154614111.66879
Here is the question, why did VBA rounded off variable c? I didn't issued any rounding off function. The value I was expecting was 154614111.6687894847. Even if I round off or format variable c to 15 decimal places I still don't get my expected result.
Any explanation would be appreciated.
Edit:
Got the expected results using cDec. I have read this in Jonathan Allen's reply in Why does CLng produce different results?
Here is the result to the test:
a = cDec(61048.4599674847)
b = cDec(154553063.208822)
c = a + b
?c
154614111.6687894847
The reason is the limited precission that can be stored in a floating point variable.
For a complete explanation you shoud read the paper What Every Computer Scientist Should Know About Floating-Point Arithmetic, by David Goldberg, published in the March, 1991 issue of Computing Surveys.
Link to paper
In VBA the default floating point type is Double which is a IEEE 64-bit (8-byte) floating-point number.
There is another type available: Decimal which is a 96-bit (12-byte) signed integers scaled by a variable power of 10
Put simply, this provides floating point numbers to 28 digit precission.
To use in your example:
a = CDec(61048.4599674847)
b = CDec(154553063.208822)
c = a + b
debug.print c
Result:
154614111.6687894847
Its not obscure, but its not necessarily obvious.
I think you've sort of answered it - but the basic problem is one of the "size" of the values that is how much data can be stored in a variable of a given type.
If (and this is very crude) you count the number of digits in each of the numbers in your first example you will see that you have 15 so whilst the range of values that a float (the default type) can represent is huge the precision is limited to 15 digits (I'm sure someone will be along to correct this, I'll tick the wiki box...)
So when you add the two numbers together it loses the least significant values in order to remain within the allowable precision for a flow.
By doing a cDec you're converting to a different type of variable (decimal) that is capable of greater precision
Related
This question already has answers here:
VBA rounding problem
(2 answers)
Closed 9 months ago.
I am trying to compare two cells in a table:
The column "MR" is calculated using the formula =ABS([#Value]-A1) to determine the moving range of the column "Value". The values in the "Value" column are not rounded. The highlighted cells in the "MR" column (B3 and B4) are equal. I can enter the formula =B3=B4 into a cell and Excel says that B3 is equal to B4.
But when I compare them in VBA, VBA says that B4 is greater than B3. I can select cell B3 and enter the following into the Immediate Window ? selection.value = selection.offset(1).value. That statement evaluates to false.
I tried removing the absolute value from the formula thinking that might have had something to do with it, but VBA still says they aren't equal.
I tried adding another row where Value=1.78 so MR=0.18. Interestingly, the MR in the new row (B5) is equal to B3, but is not equal to B4.
I then tried increasing the decimal of A4 to match the other values, and now VBA says they are equal. But when I added the absolute value back into the formula, VBA again says they are not equal. I removed the absolute value again and now VBA is saying they are not equal.
Why is VBA telling me the cells are not equal when Excel says they are? How can I reliably handle this situation through VBA going forward?
The problem is that the IEEE 754 Standard for Floating-Point Arithmetic is imprecise by design. Virtually every programming language suffers because of this.
IEEE 754 is an extremely complex topic and when you study it for months and you believe you understand fully, you are simply fooling yourself!
Accurate floating point value comparisons are difficult and error prone. Think long and hard before attempting to compare floating point numbers!
The Excel program gets around the issue by cheating on the application side. VBA on the other hand follows the IEEE 754 spec for Double Precision (binary64) faithfully.
A Double value is represented in memory using 64 bits. These 64 bits are split into three distinct fields that are used in binary scientific notation:
The SIGN bit (1 bit to represent the sign of the value: pos/neg)
The EXPONENT (11 bits, biased in value by +1023)
The MANTISSA (53 bits, 52 bits stored + 1 bit implied)
The mantissa in this system leverages the fact that all binary numbers begin with a digit of 1 and so that 1 is not stored in the bit-pattern. It is implied, increasing the mantissa precision to 53-bits for normal values.
The math works like this: Stored Value = SIGN VALUE * 2^UNBIASED EXPONENT * MANTISSA
Note that a stored value of 1 for the sign bit denotes a negative SIGN VALUE (-1) while a 0 denotes a positive SIGN VALUE (+1). The formula is SIGN VALUE = (-1) ^ (sign bit).
The problem always boils down to the same thing.
The vast majority of real numbers cannot be expressed precisely
within this system which introduces small rounding errors that propagate
like weeds.
It may help to think of this system as a grid of regularly spaced points. The system can represent ONLY the point-values and NONE of the real numbers between the points. All values assigned to a float will be rounded to one of the point-values (usually the closest point, but there are modes that enforce rounding upwards to the next highest point, or rounding downwards). Conducting any calculation on a floating-point value virtually guarantees the resulting value will require rounding.
To accent the obvious, there are an infinite number of real numbers between adjacent representable point-values on this grid; and all of them are rounded to the discreet grid-points.
To make matters worse, the gap size doubles at every Power-of-Two as the grid expands away from true zero (in both directions). For example, the gap length between grid points for values in the range of 2 to 4 is twice as large as it is for values in the range of 1 to 2. When representing values with large enough magnitudes, the grid gap length becomes massive, but closer to true zero, it is miniscule.
With your example numbers...
1.24 is represented with the following binary:
Sign bit = 0
Exponent = 01111111111
Mantissa = 0011110101110000101000111101011100001010001111010111
The Hex pattern over the full 64 bits is precisely: 3FF3D70A3D70A3D7.
The precision is derived exclusively from the 53-bit mantissa and the exact decimal value from the binary is:
0.2399999999999999911182158029987476766109466552734375
In this instance a leading integer of 1 is implied by the hidden bit associated with the mantissa and so the complete decimal value is:
1.2399999999999999911182158029987476766109466552734375
Now notice that this is not precisely 1.24 and that is the entire problem.
Let's examine 1.42:
Sign bit = 0
Exponent = 01111111111
Mantissa = 0110101110000101000111101011100001010001111010111000
The Hex pattern over the full 64 bits is precisely: 3FF6B851EB851EB8.
With the implied 1 the complete decimal value is stored as:
1.4199999999999999289457264239899814128875732421875000
And again, not precisely 1.42.
Now, let's examine 1.6:
Sign bit = 0
Exponent = 01111111111
Mantissa = 1001100110011001100110011001100110011001100110011010
The Hex pattern over the full 64 bits is precisely: 3FF999999999999A.
Notice the repeating binary fraction in this case that is truncated
and rounded when the mantissa bits run out? Obviously 1.6 when
represented in binary base2 can never be precisely accurate in the
same way as 1/3 can never be accurately represented in decimal base10
(0.33333333333333333333333... ≠ 1/3).
With the implied 1 the complete decimal value is stored as:
1.6000000000000000888178419700125232338905334472656250
Not exactly 1.6 but closer than the others!
Now let's subtract the full stored double precision representations:
1.60 - 1.42 = 0.18000000000000015987
1.42 - 1.24 = 0.17999999999999993782
So as you can see, they are not equal at all.
The usual way to work around this is threshold testing, basically an inspection to see if two values are close enough... and that depends on you and your requirements. Be forewarned, effective threshold testing is way harder than it appears at first glance.
Here is a function to help you get started comparing two Double Precision numbers. It handles many situations well but not all because no function can.
Function Roughly(a#, b#, Optional within# = 0.00001) As Boolean
Dim d#, x#, y#, z#
Const TINY# = 1.17549435E-38 'SINGLE_MIN
If a = b Then Roughly = True: Exit Function
x = Abs(a): y = Abs(b): d = Abs(a - b)
If a <> 0# Then
If b <> 0# Then
z = x + y
If z > TINY Then
Roughly = d / z < within
Exit Function
End If
End If
End If
Roughly = d < within * TINY
End Function
The idea here is to have the function return True if the two Doubles are Roughly the same Within a certain margin:
MsgBox Roughly(3.14159, 3.141591) '<---dispays True
The Within margin defaults to 0.00001, but you can pass whatever margin you need.
And while we know that:
MsgBox 1.60 - 1.42 = 1.42 - 1.24 '<---dispays False
Consider the utility of this:
MsgBox Roughly(1.60 - 1.42, 1.42 - 1.24) '<---dispays True
#chris neilsen linked to an interesting Microsoft page about Excel and IEEE 754.
And please read David Goldberg's seminal What Every Computer Scientist Should Know About Floating-Point Arithmetic. It changed the way I understood floating point numbers.
I have an excel file with the value 6228480018362050000 the exported csv looks like this...
Int,Bigint,String
1,6228480018362050000,Very big
When I try running the following code...
InputStream inp = new FileInputStream("/.../test.xlsx");
DataFormatter df = new DataFormatter(true);
df.formatCellValue(WorkbookFactory.create(inp).getSheetAt(0).getRow(1).getCell(1));
I get 6228480018362049500 which is the wrong number because precision is hosed. Is there a way to get the actual value?
If we put long numbers into Excel cells, then those numbers will be truncated to 15 significant digits. This is because Excel does not know such things like big integers. It has only floating point to store numeric values. And with those it follows the IEEE 754 specification. But some numbers cannot be stored as floating point numbers according to the IEEE 754 specification. With your example the 6228480018362050000, which is 6.22848001836205E+018, cannot be stored as such. It will be 6.2284800183620495E+018 or 6228480018362049500 according to IEEE 754 specification.
Microsoft's knowledge base mentions: "Excel follows the IEEE 754 specification on how to store and calculate floating-point numbers. Excel therefore stores only 15 significant digits in a number, and changes digits after the fifteenth place to zeroes."
This is not the whole truth. In reality at least with Office OpenXML (*.xlsx) it stores the values according to IEEE 754 specification and not only 15 significant digits. With your example it stores <v>6.2284800183620495E+18</v>. But thats secondary. Because even if it would store 6.22848001836205E+018, somewhere this must be reconverted to floating point and then it will be 6.2284800183620495E+18 again. Excel does the same while opening the workbook. It converts <v>6.2284800183620495E+18</v> to floating point and then it only displays 15 significant digits.
So if you really need to store the 6228480018362050000 as a number in Excel, then the only way to get the same results as in Excel is to do the same as Excel. To do so we can use BigDecimal and it's round method which is able to use a MathContext with setted precision.
Example:
import org.apache.poi.ss.usermodel.*;
import java.io.*;
import java.math.BigDecimal;
import java.math.MathContext;
class ReadExcelBigNumbers {
public static void main(String[] args) throws Exception{
for (int i = 0; i < 10; i++) {
String v = "6.2284800183620" + i + "E+018";
double d = Double.parseDouble(v);
System.out.print(v + "\t");
System.out.print(d + "\t");
BigDecimal bd = new BigDecimal(d);
v = bd.round(new MathContext(15)).toPlainString();
System.out.println(v);
}
InputStream inp = new FileInputStream("test.xlsx");
Workbook wb = WorkbookFactory.create(inp);
for (int i = 1; i < 9; i++) {
double d = wb.getSheetAt(0).getRow(i).getCell(1).getNumericCellValue();
BigDecimal bd = new BigDecimal(d);
String v = bd.round(new MathContext(15)).toPlainString();
System.out.println(v);
}
}
}
The first part prints:
6.22848001836200E+018 6.2284800183620004E18 6228480018362000000
6.22848001836201E+018 6.2284800183620096E18 6228480018362010000
6.22848001836202E+018 6.2284800183620198E18 6228480018362020000
6.22848001836203E+018 6.2284800183620301E18 6228480018362030000
6.22848001836204E+018 6.2284800183620403E18 6228480018362040000
6.22848001836205E+018 6.2284800183620495E18 6228480018362050000
6.22848001836206E+018 6.2284800183620598E18 6228480018362060000
6.22848001836207E+018 6.22848001836207E18 6228480018362070000
6.22848001836208E+018 6.2284800183620803E18 6228480018362080000
6.22848001836209E+018 6.2284800183620905E18 6228480018362090000
There you can see the difference between wanted floating point value, real floating point value according IEEE 754 specification and reformatted BigDecimal. As you see only the 6.22848001836207E+018 can be stored according to the IEEE 754 specification directly.
The second part does the same using the following Excel sheet:
Another possible workaround is mentioned in the knowledge base article : "To work around this behavior, format the cell as text, then type the numbers. The cell can then display up to 1,024 characters. ". This is good if the numbers are not really numbers but Identifiers for example or some other strings where the digits are only meant as characters. Calculations with such "Text-Numbers" are of course not possible without reconverting them to floating point which will bring the problem again.
There is no change (loss or gain) of precision between 6228480018362050000 and 6228480018362049500. They are simply two different decimal presentations of the same internal binary value, which in decimal is exactly 6228480018362049536, by the way.
Regardless of the cell format, Excel displays (not "stores") only up to the first 15 significant digits, rounding any digits to the right [1].
However, other applications and file formats show up to the first 17 significant digits (or more), which is really what the IEEE 754 standard requires in order to represent every binary value [2]. Apparently, that is true of Apache POI and OpenXML.
You can demonstrate this by doing the following.
In Excel, enter 6228480018362050000. Save as XML.
Open the XML file in Notepad. Note that the Cell/Data element shows 6.2284800183620495E+18, which is 6228480018362049500.
Open the XML file in Excel. Note that Excel still displays 6228480018362050000 in the Formula Bar and in the cell formatted as Number.
It is true that Excel truncates manually-entered numbers (including those read from CSV and TXT files) to the first 15 significant digits, replacing any digits to the right with zeros. But Excel VBA does not.
So for another demonstration, enter the following in VBA, then execute the procedure.
Sub doit()
Range("a1:a2").NumberFormat = "0"
Range("a1") = CDbl("6228480018362050000")
Range("a2") = CDbl("6228480018362049536")
Columns("a").AutoFit
Range("b2") = "=match(a1,a2,0)"
End Sub
Note that A1 and A2 display 6228480018362050000. B2 displays 1, indicating that the internal binary values are an exact match, and VBA does not truncate after the first 15 significant digits.
Explanation....
Excel and most applications use IEEE 754 double-precision to represent numeric values. The binary representation is the sum of 53 consecutive powers of 2 ("bits") times an exponential factor.
Consequently, only integers up to 9007199254740992 (2^53) can be represented exactly. (But note that Excel displays 9007199254740990 for =2^53 because of its 15-significant-digit formatting limitation.)
Most larger integers can only be approximated.
And that is true of most decimal fractions as well, regardless of the number of significant digits. That is part of the reason why =10.1-10 displays 0.0999999999999996 in the Formula Bar and in the cell formatted with 16 decimal places (15 significant digits).
But beware: a calculated value that displays as 6228480018362050000 might differ from the actual internal binary value.
For example, if you enter 6228480018362050000 into A1 and the formula =6228480018362050000+1600 into A2, both A1 and A2 display 6228480018362050000.
But =MATCH(A1,A2,0) returns #N/A, which indicates that the internal binary values are not an exact match.
And the XML file would show 6.2284800183620516E+18 in the Data element corresponding to the Cell element for A2, which is 6228480018362051600. The actual internal binary value, in decimal, is exactly 6228480018362051584.
(FYI, the Excel equal operator ("=") does not compare the internal binary values. Instead, it compares the values rounded to 15 significant digits. So =(A1=A2) returns TRUE misleadingly. It is intended to be a feature; but it is implemented inconsistently.)
If you copy A2 and paste-value into A3, =MATCH(A1,A3,0) continues to return #N/A. But if you subsequently "edit" A3 (e.g. press f2, then Enter), =MATCH(A1,A3,0) returns 1. The internal value of A3 has been changed to the binary representation of 6228480018362050000.
I wonder if that is actually the mysterious problem that you encountered, and you inadvertently oversimplified it with your example.
Does that help?
[1] Cell format does not affect the internal binary value with two exceptions: (1) when Precision As Displayed is set, which is almost never recommended; and (2) when the cell value is calculated, and the worksheet is saved in CSV or TXT file, then re-open or imported in Excel.
[2] Although IEEE 754 specifies that 17 significant decimal digits are the minimum needed to represent all binary values, that does not mean that only 17 significant decimal digits are "stored". As demonstrated above, 6228480018362049500 is actually stored as exactly 6228480018362049536.
I am writing a program to approximate the golden ratio to the largest amount of precision possible. It works, but when I tell it to round to more than 16 decimal places, it just doesn't go past 15. This is my code:
# Using fractions to approximate the Golden Ratio
a = 1
b = 1
while b < 1000000000000000:
g = a + b
h = g / a
print (round(h, 20))
b = a
a = g
I realize that the while loop probably isn't the best way to do this, so if there is a more efficient way, please inform me of that. But my main question is is this rounding issue fixable? Or will I just have to settle for 15 decimal places? Thank you!
float doesn't have more than about 15 actual decimal places. Rounding it to more is pointless, since they don't exist.
If you really care about precision, I believe you should be using Decimal numbers instead of integers and floats.
Regardless of the type you use, be sure that you are formatting your string the way you want, and not just using print's default.
I have user input in two cells, named "UpperRangeHigh" and "UpperRangeLow". I have the following code:
dRangeUpper = [UpperRangeHigh] - [UpperRangeLow]
lLines = Int(dRangeUpper * 100 / lInterval)
The user inputs 120.3 and 120 into the input cells respectively. lInterval has the value 10. VBA produces the result of 2 for lLines, instead of 3.
I can overcome this problem by adding 0.000000001 to dRangeUpper, but I'm wondering if there is a known reason for this behaviour?
This appears to be a problem with Excel's calculation and significant digits. If you do:
=120.3 - 120 and format the cell to display 15 decimal places, the result appears as:
0.2999999999999970
Here is a brief overview which explains how Excel uses binary arithmetic and that this may result in results divergent from what you would expect:
http://excel.tips.net/T008143_Avoiding_Rounding_Errors_in_Formula_Results.html
You can overcome this by forcing a rounded precision, e.g., to 10 decimal places:
lLines = Int(Round(dRangeUpper, 10) * 100 / lInterval
Kindly use single or double when working with decimals to get more accurate results.
Sub sample()
Dim dRangeUpper As Double
dRangeUpper = CDbl("120.3") - CDbl("120")
lLines = Round(CDbl(dRangeUpper * 100 / 10), 4)
End Sub
output = 3
This is a known Floating point issue within Excel
http://support.microsoft.com/kb/78113
From MSDN:
To minimize any effects of floating point arithmetic storage
inaccuracy, use the Round() function to round numbers to the number of
decimal places that is required by your calculation. For example, if
you are working with currency, you would likely round to 2 decimal
places:
=ROUND(1*(0.5-0.4-0.1),2)
In your case, using round() instead of INT should do the trick using 0 rather than 2
Suppose I want to conver the number 0.011124325465476454 to string in MATLAB.
If I hit
mat2str(0.011124325465476454,100)
I get 0.011124325465476453 which differs in the last digit.
If I hit num2str(0.011124325465476454,'%5.25f')
I get 0.0111243254654764530000000
which is padded with undesirable zeros and differs in the last digit (3 should be 4).
I need a way to convert numerics with random number of decimals to their EXACT string matches (no zeros padded, no final digit modification).
Is there such as way?
EDIT: Since I din't have in mind the info about precision that Amro and nrz provided, I am adding some more additional info about the problem. The numbers I actually need to convert come from a C++ program that outputs them to a txt file and they are all of the C++ double type. [NOTE: The part that inputs the numbers from the txt file to MATLAB is not coded by me and I'm actually not allowed to modify it to keep the numbers as strings without converting them to numerics. I only have access to this code's "output" which is the numerics I'd like to convert]. So far I haven't gotten numbers with more than 17 decimals (NOTE: consequently the example provided above, with 18 decimals, is not very indicative).
Now, if the number has 15 digits eg 0.280783055069002
then num2str(0.280783055069002,'%5.17f') or mat2str(0.280783055069002,17) returns
0.28078305506900197
which is not the exact number (see last digits).
But if I hit mat2str(0.280783055069002,15) I get
0.280783055069002 which is correct!!!
Probably there a million ways to "code around" the problem (eg create a routine that does the conversion), but isn't there some way using the standard built-in MATLAB's to get desirable results when I input a number with random number of decimals (but no more than 17);
My HPF toolbox also allows you to work with an arbitrary precision of numbers in MATLAB.
In MATLAB, try this:
>> format long g
>> x = 0.280783054
x =
0.280783054
As you can see, MATLAB writes it out with the digits you have posed. But how does MATLAB really "feel" about that number? What does it store internally? See what sprintf says:
>> sprintf('%.60f',x)
ans =
0.280783053999999976380053112734458409249782562255859375000000
And this is what HPF sees, when it tries to extract that number from the double:
>> hpf(x,60)
ans =
0.280783053999999976380053112734458409249782562255859375000000
The fact is, almost all decimal numbers are NOT representable exactly in floating point arithmetic as a double. (0.5 or 0.375 are exceptions to that rule, for obvious reasons.)
However, when stored in a decimal form with 18 digits, we see that HPF did not need to store the number as a binary approximation to the decimal form.
x = hpf('0.280783054',[18 0])
x =
0.280783054
>> x.mantissa
ans =
2 8 0 7 8 3 0 5 4 0 0 0 0 0 0 0 0 0
What niels does not appreciate is that decimal numbers are not stored in decimal form as a double. For example what does 0.1 look like internally?
>> sprintf('%.60f',0.1)
ans =
0.100000000000000005551115123125782702118158340454101562500000
As you see, matlab does not store it as 0.1. In fact, matlab stores 0.1 as a binary number, here in effect...
1/16 + 1/32 + 1/256 + 1/512 + 1/4096 + 1/8192 + 1/65536 + ...
or if you prefer
2^-4 + 2^-5 + 2^-8 + 2^-9 + 2^-12 + 2^13 + 2^-16 + ...
To represent 0.1 exactly, this would take infinitely many such terms since 0.1 is a repeating number in binary. MATLAB stops at 52 bits. Just like 2/3 = 0.6666666666... as a decimal, 0.1 is stored only as an approximation as a double.
This is why your problem really is completely about precision and the binary form that a double comprises.
As a final edit after chat...
The point is that MATLAB uses a double to represent a number. So it will take in a number with up to 15 decimal digits and be able to spew them out with the proper format setting.
>> format long g
>> eps
ans =
2.22044604925031e-16
So for example...
>> x = 1.23456789012345
x =
1.23456789012345
And we see that MATLAB has gotten it right. But now add one more digit to the end.
>> x = 1.234567890123456
x =
1.23456789012346
In its full glory, look at x, as MATLAB sees it:
>> sprintf('%.60f',x)
ans =
1.234567890123456024298320699017494916915893554687500000000000
So always beware the last digit of any floating point number. MATLAB will try to round things intelligently, but 15 digits is just on the edge of where you are safe.
Is it necessary to use a tool like HPF or MP to solve such a problem? No, as long as you recognize the limitations of a double. However tools that offer arbitrary precision give you the ability to be more flexible when you need it. For example, HPF offers the use and control of guard digits down in that basement area. If you need them, they are there to save the digits you need from corruption.
You can use Multiple Precision Toolkit from MATLAB File Exchange for arbitrary precision numbers. Floating point numbers do not usually have a precise base-10 presentation.
That's because your number is beyond the precision of the double numeric type (it gives you between 15 to 17 significant decimal digits). In your case, it is rounded to the nearest representable number as soon as the literal is evaluated.
If you need more precision than what the double-precision floating-points provides, store the numbers in strings, or use arbitrary-precision libraries. For example use the Symbolic Toolbox:
sym('0.0111243254654764549999999')
You cannot get EXACT string since the number is stored in double type, or even long double type.
The number stored will be a subtle more or less than the number you gives.
computer only knows binary number 0 & 1. You must know that numbers in one radix may not expressed the same in other radix. For example, number 1/3, radix 10 yields 0.33333333...(The ellipsis (three dots) indicate that there would still be more digits to come, here is digit 3), and it will be truncated to 0.333333; radix 3 yields 0.10000000, see, no more or less, exactly the amount; radix 2 yields 0.01010101... , so it will likely truncated to 0.01010101 in computer,that's 85/256, less than 1/3 by rounding, and next time you fetch the number, it won't be the same you want.
So from the beginning, you should store the number in string instead of float type, otherwise it will lose precision.
Considering the precision problem, MATLAB provides symbolic computation to arbitrary precision.