Supress scientific notation without knowing length of number? - python-3.x

In python, how could I go about supressing scientific notation with complete precision WITHOUT knowing the length of number?
I need python to dynamically be able to return the number in normal form with exact precision no matter how large the number is, and to do it without any trailing zeros. The numbers will always be integers but they will be getting very large and I need them to be completely accurate. Even a single digit being rounded or changed would mess up my program.
Any ideas?

Use the decimal class.
Unlike hardware based binary floating point, the decimal module has a user alterable
precision (defaulting to 28 places) which can be as large as needed for a given
problem.
From https://docs.python.org/library/decimal.html

Related

To how many decimal places is bc accurate?

It is possible to print to several hundred decimal places a square root in bc, as it is in C. However in C it is only accurate to 15. I have checked the square root of 2 to 50 decimal places and it is accurate but what is the limit in bc? I can't find any reference to this.
To how many decimal places is bc accurate?
bc is an arbitrary precision calculator. Arbitrary precision just tells us how many digits it can represent (as many as will fit in memory), but doesn't tell us anything about accuracy.
However in C it is only accurate to 15
C uses your processor's built-in floating point hardware. This is fast, but has a fixed number of bits to represent each number, so is obviously fixed rather than arbitrary precision.
Any arbitrary precision system will have more ... precision than this, but could of course still be inaccurate. Knowing how many digits can be stored doesn't tell us whether they're correct.
However, the GNU implementation of bc is open source, so we can just see what it does.
The bc_sqrt function uses an iterative approximation (Newton's method, although the same technique was apparently known by the Babylonians in at least 1,000BC).
This approximation is just run, improving each time, until two consecutive guesses differ by less than the precision requested. That is, if you ask for 1,000 digits, it'll keep going until the difference is at most in the 1,001st digit.
The only exception is when you ask for an N-digit result and the original number has more than N digits. It'll use the larger of the two as its target precision.
Since the convergence rate of this algorithm is faster than one digit per iteration, there seems little risk of two consecutive iterations agreeing to some N digits without also being correct to N digits.

Why doesn't precision in Mathematica work consistently, or sometimes not at all?

Consider the following terminating decimal numbers.
3.1^2 = 9.61
3.1^4 = 92.3521
3.1^8 = 8528.91037441
The following shows how Mathematica treats these expressions
In[1]:= 3.1^2
Out[1]= 9.61
In[2]:= 3.1^4
Out[2]= 92.352
So far so good, but
In[3]:= 3.1^8
Out[3]= 8528.91
doesn't provide enough precision.
So let's try N[], NumberForm[], and DecimalForm[] with a precision of 12
In[4]:= N[3.1^8,12]
Out[4]= 8528.91
In[5]:= NumberForm[3.1^8,12]
Out[5]= 8528.91037441
In[6]:= DecimalForm[3.1^8,12]
Out[6]= 8528.91037441
In this case DecimialForm[] and NumberForm[] work as expected, but N[] only provided the default precision of 6, even though I asked for 12. So DecimalForm[] or NumberForm[] seem to be the way to go if you want exact results when the inputs are terminating decimals.
Next consider rational numbers with infinite repeating decimals like 1/3.
In[7]:= N[1/3,20]
Out[7]= 0.33333333333333333333
In[9]:= NumberForm[1/3, 20]
Out[9]=
1/3
In[9]:= DecimalForm[1/3, 20]
Out[9]=
1/3
Unlike the previous case, N[] seems to be the proper way to go here, whereas NumberForm[] and DecimalForm[] do not respect precisions.
Finally consider irrational numbers like Sqrt[2] and Pi.
In[10]:= N[Sqrt[2],20]
Out[10]= 1.4142135623730950488
In[11]:= NumberForm[Sqrt[2], 20]
Out[11]=
sqrt(2)
In[12]:= DecimalForm[Sqrt[2], 20]
Out[12]=
sqrt(2)
In[13]:= N[π^12,30]
Out[13]= 924269.181523374186222579170358
In[14]:= NumberForm[Pi^12,30]
Out[14]=
π^12
In[15]:= DecimalForm[Pi^12,30]
Out[15]=
π^12
In these cases N[] works, but NumberForm[] and DecimalForm[] do not. However, note that N[] switches to scientific notation at π^13, even with a larger precision. Is there a way to avoid this switch?
In[16]:= N[π^13,40]
Out[16]= 2.903677270613283404988596199487803130470*10^6
So there doesn't seem to be a consistent way of formulating how to get decimal numbers with requested precisions and at the same time avoiding scientific notation. Sometimes N[] works, othertimes DecimalForm[] or NumberForm[] works, and at othertimes nothing seems to work.
Have I missed something or are there bugs in the system?
It isn't a bug because it is designed purposefully to behave this way. Precision is limited by the precision of your machine, your configuration of Mathematica, and the algorithm and performance constraints of the calculation.
The documentation for N[expr, n] states it attempts to give a result with n‐digit precision. When it cannot give the requested precision it gets as close as it can. DecimalForm and NumberForm work the same way.
https://reference.wolfram.com/language/ref/N.html explains the various cases behind this:
Unless numbers in expr are exact, or of sufficiently high precision, N[expr,n] may not be able to give results with n‐digit precision.
N[expr,n] may internally do computations to more than n digits of precision.
$MaxExtraPrecision specifies the maximum number of extra digits of precision that will ever be used internally.
The precision n is given in decimal digits; it need not be an integer.
n must lie between $MinPrecision and $MaxPrecision. $MaxPrecision can be set to Infinity.
n can be smaller than $MachinePrecision.
N[expr] gives a machine‐precision number, so long as its magnitude is between $MinMachineNumber and $MaxMachineNumber.
N[expr] is equivalent to N[expr,MachinePrecision].
N[0] gives the number 0. with machine precision.
N converts all nonzero numbers to Real or Complex form.
N converts each successive argument of any function it encounters to numerical form, unless the head of the function has an attribute such as NHoldAll.
You can define numerical values of functions using N[f[args]]:=value and N[f[args],n]:=value.
N[expr,{p,a}] attempts to generate a result with precision at most p and accuracy at most a.
N[expr,{Infinity,a}] attempts to generate a result with accuracy a.
N[expr,{Infinity,1}] attempts to find a numerical approximation to the integer part of expr.

Does default memory allocated for a datatype play a role in rounding? In what manner a float is rounded if it exceeds allocated memory?

Having a file test2.py with the following contents:
print(2.0000000000000003)
print(2.0000000000000002)
I get this output:
$ python3 test2.py
2.0000000000000004
2.0
I thought lack of memory allocated for float might be causing this but 2.0000000000000003 and 2.0000000000000002 need same amount of memory.
IEEE 754 64-bit binary floating point always uses 64 bits to store a number. It can exactly represent a finite subset of the binary fractions. Looking only at the normal numbers, if N is a power of two in its range, it can represent a number of the form, in binary, 1.s*N where s is a string of 52 zeros and ones.
All the 32 bit binary integers, including 2, are exactly representable.
The smallest exactly representable number greater than 2 is 2.000000000000000444089209850062616169452667236328125. It is twice the binary fraction 1.0000000000000000000000000000000000000000000000000001.
2.0000000000000003 is closer to 2.000000000000000444089209850062616169452667236328125 than to 2, so it rounds up and prints as 2.0000000000000004.
2.0000000000000002 is closer to 2.0, so it rounds down to 2.0.
To store numbers between 2.0 and 2.000000000000000444089209850062616169452667236328125 would require a different floating point format likely to take more than 64 bits for each number.
Floats are not stored as integers are, with each bit signaling a yes/no term of 1,2,4,8,16,32,... value that you add up to get the complete number. They are stored as sign + mantissa + exponent in base 2. Several combinations have special meaning (NaN, +-inf, -0,...). Positive and negative numbers are idential in mantissa and exponent, only the sign differs.
At any given time they have a specific bit-length they are "put into". They can not overflow.
They have however a minimal accuracy, if you try to fit numbers into them that would need a bigger accuracy you get rounding errors - thats what you see in your example.
More on floats and storage (with example):
http://stupidpythonideas.blogspot.de/2015/01/ieee-floats-and-python.html
(which links to a more technical https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)
More on accuracy of floats:
- Floating Point Arithmetic: Issues and Limitations

directx local space coordinates float accuracy

I'm a bit confused of the local space coordinate system. Suppose I have a complex object in the local space. I know when I want to put it in the world space I have to multiply it with Scale,Rotate,Translate matrix. But the problem is the local coordinate only ranged from -1.0f to 1.0f, when I want to have vertex like (1/500,1/100,1/100) things will not work, everything will become 0 due to the float accuracy problem.
The only solution to me now is separate them into lots of local space systems and ProjectView each individually to put them together. It seems not the correct way of solving the problem. I've been checked lots of books but none of them mentioned this issue. I really want to know how to solve it.
when I want to have vertex like (1/500,1/100,1/100) things will not work
What makes you think that? The float accuracy problem does not mean something will coerce to 0 if it can't be accurately represented. It just means, it will coerce to the floating point number closest to the intended figure.
It's the very same as writing down, e.g., 3/9 with at most 6 significant decimal digits: 0.33334 – it didn't coerce to 0. And the very same goes for floating point.
Now you may be familiar with scientific notation: x·10^y – this is essentially decimal floating point, a mantissa x and an exponent y which essentially specifies the order of magnitude. In binary floating point it becomes x·2^y. In either case the significant digits are in the mantissa. Your typical floating point number (in OpenGL) has a mantissa of 23 bits, which boils down to an amount of 22 significant binary digits (which are about 7 decimal digits).
I really want to know how to solve it.
The real trouble with floating point numbers is, if you have to mix and merge numbers over a large range of orders of magnitudes. As long as the numbers are of similar order of magnitudes, everything happens with just the mantissa. And that one last change in order of magnitude to the [-1, 1] range will not hurt you; heck this can be done by "normalizing" the floating point value and then simply dropping the exponent.
Recommended read: http://floating-point-gui.de/
Update
One further thing: If you're writing 1/500 in a language like C, then you're performing an integer division and that will of course round down to 0. If you want this to be a floating point operation you either have to write floating point literals or cast to float, i.e.
1./500.
or
(float)1/(float)500
Note that casting one of the operands to float suffices to make this a floating point division.

lossless conversion of float to string and back: is it possible?

This question refers to the IEEE standard floating point numbers used on C/x86.
Is it possible to represent any numeric (i.e. excluding special values such as NaN) float or double as a decimal string such that converting that string back to a float/double will always yield exactly the original number?
If not, what algorithm tells me whether a given number will suffer a conversion error?
If so, consider this: some decimal fractions, when converted to binary, will not be numerically the same as the original decimal value, but the reverse is not true (because the binary has bounded precision so any decimal expansion is finite and perfect if not truncated), so here's another question...
Is it ever necessary to introduce deliberate errors into the decimal representation in order to trick the atof (or other) function into yielding the exact original number, or will a naive, non-truncating toString function be adequate (assuming exact conversion is possible in general)?
According to this page:
Actually, the IEEE754-1985 standard says that 17 decimal digits is
enough in all cases. However, it seems that the standard is a little
vague on whether conforming implementations must guarantee lossless
conversion when 17 digits are used.
So storing a double as a decimal string with at least 17 digits (correctly rounded) will guarantee that it can be converted back to binary double without any data loss.
In other words, if every single possible double-precision value were to be converted to a decimal string of 17 digits (correctly rounded), they will all map to different values. Thus there is no data-loss.
I'm not sure on the minimum cut-off for single-precision though. But I'd suspect that it will be 8 or 9 digits.
Given that the IEEE format can only represent a finite number of (binary) digits, and therefore have a minimum accuracy (cf. epsilon), you will only need a finite number of (decimal) digits. Of course it is preferable if the implementation (strtod, snprintf) has an identity mapping behavior between {all floats} and the set of {one decimal representation for each float}.
In java, it is possible to convert double from/to string, by constructing an intermediate BigDecimal object:
double doubleValue = ...;
// From double to string
String valueOfDoubleAsString = new BigDecimal(doubleValue).toString();
// And back
double doubleValueFromString = new BigDecimal(valueOfDoubleAsString).doubleValue();
// doubleValue == doubleValueFromString
There is no locale issue with this method.
However, special double values (Infinite, NaN) will of course not work.

Resources