How to calculate and store the number in scientific notation? - python-3.x

I am using Python to model the statistical physical, so I will deal with small numbers.
For example,
a = 2.22e-300, b = 3e-200
and I want to calculate
a * b = 6.66e-500.
However, in Python 3 it shows 0.0.
I am thinking to design a data type: the first part to store the float number, which is 6.66 here, and the second part stores the magnitude, which is -500.
May I ask how I can implement this? Or is there any better way to deal with the scientific number?

Create a class:
class Sci_note:
def __init__(self, base, exp):
self.base = base
self.exp = exp
def __mul__(self, other):
return Sci_note(self.base * other.base,
self.exp + other.exp)
def __str__(self):
return str(self.base) + 'e' + str(self.exp)
and it functions as you would expect:
>>> a = Sci_note(2.22, -300)
>>> b = Sci_note(3, -200)
>>> c = a * b
>>> c.base
6.66
>>> c.exp
-500
update
I added a __str__ method (above), so they are displayed properly when printed:
>>> print(a)
2.22e-300
Of course, I have only implemented the multiplication method here, but I will leave it up to you to implement the others when required. It may be the case that you only need multiplication so I would be wasting everyone's time if I wrote them now!
In addition, creating a __float__ handler would also not be useful here, as Python can't handle floats of the order ^-300, so it would be useless to return them as we would just get 0!

I strongly suggest you use something like the built-in decimal module and increase its precision to your needs. For example:
>>> from decimal import *
>>> getcontext().prec = 100
>>> a = Decimal("2.22e-300")
>>> b = Decimal("3e-200")
>>> a
Decimal('2.22E-300')
>>> b
Decimal('3E-200')
>>> a*b
Decimal('6.66E-500')
Note that, to be on the safe side, I create a and b using strings such as "3e-200" to let the decimal module parse them correctly. If not, it will first convert them to Python's inexact floating points and muck them up before passing them into Decimal objects.
In the above code, we set the precision to 100.

Related

How to prevent the "overflow" of multiply several small numbers?

I have a list with x very small numbers and want to create the product of them. I only want to use pure Python or/and numpy.
# List A with x very small numbers
A =[1.20223398e-072 1.53678559e-067 6.04813112e-041 3.26046833e-104
3.09114525e-048 7.65394632e-118 4.58886892e-209 7.02220200e-044
3.40963578e-085 2.79721084e-060 6.99320974e-052 7.65701921e-039
3.05321642e-103 2.33360119e-050 2.92905105e-044 5.13970623e-044
6.46863409e-180 1.78254565e-177 6.26061488e-068 5.86281346e-043]
#creating the product of all elements in A
np.prod(A)
Output:
0.0
And this is a problem, maybe an overflow?!
What have I tried?
I tried to do this in a loop => very bad
running time and did not work
Input:
prod = 1
for i in A):
prod = prod * A
Output:
0.0
I tried to sort by amount and then multiply from the array always the first and last, then second and penultimate, third and third last, ....
=> Did not work ether
Input:
prod = 1
A.sort()
for i in range(len(A)):
prod = prod * (A[i]*A[-i-1])
Output:
0.0
Do any one has an idea how to solve this problem?
Best regards
Christian
The underlying floating point storage is not large enough to hold so small numbers, especially their product, which grows smaller the more you multiply.
>>> A[0]*A[1]
1.8475758562723483e-139
>>> A[0]*A[1]*A[2]
1.1174381032881437e-179
>>> A[0]*A[1]*A[2]*A[3]
3.6433715465062613e-283
>>> A[0]*A[1]*A[2]*A[3]*A[4]
0.0
However, with these specific numbers, the product actually fits in a np.float128 type:
>>> np.prod(A, dtype=np.float128)
6.439950307032109978e-1637
Of course, this just moves the goal post: Multiplying other numbers could again give you a zero.
An alternative is to use the decimal module in Python, which will give you the most flexibility in dealing with exact numbers. It's much slower than IEEE floats, though, but should work:
>>> import decimal
>>> B = [decimal.Decimal(x) for x in A]
>>> B[0]*B[1]
Decimal('1.847575856272348173712320037E-139')
>>> B[0]*B[1]*B[2]
Decimal('1.117438103288143607857271302E-179')
>>> B[0]*B[1]*B[2]*B[3]
Decimal('3.643371546506261078793026112E-283')
>>> B[0]*B[1]*B[2]*B[3]*B[4]
Decimal('1.126219064996798351280705310E-330')

Why does this n choose r python code not work?

These 2 variations of n choose r code got different answer although followed the correct definition
I saw that this code works,
import math
def nCr(n,r):
f = math.factorial
return f(n) // f(r) // f(n-r)
But mine did not:
import math
def nCr(n,r):
f = math.factorial
return int(f(n) / (f(r) * f(n-r)))
Use test case nCr(80,20) will show the difference in result. Please advise why are they different in Python 3, thank you!
No error message. The right answer should be 3535316142212174320, but mine got 3535316142212174336.
That's because int(a / b) isn't the same as a // b.
int(a / b) evaluates a / b first, which is floating-point division. And floating-point numbers are prone to inaccuracies, roundoff errors and the like, as .1 + .2 == 0.30000000000000004. So, at some point, your code attempts to divide really big numbers, which causes roundoff errors since floating-point numbers are of fixed size, and thus cannot be infinitely precise.
a // b is integer division, which is a different thing. Python's integers can be arbitrarily huge, and their division doesn't cause roundoff errors, so you get the correct result.
Speaking about floating-point numbers being of fixed size. Take a look at this:
>>> import math
>>> f = math.factorial
>>> f(20) * f(80-20)
20244146256600469630315959326642192021057078172611285900283370710785170642770591744000000000000000000
>>> f(80) / _
3.5353161422121743e+18
The number 3.5353161422121743e+18 is represented exactly as shown here: there is no information about the digits after the last 3 in 53...43 because there's nowhere to store it. But int(3.5353161422121743e+18) must put something there! Yet it doesn't have enough information. So it puts whatever it wants to so that float(int(3.5353161422121743e+18)) == 3.5353161422121743e+18.

math.sqrt function python gives same result for two different values [duplicate]

Why does the math module return the wrong result?
First test
A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 12345678917
B = 12345678917
Here, the result is correct.
Second test
A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 123456758365483459347856
B = 123456758365483467538432
Here the result is incorrect.
Why is that the case?
Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.
Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.
If one calculates A**2 one gets:
>>> 12345678917**2
152415787921658292889L
Now if one converts it to a float(..), one gets:
>>> float(12345678917**2)
1.5241578792165828e+20
But if you now ask whether the two are equal:
>>> float(12345678917**2) == 12345678917**2
False
So information has been lost while converting it to a float.
You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.
The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."
Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.
However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.
If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.
Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.
def isqrt(x):
"""Return the integer part of the square root of x, even for very
large values."""
if x < 0:
raise ValueError('square root not defined for negative numbers')
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = (1 << (a+b)) - 1
while True:
y = (x + n//x) // 2
if y >= x:
return x
x = y
If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:
import sympy
num = sympy.Integer(123456758365483459347856)
print(int(num) == int(sympy.sqrt(num**2)))
The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:
>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856
I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.

Setting the length of a float or number

I've been looking around to try and set the length of floats or decimals to 2 places, I'm doing this for a set of course work, I have tried getcontext but it does nothing.
from decimal import *
getcontext().prec = 2
price = ("22.5")
#I would like this to be 22.50, but as it comes form a list and I use float a bit, so I have to convert it to decimal (?)
price = Decimal(price)
print (price)
But the output is:
22.5
If anyone knows a better way to set the length of a decimal to 2 decimal places (using it in money) or where I'm going wrong, it would be helpful.
"float" is short for "floating point". Read about floating point on https://en.wikipedia.org/wiki/Floating-point_arithmetic and then never ever ever use it to represent money.
You're on the right track with Decimal. You just need to watch out for the distinction between the precision of the representation and the display.
The prec attribute of the context controls the precision of the representation of values that result from different operations. It does not control the precision of explicitly constructed values. And it does not control the precision of the display.
Consider:
>>> getcontext().prec = 2
>>> Decimal("1") / Decimal("3")
Decimal('0.33')
>>>
vs
>>> getcontext().prec = 2
>>> Decimal("1") / Decimal("2")
Decimal('0.5')
>>>
vs
>>> getcontext().prec = 2
>>> Decimal("0.12345")
Decimal('0.12345')
>>>
To specify the precision for display purposes of a Decimal, you just have to take more control over the display code. Don't rely on str(Decimal(...)).
One option is to normalize the decimal for display:
>>> getcontext().prec = 2
>>> Decimal("0.12345").normalize()
Decimal('0.12')
This respects the prec setting from the context.
Another option is to quantize it to a specific precision:
>>> Decimal("0.12345").quantize(Decimal("1.00"))
Decimal('0.12')
This is independent of the prec setting from the context.
Decimals can also be rounded:
>>> round(Decimal("123.4567"), 2)
123.46
Though be very careful with this as the result of rounding is a float.
You can also format a Decimal directly into a string:
>>> "{:.2f}".format(Decimal("1.234"))
'1.23'
Try this:
print "{:.2f}".format(price)
This way there will be no need for globally set the precision.

Distinguishing large integers from near integers in python

I want to avoid my code mistaking a near integer for an integer. For example, 58106601358565889 has as its square root 241,053,109.00000001659385359763188, but when I used the following boolean test, 58106601358565889 fooled me into thinking it was a perfect square:
a = 58106601358565889
b = math.sqrt(a)
print(b == int(b))
The precision isn't necessarily the problem, because if I re-check, I get the proper (False) conclusion:
print(a == b**2)
What would be a better way to test for a true versus a near integer? The math.sqrt is buried in another definition in my code, and I would like to avoid having to insert a check of a squared square root, if possible. I apologize if this is not a good question; I'm new to python.
import numpy as np
import math
from decimal import *
a = 58106601358565889
b = np.sqrt(a)
c = math.sqrt(a)
d = Decimal(58106601358565889).sqrt()
print(d)
print(int(d))
print(c)
print(int(c))
print(b)
print(int(b))
o/p
241053109.0000000165938535976
241053109
241053109.0
241053109
241053109.0
241053109
I would say use decimal.
Expected code :
from decimal import *
d = Decimal(58106601358565889).sqrt()
print(d == int(d))
o/p
False
This isn't a matter of distinguishing integers from non-integers, because b really is an integer*. The precision of a Python float isn't enough to represent the square root of a to enough digits to get any of its fractional component. The second check you did:
print(a == b**2)
only prints False because while b is an integer, b**2 still isn't a.
If you want to test whether very large integers are exact squares, consider implementing a square root algorithm yourself.
*as in 0 fractional part, not as in isinstance(b, int).
It's not the precision of the int that is the problem - it's the limited precision of floats
>>> import math
>>> math.sqrt(58106601358565889)
241053109.0
>>> math.sqrt(58106601358565889) - 241053109
0.0
I think the double check would be the obvious solution
You could also look at the gmpy2 library. It has a function for calculating the integer square root and also the integer square root plus remainder. There are no precision constraints.
>>> import gmpy2
>>> gmpy2.isqrt(58106601358565889)
mpz(241053109)
>>> gmpy2.isqrt_rem(58106601358565889)
(mpz(241053109), mpz(8))
>>>
Disclaimer: I maintain gmpy2.

Resources