why is np.exp(x) not equal to np.exp(1)**x - python-3.x

Why is why is np.exp(x) not equal to np.exp(1)**x?
For example:
np.exp(400)
>>>5.221469689764144e+173
np.exp(1)**400
>>>5.221469689764033e+173
np.exp(400)-np.exp(1)**400
>>>1.1093513018771065e+160

This is optimisation of numpy that raise this diff.
Indeed, you have to understand how is calculated the Euler number in math:
e = (1/n)**n with n == inf.
I think numpy stop at a certain order:
You have in the numpy exp documentation here that is not very clear about how the Euler number is calculated.
Because of this order that is not equal to infinity, you have this small difference in the two calculations.
Indeed the value np.exp(400) is calculated using this: (1 + 400/n)**n
>>> (1 + 400/n)**n
5.221642085428121e+173
>>> numpy.exp(400)
5.221469689764144e+173
Here you have n = 1000000000000 wich is very small and raise this difference at 10e-5.
Indeed there is no exact value of the Euler number. Like Pi, you can only have an approched value.

It looks like a rounding issue. In the first case it's internally using a very precise value of e, while in the second you get a less precise value, which when multiplied 400 times the precision issues become more apparent.
The actual result when using the Windows calculator is 5.2214696897641439505887630066496e+173, so you can see your first outcome is fine, while the second is not.
5.2214696897641439505887630066496e+173 // calculator
5.221469689764144e+173 // exp(400)
5.221469689764033e+173 // exp(1)**400
Starting from your result, it looks it's using a value with 15 digits of precision.
2.7182818284590452353602874713527 // e
2.7182818284590450909589085441968 // 400th root of the 2nd result

Related

Is there any way to make this code more efficient?

I have to write a code to calculate the number of elements that have the maximum number of divisors between any 2 given numbers (A[0], A[1])(inclusive of both). I have to take input in the form of a line separated with spaces. The first line of the input gives the number of cases present in an example. This code is working perfectly fine but is taking some time to execute. Can anyone please help me write this code more efficiently?
import numpy as np
from sys import stdin
t=input()
for i in range(int(t)):
if int(t)<=100 and int(t)>=1:
divisor=[]
A=list(map(int,stdin.readline().split(' ')))
def divisors(n):
count=0
for k in range(1,int(n/2)+1):
if n%k==0:
count+=1
return count
for j in np.arange(A[0],A[1]+1):
divisor.append(divisors(j))
print(divisor.count(max(divisor)))
Sample input:
2
2 9
1 10
Sample Output:
3
4
There is a way to calculate divisors from the prime factorisation of a number.
Given the prime factorisation, calculating divisors is faster than trial division (which you do here).
But prime factorisation has to be fast. For small numbers having a pre-calculated list of prime numbers (easy to do) can make prime factorisation fast and divisor calculation fast as well. If you konw the upper limit of the numbers you test (let's call it L), then you need the prime numbers up to sqrt(L). Given the prime factorisation of a number n = p_1^e_1 * p_2^e_2 * .. * p_k^e_k the number of divisors is simply (1+e_1) * (1+e_2) * .. * (1+e_k)
Even more, you can pre-calculate and/or memoize the num of divisors of some overused numbers up to some limit. This will save a lot of time but increase memory, else you can calculate it directly (for example using previous method).
Apart from that, you can optimise the code a bit. For example you can avoid doing int(t) casting (and similar) all the time, do it once and store it in a variable.
Numpy may be avoided all together, it is superflous and I doubt adds any speed advantage, depends.
That should make your code faster, but always need to measure performance by real tests.

Why does this n choose r python code not work?

These 2 variations of n choose r code got different answer although followed the correct definition
I saw that this code works,
import math
def nCr(n,r):
f = math.factorial
return f(n) // f(r) // f(n-r)
But mine did not:
import math
def nCr(n,r):
f = math.factorial
return int(f(n) / (f(r) * f(n-r)))
Use test case nCr(80,20) will show the difference in result. Please advise why are they different in Python 3, thank you!
No error message. The right answer should be 3535316142212174320, but mine got 3535316142212174336.
That's because int(a / b) isn't the same as a // b.
int(a / b) evaluates a / b first, which is floating-point division. And floating-point numbers are prone to inaccuracies, roundoff errors and the like, as .1 + .2 == 0.30000000000000004. So, at some point, your code attempts to divide really big numbers, which causes roundoff errors since floating-point numbers are of fixed size, and thus cannot be infinitely precise.
a // b is integer division, which is a different thing. Python's integers can be arbitrarily huge, and their division doesn't cause roundoff errors, so you get the correct result.
Speaking about floating-point numbers being of fixed size. Take a look at this:
>>> import math
>>> f = math.factorial
>>> f(20) * f(80-20)
20244146256600469630315959326642192021057078172611285900283370710785170642770591744000000000000000000
>>> f(80) / _
3.5353161422121743e+18
The number 3.5353161422121743e+18 is represented exactly as shown here: there is no information about the digits after the last 3 in 53...43 because there's nowhere to store it. But int(3.5353161422121743e+18) must put something there! Yet it doesn't have enough information. So it puts whatever it wants to so that float(int(3.5353161422121743e+18)) == 3.5353161422121743e+18.

Very large float in python

I'm trying to construct a neural network for the Mnist database. When computing the softmax function I receive an error to the same ends as "you can't store a float that size"
code is as follows:
def softmax(vector): # REQUIRES a unidimensional numpy array
adjustedVals = [0] * len(vector)
totalExp = np.exp(vector)
print("totalExp equals")
print(totalExp)
totalSum = totalExp.sum()
for i in range(len(vector)):
adjustedVals[i] = (np.exp(vector[i])) / totalSum
return adjustedVals # this throws back an error sometimes?!?!
After inspection, most recommend using the decimal module. However when I've messed around with the values being used in the command line with this module, that is:
from decimal import Decimal
import math
test = Decimal(math.exp(720))
I receive a similar error for any values which are math.exp(>709).
OverflowError: (34, 'Numerical result out of range')
My conclusion is that even decimal cannot handle this number. Does anyone know of another method I could use to represent these very large floats.
There is a technique which makes the softmax function more feasible computationally for a certain kind of value distribution in your vector. Namely, you can subtract the maximum value in the vector (let's call it x_max) from each of its elements. If you recall the softmax formula, such operation doesn't affect the outcome as it reduced to multiplication of the result by e^(x_max) / e^(x_max) = 1. This way the highest intermediate value you get is e^(x_max - x_max) = 1 so you avoid the overflow.
For additional explanation I recommend the following article: https://nolanbconaway.github.io/blog/2017/softmax-numpy
With a value above 709 the function 'math.exp' exceeds the floating point range and throws this overflow error.
If, instead of math.exp, you use numpy.exp for such large exponents you will see that it evaluates to the special value inf (infinity).
All this apart, I wonder why you would want to produce such a big number (not sure you are aware how big it is. Just to give you an idea, the number of atoms in the universe is estimated to be in the range of 10 to the power of 80. The number you are trying to produce is MUCH larger than that).

Python floating point precision sum

I have the following array in python
n = [565387674.45, 321772103.48,321772103.48, 214514735.66,214514735.65,
357524559.41]
if I sum all these elements, I get this:
sum(n)
1995485912.1300004
But, this sum should be:
1995485912.13
In this way, I know about floating point "error". I already used the isclose() function from numpy to check the corrected value, but
how much is this limit? Is there any way to reduce this "error"?
The main issue here is that the error propagates to other operations, for example, the below assertion must be true:
assert (sum(n) - 1995485911) ** 100 - (1995485912.13 - 1995485911) ** 100 == 0.
This is problem with floating point numbers. One solution is having them represented in string form and using decimal module:
n = ['565387674.45', '321772103.48', '321772103.48', '214514735.66', '214514735.65',
'357524559.41']
from decimal import Decimal
s = sum(Decimal(i) for i in n)
print(s)
Prints:
1995485912.13
You could use round(num, n) function which rounds the number to the desired decimal places. So in your example you would use round(sum(n), 2)

math.sqrt function python gives same result for two different values [duplicate]

Why does the math module return the wrong result?
First test
A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 12345678917
B = 12345678917
Here, the result is correct.
Second test
A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 123456758365483459347856
B = 123456758365483467538432
Here the result is incorrect.
Why is that the case?
Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.
Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.
If one calculates A**2 one gets:
>>> 12345678917**2
152415787921658292889L
Now if one converts it to a float(..), one gets:
>>> float(12345678917**2)
1.5241578792165828e+20
But if you now ask whether the two are equal:
>>> float(12345678917**2) == 12345678917**2
False
So information has been lost while converting it to a float.
You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.
The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."
Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.
However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.
If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.
Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.
def isqrt(x):
"""Return the integer part of the square root of x, even for very
large values."""
if x < 0:
raise ValueError('square root not defined for negative numbers')
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = (1 << (a+b)) - 1
while True:
y = (x + n//x) // 2
if y >= x:
return x
x = y
If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:
import sympy
num = sympy.Integer(123456758365483459347856)
print(int(num) == int(sympy.sqrt(num**2)))
The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:
>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856
I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.

Resources