Hello I am trying to understant why after this operation:
a = np.array([[1, 2], [3, 4]])
ainv = inv(a)
print(np.dot(a,ainv))
I am getting:
[[1.0000000e+00 0.0000000e+00]
[8.8817842e-16 1.0000000e+00]]
Since I am using the a's inverse matrix I think that I shoud get:
[[1,0],[0,1]]
SO I would like support to understand the result
a = np.array([[1.0, 2.0], [3.0, 4.0]])
ainv = np.linalg.inv(a) #[[-2.0, 1.0],[1.5, -0.5]]
print(np.dot(a,ainv))
Yields as you discovered:
[[1.0000000e+00 0.0000000e+00]
[8.8817842e-16 1.0000000e+00]]
Lets look at the type of the array elements
type(ainv[1][1])
Shows us that the type of the array is
numpy.float64
Lets look at the numpy precision for this type
numpy.finfo(numpy.float64).precision
Numpy says the aproximate number of decimal digits to which this kind of float is precise is 15.
15
For curiosity, we can also look at the machine epsilon for the type;
np.finfo(np.float64).eps
Which yields the smallest number n where 1 +n is indistinguishable from 1
2.220446049250313e-16
So even though the number you get is technically distinguishable from 0 for the datatype, the overall precision is 15 decimals, calculations on large matrices might compound floating point imprecision even further.
That is the identity matrix, almost. You are getting numbers very close to zero instead of zero, which is a common issue with floating point numbers since they are only a finite approximation of real numbers. For all practical purposes 8.8e-16 or 0.00000000000000088 is ~ zero.
Related
In math, you are allowed to take cubic roots of negative numbers, because a negative number multiplied by two other negative numbers results in a negative number. Raising something to a fractional power 1/n is the same as taking the nth root of it. Therefore, the cubic root of -27, or (-27)**(1.0/3.0) comes out to -3.
But in Python 2, when I type in (-27)**(1.0/3.0), it gives me an error:
Traceback (most recent call last):
File "python", line 1, in <module>
ValueError: negative number cannot be raised to a fractional power
Python 3 doesn't produce an exception, but it gives a complex number that doesn't look anything like -3:
>>> (-27)**(1.0/3.0)
(1.5000000000000004+2.598076211353316j)
Why don't I get the result that makes mathematical sense? And is there a workaround for this?
-27 has a real cube root (and two non-real cube roots), but (-27)**(1.0/3.0) does not mean "take the real cube root of -27".
First, 1.0/3.0 doesn't evaluate to exactly one third, due to the limits of floating-point representation. It evaluates to exactly
0.333333333333333314829616256247390992939472198486328125
though by default, Python won't print the exact value.
Second, ** is not a root-finding operation, whether real roots or principal roots or some other choice. It is the exponentiation operator. General exponentiation of negative numbers to arbitrary real powers is messy, and the usual definitions don't match with real nth roots; for example, the usual definition of (-27)^(1/3) would give you the principal root, a complex number, not -3.
Python 2 decides that it's probably better to raise an error for stuff like this unless you make your intentions explicit, for example by exponentiating the absolute value and then applying the sign:
def real_nth_root(x, n):
# approximate
# if n is even, x must be non-negative, and we'll pick the non-negative root.
if n % 2 == 0 and x < 0:
raise ValueError("No real root.")
return (abs(x) ** (1.0/n)) * (-1 if x < 0 else 1)
or by using complex exp and log to take the principal root:
import cmath
def principal_nth_root(x, n):
# still approximate
return cmath.exp(cmath.log(x)/n)
or by just casting to complex for complex exponentiation (equivalent to the exp-log thing up to rounding error):
>>> complex(-27)**(1.0/3.0)
(1.5000000000000004+2.598076211353316j)
Python 3 uses complex exponentiation for negative-number-to-noninteger, which gives the principal nth root for y == 1.0/n:
>>> (-27)**(1/3) # Python 3
(1.5000000000000004+2.598076211353316j)
The type coercion rules documented by builtin pow apply here, since you're using a float for the exponent.
Just make sure that either the base or the exponent is a complex instance and it works:
>>> (-27+0j)**(1.0/3.0)
(1.5000000000000004+2.598076211353316j)
>>> (-27)**(complex(1.0/3.0))
(1.5000000000000004+2.598076211353316j)
To find all three roots, consider numpy:
>>> import numpy as np
>>> np.roots([1, 0, 0, 27])
array([-3.0+0.j , 1.5+2.59807621j, 1.5-2.59807621j])
The list [1, 0, 0, 27] here refers to the coefficients of the equation 1x³ + 0x² + 0x + 27.
I do not think Python, or your version of it, supports this function. I pasted the same equation into my Python interpreter, (IDLE) and it solved it, with no errors. I am using Python 3.2.
I want my sigmoid to never print a solid 1 or 0, but to actually print the exact value
i tried using
torch.set_printoptions(precision=20)
but it didn't work. here's a sample output of the sigmoid function :
before sigmoid : tensor([[21.2955703735]])
after sigmoid : tensor([[1.]])
but i don't want it to print 1, i want it to print the exact number, how can i force this?
The difference between 1 and the exact value of sigmoid(21.2955703735) is on the order of 5e-10, which is significantly less than machine epsilon for float32 (which is about 1.19e-7). Therefore 1.0 is the best approximation that can be achieved with the default precision. You can cast your tensor to a float64 (AKA double precision) tensor to get a more precise estimate.
torch.set_printoptions(precision=20)
x = torch.tensor([21.2955703735])
result = torch.sigmoid(x.to(dtype=torch.float64))
print(result)
which results in
tensor([0.99999999943577644324], dtype=torch.float64)
Keep in mind that even with 64-bit floating point computation this is only accurate to about 6 digits past the last 9 (and will be even less precise for larger sigmoid inputs). A better way to represent numbers very close to one is to directly compute the difference between 1 and the value. In this case 1 - sigmoid(x) which is equivalent to 1 / (1 + exp(x)) or sigmoid(-x). For example,
x = torch.tensor([21.2955703735])
delta = torch.sigmoid(-x.to(dtype=torch.float64))
print(f'sigmoid({x.item()}) = 1 - {delta.item()}')
results in
sigmoid(21.295570373535156) = 1 - 5.642236648842976e-10
and is a more accurate representation of your desired result (though still not exact).
I'm calculating a covariance matrix from a 2D array using np.cov, and using it to get nearest neighbors with Mahalanobis distance.
c = np.cov(arr)
neigh = NearestNeighbors(100,metric='mahalanobis',metric_params = {'VI':np.linalg.inv(c)})
neigh.fit(dfeatures)
But for some reason, I'm getting
/lib/python3.4/site-packages/sklearn/externals/joblib/parallel.py:131: RuntimeWarning: invalid value encountered in sqrt
and the values of the distance of any query point returns nan.
Instead of passing c to NearestNeighbors, if I pass an identity matrix the NearestNeighbors works as expected. I suspected that c might actually not be positive semidefinite and therefore the values in the sqrt in Mahalanobis distance might get a negative value as input.
I checked the eigenvalue of resulting c and many of them turned out to be negative(and complex) but close to 0.
I'd a few questions:
Is this totally because of the numerical errors(or am I doing something wrong)?
If it is because of numerical errors is there a way to fix it?
Turns out this is in-fact because of numerical error. A workaround to correct this is to add a small number to diagonal element of covariance matrix. The larger this number the closer the distance will be to euclidean distance, so one must be careful while choosing this number.
Why does the math module return the wrong result?
First test
A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 12345678917
B = 12345678917
Here, the result is correct.
Second test
A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 123456758365483459347856
B = 123456758365483467538432
Here the result is incorrect.
Why is that the case?
Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.
Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.
If one calculates A**2 one gets:
>>> 12345678917**2
152415787921658292889L
Now if one converts it to a float(..), one gets:
>>> float(12345678917**2)
1.5241578792165828e+20
But if you now ask whether the two are equal:
>>> float(12345678917**2) == 12345678917**2
False
So information has been lost while converting it to a float.
You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.
The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."
Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.
However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.
If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.
Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.
def isqrt(x):
"""Return the integer part of the square root of x, even for very
large values."""
if x < 0:
raise ValueError('square root not defined for negative numbers')
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = (1 << (a+b)) - 1
while True:
y = (x + n//x) // 2
if y >= x:
return x
x = y
If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:
import sympy
num = sympy.Integer(123456758365483459347856)
print(int(num) == int(sympy.sqrt(num**2)))
The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:
>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856
I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.
In the ghci terminal, I was computing some equations with Haskell using the sqrt function.
I notice that I would sometimes lose precision in my sqrt result, when it was supposed to be simplified.
For example,
sqrt 4 * sqrt 4 = 4 -- This works well!
sqrt 2 * sqrt 2 = 2.0000000000000004 -- Not the exact result.
Normally, I would expect a result of 2.
Is there a way to get the right simplification result?
How does that work in Haskell?
There are usable precise number libraries in Haskell. Two that come to mind are cyclotomic and the CReal module in the numbers package. (Cyclotomic numbers don't support all the operations on complex numbers that you might like, but square roots of integers and rationals are in the domain.)
>>> import Data.Complex.Cyclotomic
>>> sqrtInteger 2
e(8) - e(8)^3
>>> toReal $ sqrtInteger 2
Just 1.414213562373095 -- Maybe Double
>>> sqrtInteger 2 * sqrtInteger 2
2
>>> toReal $ sqrtInteger 2 * sqrtInteger 2
Just 2.0
>>> rootsQuadEq 3 2 1
Just (-1/3 + 1/3*e(8) + 1/3*e(8)^3,-1/3 - 1/3*e(8) - 1/3*e(8)^3)
>>> let eq x = 3*x*x + 2*x + 1
>>> eq (-1/3 + 1/3*e(8) + 1/3*e(8)^3)
0
>>> import Data.Number.CReal
>>> sqrt 2 :: CReal
1.4142135623730950488016887242096980785697 -- Show instance cuts off at 40th place
>>> sqrt 2 * sqrt 2 :: CReal
2.0
>>> sin 3 :: CReal
0.1411200080598672221007448028081102798469
>>> sin 3*sin 3 + cos 3*cos 3 :: CReal
1.0
You do not lose precision. You have limited precision.
The square root of 2 is a real number but not a rational number, therefore it's value cannot be represented exactly by any computer (except representing it symbolically, of course).
Even if you define a very large precision type, it will not be able to represent the square root of 2 exactly. You may get more precision, but never enough to represent that value exactly (unless you have a computer with infinite memory, in which case please hire me).
The explanation for these results lies in the type of the values returned by the sqrt function:
> :t sqrt
sqrt :: Floating a => a -> a
The Floating a means that the value returned belongs to the Floating type class.
The values of all types belonging to this class are stored as floating point numbers. These sacrifice precision for the sake of covering a larger range of numbers.
Double precision floating point numbers can cover very large ranges but they have limited precision and cannot encode all possible numbers. The square root of 2 (√2) is one such number:
> sqrt 2
1.4142135623730951
> sqrt 2 + 0.000000000000000001
1.4142135623730951
As you see above, it is impossible for double precision floating point numbers to be precise enough to represent √2 + 0.000000000000000001, it is simply rounded to the closest approximation which can be expressed using floating point encoding.
As mentioned by another poster, √2 is an irrational number which can be simplified to mean that it requires an infinite number of digits to represent correctly. As such it cannot be represented faithfully using floating point numbers. This leads to errors such as the one you noticed when multiplying it with itself.
You can learn about floating points on their wikipedia page: http://en.wikipedia.org/wiki/Floating_point.
I especially recommend that you read the answer to this other Stack Overflow question: Floating Point Limitations and follow the mentioned link, it will help you understand what's going on under the hood.
Note that this is a problem in every language, not just Haskell. One way to get rid of it entirely is to use symbolic computation libraries but they are much slower than the floating point numbers offered by CPUs. For many computations the loss of precision due to floating points is not a problem.