I want my sigmoid to never print a solid 1 or 0, but to actually print the exact value
i tried using
torch.set_printoptions(precision=20)
but it didn't work. here's a sample output of the sigmoid function :
before sigmoid : tensor([[21.2955703735]])
after sigmoid : tensor([[1.]])
but i don't want it to print 1, i want it to print the exact number, how can i force this?
The difference between 1 and the exact value of sigmoid(21.2955703735) is on the order of 5e-10, which is significantly less than machine epsilon for float32 (which is about 1.19e-7). Therefore 1.0 is the best approximation that can be achieved with the default precision. You can cast your tensor to a float64 (AKA double precision) tensor to get a more precise estimate.
torch.set_printoptions(precision=20)
x = torch.tensor([21.2955703735])
result = torch.sigmoid(x.to(dtype=torch.float64))
print(result)
which results in
tensor([0.99999999943577644324], dtype=torch.float64)
Keep in mind that even with 64-bit floating point computation this is only accurate to about 6 digits past the last 9 (and will be even less precise for larger sigmoid inputs). A better way to represent numbers very close to one is to directly compute the difference between 1 and the value. In this case 1 - sigmoid(x) which is equivalent to 1 / (1 + exp(x)) or sigmoid(-x). For example,
x = torch.tensor([21.2955703735])
delta = torch.sigmoid(-x.to(dtype=torch.float64))
print(f'sigmoid({x.item()}) = 1 - {delta.item()}')
results in
sigmoid(21.295570373535156) = 1 - 5.642236648842976e-10
and is a more accurate representation of your desired result (though still not exact).
Related
Why is why is np.exp(x) not equal to np.exp(1)**x?
For example:
np.exp(400)
>>>5.221469689764144e+173
np.exp(1)**400
>>>5.221469689764033e+173
np.exp(400)-np.exp(1)**400
>>>1.1093513018771065e+160
This is optimisation of numpy that raise this diff.
Indeed, you have to understand how is calculated the Euler number in math:
e = (1/n)**n with n == inf.
I think numpy stop at a certain order:
You have in the numpy exp documentation here that is not very clear about how the Euler number is calculated.
Because of this order that is not equal to infinity, you have this small difference in the two calculations.
Indeed the value np.exp(400) is calculated using this: (1 + 400/n)**n
>>> (1 + 400/n)**n
5.221642085428121e+173
>>> numpy.exp(400)
5.221469689764144e+173
Here you have n = 1000000000000 wich is very small and raise this difference at 10e-5.
Indeed there is no exact value of the Euler number. Like Pi, you can only have an approched value.
It looks like a rounding issue. In the first case it's internally using a very precise value of e, while in the second you get a less precise value, which when multiplied 400 times the precision issues become more apparent.
The actual result when using the Windows calculator is 5.2214696897641439505887630066496e+173, so you can see your first outcome is fine, while the second is not.
5.2214696897641439505887630066496e+173 // calculator
5.221469689764144e+173 // exp(400)
5.221469689764033e+173 // exp(1)**400
Starting from your result, it looks it's using a value with 15 digits of precision.
2.7182818284590452353602874713527 // e
2.7182818284590450909589085441968 // 400th root of the 2nd result
I am writing a program where I need to delete duplicate points stored in a matrix. The problem is that when it comes to check whether those points are in the matrix, MATLAB can't recognize them in the matrix although they exist.
In the following code, intersections function gets the intersection points:
[points(:,1), points(:,2)] = intersections(...
obj.modifiedVGVertices(1,:), obj.modifiedVGVertices(2,:), ...
[vertex1(1) vertex2(1)], [vertex1(2) vertex2(2)]);
The result:
>> points
points =
12.0000 15.0000
33.0000 24.0000
33.0000 24.0000
>> vertex1
vertex1 =
12
15
>> vertex2
vertex2 =
33
24
Two points (vertex1 and vertex2) should be eliminated from the result. It should be done by the below commands:
points = points((points(:,1) ~= vertex1(1)) | (points(:,2) ~= vertex1(2)), :);
points = points((points(:,1) ~= vertex2(1)) | (points(:,2) ~= vertex2(2)), :);
After doing that, we have this unexpected outcome:
>> points
points =
33.0000 24.0000
The outcome should be an empty matrix. As you can see, the first (or second?) pair of [33.0000 24.0000] has been eliminated, but not the second one.
Then I checked these two expressions:
>> points(1) ~= vertex2(1)
ans =
0
>> points(2) ~= vertex2(2)
ans =
1 % <-- It means 24.0000 is not equal to 24.0000?
What is the problem?
More surprisingly, I made a new script that has only these commands:
points = [12.0000 15.0000
33.0000 24.0000
33.0000 24.0000];
vertex1 = [12 ; 15];
vertex2 = [33 ; 24];
points = points((points(:,1) ~= vertex1(1)) | (points(:,2) ~= vertex1(2)), :);
points = points((points(:,1) ~= vertex2(1)) | (points(:,2) ~= vertex2(2)), :);
The result as expected:
>> points
points =
Empty matrix: 0-by-2
The problem you're having relates to how floating-point numbers are represented on a computer. A more detailed discussion of floating-point representations appears towards the end of my answer (The "Floating-point representation" section). The TL;DR version: because computers have finite amounts of memory, numbers can only be represented with finite precision. Thus, the accuracy of floating-point numbers is limited to a certain number of decimal places (about 16 significant digits for double-precision values, the default used in MATLAB).
Actual vs. displayed precision
Now to address the specific example in the question... while 24.0000 and 24.0000 are displayed in the same manner, it turns out that they actually differ by very small decimal amounts in this case. You don't see it because MATLAB only displays 4 significant digits by default, keeping the overall display neat and tidy. If you want to see the full precision, you should either issue the format long command or view a hexadecimal representation of the number:
>> pi
ans =
3.1416
>> format long
>> pi
ans =
3.141592653589793
>> num2hex(pi)
ans =
400921fb54442d18
Initialized values vs. computed values
Since there are only a finite number of values that can be represented for a floating-point number, it's possible for a computation to result in a value that falls between two of these representations. In such a case, the result has to be rounded off to one of them. This introduces a small machine-precision error. This also means that initializing a value directly or by some computation can give slightly different results. For example, the value 0.1 doesn't have an exact floating-point representation (i.e. it gets slightly rounded off), and so you end up with counter-intuitive results like this due to the way round-off errors accumulate:
>> a=sum([0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1]); % Sum 10 0.1s
>> b=1; % Initialize to 1
>> a == b
ans =
logical
0 % They are unequal!
>> num2hex(a) % Let's check their hex representation to confirm
ans =
3fefffffffffffff
>> num2hex(b)
ans =
3ff0000000000000
How to correctly handle floating-point comparisons
Since floating-point values can differ by very small amounts, any comparisons should be done by checking that the values are within some range (i.e. tolerance) of one another, as opposed to exactly equal to each other. For example:
a = 24;
b = 24.000001;
tolerance = 0.001;
if abs(a-b) < tolerance, disp('Equal!'); end
will display "Equal!".
You could then change your code to something like:
points = points((abs(points(:,1)-vertex1(1)) > tolerance) | ...
(abs(points(:,2)-vertex1(2)) > tolerance),:)
Floating-point representation
A good overview of floating-point numbers (and specifically the IEEE 754 standard for floating-point arithmetic) is What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg.
A binary floating-point number is actually represented by three integers: a sign bit s, a significand (or coefficient/fraction) b, and an exponent e. For double-precision floating-point format, each number is represented by 64 bits laid out in memory as follows:
The real value can then be found with the following formula:
This format allows for number representations in the range 10^-308 to 10^308. For MATLAB you can get these limits from realmin and realmax:
>> realmin
ans =
2.225073858507201e-308
>> realmax
ans =
1.797693134862316e+308
Since there are a finite number of bits used to represent a floating-point number, there are only so many finite numbers that can be represented within the above given range. Computations will often result in a value that doesn't exactly match one of these finite representations, so the values must be rounded off. These machine-precision errors make themselves evident in different ways, as discussed in the above examples.
In order to better understand these round-off errors it's useful to look at the relative floating-point accuracy provided by the function eps, which quantifies the distance from a given number to the next largest floating-point representation:
>> eps(1)
ans =
2.220446049250313e-16
>> eps(1000)
ans =
1.136868377216160e-13
Notice that the precision is relative to the size of a given number being represented; larger numbers will have larger distances between floating-point representations, and will thus have fewer digits of precision following the decimal point. This can be an important consideration with some calculations. Consider the following example:
>> format long % Display full precision
>> x = rand(1, 10); % Get 10 random values between 0 and 1
>> a = mean(x) % Take the mean
a =
0.587307428244141
>> b = mean(x+10000)-10000 % Take the mean at a different scale, then shift back
b =
0.587307428244458
Note that when we shift the values of x from the range [0 1] to the range [10000 10001], compute a mean, then subtract the mean offset for comparison, we get a value that differs for the last 3 significant digits. This illustrates how an offset or scaling of data can change the accuracy of calculations performed on it, which is something that has to be accounted for with certain problems.
Look at this article: The Perils of Floating Point. Though its examples are in FORTRAN it has sense for virtually any modern programming language, including MATLAB. Your problem (and solution for it) is described in "Safe Comparisons" section.
type
format long g
This command will show the FULL value of the number. It's likely to be something like 24.00000021321 != 24.00000123124
Try writing
0.1 + 0.1 + 0.1 == 0.3.
Warning: You might be surprised about the result!
Maybe the two numbers are really 24.0 and 24.000000001 but you're not seeing all the decimal places.
Check out the Matlab EPS function.
Matlab uses floating point math up to 16 digits of precision (only 5 are displayed).
I'm trying to construct a neural network for the Mnist database. When computing the softmax function I receive an error to the same ends as "you can't store a float that size"
code is as follows:
def softmax(vector): # REQUIRES a unidimensional numpy array
adjustedVals = [0] * len(vector)
totalExp = np.exp(vector)
print("totalExp equals")
print(totalExp)
totalSum = totalExp.sum()
for i in range(len(vector)):
adjustedVals[i] = (np.exp(vector[i])) / totalSum
return adjustedVals # this throws back an error sometimes?!?!
After inspection, most recommend using the decimal module. However when I've messed around with the values being used in the command line with this module, that is:
from decimal import Decimal
import math
test = Decimal(math.exp(720))
I receive a similar error for any values which are math.exp(>709).
OverflowError: (34, 'Numerical result out of range')
My conclusion is that even decimal cannot handle this number. Does anyone know of another method I could use to represent these very large floats.
There is a technique which makes the softmax function more feasible computationally for a certain kind of value distribution in your vector. Namely, you can subtract the maximum value in the vector (let's call it x_max) from each of its elements. If you recall the softmax formula, such operation doesn't affect the outcome as it reduced to multiplication of the result by e^(x_max) / e^(x_max) = 1. This way the highest intermediate value you get is e^(x_max - x_max) = 1 so you avoid the overflow.
For additional explanation I recommend the following article: https://nolanbconaway.github.io/blog/2017/softmax-numpy
With a value above 709 the function 'math.exp' exceeds the floating point range and throws this overflow error.
If, instead of math.exp, you use numpy.exp for such large exponents you will see that it evaluates to the special value inf (infinity).
All this apart, I wonder why you would want to produce such a big number (not sure you are aware how big it is. Just to give you an idea, the number of atoms in the universe is estimated to be in the range of 10 to the power of 80. The number you are trying to produce is MUCH larger than that).
Why does the math module return the wrong result?
First test
A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 12345678917
B = 12345678917
Here, the result is correct.
Second test
A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 123456758365483459347856
B = 123456758365483467538432
Here the result is incorrect.
Why is that the case?
Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.
Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.
If one calculates A**2 one gets:
>>> 12345678917**2
152415787921658292889L
Now if one converts it to a float(..), one gets:
>>> float(12345678917**2)
1.5241578792165828e+20
But if you now ask whether the two are equal:
>>> float(12345678917**2) == 12345678917**2
False
So information has been lost while converting it to a float.
You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.
The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."
Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.
However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.
If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.
Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.
def isqrt(x):
"""Return the integer part of the square root of x, even for very
large values."""
if x < 0:
raise ValueError('square root not defined for negative numbers')
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = (1 << (a+b)) - 1
while True:
y = (x + n//x) // 2
if y >= x:
return x
x = y
If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:
import sympy
num = sympy.Integer(123456758365483459347856)
print(int(num) == int(sympy.sqrt(num**2)))
The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:
>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856
I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.
So I am ultimately trying to use Horner's rule (http://mathworld.wolfram.com/HornersRule.html) to evaluate polynomials, and creating a function to evaluate the polynomial. Whatever, so my problem is with how I wrote the function; it works for easy polynomials like 3x^2 + 2x^1 + 5 and so on. But once you get to evaluating a polynomial with a floating point number (something crazy like 1.8953343e-20, etc. ) it loses it's precision.
Because I am using this function to evaluate roots of a polynomial using Newton's Method (http://tutorial.math.lamar.edu/Classes/CalcI/NewtonsMethod.aspx), I need this to be precise, so it doesn't lose it's value through a small rounding error and whatnot.
I have already troubleshooted with two other people that the problem lies within the evaluatePoly() function, and not my other functions that evaluates Newton's Method. Also, I originally evaluated the polynomial normally (multiplying x to the degree, multiplying by constant, etc.) and it pulled out the correct answer. However, the assignment requires one to use Horner's rule for easier calculation.
This is my following code:
def evaluatePoly(poly, x_):
"""Evaluates the polynomial at x = x_ and returns the result as a floating
point number using Horner's rule"""
#http://mathworld.wolfram.com/HornersRule.html
total = 0.
polyRev = poly[::-1]
for nn in polyRev:
total = total * x_
total = total + nn
return total
Note: I have already tried setting nn, x_, (total * x_) as floats using float().
This is the output I am receiving:
Polynomial: 5040x^0 + 1602x^1 + 1127x^2 - 214x^3 - 75x^4 + 4x^5 + 1x^6
Derivative: 1602x^0 + 2254x^1 - 642x^2 - 300x^3 + 20x^4 + 6x^5
(6.9027369297630505, False)
Starting at 100.00, no root found, last estimate was 6.90, giving value f(6.90) = -6.366463e-12
(6.9027369297630505, False)
Starting at 10.00, no root found, last estimate was 6.90, giving value f(6.90) = -6.366463e-12
(-2.6575456505038764, False)
Starting at 0.00, no root found, last estimate was -2.66, giving value f(-2.66) = 8.839758e+03
(-8.106973924480215, False)
Starting at -10.00, no root found, last estimate was -8.11, giving value f(-8.11) = -1.364242e-11
(-8.106973924480215, False)
Starting at -100.00, no root found, last estimate was -8.11, giving value f(-8.11) = -1.364242e-11
This is the output I need:
Polynomial: 5040x^0 + 1602x^1 + 1127x^2 - 214x^3 - 75x^4 + 4x^5 + 1x^6
Derivative: 1602x^0 + 2254x^1 - 642x^2 - 300x^3 + 20x^4 + 6x^5
(6.9027369297630505, False)
Starting at 100.00, no root found, last estimate was 6.90,giving value f(6.90) = -2.91038e-11
(6.9027369297630505, False)
Starting at 10.00, no root found, last estimate was 6.90,giving value f(6.90) = -2.91038e-11
(-2.657545650503874, False)
Starting at 0.00, no root found, last estimate was -2.66,giving value f(-2.66) = 8.83976e+03
(-8.106973924480215, True)
Starting at -10.00, root found at x = -8.11, giving value f(-8.11)= 0.000000e+00
(-8.106973924480215, True)
Starting at -100.00, root found at x = -8.11, giving value f(-8.11)= 0.000000e+00
Note: Please ignore the tuples of the errored output; That is the result of my newton's method, where the first result is the root and the second result is indicating whether it is a root or not.
Try this:
def evaluatePoly(poly, x_):
'''Evaluate the polynomial poly at x = x_ and return the result as a
floating-point number using Horner's rule'''
total= 0
degree =0
for coef in poly:
total += (x_**degree) * coef
degree += 1
Evaluation of a polynomial close to a root requires, by construction of the problem, that large terms cancel to yield a small result. The size of the intermediate terms can be bounded in a worst-case sense by the evaluation of the polynomial that has all its coefficients set to the absolute values of the coefficients of the original polynomial. For the first root that gives
In [6]: x0=6.9027369297630505
In [7]: evaluatePoly(poly,x0)
Out[7]: -6.366462912410498e-12
In [8]: evaluatePoly([abs(c) for c in poly],abs(x0))
Out[8]: 481315.82997756737
This value is a first estimate for the magnification factor of the floating point errors, multiplied with the machine epsilon 2.22e-16 this gives a bound on the accumulated floating point errors of any evaluation method of 1.07e-10 and indeed both evaluation methods give values comfortably below this bound, indicating that the root was found within the capabilities of the floating point format.
Looking at the graph of the evaluation around x0 one sees that the basic assumption of a smooth curve fails at that magnification, and the x-axis is crossed in a jump so that no better value for x0 can be found: