Runtime Warning Using power with Numpy - python-3.x

I'm using power function from numpy and i'm obtaining a warning message. This is the code:
import numpy as np
def f(x, n):
factor = n / (1. + n)
exponent = 1. + (1. / n)
f1_x = factor * np.power(0.5, exponent) - np.power(0.5 - x, exponent)
f2_x = factor * np.power(0.5, exponent) - np.power(x - 0.5, exponent)
return np.where((0 <= x) & (x <= 0.5), f1_x, f2_x)
fv = np.vectorize(f, otypes='f')
x = np.linspace(0., 1., 20)
print(fv(x, 0.23))
And this is the warning message:
E:\ProgramasPython3\LibroCientifico\partesvectorizada.py:8:
RuntimeWarning: invalid value encountered in power f2_x = factor *
np.power(0.5, exponent) - np.power(x - 0.5, exponent)
E:\ProgramasPython3\LibroCientifico\partesvectorizada.py:7:
RuntimeWarning: invalid value encountered in power f1_x = factor *
np.power(0.5, exponent) - np.power(0.5 - x, exponent) [-0.0199636
-0.00895462 -0.0023446 0.00136486 0.003271 0.00414007
0.00447386 0.00457215 0.00459036 0.00459162 0.00459162 0.00459036
0.00457215 0.00447386 0.00414007 0.003271 0.00136486 -0.0023446 -0.00895462 -0.0199636 ]
I don't know what is the invalid value. And I don't know how to specify that with where numpy function f2_x is only valid for values between >0.5 and <= 1.0.
Thanks

The reason this happens is because you are trying to take a non-integer power of a negative number. Apparently this doesn't work in earlier versions of Python/Numpy if you don't explicitly cast the value to be complex. So you will have to do something like
np.power(complex(0.5 - x), exponent).real
EDIT : Since your values will be truly complex (not some real number + some tiny imag part), I think you would want to either use the complex (but then the <=) later on gets kind of difficult, or you would want to catch the case where the base is negative in some other way.

Ok, thanks a lot everyone, here's the solution using a piecewise function against where from numpy, and using np.complex128 as mentioned #Saullo
import numpy as np
def f(x, n):
factor = n / (1. + n)
exponent = (n + 1.) / n
f1_x = lambda x: factor * \
np.power(2., -exponent) - np.power((1. - 2. * x) / 2., exponent)
f2_x = lambda x: factor * \
np.power(2., -exponent) - np.power(-(1. - 2. * x) / 2., exponent)
conditions = [(0. <= x) & (x <= 0.5), (0.5 < x) & (x <= 1.)]
functions = [f1_x, f2_x]
result = np.piecewise(x, conditions, functions)
return np.real(result)
x = np.linspace(0., 1., 20)
x = x.astype(np.complex128)
print(f(x, 0.23))
The problem is when the base from a power is negative then np.power doesn't work fine and you obtain the warning message. I expect this is useful for everyone.

Related

Determening begin parameters 2D gaussian fit

I'm working on some code which needs to be able to preform a 2d gaussian fitting. I mostly based my code on following question: Fitting a 2D Gaussian function using scipy.optimize.curve_fit - ValueError and minpack.error . Now is problem that I don't really have an initial guess about the different parameters that need to be used.
I've tried this:
def twoD_Gaussian(x_data_tuple, amplitude, xo, yo, sigma_x, sigma_y, theta, offset):
(x,y) = x_data_tuple
xo = float(xo)
yo = float(yo)
a = (np.cos(theta)**2)/(2*sigma_x**2) + (np.sin(theta)**2)/(2*sigma_y**2)
b = -(np.sin(2*theta))/(4*sigma_x**2) + (np.sin(2*theta))/(4*sigma_y**2)
c = (np.sin(theta)**2)/(2*sigma_x**2) + (np.cos(theta)**2)/(2*sigma_y**2)
g = offset + amplitude*np.exp( - (a*((x-xo)**2) + 2*b*(x-xo)*(y-yo)
+ c*((y-yo)**2)))
return g.ravel()
The data.reshape(201,201) is just something I took from the aformentioned question.
mean_gauss_x = sum(x * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_x = np.sqrt(sum(data.reshape(201,201) * (x - mean_gauss_x)**2) / sum(data.reshape(201,201)))
mean_gauss_y = sum(y * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_y = np.sqrt(sum(data.reshape(201,201) * (y - mean_gauss_y)**2) / sum(data.reshape(201,201)))
initial_guess = (np.max(data), mean_gauss_x, mean_gauss_y, sigma_gauss_x, sigma_gauss_y,0,10)
popt, pcov = curve_fit(twoD_Gaussian, (x, y), data, p0=initial_guess)
data_fitted = twoD_Gaussian((x, y), *popt)
If I try this, I get following error message: ValueError: setting an array element with a sequence.
Is the reasoning about the begin parameters correct?
And why do I get this error?
If I use the runnable code from the linked question and substitute your definition of initial_guess:
mean_gauss_x = sum(x * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_x = np.sqrt(sum(data.reshape(201,201) * (x - mean_gauss_x)**2) / sum(data.reshape(201,201)))
mean_gauss_y = sum(y * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_y = np.sqrt(sum(data.reshape(201,201) * (y - mean_gauss_y)**2) / sum(data.reshape(201,201)))
initial_guess = (np.max(data), mean_gauss_x, mean_gauss_y, sigma_gauss_x, sigma_gauss_y,0,10)
Then
print(inital_guess)
yields
(13.0, array([...]), array([...]), array([...]), array([...]), 0, 10)
Notice that some of the values in initial_guess are arrays. The optimize.curve_fit function expects initial_guess to be a tuple of scalars. This is the source of the problem.
The error message
ValueError: setting an array element with a sequence
often arises when an array-like is supplied when a scalar value is expected. It is a hint that the source of the problem may have to do with an array having the wrong number of dimensions. For example, it might arise if you pass a 1D array to a function that expects a scalar.
Let's look at this piece of code taken from the linked question:
x = np.linspace(0, 200, 201)
y = np.linspace(0, 200, 201)
X, Y = np.meshgrid(x, y)
x and y are 1D arrays, while X and Y are 2D arrays. (I've capitalized all 2D arrays to help distinguish them from 1D arrays).
Now notice that Python sum and NumPy's sum method behave differently when applied to 2D arrays:
In [146]: sum(X)
Out[146]:
array([ 0., 201., 402., 603., 804., 1005., 1206., 1407.,
1608., 1809., 2010., 2211., 2412., 2613., 2814., 3015.,
...
38592., 38793., 38994., 39195., 39396., 39597., 39798., 39999.,
40200.])
In [147]: X.sum()
Out[147]: 4040100.0
The Python sum function is equivalent to
total = 0
for item in X:
total += item
Since X is a 2D array, the loop for item in X is iterating over the rows of X. Each item is therefore a 1D array representing a row of X. Thus, total ends up being a 1D array.
In contrast, X.sum() sums all the elements in X and returns a scalar.
Since initial_guess should be a tuple of scalars,
everywhere you use sum you should instead use the NumPy sum method. For example, replace
mean_gauss_x = sum(x * data) / sum(data)
with
mean_gauss_x = (X * DATA).sum() / (DATA.sum())
import numpy as np
import scipy.optimize as optimize
import matplotlib.pyplot as plt
# define model function and pass independant variables x and y as a list
def twoD_Gaussian(data, amplitude, xo, yo, sigma_x, sigma_y, theta, offset):
X, Y = data
xo = float(xo)
yo = float(yo)
a = (np.cos(theta) ** 2) / (2 * sigma_x ** 2) + (np.sin(theta) ** 2) / (
2 * sigma_y ** 2
)
b = -(np.sin(2 * theta)) / (4 * sigma_x ** 2) + (np.sin(2 * theta)) / (
4 * sigma_y ** 2
)
c = (np.sin(theta) ** 2) / (2 * sigma_x ** 2) + (np.cos(theta) ** 2) / (
2 * sigma_y ** 2
)
g = offset + amplitude * np.exp(
-(a * ((X - xo) ** 2) + 2 * b * (X - xo) * (Y - yo) + c * ((Y - yo) ** 2))
)
return g.ravel()
# Create x and y indices
x = np.linspace(0, 200, 201)
y = np.linspace(0, 200, 201)
X, Y = np.meshgrid(x, y)
# create data
data = twoD_Gaussian((X, Y), 3, 100, 100, 20, 40, 0, 10)
data_noisy = data + 0.2 * np.random.normal(size=data.shape)
DATA = data.reshape(201, 201)
# add some noise to the data and try to fit the data generated beforehand
mean_gauss_x = (X * DATA).sum() / (DATA.sum())
sigma_gauss_x = np.sqrt((DATA * (X - mean_gauss_x) ** 2).sum() / (DATA.sum()))
mean_gauss_y = (Y * DATA).sum() / (DATA.sum())
sigma_gauss_y = np.sqrt((DATA * (Y - mean_gauss_y) ** 2).sum() / (DATA.sum()))
initial_guess = (
np.max(data),
mean_gauss_x,
mean_gauss_y,
sigma_gauss_x,
sigma_gauss_y,
0,
10,
)
print(initial_guess)
# (13.0, 100.00000000000001, 100.00000000000001, 57.106515650488404, 57.43620227324201, 0, 10)
# initial_guess = (3,100,100,20,40,0,10)
popt, pcov = optimize.curve_fit(twoD_Gaussian, (X, Y), data_noisy, p0=initial_guess)
data_fitted = twoD_Gaussian((X, Y), *popt)
fig, ax = plt.subplots(1, 1)
ax.imshow(
data_noisy.reshape(201, 201),
cmap=plt.cm.jet,
origin="bottom",
extent=(X.min(), X.max(), Y.min(), Y.max()),
)
ax.contour(X, Y, data_fitted.reshape(201, 201), 8, colors="w")
plt.show()

Numpy tensor implementation slower than loop

I have two functions that compute the same metric. One ends up using a list comprehension to cycle through a calculation, the other uses only numpy tensor operations. The functions take in a (N, 3) array, where N is the number of points in 3D space. When N <~ 3000 the tensor function is faster, when N >~ 3000 the list comprehension is faster. Both seem to have linear time complexity in terms of N i.e two time-N lines cross at N=~3000.
def approximate_area_loop(section, num_area_divisions):
n_a_d = num_area_divisions
interp_vectors = get_section_interp_(section)
a1 = section[:-1]
b1 = section[1:]
a2 = interp_vectors[:-1]
b2 = interp_vectors[1:]
c = lambda u: (1 - u) * a1 + u * a2
d = lambda u: (1 - u) * b1 + u * b2
x = lambda u, v: (1 - v) * c(u) + v * d(u)
area = np.sum([np.linalg.norm(np.cross((x((i + 1)/n_a_d, j/n_a_d) - x(i/n_a_d, j/n_a_d)),\
(x(i/n_a_d, (j +1)/n_a_d) - x(i/n_a_d, j/n_a_d))), axis = 1)\
for i in range(n_a_d) for j in range(n_a_d)])
Dt = section[-1, 0] - section[0, 0]
return area, Dt
def approximate_area_tensor(section, num_area_divisions):
divisors = np.linspace(0, 1, num_area_divisions + 1)
interp_vectors = get_section_interp_(section)
a1 = section[:-1]
b1 = section[1:]
a2 = interp_vectors[:-1]
b2 = interp_vectors[1:]
c = np.multiply.outer(a1, (1 - divisors)) + np.multiply.outer(a2, divisors) # c_areas_vecs_divs
d = np.multiply.outer(b1, (1 - divisors)) + np.multiply.outer(b2, divisors) # d_areas_vecs_divs
x = np.multiply.outer(c, (1 - divisors)) + np.multiply.outer(d, divisors) # x_areas_vecs_Divs_divs
u = x[:, :, 1:, :-1] - x[:, :, :-1, :-1] # u_areas_vecs_Divs_divs
v = x[:, :, :-1, 1:] - x[:, :, :-1, :-1] # v_areas_vecs_Divs_divs
sub_area_norm_vecs = np.cross(u, v, axis = 1) # areas_crosses_Divs_divs
sub_areas = np.linalg.norm(sub_area_norm_vecs, axis = 1) # areas_Divs_divs (values are now sub areas)
area = np.sum(sub_areas)
Dt = section[-1, 0] - section[0, 0]
return area, Dt
Why does the list comprehension version work faster at large N? Surely the tensor version should be faster? I'm wondering if it's something to do with the size of the calculations meaning it's too big to be done in cache? Please ask if I haven't included enough information, I'd really like to get to the bottom of this.
The bottleneck in the fully vectorized function was indeed in the np.linalg.norm as #hpauljs comment suggested.
Norm was used only to get the magnitude of all the vectors contained in axis 1. A much simpler and faster method was to just:
sub_areas = np.sqrt((sub_area_norm_vecs*sub_area_norm_vecs).sum(axis = 1))
This gives exactly the same results and sped up the code by up to 25 times faster than the loop implementation (even when the loop doesn't use linalg.norm either).

Tensor("pow:0", ...) must be from the same graph as Tensor("Cast_2:0", ...)

I am trying to model something which needs to do the definite integration. The code is showing as below:
import tensorflow as tf
from numpy import pi, inf
from tensorflow import log, sqrt, exp, pow
from scipy.integrate import quad # for integration
def risk_neutral_pdf(phi, a, S, K, r, sigma, Mt, p_dict):
phii = tf.complex(0., phi)
A = tf.cast(0., tf.complex64)
B = tf.cast(0., tf.complex64)
p_dict['gamma'] = p_dict['gamma'] + p_dict['lamda'] + .5
p_dict['lamda'] = -.5
for t in range(Mt-1, -1, -1):
temp = 1. - 2. * p_dict['alpha'] * B
A = A + (phii + a) * r + p_dict['omega'] * B - .5 * log(temp)
B = B * p_dict['beta'] + (phii + a) * (p_dict['lamda'] + p_dict['gamma']) - \
.5 * p_dict['gamma']**2. + (.5*((phii + a) - p_dict['gamma'])**2. / temp)
return tf.real(S**a * (S/K)**phii * exp(A + B * sigma**2.) / phii)
p_dict={'lamda': 0.205, 'omega': 5.02e-6, 'beta': 0.589, 'gamma': 421.39, 'alpha': 1.32e-6}
S = 100.
K = 100.
r = 0.
Mt = 0
sq_ht = sqrt(.15**2/252.)
sigma = sq_ht
P1 = tf.py_func(lambda z: quad(risk_neutral_pdf, z, inf, args=(1., S, K, r, sigma, Mt, p_dict))[0],
[0.], tf.float64)
with tf.Session() as sess:
res = sess.run(P1)
print(res)
The result returns "InvalidArgumentError (see above for traceback): ValueError: Tensor("pow:0", shape=(), dtype=float32) must be from the same graph as Tensor("Cast_2:0", shape=(), dtype=complex64)." However, no matter how I change the code or reference the solution in "ValueError: Tensor A must be from the same graph as Tensor B", it does not work. I am wondering if I did wrong when putting the tf.reset_default_graph() at the top place or should the code needs be done some changes.
Thank you. (Tensroflow version: 1.6.0)
Update:
I find that the sigma variable has been sqrt before passing into the risk_neutral_pdf function and be powered when return which is not necessary. So after modifying the return to return tf.real(S**a * (S/K)**phii * exp(A + B * sigma) / phii) and the sq_ht to .15**2/252.. The error changes to "TypeError: a float is required", which I think caused by quad and Tensor. Any ideas to solve??
Many thanks.

Prevent rounding, maintaining certain level of accuracy

I am trying to apply a Runge Kutta method for solving an ODE. The problem is, python somewhere keeps rounding like a madman and I don't understand why or is something syntactically telling python to round everything? I've tried converting everything to float () to no avail. What should I do to have python compute everything satisfying some accuracy demand?
import numpy as np
def fn(x,y):
return x-y
def rk3 (y0,x):
n = len (x)
y = np.array([y0]*n)
for j in range(n-1):
h = x[j+1]-x[j]
k1 = h * fn(x[j],y[j])
k2 = h * fn(x[j] + h / 3.0, y[j] + k1 / 3.0)
k3 = h * fn(x[j] + 2.0*h /3.0, y[j] + 2.0*k2 /3.0)
y[j+1] = y[j] + k1*1.0/4.0 + k3 *3.0/4.0
return y
v = rk3(1, np.linspace(0,5,500))
The mistake is passing an integer in rk3(1,np.linspace(0,5,500)). If one changes to 1.0 all further operations are regarded as float point arithmetic as required.

Calculate all compound growth rates from a list of values

I have a list of growth rates and would like to calculate all available compounded growth rates:
l = [0.3, 0.2, 0.1]
Output (as a list):
o = [0.56, 0.716]
calculation detail about the compounded growth rates:
0.56 = (1 + 0.3) * (1 + 0.2) - 1
0.716 = (1 + 0.3) * (1 + 0.2) * (1 + 0.1) - 1
The function should be flexible to the length of the input list.
You could express the computation with list comprehensions / generator expressions and using itertools.accumulate to handle the compounding:
import itertools as IT
import operator
def compound_growth_rates(l):
result = [xy-1 for xy in
IT.islice(IT.accumulate((1+x for x in l), operator.mul), 1, None)]
return result
l = [0.3, 0.2, 0.1]
print(compound_growth_rates(l))
prints
[0.56, 0.7160000000000002]
Or, equivalently, you could write this with list-comprehensions and a for-loop:
def compound_growth_rates(l):
add_one = [1+x for x in l]
products = [add_one[0]]
for x1 in add_one[1:]:
products.append(x1*products[-1])
result = [p-1 for p in products[1:]]
return result
I think the advantage of using itertools.accumulate is that it expresses the
intent of the code better than the for-loop. But the for-loop may be more
readable in the sense that it uses more commonly known syntax.

Resources