Numpy Vectorization for Nested 'for' loop - python-3.x

I was trying to write a program which plots level set for any given function.
rmin = -5.0
rmax = 5.0
c = 4.0
x = np.arange(rmin,rmax,0.1)
y = np.arange(rmin,rmax,0.1)
x,y = np.meshgrid(x,y)
f = lambda x,y: y**2.0 - 4*x
realplots = []
for i in range(x.shape[0]):
for j in range(x.shape[1]):
if abs(f(x[i,j],y[i,j])-c)< 1e-4:
realplots.append([x[i,j],y[i,j]])`
But it being a nested for loop, is taking lot of time. Any help in vectorizing the above code/new method of plotting level set is highly appreciated.(Note: The function 'f' will be changed at the time of running.So, the vectorization must be done without considering the function's properties)
I tried vectorizing through
ans = np.where(abs(f(x,y)-c)<1e-4,np.array([x,y]),[0,0])
but it was giving me operands could not be broadcast together with shapes (100,100) (2,100,100) (2,)
I was adding [0,0] as an escape from else condition in np.where which is indeed wrong.

Since you get the values rather than the indexes, you don't really need np.where.
You can directly use the mask to index x and y, look at the "Boolean array indexing" section of the documentation.
It is straightforward:
def vectorized(x, y, c, f, threshold):
mask = np.abs(f(x, y) - c) < threshold
x, y = x[mask], y[mask]
return np.stack([x, y], axis=-1)
Your function for reference:
def op(x, y, c, f, threshold):
res = []
for i in range(x.shape[0]):
for j in range(x.shape[1]):
if abs(f(x[i, j], y[i, j]) - c) < threshold:
res.append([x[i, j], y[i, j]])
return res
Tests:
rmin, rmax = -5.0, +5.0
c = 4.0
threshold = 1e-4
x = np.arange(rmin, rmax, 0.1)
y = np.arange(rmin, rmax, 0.1)
x, y = np.meshgrid(x, y)
f = lambda x, y: y**2 - 4 * x
res_op = op(x, y, c, f, threshold)
res_vec = vectorized(x, y, c, f, threshold)
assert np.allclose(res_op, res_vec)

Related

Generate 2D distribution avoiding loops

I want to construct a 2D Gassuian-like distribution on a (Nx, Ny) array of the form:
return np.exp(-0.5*((x-xp)**2 + (y-yp)**2)/SG**2)
where (x,y), in this case, would correspond to [i, j] matrix indices.
I am doing this by looping through a np.zeros((Nx,Ny)) matrix and updating its values with the defined function.
Basically, I would like to know if there is a way to generate a similar result but avoid the for loops that I am using here. My intuition tells me that np.meshgrid or zip(x, y) should do it but I have been unable to replicate it.
(I would like to avoid using the auxiliar distribution_Gp function and to be able to use directly normaldist function).
Here is my sample code of how I am using it all together:
import numpy as np
def normaldist(x, y, Nx, Ny, xp, yp, SG=1):
"""2D-mesh (Nx,Ny) with Gaussian distribution values."""
z = np.exp(-0.5*((x-xp)**2 + (y-yp)**2)/SG**2)
# /(SG*np.sqrt(np.pi*2.))) # non-normalized
return z
def distribution_Gp(Nx, Ny, xp, yp, SG=1):
"""Fill up the C0(Nx, Ny) array for the specified values and conditions."""
mask = np.zeros((Nx, Ny))
for j in range(0, Ny):
for i in range(0, Nx):
if(i <= Nx*Ny*normaldist(i, j, Nx, Ny, xp, yp, SG)):
mask[i, j] = normaldist(i, j, Nx, Ny, xp, yp, SG)
return mask
Nx = 11
Ny = Nx
arr_img = distribution_Gp(Nx, Ny, Nx//2, Ny//3, SG=2)
A matrix with values sampled from a normal distribution can be accomplished by :
np.random.normal(mean, std, (Nx, Ny))
where Nx and Ny are shapes of the output, as in your code.
If you want to apply any custom function to a matrix then this can be accomplished by:
arr = np.zeros((Nx, Ny))
f = lambda x: x + 3
result = f(arr)
By using lambda and with two arguments and meshgrid it is possible to replicate distribution_Gp.
Using lambda and avoiding using the intermediate function:
x = np.linspace(0, 10, Nx)
y = np.linspace(0, 10, Ny)
arr = np.zeros((Nx, Ny))
f = lambda x, y: normaldist(x, y, Nx//2, Ny//3, SG=2).T
X, Y = np.meshgrid(x, y)
result = f(X, Y)
which produces the same result as:
result = distribucio_de_puntsG(Nx, Ny, Nx//2, Ny//L, SG=2)

How to resolve value error in Scipy function fmintnc?

I am trying to implement coursera assignments in python, while doing Scipy optimise for logistic regression. However, I am getting the error below.
Can any one help!
Note: cost, gradient functions are working fine.
#Sigmoid function
def sigmoid(z):
h_of_z = np.zeros([z.shape[0]])
h_of_z = np.divide(1,(1+(np.exp(-z))))
return h_of_z
def cost(x,y,theta):
m = y.shape[0]
h_of_x = sigmoid(np.matmul(x,theta))
term1 = sum(-1 * y.T # np.log(h_of_x) - (1-y.T) # np.log(1-h_of_x))
J = 1/m * term1
return J
def grad(x,y,theta):
grad = np.zeros_like(theta)
m = y.shape[0]
h_of_x = sigmoid(x#theta)
grad = (x.T # (h_of_x - y)) * (1/m)
return grad
#add intercept term for X
x = np.hstack([np.ones_like(y),X[:,0:2]])
#initialise theta
[m,n] = np.shape(x)
initial_theta = np.zeros([n,1])
#optimising theta from given theta and gradient
result = opt.fmin_tnc(func=cost, x0=initial_theta, args=(x, y))
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 99 is different from 3)
I got it !
so the problem is fmin_tnc function programmed in a way we should parse the the parameter 'theta' before calling arguments x and y .
Since in my function 'cost' I have passed x and y first, it interpreted values differently so thrown ValueError .
Below are the corrected code..
def sigmoid(x):
return 1/(1+np.exp(-x))
def cost(theta,x,y):
J = (-1/m) * np.sum(np.multiply(y, np.log(sigmoid(x # theta)))
+ np.multiply((1-y), np.log(1 - sigmoid(x # theta))))
return J
def gradient(theta,x,y):
h_of_x = sigmoid(x#theta)
grad = 1 / m * (x.T # (h_of_x - y))
return grad
#initialise theta
init_theta = np.zeros([n+1,1])
#optimise theta
from scipy import optimize as op
result = op.fmin_tnc(func=cost,
x0=init_theta.flatten(),
fprime=gradient,
args=(x,y.flatten()))

Compute Errors on Fit Parameters In Scipy Curve fit

Using "scipy.optimize.curve_fit" we can determine the fit parameters for a curve fit on x and y using
popt, pcov = curve_fit(func, xdata, ydata)
In the documentation for this function, they state that: To compute one standard deviation errors on the parameters use
perr = np.sqrt(np.diag(pcov))
Here's a link to the documentation I was reading. https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
What if I want to compute something more general than simply 1 standard deviation on the errors of the parameters? In particular, what If I'm looking for, say, 2 standard deviations (a 95% confidence interval on the parameters).
To be clear, I'm not looking for a 10 line+ solution. I already know how to compute these errors in a "hackish" way for a linear function:
def get_slope_params(data1, data2):
x_mean = mean(data1)
y_mean = mean(data2)
N = len(data1)
sum_xy = 0
for (x, y) in zip(data1, data2):
sum_xy = sum_xy + x*y
sum_xsq = 0
for x in data1:
sum_xsq = sum_xsq + x*x
b = (sum_xy-N*x_mean*y_mean)/(sum_xsq-N*x_mean**2)
a = y_mean - b*x_mean
return (a,b)
# 95%
def get_slope_params_uncertainties(data1, data2):
N = len(data1)
a, b = get_slope_params(data1, data2)
y_approx = a+b*data1
s_eps = 0
for (y, y_app) in zip(data2, y_approx):
s_eps = s_eps + (y-y_app)**2
s_eps = np.sqrt(s_eps/(N-2))
s_x = np.sqrt(cov(data1, data1))
delta_b = (1/np.sqrt(N-1))*(s_eps/s_x)*sp.stats.t.ppf(1-0.05/2, N-2)
delta_a = mean(data1)*delta_b
return delta_a, delta_b
What I'd like is a function already implemented entirely by scipy.

Better way to solve simultaneous linear equations programmatically in Python

I have the following code that solves simultaneous linear equations by starting with the first equation and finding y when x=0, then putting that y into the second equation and finding x, then putting that x back into the first equation etc...
Obviously, this has the potential to reach infinity, so if it reaches +-inf then it swaps the order of the equations so the spiral/ladder goes the other way.
This seems to work, tho I'm not such a good mathematician that I can prove it will always work beyond a hunch, and of course some lines never meet (I know how to use matrices and linear algebra to check straight off whether they will never meet, but I'm not so interested in that atm).
Is there a better way to 'spiral' in on the answer? I'm not interested in using math functions or numpy for the whole solution - I want to be able to code the solution. I don't mind using libraries to improve the performance, for instance using some sort of statistical method.
This may be a very naive question from either a coding or maths point of view, but if so I'd like to know why!
My code is as follows:
# A python program to solve 2d simultaneous equations
# by iterating over coefficients in spirals
import numpy as np
def Input(coeff_or_constant, var, lower, upper):
val = int(input("Let the {} {} be a number between {} and {}: ".format(coeff_or_constant, var, lower, upper)))
if val >= lower and val <= upper :
return val
else:
print("Invalid input")
exit(0)
def Equation(equation_array):
a = Input("coefficient", "a", 0, 10)
b = Input("coefficient", "b", 0, 10)
c = Input("constant", "c", 0, 10)
equation_list = [a, b, c]
equation_array.append(equation_list)
return equation_array
def Stringify_Equations(equation_array):
A = str(equation_array[0][0])
B = str(equation_array[0][1])
C = str(equation_array[0][2])
D = str(equation_array[1][0])
E = str(equation_array[1][1])
F = str(equation_array[1][2])
eq1 = str(A + "y = " + B + "x + " + C)
eq2 = str(D + "y = " + E + "x + " + F)
print(eq1)
print(eq2)
def Spiral(equation_array):
a = equation_array[0][0]
b = equation_array[0][1]
c = equation_array[0][2]
d = equation_array[1][0]
e = equation_array[1][1]
f = equation_array[1][2]
# start at y when x = 0
x = 0
infinity_flag = False
count = 0
coords = []
coords.append([0, 0])
coords.append([1, 1])
# solve equation 2 for x when y = START
while not (coords[0][0] == coords[1][0]):
try:
y = ( ( b * x ) + c ) / a
except:
y = 0
print(y)
try:
x = ( ( d * y ) - f ) / e
except:
x = 0
if x >= 100000 or x <= -100000:
count = count + 1
if count >= 100000:
print("It\'s looking like these linear equations don\'t intersect!")
break
print(x)
new_coords = [x, y]
coords.append(new_coords)
coords.pop(0)
if not ((x == float("inf") or x == float("-inf")) and (y == float("inf") or y == float("-inf"))):
pass
else:
infinity_flag if False else True
if infinity_flag == False:
# if the spiral is divergent this switches the equations around so it converges
# the infinity_flag is to check if both spirals returned infinity meaning the lines do not intersect
# I think this would mostly work for linear equations, but for other kinds of equations it might not
x = 0
a = equation_array[1][0]
b = equation_array[1][1]
c = equation_array[1][2]
d = equation_array[0][0]
e = equation_array[0][1]
f = equation_array[0][2]
infinity_flag = False
else:
print("These linear equations do not intersect")
break
y = round(y, 3)
x = round(x, 3)
print(x, y)
equation_array = []
print("Specify coefficients a and b, and a constant c for equation 1")
equations = Equation(equation_array)
print("Specify coefficients a and b, and a constant c for equation 1")
equations = Equation(equation_array)
print(equation_array)
Stringify_Equations(equation_array)
Spiral(equation_array)

Smoothing values (neighbors between 1-9)

Instructions: Compute and store R=1000 random values from 0-1 as x. moving_window_average(x, n_neighbors) is pre-loaded into memory from 3a. Compute the moving window average for x for the range of n_neighbors 1-9. Store x as well as each of these averages as consecutive lists in a list called Y.
My solution:
R = 1000
n_neighbors = 9
x = [random.uniform(0,1) for i in range(R)]
Y = [moving_window_average(x, n_neighbors) for n_neighbors in range(1,n_neighbors)]
where moving_window_average(x, n_neighbors) is a function as follows:
def moving_window_average(x, n_neighbors=1):
n = len(x)
width = n_neighbors*2 + 1
x = [x[0]]*n_neighbors + x + [x[-1]]*n_neighbors
# To complete the function,
# return a list of the mean of values from i to i+width for all values i from 0 to n-1.
mean_values=[]
for i in range(1,n+1):
mean_values.append((x[i-1] + x[i] + x[i+1])/width)
return (mean_values)
This gives me an error, Check your usage of Y again. Even though I've tested for a few values, I did not get yet why there is a problem with this exercise. Did I just misunderstand something?
The instruction tells you to compute moving averages for all neighbors ranging from 1 to 9. So the below code should work:
import random
random.seed(1)
R = 1000
x = []
for i in range(R):
num = random.uniform(0,1)
x.append(num)
Y = []
Y.append(x)
for i in range(1,10):
mov_avg = moving_window_average(x, n_neighbors=i)
Y.append(mov_avg)
Actually your moving_window_average(list, n_neighbors) function is not going to work with a n_neighbors bigger than one, I mean, the interpreter won't say a thing, but you're not delivering correctness on what you have been asked.
I suggest you to use something like:
def moving_window_average(x, n_neighbors=1):
n = len(x)
width = n_neighbors*2 + 1
x = [x[0]]*n_neighbors + x + [x[-1]]*n_neighbors
mean_values = []
for i in range(n):
temp = x[i: i+width]
sum_= 0
for elm in temp:
sum_+= elm
mean_values.append(sum_ / width)
return mean_values
My solution for +100XP
import random
random.seed(1)
R=1000
Y = list()
x = [random.uniform(0, 1) for num in range(R)]
for n_neighbors in range(10):
Y.append(moving_window_average(x, n_neighbors))

Resources