How do i make these two code snippets run in parallel - multithreading

I would like to run these two parts of the code in parallel. Is this possible in python? How would i have to modify the code to accommodate this?
def smo(self, X, y):
iterations = 0
n_samples = X.shape[0]
# Initial coefficients
alpha = numpy.zeros(n_samples)
# Initial gradient
g = numpy.ones(n_samples)
while True:
yg = g * y
# KKT Conditions
y_pos_ind = (y == 1)
y_neg_ind = (numpy.ones(n_samples) - y_pos_ind).astype(bool)
alpha_pos_ind = (alpha >= self.C)
alpha_neg_ind = (alpha <= 0)
indices_violating_Bi_1 = y_pos_ind * alpha_pos_ind
indices_violating_Bi_2 = y_neg_ind * alpha_neg_ind
indices_violating_Bi = indices_violating_Bi_1 + indices_violating_Bi_2
yg_i = yg.copy()
yg_i[indices_violating_Bi] = float('-inf')
# First of the maximum violating pair
i = numpy.argmax(yg_i)
Kik = self.kernel_matrix(X, i)
indices_violating_Ai_1 = y_pos_ind * alpha_neg_ind
indices_violating_Ai_2 = y_neg_ind * alpha_pos_ind
indices_violating_Ai = indices_violating_Ai_1 + indices_violating_Ai_2
yg_j = yg.copy()
yg_j[indices_violating_Ai] = float('+inf')
# Second of the maximum violating pair
j = numpy.argmin(yg_j)
Kjk = self.kernel_matrix(X, j)
# Optimality criterion
if(yg_i[i] - yg_j[j]) < self.tol or (iterations >= self.max_iter):
break
min_term_1 = (y[i] == 1) * self.C - y[i] * alpha[i]
min_term_2 = y[j] * alpha[j] + (y[j] == -1) * self.C
min_term_3 = (yg_i[i] - yg_j[j]) / (Kik[i] + Kjk[j] - 2 * Kik[j])
# Direction search
lamda = numpy.min([min_term_1, min_term_2, min_term_3])
# Gradient update
g += lamda * y * (Kjk - Kik)
# Update coefficients
alpha[i] = alpha[i] + y[i] * lamda
alpha[j] = alpha[j] - y[j] * lamda
iterations += 1
print('{} iterations to arrive at the minimum'.format(iterations))
return alpha
I would like to run this line
Kik = self.kernel_matrix(X, i)
and this line
Kjk = self.kernel_matrix(X, j)
in parallel. How do i change the code to accommodate this?

Giving you a response with just the finished multi threading code probably wouldn't be that helpful to you and is hard given I don't know what the functions themselves do but check out this link: https://realpython.com/intro-to-python-threading/
The general idea is you will have to start a thread for each task you want to run in parallel like this:
thread1 = threading.Thread(target=kernel_matrix,args=(X,j))
thread1.start()
If you want to wait for a thread to finish you call thread.join()
You'll need to watch out for race conditions too good thread on that here: What is a race condition?

Related

Optimizing asymmetrically reweighted penalized least squares smoothing (from matlab to python)

I'm trying to apply the method for baselinining vibrational spectra, which is announced as an improvement over asymmetric and iterative re-weighted least-squares algorithms in the 2015 paper (doi:10.1039/c4an01061b), where the following matlab code was provided:
function z = baseline(y, lambda, ratio)
% Estimate baseline with arPLS in Matlab
N = length(y);
D = diff(speye(N), 2);
H = lambda*D'*D;
w = ones(N, 1);
while true
W = spdiags(w, 0, N, N);
% Cholesky decomposition
C = chol(W + H);
z = C \ (C' \ (w.*y) );
d = y - z;
% make d-, and get w^t with m and s
dn = d(d<0);
m = mean(d);
s = std(d);
wt = 1./ (1 + exp( 2* (d-(2*s-m))/s ) );
% check exit condition and backup
if norm(w-wt)/norm(w) < ratio, break; end
end
that I rewrote into python:
def baseline_arPLS(y, lam, ratio):
# Estimate baseline with arPLS
N = len(y)
k = [numpy.ones(N), -2*numpy.ones(N-1), numpy.ones(N-2)]
offset = [0, 1, 2]
D = diags(k, offset).toarray()
H = lam * numpy.matmul(D.T, D)
w_ = numpy.ones(N)
while True:
W = spdiags(w_, 0, N, N, format='csr')
# Cholesky decomposition
C = cholesky(W + H)
z_ = spsolve(C.T, w_ * y)
z = spsolve(C, z_)
d = y - z
# make d- and get w^t with m and s
dn = d[d<0]
m = numpy.mean(dn)
s = numpy.std(dn)
wt = 1. / (1 + numpy.exp(2 * (d - (2*s-m)) / s))
# check exit condition and backup
norm_wt, norm_w = norm(w_-wt), norm(w_)
if (norm_wt / norm_w) < ratio:
break
w_ = wt
return(z)
Except for the input vector y the method requires parameters lam and ratio and it runs ok for values lam<1.e+07 and ratio>1.e-01, but outputs poor results. When values are changed outside this range, for example lam=1e+07, ratio=1e-02 the CPU starts heating up and job never finishes (I interrupted it after 1min). Also in both cases the following warning shows up:
/usr/local/lib/python3.9/site-packages/scipy/sparse/linalg/dsolve/linsolve.py: 144: SparseEfficencyWarning: spsolve requires A to be CSC or CSR matrix format warn('spsolve requires A to be CSC or CSR format',
although I added the recommended format='csr' option to the spdiags call.
And here's some synthetic data (similar to one in the paper) for testing purposes. The noise was added along with a 3rd degree polynomial baseline The method works well for parameters bl_1 and fails to converge for bl_2:
import numpy
from matplotlib import pyplot
from scipy.sparse import spdiags, diags, identity
from scipy.sparse.linalg import spsolve
from numpy.linalg import cholesky, norm
import sys
x = numpy.arange(0, 1000)
noise = numpy.random.uniform(low=0, high = 10, size=len(x))
poly_3rd_degree = numpy.poly1d([1.2e-06, -1.23e-03, .36, -4.e-04])
poly_baseline = poly_3rd_degree(x)
y = 100 * numpy.exp(-((x-300)/15)**2)+\
200 * numpy.exp(-((x-750)/30)**2)+ \
100 * numpy.exp(-((x-800)/15)**2) + noise + poly_baseline
bl_1 = baseline_arPLS(y, 1e+07, 1e-01)
bl_2 = baseline_arPLS(y, 1e+07, 1e-02)
pyplot.figure(1)
pyplot.plot(x, y, 'C0')
pyplot.plot(x, poly_baseline, 'C1')
pyplot.plot(x, bl_1, 'k')
pyplot.show()
sys.exit(0)
All this is telling me that I'm doing something very non-optimal in my python implementation. Since I'm not knowledgeable enough about the intricacies of scipy computations I'm kindly asking for suggestions on how to achieve convergence in this calculations.
(I encountered an issue in running the "straight" matlab version of the code because the line D = diff(speye(N), 2); truncates the last two rows of the matrix, creating dimension mismatch later in the function. Following the description of matrix D's appearance I substituted this line by directly creating a tridiagonal matrix using the diags function.)
Guided by the comment #hpaulj made, and suspecting that the loop exit wasn't coded properly, I re-visited the paper and found out that the authors actually implemented an exit condition that was not featured in their matlab script. Changing the while loop condition provides an exit for any set of parameters; my understanding is that algorithm is not guaranteed to converge in all cases, which is why this condition is necessary but was omitted by error. Here's the edited version of my python code:
def baseline_arPLS(y, lam, ratio):
# Estimate baseline with arPLS
N = len(y)
k = [numpy.ones(N), -2*numpy.ones(N-1), numpy.ones(N-2)]
offset = [0, 1, 2]
D = diags(k, offset).toarray()
H = lam * numpy.matmul(D.T, D)
w_ = numpy.ones(N)
i = 0
N_iterations = 100
while i < N_iterations:
W = spdiags(w_, 0, N, N, format='csr')
# Cholesky decomposition
C = cholesky(W + H)
z_ = spsolve(C.T, w_ * y)
z = spsolve(C, z_)
d = y - z
# make d- and get w^t with m and s
dn = d[d<0]
m = numpy.mean(dn)
s = numpy.std(dn)
wt = 1. / (1 + numpy.exp(2 * (d - (2*s-m)) / s))
# check exit condition and backup
norm_wt, norm_w = norm(w_-wt), norm(w_)
if (norm_wt / norm_w) < ratio:
break
w_ = wt
i += 1
return(z)

Simpson's rule 3/8 for n intervals in Python

im trying to write a program that gives the integral approximation of e(x^2) between 0 and 1 based on this integral formula:
Formula
i've done this code so far but it keeps giving the wrong answer (Other methods gives 1.46 as an answer, this one gives 1.006).
I think that maybe there is a problem with the two for cycles that does the Riemman sum, or that there is a problem in the way i've wrote the formula. I also tried to re-write the formula in other ways but i had no success
Any kind of help is appreciated.
import math
import numpy as np
def f(x):
y = np.exp(x**2)
return y
a = float(input("¿Cual es el limite inferior? \n"))
b = float(input("¿Cual es el limite superior? \n"))
n = int(input("¿Cual es el numero de intervalos? "))
x = np.zeros([n+1])
y = np.zeros([n])
z = np.zeros([n])
h = (b-a)/n
print (h)
x[0] = a
x[n] = b
suma1 = 0
suma2 = 0
for i in np.arange(1,n):
x[i] = x[i-1] + h
suma1 = suma1 + f(x[i])
alfa = (x[i]-x[i-1])/3
for i in np.arange(0,n):
y[i] = (x[i-1]+ alfa)
suma2 = suma2 + f(y[i])
z[i] = y[i] + alfa
int3 = ((b-a)/(8*n)) * (f(x[0])+f(x[n]) + (3*(suma2+f(z[i]))) + (2*(suma1)))
print (int3)
I'm not a math major but I remember helping a friend with this rule for something about waterplane area for ships.
Here's an implementation based on Wikipedia's description of the Simpson's 3/8 rule:
# The input parameters
a, b, n = 0, 1, 10
# Divide the interval into 3*n sub-intervals
# and hence 3*n+1 endpoints
x = np.linspace(a,b,3*n+1)
y = f(x)
# The weight for each points
w = [1,3,3,1]
result = 0
for i in range(0, 3*n, 3):
# Calculate the area, 4 points at a time
result += (x[i+3] - x[i]) / 8 * (y[i:i+4] * w).sum()
# result = 1.4626525814387632
You can do it using numpy.vectorize (Based on this wikipedia post):
a, b, n = 0, 1, 10**6
h = (b-a) / n
x = np.linspace(0,n,n+1)*h + a
fv = np.vectorize(f)
(
3*h/8 * (
f(x[0]) +
3 * fv(x[np.mod(np.arange(len(x)), 3) != 0]).sum() + #skip every 3rd index
2 * fv(x[::3]).sum() + #get every 3rd index
f(x[-1])
)
)
#Output: 1.462654874404461
If you use numpy's built-in functions (which I think is always possible), performance will improve considerably:
a, b, n = 0, 1, 10**6
x = np.exp(np.square(np.linspace(0,n,n+1)*h + a))
(
3*h/8 * (
x[0] +
3 * x[np.mod(np.arange(len(x)), 3) != 0].sum()+
2 * x[::3].sum() +
x[-1]
)
)
#Output: 1.462654874404461

Monte Carlo simulation of a system of polymer chain

I want to perform Monte Carlo simulation to the particles which are interacting via Lennard-Jones potential + FENE potential. I'm getting negative values in the FENE potential which have the log value in it. The error is "RuntimeWarning: invalid value encountered in log return (-0.5 * K * R**2 * np.log(1-((np.sqrt(rij2) - r0) / R)**2))" The FENE potential is given by:
import numpy as np
def gen_chain(N, R0):
x = np.linspace(1, (N-1)*0.8*R0, num=N)
y = np.zeros(N)
z = np.zeros(N)
return np.column_stack((x, y, z))
def lj(rij2):
sig_by_r6 = np.power(sigma/rij2, 3)
sig_by_r12 = np.power(sig_by_r6, 2)
lje = 4.0 * epsilon * (sig_by_r12 - sig_by_r6)
return lje
def fene(rij2):
return (-0.5 * K * R**2 * np.log(1-((np.sqrt(rij2) - r0) / R)**2))
def total_energy(coord):
# Non-bonded
e_nb = 0
for i in range(N):
for j in range(i-1):
ri = coord[i]
rj = coord[j]
rij = ri - rj
rij2 = np.dot(rij, rij)
if (np.sqrt(rij2) < rcutoff):
e_nb += lj(rij2)
# Bonded
e_bond = 0
for i in range(1, N):
ri = coord[i]
rj = coord[i-1]
rij = ri - rj
rij2 = np.dot(rij, rij)
e_bond += fene(rij2)
return e_nb + e_bond
def move(coord):
trial = np.ndarray.copy(coord)
for i in range(N):
delta = (2.0 * np.random.rand(3) - 1) * max_delta
trial[i] += delta
return trial
def accept(delta_e):
beta = 1.0/T
if delta_e <= 0.0:
return True
random_number = np.random.rand(1)
p_acc = np.exp(-beta*delta_e)
if random_number < p_acc:
return True
return False
if __name__ == "__main__":
# FENE parameters
K = 40
R = 0.3
r0 = 0.7
# LJ parameters
sigma = r0/0.33
epsilon = 1.0
# MC parameters
N = 50 # number of particles
rcutoff = 2.5*sigma
max_delta = 0.01
n_steps = 10000000
T = 0.5
coord = gen_chain(N, R)
energy_current = total_energy(coord)
traj = open('traj.xyz', 'w')
for step in range(n_steps):
if step % 1000 == 0:
traj.write(str(N) + '\n\n')
for i in range(N):
traj.write("C %10.5f %10.5f %10.5f\n" % (coord[i][0], coord[i][1], coord[i][2]))
print(step, energy_current)
coord_trial = move(coord)
energy_trial = total_energy(coord_trial)
delta_e = energy_trial - energy_current
if accept(delta_e):
coord = coord_trial
energy_current = energy_trial
traj.close()
The problem is that calculating rij2 = np.dot(rij, rij) in total energy with the constant values you use is always a very small number. Looking at the expression inside the log used to calculate FENE, np.log(1-((np.sqrt(rij2) - r0) / R)**2), I first noticed that you're taking the square root of rij2 which is not consistent with the formula you provided.
Secondly, notice that ((rij2 - r0) / R)**2 is the same as ((r0 - rij2) / R)**2, since the sign gets lost when squaring. Because rij2 is very small (already in the first iteration -- I checked by printing the values), this will be more or less equal to ((r0 - 0.05)/R)**2 which will be a number bigger than 1. Once you subtract this value from 1 in the log expression, 1-((np.sqrt(rij2) - r0) / R)**2 will be equal to np.nan (standing for "Not A Number"). This will propagate through all the function calls (for example, calling energy_trial = total_energy(coord_trial) will effectively set energy_trial to np.nan), until an error will be raised by some function.
Maybe you could do something with np.isnan() call, documented here. Moreover, you should check how you iterate through the coord (there's some inconsistencies throughout the code) -- I suggest you check the code review community as well.

How to write cos(1)

I need to find a way to write cos(1) in python using a while loop. But i cant use any math functions. Can someone help me out?
for example I also had to write the value of exp(1) and I was able to do it by writing:
count = 1
term = 1
expTotal = 0
xx = 1
while abs(term) > 1e-20:
print("%1d %22.17e" % (count, term))
expTotal = expTotal + term
term=term * xx/(count)
count+=1
I amm completely lost as for how to do this with the cos and sin values though.
Just change your expression to compute the term to:
term = term * (-1 * x * x)/( (2*count) * ((2*count)-1) )
Multiplying the count by 2 could be changed to increment the count by 2, so here is your copypasta:
import math
def cos(x):
cosTotal = 1
count = 2
term = 1
x=float(x)
while abs(term) > 1e-20:
term *= (-x * x)/( count * (count-1) )
cosTotal += term
count += 2
print("%1d %22.17e" % (count, term))
return cosTotal
print( cos(1) )
print( math.cos(1) )
You can calculate cos(1) by using the Taylor expansion of this function:
You can find more details on Wikipedia, see an implementation below:
import math
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n-1)
def cos(order):
a = 0
for i in range(0, order):
a += ((-1)**i)/(factorial(2*i)*1.0)
return a
print cos(10)
print math.cos(1)
This gives as output:
0.540302305868
0.540302305868
EDIT: Apparently the cosine is implemented in hardware using the CORDIC algorithm that uses a lookup table to calculate atan. See below a Python implementation of the CORDIS algorithm based on this Google group question:
#atans = [math.atan(2.0**(-i)) for i in range(0,40)]
atans =[0.7853981633974483, 0.4636476090008061, 0.24497866312686414, 0.12435499454676144, 0.06241880999595735, 0.031239833430268277, 0.015623728620476831, 0.007812341060101111, 0.0039062301319669718, 0.0019531225164788188, 0.0009765621895593195, 0.0004882812111948983, 0.00024414062014936177, 0.00012207031189367021, 6.103515617420877e-05, 3.0517578115526096e-05, 1.5258789061315762e-05, 7.62939453110197e-06, 3.814697265606496e-06, 1.907348632810187e-06, 9.536743164059608e-07, 4.7683715820308884e-07, 2.3841857910155797e-07, 1.1920928955078068e-07, 5.960464477539055e-08, 2.9802322387695303e-08, 1.4901161193847655e-08, 7.450580596923828e-09, 3.725290298461914e-09, 1.862645149230957e-09, 9.313225746154785e-10, 4.656612873077393e-10, 2.3283064365386963e-10, 1.1641532182693481e-10, 5.820766091346741e-11, 2.9103830456733704e-11, 1.4551915228366852e-11, 7.275957614183426e-12, 3.637978807091713e-12, 1.8189894035458565e-12]
def cosine_sine_cordic(beta,N=40):
# in hardware, put this in a table.
def K_vals(n):
K = []
acc = 1.0
for i in range(0, n):
acc = acc * (1.0/(1 + 2.0**(-2*i))**0.5)
K.append(acc)
return K
#K = K_vals(N)
K = 0.6072529350088812561694
x = 1
y = 0
for i in range(0,N):
d = 1.0
if beta < 0:
d = -1.0
(x,y) = (x - (d*(2.0**(-i))*y), (d*(2.0**(-i))*x) + y)
# in hardware put the atan values in a table
beta = beta - (d*atans[i])
return (K*x, K*y)
if __name__ == '__main__':
beta = 1
cos_val, sin_val = cosine_sine_cordic(beta)
print "Actual cos: " + str(math.cos(beta))
print "Cordic cos: " + str(cos_val)
This gives as output:
Actual cos: 0.540302305868
Cordic cos: 0.540302305869

The program run three times when I using multiprocessing module

px = 1 + x ** 2
cx = x ** 0
fx = (-5 / 16) * (1 / x ** (3 / 4)) - (29 / 16) * x ** (5 / 4)
newmann_g0 = "none"
newmann_gl = 2.5
dirichlet_u0 = 0
dirichlet_ul = "none"
# Deciding if it is uniform or geometric and finding the Nod Point & Mesh
length = 0.1
num_element = int(1 / length)
num_nod_point = num_element + 1
degree = [3 for i in range(num_element)]
nod_point = [0]
for i in range(1, num_element + 1):
nod_point += [i * length]
h = [length for _ in range(1, num_nod_point)]
max_degree = max(degree)
# Finding the Legendre Polynomial by iteration
legendre_poly = [1, x]
for n in range(1, max_degree):
legendre_poly.append(x * legendre_poly[n] * (2 * n + 1) / (n + 1) - legendre_poly[n - 1] * n / (n + 1))
# Calculating the Shape Function by iteration
shape_function = [-0.5 * x + 0.5, 0.5 * x + 0.5] + [(legendre_poly[n - 1] - legendre_poly[n - 3]) * ((1 / (2 * (2 * n - 3))) ** 0.5) for n in range(3, max_degree + 2)]
# Calculating the Derivative of Shape Function
shape_prime = [sympy.diff(y, x) for y in shape_function]
# Defining Mapping Function
mapping = []
for i in range(0, num_nod_point - 1):
mapping += [(1 - x) * nod_point[i] / 2 + (1 + x) * nod_point[i + 1] / 2]
if __name__ == "__main__"
q = multiprocessing.Queue()
p1 = multiprocessing.Process(target=da_k, args=(num_element, degree, h, shape_prime, px, mapping))
p2 = multiprocessing.Process(target=da_m, args=(num_element, degree, h, shape_function, cx, mapping))
p3 = multiprocessing.Process(target=da_f, args=(num_element, degree, h, shape_function, fx, mapping))
p1.start()
p2.start()
p3.start()
p1.join()
p2.join()
p3.join()
global_k, total_local_k = q.get()
global_m, total_local_m = q.get()
global_f, total_local_f = q.get()
print(global_k,global_m,global_f)
File "C:\Python\WinPython-64bit-3.5.1.1\python-3.5.1.amd64\lib\multiprocessing\spawn.py", line 137, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
I'm learning multiprocess module recently,and I added the multiprocess module in my code, but the program will run 3 times when I click "run", there must be something wrong of the multiprocess part,cuz the other parts works well before I adding multiprocess module. Does anyone could help me to use multi-process module correctly? Thanks very much!
Update: I checked the multiprocess code, and it works fine, so the problem should be about the queue, thanks for your help.
The RuntimeError is because you are missing a colon at the end of
if __name__ == "__main__"
It should be
if __name__ == "__main__":
With the colon missing, all the code before that line will be run when you execute the code. Also seems you haven't included all your code, can't see you put anything into Queue anywhere.

Resources