Jupyter notebook randomly stops utilizing all CPU cores - python-3.x

I am currently doing a simple regression task (No ML libraries involved, just my own code) for a homework assignment. The problem is Jupyter sometimes uses 95%+ of my CPU (this is good, I have a 8600k which i would like to utilize) but often decides not to use any extra thread at all and remains at a steady 20% usage. My feedback loop gets increased 6 times over just because of this.
I have looked around for any jupyter related settings that might be relevant but found none. Is there any explanation for this problem?
EDIT:
Here's the code i'm currently using. The data passed is a 30000x36 np array. I don't know how jupyter parallels this but hey, it does it sometimes.
def hyp(theta, X):
return X.dot(theta)
def cost_function(theta,X,Y):
return (1.0 / ( 2 * X.shape[0] ) ) * np.sum( (hyp(theta, X) - Y ) ** 2 )
def derivative_cost_function(theta, X, Y):
e = hyp(theta, X) - Y
return (1.0 / X.shape[0]) * X.T.dot(e)
def GradientDescent(X, Y, maxniter=400000):
nexamples = float(X.shape[0])
thetas = np.ones(X.shape[1],)
alpha = 0.001
print("Before:", cost_function(thetas, X, Y))
print_iter = 100
for i in range (maxniter):
dtheta = derivative_cost_function(thetas, X, Y)
thetas = thetas - alpha * dtheta
if i % print_iter == 0:
print(i, cost_function(thetas, X, Y))
print("After:", cost_function(thetas, X, Y))
return thetas

This looks more like a numpy issue than a jupyter issue. Take a look at https://roman-kh.github.io/numpy-multicore/ to make numpy use more cores.

Related

how do I make a for loop understand that I want it to run i for all values inside x. x being a defined array

My code right now is as it follows:
from math import *
import matplotlib.pyplot as plt
import numpy as np
"""
TITLE
"""
def f(x,y):
for i in range(len(x)):
y.append(exp(-x[i]) - sin (pi*x[i]/2))
def ddxf(x,y2):
for i in range(len(x)):
y2.append(-exp(-x[i]) - (pi/2)*cos(pi*x[i]/2))
y = []
y2 = []
f(x, y)
x = np.linspace(0, 4, 100)
plt.title('Graph of function x')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.plot(x, y, 'g')
plt.grid(True)
plt.show()
x0 = float(input("Insert the approximate value of the first positive root: "))
intMax = 100
i = 0
epsilon = 1.e-7
while abs(f(x0,y)) > epsilon and i < intMax:
x1 = x0 - (f(x0))/(ddxf(x0))
x0 = x1
i += 1
print (i)
print (x1)
I get this error when running the program. it seems that (len(x)) cannot be used if x isnt a string. which doesn't make sense to me. if the array has a len and isn't infinite, why (len(x)) cant read his length? anyway, please help me. hope I made myself clear
Regarding the error: You are using x before defining it. In your code, you first use f(x, y) and only after that you actually define what x is (namely x = np.linspace(0, 4, 100)). You probably want to swap these two lines to fix the issue.
Regarding the question in the title: len(x) should be fine to get the length of a list. However, in Python you don't need to go through a list like that. You can instead use for element_name in list_name: This will essentially go through list_name element by element and make it available to you with the name element_name.
There is also something called list comprehensions in Python - you might want to take a look at those and see whether you can apply it to your code.

scipy solve_ivp with adaptive solution

I am struggling to understand how scipy.solve_ivp() handles errors in a system of ODE. Lets say I have the following, simple code for a single ODEs, and I think I might be doing things wrong in some way. Lets say my rhs looks something like:
from scipy.integrate import solve_ivp
def rhs_func(t, y):
z = 1.0/( x - y + 1j )
return z
Suppose we call solve_ivp with the following signature:
Z_solution = ivp_adaptive.solve_ivp( fun = rhs_func,
t_span = [100,0],
y0 = y0, #some initial value of 0 for example
method ='RK45',
t_eval = None,
args = some_additional_arguments_to_rhs_func,
dense_output = False,
rtol = 1e-8
atol = 1e-10
)
Now, the absolute and relative tolerances are supposed to fix the error of the caculation. The problem I am having has to do with the "t_eval=None" in this case. Apparently, this choice lets the integrator (in this case of type RK45) to choose the time step according to the specified tolerances above being or not being exceeded, i.e., the steps are not fixed, but somehow taking a larger step in t would mean a solution has been found that lies below the tolerances above (atol=1e-10 , rtol=1e-8). This is particularly useful in problems with large variations of the time scale, where a uniform discretization of t is very inefficient.
My big problem has to do with the following piece of code in scipy.integrate._ivp.solve_ivp() around line 575, with the "t_eval == None" case:
while status is None:
message = solver.step()
if solver.status == 'finished':
status = 0
elif solver.status == 'failed':
status = -1
break
t_old = solver.t_old
t = solver.t
y = solver.y
if dense_output:
sol = solver.dense_output()
interpolants.append(sol)
else:
sol = None
if events is not None:
g_new = [event(t, y) for event in events]
active_events = find_active_events(g, g_new, event_dir)
if active_events.size > 0:
if sol is None:
sol = solver.dense_output()
root_indices, roots, terminate = handle_events(
sol, events, active_events, is_terminal, t_old, t)
for e, te in zip(root_indices, roots):
t_events[e].append(te)
y_events[e].append(sol(te))
if terminate:
status = 1
t = roots[-1]
y = sol(t)
g = g_new
# HERE I HAVE MODIFIED THE FILE BY CALLING AN INTERPOLATION FUNCTION FOR THE SOLUTION
if t_eval is None:
ts.append(t)
#ys.append(y)
# this calls to adapt the solution to a new set of values x over which y(x,t) is
# defined
interp_solution(t,y,solver,args)
y = solver.y
ys.append(y)
where I have defined a function:
def interp_solution( t, y, solver, args ):
import numpy as np
from scipy import interpolate
x_old = args.get_old_grid() # this call just returns an array of the style of
# x_new, and is where y is defined
x_new = np.linspace( -t, t, dim ) # the new array where components of y are
# defined
y_interp = interpolate.interp1d( x_old, y )
y_new = y_interp( x_new )
solver.y = y_new # update the solver y
# finally, we change the maximum allowed step of the integrator if t is below
# some threshold value
if ( t < args.get_threshold() ):
solver.max_step = #some number
return y_new
When I look at the results, it seems that this is very sensitive to the tolerances and the way the integration steps are performed, but somehow I fail to see where errors could come from in this approach -- can anyone explain if this approach is somehow affecting the solution and the associated errors ? How can one implement a similar approach in this fashion? Any help is greatly appreaciated.

Weighted moving average in python with different width in different regions

I was trying to take a oscillation avarage of a highly oscillating data. The oscillations are not uniform, it has less oscillations in the initial regions.
x = np.linspace(0, 1000, 1000001)
y = some oscillating data say, sin(x^2)
(The original data file is huge, so I can't upload it)
I want to take a weighted moving avarage of the function and plot it. Initially the period of the function is larger, so I want to take avarage over a large time interval. While I can do with smaller time interval latter.
I have found a possible elegant solution in following post:
Weighted moving average in python
However, I want to have different width in different regions of x. Say when x is between (0,100) I want the width=0.6, while when x is between (101, 300) width=0.2 and so on.
This is what I have tried to implement( with my limited knowledge in programing!)
def weighted_moving_average(x,y,step_size=0.05):#change the width to control average
bin_centers = np.arange(np.min(x),np.max(x)-0.5*step_size,step_size)+0.5*step_size
bin_avg = np.zeros(len(bin_centers))
#We're going to weight with a Gaussian function
def gaussian(x,amp=1,mean=0,sigma=1):
return amp*np.exp(-(x-mean)**2/(2*sigma**2))
if x.any() < 100:
for index in range(0,len(bin_centers)):
bin_center = bin_centers[index]
weights = gaussian(x,mean=bin_center,sigma=0.6)
bin_avg[index] = np.average(y,weights=weights)
else:
for index in range(0,len(bin_centers)):
bin_center = bin_centers[index]
weights = gaussian(x,mean=bin_center,sigma=0.1)
bin_avg[index] = np.average(y,weights=weights)
return (bin_centers,bin_avg)
It is needless to say that this is not working! I am getting the plot with the first value of sigma. Please help...
The following snippet should do more or less what you tried to do. You have mainly a logical problem in your code, x.any() < 100 will always be True, so you'll never execute the second part.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 1000)
y = np.sin(x**2)
def gaussian(x,amp=1,mean=0,sigma=1):
return amp*np.exp(-(x-mean)**2/(2*sigma**2))
def weighted_average(x,y,step_size=0.3):
weights = np.zeros_like(x)
bin_centers = np.arange(np.min(x),np.max(x)-.5*step_size,step_size)+.5*step_size
bin_avg = np.zeros_like(bin_centers)
for i, center in enumerate(bin_centers):
# Select the indices that should count to that bin
idx = ((x >= center-.5*step_size) & (x <= center+.5*step_size))
weights = gaussian(x[idx], mean=center, sigma=step_size)
bin_avg[i] = np.average(y[idx], weights=weights)
return (bin_centers,bin_avg)
idx = x <= 4
plt.plot(*weighted_average(x[idx],y[idx], step_size=0.6))
idx = x >= 3
plt.plot(*weighted_average(x[idx],y[idx], step_size=0.1))
plt.plot(x,y)
plt.legend(['0.6', '0.1', 'y'])
plt.show()
However, depending on the usage, you could also implement moving average directly:
x = np.linspace(0, 60, 1000)
y = np.sin(x**2)
z = np.zeros_like(x)
z[0] = x[0]
for i, t in enumerate(x[1:]):
a=.2
z[i+1] = a*y[i+1] + (1-a)*z[i]
plt.plot(x,y)
plt.plot(x,z)
plt.legend(['data', 'moving average'])
plt.show()
Of course you could then change a adaptively, e.g. depending of the local variance. Also note that this has apriori a small bias depending on a and the step size in x.

repeating tests in multiple functions python

I have some function for sound processing/ sound processing. And before it was all a single channel. But know i make it less or more multi channel.
At this point i have the feeling i do part of the scrips over and over again.
In this example it are two functions(my original function is longer) but the same happens also in single scripts.
my Two functions
import numpy as np
# def FFT(x, fs, *args, **kwargs):
def FFT(x, fs, output='complex'):
from scipy.fftpack import fft, fftfreq
N = len(x)
X = fft(x) / N
if output is 'complex':
F = np.linspace(0, N) / (N / fs)
return(F, X, [])
elif output is 'ReIm':
F = np.linspace(0, N) / (N / fs)
RE = np.real(X)
IM = np.imag(X)
return(F, RE, IM)
elif output is 'AmPh0':
F = np.linspace(0, (N-1)/2, N/2)
F = F/(N/fs)
# N should be int becouse of nfft
half_spec = np.int(N / 2)
AMP = abs(X[0:half_spec])
PHI = np.arctan(np.real(X[0:half_spec]) / np.imag(X[0:half_spec]))
return(F, AMP, PHI)
elif output is 'AmPh':
half_spec = np.int(N / 2)
F = np.linspace(1, (N-1)/2, N/2 - 1)
F = F/(N/fs)
AMP = abs(X[1:half_spec])
PHI = np.arctan(np.real(X[1:half_spec])/np.imag(X[1:half_spec]))
return(F, AMP, PHI)
def mFFT(x, fs, spectrum='complex'):
fft_shape = np.shape(x)
if len(fft_shape) == 1:
mF, mX1, mX2 = FFT(x, fs, spectrum)
elif len(fft_shape) == 2:
if fft_shape[0] < fft_shape[1]:
pass
elif fft_shape[0] > fft_shape[1]:
x = x.T
fft_shape = np.shape(x)
mF = mX1 = mX2 = []
for channel in range(fft_shape[0]):
si_mF, si_mX1, si_mX2 = FFT(x[channel], fs, spectrum)
if channel == 0:
mF = np.append(mF, si_mF)
mX1 = np.append(mX1, si_mX1)
mX2 = np.append(mX2, si_mX2)
else:
mF = np.vstack((mF, si_mF))
mX1 = np.vstack((mX1, si_mX1))
if si_mX2 == []:
pass
else:
mX2 = np.vstack((mX2, si_mX2))
elif len(fft_shape) > 2:
raise ValueError("Shape of input can't be greather than 2")
return(mF, mX1, mX2)
The second funcion in this case have the problem.
The reason for this checks is best to understand with an example:
I have recorded a sample of 1 second of audio data with 4 microphones.
so i have an ndim array of 4 x 44100 samples.
The FFT works on every even length array. This means that i get an result in both situations (4 x 44100 and 44100 x 4).
For all function after this function i have also 2 data types. or a complex signal or an tuple of two signals (amplitude and phase)... what's create an extra switch/ check in the script.
check type (tuple or complex data)
check direction (ad change it)
Check size / shape
run function and append/ stack this
Are there some methods to make this less repeative i have this situation in at least 10 functions...
Bert,
The problematic I'm understanding is the repeat of the calls you're making to do checks of all sorts. I'm not understanding all but I'm guessing they are made to format your data in a way you'll be able to execute fft on it.
One of the philosophy about computer programming in Python is "It's easier to ask forgiveness than it is to get permission."[1] . This means, you should probably try first and then ask forgiveness (try, except). It's much faster to do it this way then to do a lots of checks on the value. Also, those who are going to use your program should understand how it works pretty easily; make it easy to read without those check by separating the logic business from the technical logic. Don't worry, it's not evident and the fact you're asking is an indicator you're catching something isn't right :).
Here is what I would propose for your case (and it's not the perfect solution!):
def mFFT(x, fs, spectrum='complex'):
#Assume we're correcty align when receiving the data
#:param: x assume that we're multi-channel in the format [channel X soundtrack ]
#also, don't do this:
#mF = mX1 = si_mX2 = []
# see why : https://stackoverflow.com/questions/2402646/python-initializing-multiple-lists-line
mF = []
mX1 = []
mX2 = []
try:
for channel in range(len(x)):
si_mF, si_mX1, si_mX2 = FFT(x[channel], fs, spectrum)
mF.append(si_mF)
mX1.append(si_mX1)
mX2.append(si_mX2)
return (mF, mX1, mX2)
except:
#this is where you would try to point out why it could have failed. One good you had was the check for the orientation of the data and try again;
if np.shape(x)[0] > np.shape(x)[1]:
result = mFFT(x.T,fs,spectrum)
return result
else :
if np.shape(x)[0] > 2:
raise(ValueError("Shape of input isn't supported for greather than 2"))
I gave an example because I believe you expected one, but I'm not giving the perfect answer away ;). The problematic you have is a design problematic and no, there are no easy solution. What I propose to you is to start by assuming that the order is always in this format [ n-th channel X sample size ] (i.e. [ 4 channel X 44100 sample]). That way, you try it out first like this(as in try/except), then maybe as the inverse order.
Another suggestion (and it really depends on your use case), would be to make a data structure class that would manipulate the FFT data to return the complex or the ReIm or the AmPh0 or the AmPh as getters. (so you treat the input data as to be always time and you just give what the users want).
class FFT(object):
def __init__(self,x, fs):
from scipy.fftpack import fft, fftfreq
self.N = len(x)
self.fs = fs
self.X = fft(x) / N
def get_complex(self):
F = np.linspace(0, self.N) / (self.N / self.fs)
return(F, self.X, [])
def get_ReIm(self):
F = np.linspace(0, self.N) / (self.N / self.fs)
RE,IM = np.real(self.X), np.imag(self.X)
return(F, RE, IM)
def get_AmPh0(self):
F = np.linspace(0, (self.N-1)/2, self.N/2)/(self.N/self.fs)
# N should be int because of nfft
half_spec = np.int(self.N / 2)
AMP = abs(self.X[:half_spec])
PHI = np.arctan(np.real(self.X[:half_spec]) / np.imag(self.X[:half_spec]))
return(F, AMP, PHI)
This can then be used to be called depending on the desired output from another class with an eval to get the desire output (but you require to use the same convention across your code ;) ). 2

How to handle a varying number of parameters on a function?

I'm trying to construct a code that gives a square's area and a rectangle's area with the same function, but I'm either running into missing positional argument error or something more exotic with whatever I do and I was flabbergasted by the potential solutions out there as I'm only a very basic level of python coder.
The biggest question is what kind of format should area() function be in order for me to be able to have it assume y is None if it's not given.
def area(x, y):
return x * x if y is None else x * y #Calculate area for square and rectangle
def main():
print("Square's area is {:.1f}".format(area(3))) #Square
print("Rectangle's area is {:.1f}".format(area(4, 3))) #Rectangle
main()
Do it like so:
def area(x, y=None):
return x * x if y is None else x * y #Calculate area for square and rectangle
By giving a default value, you may pass 1 less argument and it will be set to the default.

Resources