Keras sign for if statement - keras

I have a function that I define as follows
def NewLoss(y_true,y_pred):
p=0
for i in range(3074):
if (y_pred[i+1]-y_pred[i])<0:
p+=(y_true[i]-y_pred[i])**2
elif (y_pred[i+1]-y_pred[i])>0:
p+=(y_true[i]-y_pred[i])**2+(y_true[i]-y_pred[i])*(y_pred[i+1]-y_pred[i])**2
else:
p+=(y_true[i]-y_pred[i])**2+0.5*(y_true[i]-y_pred[i])*(y_pred[i+1]-y_pred[i])**2
return p
My y_true and y_pred are vectors. When I try to run a code that calls this function, I get the following error:
"Using a tf.Tensor as a Python bool is not allowed".
I would like to know how to check the sign of (y_true[i]-y_pred[i]) and avoid this error, I am actually using keras.
Thank you very much for your help.

def NewLoss(y_true, y_pred):
true = y_true[:3074]
pred = y_pred[:3074]
predShifted = y_pred[1:3075]
diff = true - pred
diffShifted = predShifted - pred
pLeftPart = K.square(diff)
pRightPart = diff * K.square(diffShifted)
greater = K.cast(K.greater(diffShifted,0),K.floatx())
equal = 0.5 * K.cast(K.equal(diffShifted, 0), K.floatx())
mask = greater + equal
return K.sum(pLeftPart + (mask*pRightPart))
Remarks:
1 - The first axis is the samples axis, perhaps you're trying to do this with the timesteps axis? If so, use:
true = y_true[:,:3074]
pred = y_pred[:,:3074]
predShifted = y_pred[:,1:3075]
2 - Having differences exactly equal to zero is so rare that maybe you don't need the last part of the if statement.
3 - If the max length of your tensors is 3075, you can simplify the selections:
true = y_true[:-1]
pred = y_pred[:-1]
predShifted = y_pred[1:]

Related

scipy solve_ivp with adaptive solution

I am struggling to understand how scipy.solve_ivp() handles errors in a system of ODE. Lets say I have the following, simple code for a single ODEs, and I think I might be doing things wrong in some way. Lets say my rhs looks something like:
from scipy.integrate import solve_ivp
def rhs_func(t, y):
z = 1.0/( x - y + 1j )
return z
Suppose we call solve_ivp with the following signature:
Z_solution = ivp_adaptive.solve_ivp( fun = rhs_func,
t_span = [100,0],
y0 = y0, #some initial value of 0 for example
method ='RK45',
t_eval = None,
args = some_additional_arguments_to_rhs_func,
dense_output = False,
rtol = 1e-8
atol = 1e-10
)
Now, the absolute and relative tolerances are supposed to fix the error of the caculation. The problem I am having has to do with the "t_eval=None" in this case. Apparently, this choice lets the integrator (in this case of type RK45) to choose the time step according to the specified tolerances above being or not being exceeded, i.e., the steps are not fixed, but somehow taking a larger step in t would mean a solution has been found that lies below the tolerances above (atol=1e-10 , rtol=1e-8). This is particularly useful in problems with large variations of the time scale, where a uniform discretization of t is very inefficient.
My big problem has to do with the following piece of code in scipy.integrate._ivp.solve_ivp() around line 575, with the "t_eval == None" case:
while status is None:
message = solver.step()
if solver.status == 'finished':
status = 0
elif solver.status == 'failed':
status = -1
break
t_old = solver.t_old
t = solver.t
y = solver.y
if dense_output:
sol = solver.dense_output()
interpolants.append(sol)
else:
sol = None
if events is not None:
g_new = [event(t, y) for event in events]
active_events = find_active_events(g, g_new, event_dir)
if active_events.size > 0:
if sol is None:
sol = solver.dense_output()
root_indices, roots, terminate = handle_events(
sol, events, active_events, is_terminal, t_old, t)
for e, te in zip(root_indices, roots):
t_events[e].append(te)
y_events[e].append(sol(te))
if terminate:
status = 1
t = roots[-1]
y = sol(t)
g = g_new
# HERE I HAVE MODIFIED THE FILE BY CALLING AN INTERPOLATION FUNCTION FOR THE SOLUTION
if t_eval is None:
ts.append(t)
#ys.append(y)
# this calls to adapt the solution to a new set of values x over which y(x,t) is
# defined
interp_solution(t,y,solver,args)
y = solver.y
ys.append(y)
where I have defined a function:
def interp_solution( t, y, solver, args ):
import numpy as np
from scipy import interpolate
x_old = args.get_old_grid() # this call just returns an array of the style of
# x_new, and is where y is defined
x_new = np.linspace( -t, t, dim ) # the new array where components of y are
# defined
y_interp = interpolate.interp1d( x_old, y )
y_new = y_interp( x_new )
solver.y = y_new # update the solver y
# finally, we change the maximum allowed step of the integrator if t is below
# some threshold value
if ( t < args.get_threshold() ):
solver.max_step = #some number
return y_new
When I look at the results, it seems that this is very sensitive to the tolerances and the way the integration steps are performed, but somehow I fail to see where errors could come from in this approach -- can anyone explain if this approach is somehow affecting the solution and the associated errors ? How can one implement a similar approach in this fashion? Any help is greatly appreaciated.

How to calculate backpropagation through tf.while_loop to use as loss function

I want to implement an Fourier Ring Correlation Loss for two images to train a GAN. Therefore I'd like to loop over a specific amount of times and calculate the loss. This works fine for a normal Python loop. To speed up the process I want to use the tf.while_loop but unfortunately I am not able to track the gradients through my while loop. I constructed a dummy example just to calculate gradients during a while loop but it doesn't work. First, the working python loop :
x = tf.constant(3.0)
y = tf.constant(2.0)
for i in range(3):
y = y * x
grad = tf.gradients(y, x)
with tf.Session() as ses:
print("output : ", ses.run(grad))
This works and gives the output
[54]
If i do the same with a tf.while_loop it doesn't work:
a = tf.constant(0, dtype = tf.int64)
b = tf.constant(3, dtype = tf.int64)
x = tf.constant(3.0)
y = tf.constant(2.0)
def cond(a,b,x,y):
return tf.less(a,b)
def body(a,b,x,y):
y = y * x
with tf.control_dependencies([y]):
a = a + 1
return [a,b,x,y]
results = tf.while_loop(cond, body, [a,b,x,y], back_prop = True)
grad = tf.gradients(y, results[2])
with tf.Session() as ses:
print("grad : ", ses.run(grad))
The output is :
TypeError: Fetch argument None has invalid type '<'class 'NoneType'>
So I guess somehow tensorflow is not able to do the backpropagation.
The problem still accours if you use tf.GradientTape() instead of tf.gradients().
I changed the code so that it now outputs the gradients:
import tensorflow as tf
a = tf.constant(0, dtype = tf.int64)
b = tf.constant(3, dtype = tf.int64)
x = tf.Variable(3.0, tf.float32)
y = tf.Variable(2.0, tf.float32)
dy = tf.Variable(0.0, tf.float32)
def cond(a,b,x,y,dy):
return tf.less(a,b)
def body(a,b,x,y,dy):
y = y * x
dy = tf.gradients(y, x)[0]
with tf.control_dependencies([y]):
a = a + 1
return [a,b,x,y,dy]
init = tf.global_variables_initializer()
with tf.Session() as ses:
ses.run(init)
results = ses.run(tf.while_loop(cond, body, [a,b,x,y,dy], back_prop = True))
print("grad : ", results[-1])
The things I modified:
I made x and y into variables and added their initialisation init.
I added a variable called dy which will contain the gradient of y.
I moved the tf.while_loop inside the session.
Put the evaluation of the gradient inside the body function
I think the problem before was that when you define grad = tf.gradients(y, results[2]) the loop has not run yet, so y is not a function of x. Therefore, there is no gradient.
Hope this helps.

Divide by Zero in Mean()?

I'm trying to write some code to compute mean, Variance, Standard Deviation, FWHM, and finally evaluate the Gaussian Integral. I've been running into a division by zero error that I can't get past and I would like to know the solution for this ?
Where it's throwing an error I've tried to throw an exception handler as follows
Average = (sum(yvalues)) / (len(yvalues)) try: return (sum(yvalues) / len(yvalues))
expect ZeroDivisionError:
return 0
xvalues = []
yvalues = []
def generate():
for i in range(0,300):
a = rand.uniform((float("-inf") , float("inf")))
b = rand.uniform((float("-inf") , float("inf")))
xvalues.append(i)
### Defining the variable 'y'
y = a * (b + i)
yvalues.append(y) + 1
def mean():
Average = (sum(yvalues))/(len(yvalues))
print("The average is", Average)
return Average
def varience():
# This calculates the SD and the varience
s = []
for i in yvalues:
z = i - mean()
z = (np.abs(i-z))**2
s.append(y)**2
t = mean()
v = numpy.sqrt(t)
print("Answer for Varience is:", v)
return v
Traceback (most recent call last):
File "Tuesday.py", line 42, in <module>
def make_gauss(sigma=varience(), mu=mean(), x = random.uniform((float("inf"))*-1, float("inf"))):
File "Tuesday.py", line 35, in varience
t = mean()
File "Tuesday.py", line 25, in mean
Average = (sum(yvalues))/(len(yvalues))
ZeroDivisionError: division by zero
There are a few things that are not quite right as people noted above.
import random
import numpy as np
def generate():
xvalues, yvalues = [], []
for i in range(0,300):
a = random.uniform(-1000, 1000)
b = random.uniform(-1000, 1000)
xvalues.append(i)
### Defining the variable 'y'
y = a * (b + i)
yvalues.append(y)
return xvalues, yvalues
def mean(yvalues):
return sum(yvalues)/len(yvalues)
def variance(yvalues):
# This calculates the SD and the varience
s = []
yvalues_mean = mean(yvalues)
for y in yvalues:
z = (y - yvalues_mean)**2
s.append(z)
t = mean(s)
return t
def variance2(yvalues):
yvalues_mean = mean(yvalues)
return sum( (y-yvalues_mean)**2 for y in yvalues) / len(yvalues)
# Generate the xvalues and yvalues
xvalues, yvalues = generate()
# Now do the calculation, based on the passed parameters
mean_yvalues = mean(yvalues)
variance_yvalues = variance(yvalues)
variance_yvalues2 = variance2(yvalues)
print('Mean {} variance {} {}'.format(mean_yvalues, variance_yvalues, variance_yvalues2))
# Using Numpy
np_mean = np.mean(yvalues)
np_var = np.var(yvalues)
print('Numpy: Mean {} variance {}'.format(np_mean, np_var))
The way variance was calculated isn't quite right, but given the comment of "SD and variance" you were probably going to calculate both.
The code above gives 2 (well, 3) ways to do what I understand you were trying to do but I changed a few of the methods to clean them up a bit. generate() returns two lists now. mean() returns the mean, etc. The function variance2() gives an alternative way to calculate the variance but using a list comprehension style.
The last couple of lines are an example using numpy which has all of it built in and, if available, is a great way to go.
The one part that wasn't clear was the random.uniform(float("-inf"), float("inf"))) which seems to be an error (?).
You are calling mean before you call generate.
This is obvious since yvalues.append(y) + 1 (in generate) would have caused another error (TypeError) since .append returns None and you can't add 1 to None.
Change yvalues.append(y) + 1 to yvalues.append(y + 1) and then make sure to call generate before you call mean.
Also notice that you have the same error in varience (which should be called variance, btw). s.append(y)**2 should be s.append(y ** 2).
Another error you have is that the stacktrace shows make_gauss(sigma=varience(), mu=mean(), x = random.uniform((float("inf"))*-1, float("inf"))).
I'm pretty sure you don't actually want to call varience and mean on this line, just reference them. So also change that line to make_gauss(sigma=varience, mu=mean, x = random.uniform((float("inf"))*-1, float("inf")))

Neural network numerical gradient check not working with matrices using Python-numpy

I'm trying to implement a simple numerical gradient check using Python 3 and numpy to be used for neural network.
It works well for simple 1D functions but fails when applied to matrices of parameters.
My guess is that either my cost function is not calculated well for a matrix or that the way I do the numerical gradient check is wrong somehow.
See code below and thanks for your help!
import numpy as np
import random
import copy
def gradcheck_naive(f, x):
""" Gradient check for a function f.
Arguments:
f -- a function that takes a single argument (x) and outputs the
cost (fx) and its gradients grad
x -- the point (numpy array) to check the gradient at
"""
rndstate = random.getstate()
random.setstate(rndstate)
fx, grad = f(x) # Evaluate function value at original point
#fx=cost
#grad=gradient
h = 1e-4
# Iterate over all indexes in x
it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
while not it.finished:
ix = it.multi_index #multi-index number
random.setstate(rndstate)
xp = copy.deepcopy(x)
xp[ix] += h
fxp, gradp = f(xp)
random.setstate(rndstate)
xn = copy.deepcopy(x)
xn[ix] -= h
fxn, gradn = f(xn)
numgrad = (fxp-fxn) / (2*h)
# Compare gradients
reldiff = abs(numgrad - grad[ix]) / max(1, abs(numgrad), abs(grad[ix]))
if reldiff > 1e-5:
print ("Gradient check failed.")
print ("First gradient error found at index %s" % str(ix))
print ("Your gradient: %f \t Numerical gradient: %f" % (
grad[ix], numgrad))
return
it.iternext() # Step to next dimension
print ("Gradient check passed!")
#sanity check with 1D function
exp_f = lambda x: (np.sum(np.exp(x)), np.exp(x))
gradcheck_naive(exp_f, np.random.randn(4,5)) #this works fine
#sanity check with matrices
#forward pass
W = np.random.randn(5,10)
x = np.random.randn(10,3)
D = W.dot(x)
#backpropagation pass
gradx = W
func_f = lambda x: (np.sum(W.dot(x)), gradx)
gradcheck_naive(func_f, np.random.randn(10,3)) #this does not work (grad check fails)
I figured it out! (my math teacher would be so proud...)
The short answer is that I was mixing up matrices dot product and element wise product.
When using an element wise product, the gradient is equal to:
W = np.array([[2,4],[3,5],[3,1]])
x = np.array([[1,7],[5,-1],[4,7]])
D = W*x #element-wise multiplication
gradx = W
func_f = lambda x: (np.sum(W*x), gradx)
gradcheck_naive(func_f, np.random.randn(3,2))
When using the dot product, the gradient becomes:
W = np.array([[2,4],[3,5]]))
x = np.array([[1,7],[5,-1],[5,1]])
D = x.dot(W)
unitary = np.array([[1,1],[1,1],[1,1]])
gradx = unitary.dot(np.transpose(W))
func_f = lambda x: (np.sum(x.dot(W)), gradx)
gradcheck_naive(func_f, np.random.randn(3,2))
I was also wondering how did the element wise product behave with matrices of not equal dimensions like below:
x = np.random.randn(10)
W = np.random.randn(3,10)
D1 = x*W
D2 = W*x
Turns out that D1=D2 (same dimension as W=3x10) and my understanding is that x is being broadcasted by numpy to be a 3x10 matrix to allow the element wise multiplication.
Conclusion: when in doubt, write it out with small matrices to figure out where the error is.

Wrong number of dimensions: expected 0, got 1 with shape (1,)

I am doing word-level language modelling with a vanilla rnn, I am able to train the model but for some weird reasons I am not able to get any samples/predictions from the model; here is the relevant part of the code:
train_set_x, train_set_y, voc = load_data(dataset, vocab, vocab_enc) # just load all data as shared variables
index = T.lscalar('index')
x = T.fmatrix('x')
y = T.ivector('y')
n_x = len(vocab)
n_h = 100
n_y = len(vocab)
rnn = Rnn(input=x, input_dim=n_x, hidden_dim=n_h, output_dim=n_y)
cost = rnn.negative_log_likelihood(y)
updates = get_optimizer(optimizer, cost, rnn.params, learning_rate)
train_model = theano.function(
inputs=[index],
outputs=cost,
givens={
x: train_set_x[index],
y: train_set_y[index]
},
updates=updates
)
predict_model = theano.function(
inputs=[index],
outputs=rnn.y,
givens={
x: voc[index]
}
)
sampling_freq = 2
sample_length = 10
n_train_examples = train_set_x.get_value(borrow=True).shape[0]
train_cost = 0.
for i in xrange(n_train_examples):
train_cost += train_model(i)
train_cost /= n_train_examples
if i % sampling_freq == 0:
# sample from the model
seed = randint(0, len(vocab)-1)
idxes = []
for j in xrange(sample_length):
p = predict_model(seed)
seed = p
idxes.append(p)
# sample = ''.join(ix_to_words[ix] for ix in idxes)
# print(sample)
I get the error: "TypeError: ('Bad input argument to theano function with name "train.py:94" at index 0(0-based)', 'Wrong number of dimensions: expected 0, got 1 with shape (1,).')"
Now this corresponds to the following line (in the predict_model):
givens={ x: voc[index] }
Even after spending hours I am not able to comprehend how could there be a dimension mis-match when:
train_set_x has shape: (42, 4, 109)
voc has shape: (109, 1, 109)
And when I do train_set_x[index], I am getting (4, 109) which 'x' Tensor of type fmatrix can hold (this is what happens in train_model) but when I do voc[index], I am getting (1, 109), which is also a matrix but 'x' cannot hold this, why ? !
Any help will be much appreciated.
Thanks !
The error message refers to the definition of the whole Theano function named predict_model, not the specific line where the substitution with givens occurs.
The issue seems to be that predict_model gets called with an argument that is a vector of length 1 instead of a scalar. The initial seed sampled from randint is actually a scalar, but I would guess that the output p of predict_model(seed) is a vector and not a scalar.
In that case, you could either return rnn.y[0] in predict_model, or replace seed = p with seed = p[0] in the loop over j.

Resources