Theano / Keras: Set the K-first values of Tensor to a value - theano

I have a custom Keras layer and i use Theano as the backend and i want to do the following operation:
Suppose we have a tensor with shape (N,). I want to set the K first values to a fixed value x (3 or whatever...). How do i do that? I assume i have to use argsort but i don't know how to implement it using Theano.
For example in a simple FF layer, how can i set the first N values of the tensor a to a value x?
def call(self, x, mask=None):
a = K.dot(x, self.W)
if self.bias:
a += self.b
return a
For example using numpy i can set the 10 first values of a to 1 like this:
a[a.argsort() <= 10] = 1
I want to do the same using Keras backend functions or Theano functions.

For a[a.argsort() <= 10] = 1 the equivalent code will be:
import numpy as np
from keras import backend as K
a = np.asarray([[3,4,2,44,22,4,5,6,77,86,3,2,3,23,44,21],
[3,4,22,44,2,4,54,6,77,8,3,2,36,23,4,2]], dtype=np.float)
a_t = K.variable(a)
a[a.argsort() <= 10] = 1
print a
arg_sort = K.T.argsort(a_t)
_cond = K.lesser_equal(arg_sort,10)
a_new = K.switch(_cond, 1, a_t)
print K.eval(a_new)
In a single statement it will be:
a_new = K.switch(K.lesser_equal(K.T.argsort(a_t),10), 1, a_t)
print K.eval(a_new)
I am not sure if this is exactly what you need.

Related

Pytorch: Custom thresholding activation function - gradient

I created an activation function class Threshold that should operate on one-hot-encoded image tensors.
The function performs min-max feature scaling on each channel followed by thresholding.
class Threshold(nn.Module):
def __init__(self, threshold=.5):
super().__init__()
if threshold < 0.0 or threshold > 1.0:
raise ValueError("Threshold value must be in [0,1]")
else:
self.threshold = threshold
def min_max_fscale(self, input):
r"""
applies min max feature scaling to input. Each channel is treated individually.
input is assumed to be N x C x H x W (one-hot-encoded prediction)
"""
for i in range(input.shape[0]):
# N
for j in range(input.shape[1]):
# C
min = torch.min(input[i][j])
max = torch.max(input[i][j])
input[i][j] = (input[i][j] - min) / (max - min)
return input
def forward(self, input):
assert (len(input.shape) == 4), f"input has wrong number of dims. Must have dim = 4 but has dim {input.shape}"
input = self.min_max_fscale(input)
return (input >= self.threshold) * 1.0
When I use the function I get the following error, since the gradients are not calculated automatically I assume.
Variable._execution_engine.run_backward(RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I already had a look at How to properly update the weights in PyTorch? but could not get a clue how to apply it to my case.
How is it possible to calculate the gradients for this function?
Thanks for your help.
The issue is you are manipulating and overwriting elements, this time of operation can't be tracked by autograd. Instead, you should stick with built-in functions. You example is not that tricky to tackle: you are looking to retrieve the minimum and maximum values along input.shape[0] x input.shape[1]. Then you will scale your whole tensor in one go i.e. in vectorized form. No for loops involved!
One way to compute min/max along multiple axes is to flatten those:
>>> x_f = x.flatten(2)
Then, find the min-max on the flattened axis while retaining all shapes:
>>> x_min = x_f.min(axis=-1, keepdim=True).values
>>> x_max = x_f.max(axis=-1, keepdim=True).values
The resulting min_max_fscale function would look something like:
class Threshold(nn.Module):
def min_max_fscale(self, x):
r"""
Applies min max feature scaling to input. Each channel is treated individually.
Input is assumed to be N x C x H x W (one-hot-encoded prediction)
"""
x_f = x.flatten(2)
x_min, x_max = x_f.min(-1, True).values, x_f.max(-1, True).values
x_f = (x_f - x_min) / (x_max - x_min)
return x_f.reshape_as(x)
Important note:
You would notice that you can now backpropagate on min_max_fscale... but not on forward. This is because you are applying a boolean condition which is not a differentiable operation.

Errors when Building up a Custom Loss Function

I try to build up my own loss function as follows
import numpy as np
from keras import backend as K
def MyLoss(self, x_input, x_reconstruct):
a = np.copy(x_reconstruct)
a = np.asarray(a, dtype='float16')
a = np.floor(4*a)/4
return K.mean(K.square(a - x_input), axis=-1)`
In compilation, it says
ValueError: setting an array element with a sequence
Both x_input and x_reconstruct are [m, n, 1] np arrays. The last line of code is actually copied directly from Keras' built-in MSE loss function.
Also, I suppose loss is calculated per sample. If dimensions of the input and reconstructed input are both [m, n, 1], the result of Keras' built-in loss will also be a matrix sized [m, n]. So why does it work properly?
I then tried to us np's functions directly by
def MyLoss(self, x_input, x_reconstruct):
a = np.copy(x_reconstruct)
a = np.asarray(a, dtype=self.precision)
a = np.floor(4*a)/4
Diff = a - x_input
xx = np.mean(np.square(Diff), axis=-1)
yy = np.sum(xx)
return yy
yet the error persists. What mistake did I make? How should write the code?
Having borrowed the suggestion from Make a Custom loss function in Keras in detail, I tried following
def MyLoss(self, x_input, x_reconstruct):
if self.precision == 'float16':
K.set_floatx('float16')
K.set_epsilon(1e-4)
a = K.cast_to_floatx(x_input)
a = K.round(a*4.-0.5)/4.0
return K.sum(K.mean(K.square(x_input-a), axis=-1))
But the same error happens
You can not use numpy arrays in your loss. You have to use TensorFlow or Keras backend operations. Try this maybe:
import tensorflow as tf
import keras.backend as K
def MyLoss(x_input, x_reconstruct):
a = tf.cast(x_input, dtype='tf.float16')
a = tf.floor(4*a)/4
return K.mean(K.square(a - x_input), axis=-1)
I found the answer myself, and let me share it here
If I write code like this
def MyLoss(self, y_true, y_pred):
if self.precision == 'float16':
K.set_floatx('float16')
K.set_epsilon(1e-4)
return K.mean(K.square(y_true-K.round(y_pred*4.-0.5)/4.0), axis=-1)
It works. The trick is, I think, that I cannot use 'K.cast_to_floatx(y_true)'. Instead, simply use y_true directly. I still do not understand why...

Is it possible to obtain batch size in keras layer

I'm design a specific keras layer using lambda function,how can I get the dynamic batch_size in the function?
I tried many times to solve this problem,but all failed.
def minus(inputs):
x,y = inputs
batch_size=K.shape(x)[0]
e = K.get_variable_shape(x)
for k in range(e[0]):
for i in range(e[1]):
for j in range(e[2]):
if x[k][i][j]==0:
K.update(x[k][i][j], y[k][i][j])
return x
def mymodel():
inpA = keras.layers.Input(shape=(10,8),name='InputLayerA')
inpB = keras.layers.Input(shape=(10,8),name='InputLayerB')
print(inpA.shape)
middle = keras.layers.Lambda(minus,name='minus')([inpA,inpB])
ae = keras.Model([inpA,inpB],middle)
ae.summary()
return ae
When I new a model,like ae = mymodel() .I except a new x tensor,but the actual is error message:'NoneType' object cannot be interpreted as an integer.
When using tensorflow I am using tf.shape(x) to get the batch-size for model layers, so I think you are right that K.shape(x) (keras equivalent?) can be used.
If i understand what you try to do correct, wouldn't this be sufficient? With the bonus of avoiding slow python for-loops.
import keras
import keras.backend as K
def minus(inputs):
x,y = inputs
change_index = K.cast(K.equal(x, 0),'float32')
return x*(1-change_index)+y*change_index
def mymodel():
inpA = keras.layers.Input(shape=(10,8),name='InputLayerA')
inpB = keras.layers.Input(shape=(10,8),name='InputLayerB')
print(inpA.shape)
middle = keras.layers.Lambda(minus,name='minus')([inpA,inpB])
ae = keras.Model([inpA,inpB],middle)
ae.summary()
return ae
ae = mymodel()

Padding sequences in tensorflow with tf.pad

I am trying to load imdb dataset in python. I want to pad the sequences so that each sequence is of same length. I am currently doing it with numpy. What is a good way to do it in tensorflow with tf.pad. I saw the given here but I dont know how to apply it with a 2 d matrix.
Here is my current code
import tensorflow as tf
from keras.datasets import imdb
max_features = 5000
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
def padSequence(dataset,max_length):
dataset_p = []
for x in dataset:
if(len(x) <=max_length):
dataset_p.append(np.pad(x,pad_width=(0,max_length-len(x)),mode='constant',constant_values=0))
else:
dataset_p.append(x[0:max_length])
return np.array(x_train_p)
max_length = max(len(x) for x in x_train)
x_train_p = padSequence(x_train,max_length)
x_test_p = padSequence(x_test,max_length)
print("input x shape: " ,x_train_p.shape)
Can someone please help ?
I am using tensorflow 1.0
In Response to the comment:
The padding dimensions are given by
# 'paddings' is [[1, 1,], [2, 2]].
I have a 2 d matrix where every row is of different length. I want to be able to pad to to make them of equal length. In my padSequence(dataset,max_length) function, I get the length of every row with len(x) function. Should I just do the same with tf ? Or is there a way to do it like Keras Function
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
If you want to use tf.pad, according to me you have to iterate for each row.
Code will be something like this:
max_length = 250
number_of_samples = 5
padded_data = np.ndarray(shape=[number_of_samples, max_length],dtype=np.int32)
sess = tf.InteractiveSession()
for i in range(number_of_samples):
reviewToBePadded = dataSet[i] #dataSet numpy array
paddings = [[0,0], [0, maxLength-len(reviewToBePadded)]]
data_tf = tf.convert_to_tensor(reviewToBePadded,tf.int32)
data_tf = tf.reshape(data_tf,[1,len(reviewToBePadded)])
data_tf = tf.pad(data_tf, paddings, 'CONSTANT')
padded_data[i] = data_tf.eval()
print(padded_data)
sess.close()
New to Python, possibly not the best code. But I just want to explain the concept.

How to get a theano function to return the an array of the same length as another tensor variable

I am really new to Theano, and I am just trying to figure out some basic functionality. I have a tensor variable x, and i would like the functio to return a tensor variable y of the same shape, but filled with value 0.2. I am not sure how to define y.
For example if x = [1,2,3,4,5], then I would like y = [0,2, 0,2, 0,2, 0,2, 0.2]
from theano import tensor, function
y = tensor.dmatrix('y')
masked_array = function([x],y)
There's probably a dozen different ways to do this and which is best will depend on the context: how this piece of code/functionality fits into the wider program.
Here's one approach:
import theano
import theano.tensor as tt
x = tt.vector()
y = tt.ones_like(x) * 0.2
f = theano.function([x], outputs=y)
print f([1, 2, 3, 4, 5])

Resources