keras: how to add weights to loss evaluation - keras

Todo :
I would like to add a weight for each pattern loss in a given Keras loss function.
For example: if the error on pattern i is l_i, I would like to consider, instead, an error l_i * c_i, where c_i is an input scalar.

def customloss(y_true, y_pred):
c_i = ...
loss = ...(only use tensor operations on y_true and y_pred or use built in keras losses)
return c_i*loss
Now compile your model passing the loss function.
model.compile(loss = customloss)

Related

Can I use keras.losses.binary_crossentropy(y_true,y_pred) without training process?

I am new to Keras. I want to know the loss of certain instances. So I got the y_true and y_pred of these data instances. I want to call the loss function to calculate the loss but only get Tensor("Mean_5:0",shape=(),dtype=float32). How can I evaluate the value of the tensor? Is it similar to tensorflow by calling los.eval()?
y_pred is calcualted by:
y_pred = self.model.predict(x, batch_size=self.batch_size)
y_true is also an available list.
How to use binary_crossentropy()?
You almost had the answer.
from keras import backend
from keras.losses import binary_crossentropy
y_true = backend.variable(y_true)
y_pred = backend.variable(y_pred)
# calculate the average cross-entropy
mean_ce = backend.eval(binary_crossentropy(y_true, y_pred))
print('Average Cross Entropy: %.3f nats' % mean_ce)

How to create a custom loss function in Keras that evaluates prediction after each epoch?

I'm working on a neural network in Keras that translates English sentences into a custom language. For this, I'd like to create a custom loss function that takes the prediction for each sentence and evaluates whether it complies with the grammar rules of the custom language and if not adds value to the standard loss function.
How can I evaluate a tensor after each epoch but not during compilation?
Below is my custom loss function. As during compilation of the model there is no batch yet, y_pred has the shape (None, x, y) and can't be evaluated to get the prediction. My idea to circumvent this was to assign a standard loss function during compilation and when batches arrive calculate the custom loss. Unfortunately the custom loss is never reached.
def custom_loss(tokenizer, punishment_rate):
def compile_loss(y_true, y_pred):
shape = K.int_shape(y_pred)
#standard loss function
loss = K.sparse_categorical_crossentropy(y_true, y_pred)
if shape[0] is not None:
#THIS is never reached and that's the problem
prediction = logits_to_text(K.eval(y_pred), tokenizer)
#test if prediction complies to grammar rules
compileable = compiles(prediction) ^ 1
compile_error = compileable * punishment_rate
loss = K.sparse_categorical_crossentropy(y_true, y_pred, axis=-1) * (1 + compile_error)
return loss
return compile_loss
Is there any workaround for evaluating a tensor only when it was filled with a batch? Or alternatively, change the loss function after compilation of the model via a callback without it having to recompile the model?
As per keras source, you can use a Loss Function Wrapper to create a Custom Loss Function class and then pass it to your model seamlessly.
As an example:
#Import the wrapper
from keras.losses import LossFunctionWrapper
#Create your class extending the wrapper
class MyLossFunction(LossFunctionWrapper):
#Implement the constructor - here you can give extended arguments to it.
def __init__(self,
tokenizer,
punishment_rate,
reduction=losses_utils.Reduction.SUM_OVER_BATCH_SIZE,
name='my_custom_text_function'):
super(MyLossFunction, self).__init__(
my_function,
name=name,
reduction=reduction,
tokenizer = tokenizer,
punishment_rate= punishment_rate)
#Now you have to define your function "my_function":
#Please, notice that ALL loss functions that follow keras model needs two arguments:
#y_true (correct result) and y_pred (the result obtained in the network).
def my_function(y_true, y_pred, tokenizer, punishment_rate):
shape = K.int_shape(y_pred)
if shape[0] is not None:
prediction = logits_to_text(K.eval(y_pred), tokenizer)
#test if prediction complies to grammar rules
compileable = compiles(prediction) ^ 1
compile_error = compileable * punishment_rate
return K.sparse_categorical_crossentropy(y_true, y_pred, axis=-1) * (1 + compile_error)
return K.sparse_categorical_crossentropy(y_true, y_pred)
You can then instantiate it and use in your compiler:
custom_loss= MyLossFunction(tokenizer = ..., punishment_rate = ...)
classifier.compile(optimizer=optimizer,
loss=custom_loss,
metrics= ['binary_accuracy'])

How to regularize a layer's kernel weights bias weights in a single regularization function?

The Keras documentation introduces separate classes for weight regularization and bias regularization. These can be subclasses to add a custom regularizer. An example from the Keras docs:
def my_regularizer(x):
return 1e-3 * tf.reduce_sum(tf.square(x))
where x can be either the kernel weights or the bias weights. I however want to regularize my layer with a function that include both the layer weights and the layer bias. Is there a way that incorporates both of these into a single function?
For example I would like to have as regularizer:
def l1_special_reg(weight_matrix, bias_vector):
return 0.01 * K.sum(K.abs(weight_matrix)-K.abs(bias_vector))
Thanks,
You can call layer[idx].trainable_weights, it will return both weights and bias. After that you can manually add that regularization loss in model loss function as follows:
model.layers[-1].trainable_weights
[<tf.Variable 'dense_2/kernel:0' shape=(100, 10) dtype=float32_ref>,
<tf.Variable 'dense_2/bias:0' shape=(10,) dtype=float32_ref>]
Complete example with loss function:
# define model
def l1_reg(weight_matrix):
return 0.01 * K.sum(K.abs(weight_matrix))
wts = model.layers[-1].trainable_weights # -1 for last dense layer.
reg_loss = l1_reg(wts[0]) + l1_reg(wts[1])
def custom_loss(reg_loss):
def orig_loss(y_true, y_pred):
return K.categorical_crossentropy(y_true, y_pred) + reg_loss
return orig_loss
model.compile(loss=custom_loss(reg_loss),
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
In TensorFlow 2 this can be achieved with the model.add_loss() function. Say you have a weights and a bias tensor of some layer:
w, b = layer.trainable_weights()
Then you can regularize this layer by adding the regularization function a loss term to the model object as follows:
def l1_special_reg(weight_matrix, bias_vector):
return 0.01 * K.sum(K.abs(weight_matrix)-K.abs(bias_vector))
model.add_loss(l1_special_reg(w, b))
Naturally, you can do this for each layer independently.

Vector regression with Keras

Suppose, for example, a regression problem with five scalars as output, where each output has approximately the same range. In Keras, we can model this using a 5-output dense layer without activation function (vector regression):
output_layer = layers.Dense(5, activation=None)(previous_layer)
model = models.Model(input_layer, output_layer)
model.compile(optimizer='rmsprop', loss='mse', metrics=['mse'])
Is the total loss (metric) simply the sum of the individual losses (metrics)? Is this equivalent to the following multi-output model, where the outputs have the same implicit loss weights? In my experiments, I haven't observed any significant differences but want to make sure that I didn't miss anything fundamental.
output_layer_list = []
for _ in range(5):
output_layer_list.append(layers.Dense(1, activation=None)(previous_layer))
model = models.Model(input_layer, output_layer_list)
model.compile(optimizer='rmsprop', loss='mse', metrics=['mse'])
Is there an easy way to attach weights to the outputs in the first solution similar to specifying loss_weights in case of multi-output models?
Those models are the same. To answer your questions let's look at the mse loss:
def mean_squared_error(y_true, y_pred):
return K.mean(K.square(y_pred - y_true), axis=-1)
Is the total loss (metric) simply the sum of the individual losses (metrics)? Yes, because the mse loss applies the K.mean function so you can argue it is the sum of all the elements in the output vector.
Is this equivalent to the following multi-output model, where the outputs have the same implicit loss weights? Yes, because subtraction and squaring are done element wise in vector form, so scalar outputs will produce the same as a single vector output. And a multi-output model loss is the sum of losses of individual outputs.
Yes, both are equivalent. To replicate the loss_weights functionality with your first model, you can define your own custom loss function. Something along these lines:
import tensorflow as tf
weights = K.variable(value=np.array([[0.1, 0.1, 0.1, 0.1, 0.6]]))
def custom_loss(y_true, y_pred):
return tf.matmul(K.square(y_true - y_pred), tf.transpose(weights))
and pass this function to the loss argument upon compiling:
model.compile(optimizer='rmsprop', loss=custom_loss, metrics=['mse'])

keras error when using custom loss

I was to use a simple BiLSTM model with my own custom loss function in Keras.
See below.
model = Sequential()
model.add(Bidirectional(LSTM(128, return_sequences=True), input_shape=(1,8)))
model.add(Bidirectional(LSTM(128)))
model.add(Dense(64, activation='relu'))
model.add(Dense(20, activation='softmax'))
def my_loss_np(y_true, y_pred):
labels = [np.argmax(y_pred[i]) for i in range(y_pred.shape[1])]
loss = np.mean(labels)
return loss
import keras.backend as K
def my_loss(y_true, y_pred):
loss = K.eval(my_loss_np(K.eval(y_true), K.eval(y_pred)))
return loss
When I compile this model, I get an error -
model.compile(loss=my_loss, optimizer='adam')
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'dense_95_target' with dtype float and shape [?,?]
[[Node: dense_95_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
There are several issues here with your loss function:
You are using NumPy on tensors, unfortunately though it is an intuitive this doesn't work. You need to use tensor operators from the Keras backend, they are very similar.
To that end you are calling K.eval but at this stage you are still constructing a symbolic computation graph which will be run in TensorFlow or Theano. So the tensors don't have a value to compute per say, you need to keep it symbolic, you can get any values like you do in NumPy.
Even if you fix the problems above, you are using a non-differentiable operation argmax which will not work with gradient descent algorithms.
Your model looks like a multi-label classification problem, 20 classes as your final layer is 20 with softmax. In this case, the literature uses categorical-crossentropy loss to train the classifier network.

Resources