How to customize loss function in keras based on the y_true - keras

I want to custom a loss function based on the y_true values. y_true is a binary value. For each mini-batch, I want to treat y_true==0 and y_true==1 differently. Currently, I have:
def custom_loss(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
zero = tf.fill(tf.shape(y_true_f), 0.0)
one = tf.fill(tf.shape(y_true_f), 1.0)
mask_0 = tf.equal(y_true_f,zero)
mask_1 = tf.equal(y_true_f,one)
y_pred_1 = tf.boolean_mask(y_pred_f,mask_1)
y_pred_0 = tf.boolean_mask(y_pred_f,mask_0)
y_true_1 = tf.boolean_mask(y_true_f,mask_1)
y_true_0 = tf.boolean_mask(y_true_f,mask_0)
loss1 = K.binary_crossentropy(y_true_1,y_pred_1)
loss0 = K.binary_crossentropy(y_true_0,y_pred_0)
loss = loss1 +a*loss0 # a is an arbitrary number
However, I got an nan loss error. I guess it is because I am training on an imbalance data where only a few cases having y_true==1. So when there is no y_true==1 in this minibatch, there is nan. I want to add if condition based on the shape of mask_1. How can I do that?

you can achieve this with the same technique as cross entropy loss function. Here is the function loss = ((y_true)*(Loss1)) + ((1 - y_true)*(Loss2)), so if your y_true = 0, first term will be equal to zero and result in loss = ((0)*Loss1) + ((1 - 0)*Loss2) = Loss2. If your y_true = 1 , your second term will become zero, loss = ((1)*Loss1) + ((1 - 1)*Loss2) = Loss1
Therefore, you can have 2 Loss function depend on your y_true = {0,1}

Related

Custom loss function in keras with class weights for each batch

I am new to deep learning and tensorflow. I am working on a speech binary classification problem, trying to replicate a research paper. Number of samples in class 1 are 2700 approx and in class 2 are 1200 approx. The paper has used MFCC features for binary classification with 88.3% accuracy, 88% F1-Score and 82.3% recall. They have also used a custom loss function in which they weighted average between specificity and recall with focus on specificity, with formula:
custom_loss = 1 - (0.85 * specificity + 0.15 * recall)
I have implemented all the parameters given by the paper. The only thing which is still not addressed by me is class imbalance (Class 1 -2700 and class 2 - 1200). I tried oversampling the class 2 with different oversampling methods but nothing worked. The accuracy I achieve is at max 68% with algorithm just learning the majority class well and performing poorly on minority class.
The class weight technique as follows did not work with custom loss function:
from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight('balanced', classes = [0, 1], y = y_train)
weights = {i:w for i,w in enumerate(class_weights)}
Hence I tried weighted custom loss function, but the results were not good. Accuracy was still near 68%.
I am sharing the code as follows:
from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight('balanced', classes = [0, 1], y = y_train)
weights = {i:w for i,w in enumerate(class_weights)}
def binary_recall_specificity(y_true, y_pred, recall_weight, spec_weight):
y_true = K.clip(y_true, K.epsilon(), 1)
y_pred = K.clip(y_pred, K.epsilon(), 1)
ground_positives = K.sum(y_true, axis=0) + K.epsilon() # = TP + FN
pred_positives = K.sum(y_pred, axis=0) + K.epsilon() # = TP + FP
true_positives = K.sum(y_true * y_pred, axis=0) + K.epsilon() # = TP
neg_y_true = 1 - y_true
neg_y_pred = 1 - y_pred
fp = K.sum(neg_y_true * y_pred)
tn = K.sum(neg_y_true * neg_y_pred)
specificity = tn / (tn + fp + K.epsilon())
recall = true_positives / ground_positives
loss1 = 1.0 - (recall_weight*recall + spec_weight*specificity)
return loss1 * class_weights.tolist()
def custom_loss(recall_weight, spec_weight):
def recall_spec_loss(y_true, y_pred):
return binary_recall_specificity(y_true, y_pred, recall_weight, spec_weight)
# Returns the (y_true, y_pred) loss function
return recall_spec_loss
Am I doing something wrong here? Please guide. I have read answers to other questions to the same questions too but unable to get my query solved.

Computing Jacobian and Derivative in Tensorflow is extremely slow

Is there a more efficient way to compute Jacobian (there must be, it doesn't even run for a single batch) I want to compute the loss as given in the self-explanatory neural network. Input has a shape of (32, 365, 3) where 32 is the batch size. The loss I want to minimize is Equation 3 of the paper.
I believe that I am not using the GradientTape optimally.
def compute_loss_theta(tape, parameter, concept, output, x):
b = x.shape[0]
in_dim = (x.shape[1], x.shape[2])
feature_dim = in_dim[0]*in_dim[1]
J = tape.batch_jacobian(concept, x)
grad_fx = tape.gradient(output, x)
grad_fx = tf.reshape(grad_fx,shape=(b, feature_dim))
J = tf.reshape(J, shape=(b, feature_dim, feature_dim))
parameter = tf.expand_dims(parameter, axis =1)
loss_theta_matrix = grad_fx - tf.matmul(parameter, J)
loss_theta = tf.norm(loss_theta_matrix)
return loss_theta
for i in range(10):
for x, y in train_dataset:
with tf.GradientTape(persistent=True) as tape:
tape.watch(x)
parameter, concept, output = model(x)
loss_theta = compute_loss_theta(tape, parameter, concept, output , x)
loss_y = loss_object(y_true=y, y_pred=output)
loss_value = loss_y + eps*loss_theta
gradients = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(gradients, model.trainable_weights))

Binary classification - BCELoss and model output size not corresponding

I'm doing a binary classification, hence I used a binary cross entropy loss:
criterion = torch.nn.BCELoss()
However, I'm getting an error:
Using a target size (torch.Size([64, 1])) that is different to the input size (torch.Size([64, 2])) is deprecated. Please ensure they have the same size.
My model ends with:
x = self.wave_block6(x)
x = self.sigmoid(self.fc(x))
return x.squeeze()
I tried removing the squeeze, but to no avail. My batch size is 64. It seems like I'm doing something simple wrong here. Is my model giving 1 output and BCE loss expecting 2 inputs? Which loss should I use then?
Binary Cross-Entropy Loss (BCELoss) is used for binary classification tasks. Therefore if N is your batch size, your model output should be of shape [64, 1] and your labels must be of shape [64].Therefore just squeeze your output at the 2nd dimension and pass it to the loss function -
Here is a minimal working example
import torch
a = torch.randn((64, 1))
b = torch.randn((64))
loss = torch.nn.BCELoss()
b = torch.round(torch.sigmoid(b)) # just to create some labels
a = torch.sigmoid(a).squeeze(1)
l = loss(a, b)
Update - Basing on the conversation in the comments, focal loss can be defined as follows -
class focalLoss(nn.Module):
def __init__(self, alpha=0.25, gamma=3):
super(focalLoss, self).__init__()
self.alpha = alpha
self.gamma = gamma
def forward(self, pred_logits: torch.Tensor, target: torch.Tensor):
batch_size = pred_logits.shape[0]
pred = pred.view(batch_size, -1)
target = target.view(batch_size, -1)
pred = pred_logits.sigmoid()
ce = F.binary_cross_entropy(pred_logits, target, reduction='none')
alpha = target * self.alpha + (1. - target) * (1. - self.alpha)
pt = torch.where(target == 1, pred, 1 - pred)
return alpha * (1. - pt) ** self.gamma * ce

Triple loss in keras, how to get the anchor, positive, and negative from merged vector

What I am trying to do is use the triple loss as my loss function, but I don't know if I am getting the right values from the merged vector that is used.
So here is my loss function:
def triplet_loss(y_true, y_pred, alpha=0.2):
"""
Implementation of the triplet loss function
Arguments:
y_true -- true labels, required when you define a loss in Keras, not used in this function.
y_pred -- python list containing three objects:
anchor: the encodings for the anchor data
positive: the encodings for the positive data (similar to anchor)
negative: the encodings for the negative data (different from anchor)
Returns:
loss -- real number, value of the loss
"""
print("Ypred")
print(y_pred.shape)
anchor = y_pred[:,0:512]
positive = y_pred[:,512:1024]
negative = y_pred[:,1024:1536]
print(anchor.shape)
print(positive.shape)
print(negative.shape)
#anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2] # Dont think this is working
# distance between the anchor and the positive
pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)))
print("PosDist", pos_dist)
# distance between the anchor and the negative
neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)))
print("Neg Dist", neg_dist)
# compute loss
basic_loss = (pos_dist - neg_dist) + alpha
loss = tf.maximum(basic_loss, 0.0)
return loss
Now this does work when I use this line in the code and nother the sliceing one
anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
But I dont think that this is correct as the shape of the merged vector is (?, 3, 3, 1536)
I think it is grabbing the wrong information. But I cannot seem to figure out how to slice this correctly. as the uncommented code gives me this issue.
Dimensions must be equal, but are 3 and 0 for 'loss_9/concatenate_10_loss/Sub' (op: 'Sub') with input shapes: [?,3,3,1536], [?,0,3,1536].
My network set up is like this:
input_dim = (7,7,2048)
anchor_in = Input(shape=input_dim)
pos_in = Input(shape=input_dim)
neg_in = Input(shape=input_dim)
base_network = create_base_network()
# Run input through base network
anchor_out = base_network(anchor_in)
pos_out = base_network(pos_in)
neg_out = base_network(neg_in)
print(anchor_out.shape)
merged_vector = Concatenate(axis=-1)([anchor_out, pos_out, neg_out])
print("Meged Vector", merged_vector.shape)
print(merged_vector)
model = Model(inputs=[anchor_in, pos_in, neg_in], outputs=merged_vector)
adam = Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(optimizer=adam, loss=triplet_loss)
Update
Using this seems to be right, could anyone confirm this?
anchor = y_pred[:,:,:,0:512]
positive = y_pred[:,:,:,512:1024]
negative = y_pred[:,:,:,1024:1536]
You do not need to do the concatenation operation:
# change this line to this
model = Model(inputs=[anchor_in, pos_in, neg_in], outputs=[anchor_out, pos_out, neg_out])
Complete code:
input_dim = (7,7,2048)
anchor_in = Input(shape=input_dim)
pos_in = Input(shape=input_dim)
neg_in = Input(shape=input_dim)
base_network = create_base_network()
# Run input through base network
anchor_out = base_network(anchor_in)
pos_out = base_network(pos_in)
neg_out = base_network(neg_in)
print(anchor_out.shape)
# code changed here
model = Model(inputs=[anchor_in, pos_in, neg_in], outputs=[anchor_out, pos_out, neg_out])
adam = Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(optimizer=adam, loss=triplet_loss)
Then you can use the following loss:
def triplet_loss(y_true, y_pred, alpha=0.3):
'''
Inputs:
y_true: True values of classification. (y_train)
y_pred: predicted values of classification.
alpha: Distance between positive and negative sample, arbitrarily
set to 0.3
Returns:
Computed loss
Function:
--Implements triplet loss using tensorflow commands
--The following function follows an implementation of Triplet-Loss
where the loss is applied to the network in the compile statement
as usual.
'''
anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
positive_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), -1)
negative_dist = tf.reduce_sum(tf.square(tf.subtract(anchor,negative)), -1)
loss_1 = tf.add(tf.subtract(positive_dist, negative_dist), alpha)
loss = tf.reduce_sum(tf.maximum(loss_1, 0.0))
return loss

Correct way to compute AUC in tensorflow

I'm calculating the area under the curve (AUC) in TensorFlow.
Here is part of my code:
with tf.name_scope("output"):
W = tf.Variable(tf.random_normal([num_filters_total, num_classes], stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
l2_loss += tf.nn.l2_loss(W)
l2_loss += tf.nn.l2_loss(b)
self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
self.softmax_scores = tf.nn.softmax(self.scores)
self.predictions = tf.argmax(self.scores, 1, name="predictions")
# CalculateMean cross-entropy loss
with tf.name_scope("loss"):
self.losses = tf.nn.softmax_cross_entropy_with_logits(labels=self.input_y,logits=self.scores)
self.loss = tf.reduce_mean(self.losses) + l2_reg_lambda * l2_loss
# Accuracy
with tf.name_scope("accuracy"):
correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")
# AUC
with tf.name_scope("auc"):
self.auc = tf.metrics.auc(labels = tf.argmax(self.input_y, 1), predictions = self.predictions)`
`
In the above piece of code, input_y is a tensor with shape (batch_size,2) and predictions has the shape (batch_size,).
Therefore the real values for labels and predictions variables in tf.metrics.auc are [0,1,1,1,0,0,...].
I wonder if it's a correct way to compute AUC?
I've tried with the following command:
self.auc = tf.metrics.auc(labels = tf.argmax(self.input_y, 1), predictions = tf.reduce_max(self.softmax_scores,axis=1))
But this only gives me zero numbers.
Another thing I notice is that while the accuracy is quite stable at the end of the training process, the auc computed by the first method keeps increasing. Is that correct?
Thanks.

Resources