How to customize loss function in keras based on the y_true

How to customize loss function in keras based on the y_true - keras

I want to custom a loss function based on the y_true values. y_true is a binary value. For each mini-batch, I want to treat y_true==0 and y_true==1 differently. Currently, I have:
def custom_loss(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
zero = tf.fill(tf.shape(y_true_f), 0.0)
one = tf.fill(tf.shape(y_true_f), 1.0)
mask_0 = tf.equal(y_true_f,zero)
mask_1 = tf.equal(y_true_f,one)
y_pred_1 = tf.boolean_mask(y_pred_f,mask_1)
y_pred_0 = tf.boolean_mask(y_pred_f,mask_0)
y_true_1 = tf.boolean_mask(y_true_f,mask_1)
y_true_0 = tf.boolean_mask(y_true_f,mask_0)
loss1 = K.binary_crossentropy(y_true_1,y_pred_1)
loss0 = K.binary_crossentropy(y_true_0,y_pred_0)
loss = loss1 +a*loss0 # a is an arbitrary number
However, I got an nan loss error. I guess it is because I am training on an imbalance data where only a few cases having y_true==1. So when there is no y_true==1 in this minibatch, there is nan. I want to add if condition based on the shape of mask_1. How can I do that?

you can achieve this with the same technique as cross entropy loss function. Here is the function loss = ((y_true)*(Loss1)) + ((1 - y_true)*(Loss2)), so if your y_true = 0, first term will be equal to zero and result in loss = ((0)*Loss1) + ((1 - 0)*Loss2) = Loss2. If your y_true = 1 , your second term will become zero, loss = ((1)*Loss1) + ((1 - 1)*Loss2) = Loss1
Therefore, you can have 2 Loss function depend on your y_true = {0,1}

Related

Custom loss function in keras with class weights for each batch

I am new to deep learning and tensorflow. I am working on a speech binary classification problem, trying to replicate a research paper. Number of samples in class 1 are 2700 approx and in class 2 are 1200 approx. The paper has used MFCC features for binary classification with 88.3% accuracy, 88% F1-Score and 82.3% recall. They have also used a custom loss function in which they weighted average between specificity and recall with focus on specificity, with formula:
custom_loss = 1 - (0.85 * specificity + 0.15 * recall)
I have implemented all the parameters given by the paper. The only thing which is still not addressed by me is class imbalance (Class 1 -2700 and class 2 - 1200). I tried oversampling the class 2 with different oversampling methods but nothing worked. The accuracy I achieve is at max 68% with algorithm just learning the majority class well and performing poorly on minority class.
The class weight technique as follows did not work with custom loss function:
from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight('balanced', classes = [0, 1], y = y_train)
weights = {i:w for i,w in enumerate(class_weights)}
Hence I tried weighted custom loss function, but the results were not good. Accuracy was still near 68%.
I am sharing the code as follows:
from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight('balanced', classes = [0, 1], y = y_train)
weights = {i:w for i,w in enumerate(class_weights)}
def binary_recall_specificity(y_true, y_pred, recall_weight, spec_weight):
y_true = K.clip(y_true, K.epsilon(), 1)
y_pred = K.clip(y_pred, K.epsilon(), 1)
ground_positives = K.sum(y_true, axis=0) + K.epsilon() # = TP + FN
pred_positives = K.sum(y_pred, axis=0) + K.epsilon() # = TP + FP
true_positives = K.sum(y_true * y_pred, axis=0) + K.epsilon() # = TP
neg_y_true = 1 - y_true
neg_y_pred = 1 - y_pred
fp = K.sum(neg_y_true * y_pred)
tn = K.sum(neg_y_true * neg_y_pred)
specificity = tn / (tn + fp + K.epsilon())
recall = true_positives / ground_positives
loss1 = 1.0 - (recall_weight*recall + spec_weight*specificity)
return loss1 * class_weights.tolist()
def custom_loss(recall_weight, spec_weight):
def recall_spec_loss(y_true, y_pred):
return binary_recall_specificity(y_true, y_pred, recall_weight, spec_weight)
# Returns the (y_true, y_pred) loss function
return recall_spec_loss
Am I doing something wrong here? Please guide. I have read answers to other questions to the same questions too but unable to get my query solved.

Computing Jacobian and Derivative in Tensorflow is extremely slow

Is there a more efficient way to compute Jacobian (there must be, it doesn't even run for a single batch) I want to compute the loss as given in the self-explanatory neural network. Input has a shape of (32, 365, 3) where 32 is the batch size. The loss I want to minimize is Equation 3 of the paper.
I believe that I am not using the GradientTape optimally.
def compute_loss_theta(tape, parameter, concept, output, x):
b = x.shape[0]
in_dim = (x.shape[1], x.shape[2])
feature_dim = in_dim[0]*in_dim[1]
J = tape.batch_jacobian(concept, x)
grad_fx = tape.gradient(output, x)
grad_fx = tf.reshape(grad_fx,shape=(b, feature_dim))
J = tf.reshape(J, shape=(b, feature_dim, feature_dim))
parameter = tf.expand_dims(parameter, axis =1)
loss_theta_matrix = grad_fx - tf.matmul(parameter, J)
loss_theta = tf.norm(loss_theta_matrix)
return loss_theta
for i in range(10):
for x, y in train_dataset:
with tf.GradientTape(persistent=True) as tape:
tape.watch(x)
parameter, concept, output = model(x)
loss_theta = compute_loss_theta(tape, parameter, concept, output , x)
loss_y = loss_object(y_true=y, y_pred=output)
loss_value = loss_y + eps*loss_theta
gradients = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(gradients, model.trainable_weights))

Binary classification - BCELoss and model output size not corresponding

I'm doing a binary classification, hence I used a binary cross entropy loss:
criterion = torch.nn.BCELoss()
However, I'm getting an error:
Using a target size (torch.Size([64, 1])) that is different to the input size (torch.Size([64, 2])) is deprecated. Please ensure they have the same size.
My model ends with:
x = self.wave_block6(x)
x = self.sigmoid(self.fc(x))
return x.squeeze()
I tried removing the squeeze, but to no avail. My batch size is 64. It seems like I'm doing something simple wrong here. Is my model giving 1 output and BCE loss expecting 2 inputs? Which loss should I use then?

Binary Cross-Entropy Loss (BCELoss) is used for binary classification tasks. Therefore if N is your batch size, your model output should be of shape [64, 1] and your labels must be of shape [64].Therefore just squeeze your output at the 2nd dimension and pass it to the loss function -
Here is a minimal working example
import torch
a = torch.randn((64, 1))
b = torch.randn((64))
loss = torch.nn.BCELoss()
b = torch.round(torch.sigmoid(b)) # just to create some labels
a = torch.sigmoid(a).squeeze(1)
l = loss(a, b)
Update - Basing on the conversation in the comments, focal loss can be defined as follows -
class focalLoss(nn.Module):
def __init__(self, alpha=0.25, gamma=3):
super(focalLoss, self).__init__()
self.alpha = alpha
self.gamma = gamma
def forward(self, pred_logits: torch.Tensor, target: torch.Tensor):
batch_size = pred_logits.shape[0]
pred = pred.view(batch_size, -1)
target = target.view(batch_size, -1)
pred = pred_logits.sigmoid()
ce = F.binary_cross_entropy(pred_logits, target, reduction='none')
alpha = target * self.alpha + (1. - target) * (1. - self.alpha)
pt = torch.where(target == 1, pred, 1 - pred)
return alpha * (1. - pt) ** self.gamma * ce

Triple loss in keras, how to get the anchor, positive, and negative from merged vector

What I am trying to do is use the triple loss as my loss function, but I don't know if I am getting the right values from the merged vector that is used.
So here is my loss function:
def triplet_loss(y_true, y_pred, alpha=0.2):
"""
Implementation of the triplet loss function
Arguments:
y_true -- true labels, required when you define a loss in Keras, not used in this function.
y_pred -- python list containing three objects:
anchor: the encodings for the anchor data
positive: the encodings for the positive data (similar to anchor)
negative: the encodings for the negative data (different from anchor)
Returns:
loss -- real number, value of the loss
"""
print("Ypred")
print(y_pred.shape)
anchor = y_pred[:,0:512]
positive = y_pred[:,512:1024]
negative = y_pred[:,1024:1536]
print(anchor.shape)
print(positive.shape)
print(negative.shape)
#anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2] # Dont think this is working
# distance between the anchor and the positive
pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)))
print("PosDist", pos_dist)
# distance between the anchor and the negative
neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)))
print("Neg Dist", neg_dist)
# compute loss
basic_loss = (pos_dist - neg_dist) + alpha
loss = tf.maximum(basic_loss, 0.0)
return loss
Now this does work when I use this line in the code and nother the sliceing one
anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
But I dont think that this is correct as the shape of the merged vector is (?, 3, 3, 1536)
I think it is grabbing the wrong information. But I cannot seem to figure out how to slice this correctly. as the uncommented code gives me this issue.
Dimensions must be equal, but are 3 and 0 for 'loss_9/concatenate_10_loss/Sub' (op: 'Sub') with input shapes: [?,3,3,1536], [?,0,3,1536].
My network set up is like this:
input_dim = (7,7,2048)
anchor_in = Input(shape=input_dim)
pos_in = Input(shape=input_dim)
neg_in = Input(shape=input_dim)
base_network = create_base_network()
# Run input through base network
anchor_out = base_network(anchor_in)
pos_out = base_network(pos_in)
neg_out = base_network(neg_in)
print(anchor_out.shape)
merged_vector = Concatenate(axis=-1)([anchor_out, pos_out, neg_out])
print("Meged Vector", merged_vector.shape)
print(merged_vector)
model = Model(inputs=[anchor_in, pos_in, neg_in], outputs=merged_vector)
adam = Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(optimizer=adam, loss=triplet_loss)
Update
Using this seems to be right, could anyone confirm this?
anchor = y_pred[:,:,:,0:512]
positive = y_pred[:,:,:,512:1024]
negative = y_pred[:,:,:,1024:1536]

You do not need to do the concatenation operation:
# change this line to this
model = Model(inputs=[anchor_in, pos_in, neg_in], outputs=[anchor_out, pos_out, neg_out])
Complete code:
input_dim = (7,7,2048)
anchor_in = Input(shape=input_dim)
pos_in = Input(shape=input_dim)
neg_in = Input(shape=input_dim)
base_network = create_base_network()
# Run input through base network
anchor_out = base_network(anchor_in)
pos_out = base_network(pos_in)
neg_out = base_network(neg_in)
print(anchor_out.shape)
# code changed here
model = Model(inputs=[anchor_in, pos_in, neg_in], outputs=[anchor_out, pos_out, neg_out])
adam = Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(optimizer=adam, loss=triplet_loss)
Then you can use the following loss:
def triplet_loss(y_true, y_pred, alpha=0.3):
'''
Inputs:
y_true: True values of classification. (y_train)
y_pred: predicted values of classification.
alpha: Distance between positive and negative sample, arbitrarily
set to 0.3
Returns:
Computed loss
Function:
--Implements triplet loss using tensorflow commands
--The following function follows an implementation of Triplet-Loss
where the loss is applied to the network in the compile statement
as usual.
'''
anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
positive_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), -1)
negative_dist = tf.reduce_sum(tf.square(tf.subtract(anchor,negative)), -1)
loss_1 = tf.add(tf.subtract(positive_dist, negative_dist), alpha)
loss = tf.reduce_sum(tf.maximum(loss_1, 0.0))
return loss

Correct way to compute AUC in tensorflow

I'm calculating the area under the curve (AUC) in TensorFlow.
Here is part of my code:
with tf.name_scope("output"):
W = tf.Variable(tf.random_normal([num_filters_total, num_classes], stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
l2_loss += tf.nn.l2_loss(W)
l2_loss += tf.nn.l2_loss(b)
self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
self.softmax_scores = tf.nn.softmax(self.scores)
self.predictions = tf.argmax(self.scores, 1, name="predictions")
# CalculateMean cross-entropy loss
with tf.name_scope("loss"):
self.losses = tf.nn.softmax_cross_entropy_with_logits(labels=self.input_y,logits=self.scores)
self.loss = tf.reduce_mean(self.losses) + l2_reg_lambda * l2_loss
# Accuracy
with tf.name_scope("accuracy"):
correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")
# AUC
with tf.name_scope("auc"):
self.auc = tf.metrics.auc(labels = tf.argmax(self.input_y, 1), predictions = self.predictions)`
`
In the above piece of code, input_y is a tensor with shape (batch_size,2) and predictions has the shape (batch_size,).
Therefore the real values for labels and predictions variables in tf.metrics.auc are [0,1,1,1,0,0,...].
I wonder if it's a correct way to compute AUC?
I've tried with the following command:
self.auc = tf.metrics.auc(labels = tf.argmax(self.input_y, 1), predictions = tf.reduce_max(self.softmax_scores,axis=1))
But this only gives me zero numbers.
Another thing I notice is that while the accuracy is quite stable at the end of the training process, the auc computed by the first method keeps increasing. Is that correct?
Thanks.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to customize loss function in keras based on the y_true - keras

Related

Custom loss function in keras with class weights for each batch

Computing Jacobian and Derivative in Tensorflow is extremely slow

Binary classification - BCELoss and model output size not corresponding

Triple loss in keras, how to get the anchor, positive, and negative from merged vector

Correct way to compute AUC in tensorflow

Categories

Resources