How to use nn.CrossEntropyLoss() for a PatchGAN Discriminator output? - pytorch

I am trying to use the nn.CrossEntropyLoss() to find the cross-entropy loss between reals and fakes of a patchGAN discriminator that outputs a tensor of shape (batch_size, 1, 30, 30).
I am confused with the documentation here that asks for class indexes instead of targets.
CE_loss = nn.CrossEntropyLoss()
real_loss = CE_loss(discriminator_real_outputs, torch.ones_like(discriminator_real_outputs))
fake_loss= CE_loss(discriminator_fake_outputs, torch.zeros_like(discriminator_fake_outputs))
I understand from an error that the target requires long int. I converted it to long.
CE_loss = nn.CrossEntropyLoss()
real_loss = CE_loss(discriminator_real_outputs, torch.ones_like(discriminator_real_outputs).long())
fake_loss= CE_loss(discriminator_fake_outputs, torch.zeros_like(discriminator_fake_outputs).long())
Then it asked for the target to be of the (batch_size, 30,30) instead of (batch_size, 1,30,30), fixed that too.
After that, it returned this error cuda runtime error (710) device-side assert triggered which broke the GPU on Google Colab and couldn't run it again until I reset the runtime.
I want to use this loss like other losses in the form
loss = Loss(input, target) without the index. How do I go about this?

Related

torch.nn.CrossEntropyLoss over Multiple Batches

I am currently working with torch.nn.CrossEntropyLoss. As far as I know, it is common to compute the loss batch-wise. However, is there a possibility to compute the loss over multiple batches?
More concretely, assume we are given the data
import torch
features = torch.randn(no_of_batches, batch_size, feature_dim)
targets = torch.randint(low=0, high=10, size=(no_of_batches, batch_size))
loss_function = torch.nn.CrossEntropyLoss()
Is there a way to compute in one line
loss = loss_function(features, targets) # raises RuntimeError: Expected target size [no_of_batches, feature_dim], got [no_of_batches, batch_size]
?
Thank you in advance!
You can compute multiple cross-entropy losses but you'll need to do your own reduction. Since cross-entropy loss assumes the feature dim is always the second dimension of the features tensor you will also need to permute it first.
loss_function = torch.nn.CrossEntropyLoss(reduction='none')
loss = loss_function(features.permute(0,2,1), targets).mean(dim=1)
which will result in a loss tensor with no_of_batches entries.

Cross Entropy for Soft Labeling in Pytorch

i'm trying to define the loss function of a two-class classification problem. However, the target label is not hard label 0,1, but a float number between 0~1.
torch.nn.CrossEntropy in Pytorch do not support soft label so i'm trying to write a cross entropy function by my self.
My function looks like this
def cross_entropy(self, pred, target):
loss = -torch.mean(torch.sum(target.flatten() * torch.log(pred.flatten())))
return loss
def step(self, batch: Any):
x, y = batch
logits = self.forward(x)
loss = self.criterion(logits, y)
preds = logits
# torch.argmax(logits, dim=1)
return loss, preds, y
however it does not work at all.
Can anyone give me a suggestion is there any mistake in my loss function?
It seems like BCELoss and the robust version BCEWithLogitsLoss are working with fuzzy targets "out of the box". They do not expect target to be binary" any number between zero and one is fine.
Please read the doc.

MSE loss in tensorflow 2.0 mistakes y_true for a reduction key

I am using a really simple neural network with the latest version of tensorflow 2.0, on a jupyter notebook running python 3.7.0.
The NN has Xip, a float as output, which I use as a parameter in my function MainGaussian_1_U, which approximates an image. When I try to compute the loss using MeanSquareError between the real image img and the approximation mk, I am given an error in which the loss function seems to take img as a reduction key. After searches, I still have no idea what this key is supposed to be, and can't find a way to debug my code:
model = tf.keras.models.Sequential()
# Add the layers
model.add(tf.keras.layers.Dense(64, activation="relu"))
model.add(tf.keras.layers.Dense(32, activation="relu"))
model.add(tf.keras.layers.Dense(1, activation="relu"))
# The loss method
loss_object = tf.keras.losses.MeanSquaredError()
# The optimize
optimizer = tf.keras.optimizers.Adam()
# This metrics is used to track the progress of the training loss during the training
train_loss = tf.keras.metrics.Mean(name='train_loss')
def train_step(Data, img):
MainGaussian_init(Data)
for _ in range (5):
with tf.GradientTape() as tape:
Xip= model( (sizeh**-2 * np.ones((sizeh, sizeh))).reshape(-1, 49))
MainGaussian_1_U ()
print ("img=", img)
loss= tf.keras.losses.MeanSquaredError(img, mk)
print ("loss=", loss)
gradients = tape.gradient(loss, model.trainable_variables)
print (gradients)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_loss(loss)
train_step (TestFile, TestFile[4])
The error given is:
c:\program files\python37\lib\site-packages\tensorflow_core\python\ops\losses\loss_reduction.py:67: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
if key not in cls.all():
...
ValueError: Invalid Reduction Key [[21.05224609 20.79420471 34.9659729 ... 48.09233093 68.83874512
83.10766602]
[20.93516541 17.0511322 39.00476074 ... 56.74258423 47.75274658
98.57067871]
[38.18562317 22.70791626 24.37176514 ... 64.9606781 47.65338135
67.61506653]
...
[85.76565552 79.45443726 73.64129639 ... 73.66456604 47.06422424
49.44664001]
[87.14616394 82.38183594 77.00856018 ... 66.21652222 71.32862854
58.39285278]
[36.74142456 37.27145386 34.52891541 ... 29.58699036 37.37667847
30.25990295]].
This is my first question here on Stack Overflow: please let me know if I can make it any clearer!
You correctly create the "loss object", but never use it. Instead, your code tries to create a new "loss object" with the images as parameters (which doesn't work). Instead, you want to put the images into the already-created loss object. You just have to change this line
loss= tf.keras.losses.MeanSquaredError(img, mk)
to
loss= loss_object(img, mk)

Pytorch Categorical Cross Entropy loss function behaviour

I have question regarding the computation made by the Categorical Cross Entropy Loss from Pytorch.
I have made this easy code snippet and because I use the argmax of the output tensor as the targets, I cannot understand why the loss is still high.
import torch
import torch.nn as nn
ce_loss = nn.CrossEntropyLoss()
output = torch.randn(3, 5, requires_grad=True)
targets = torch.argmax(output, dim=1)
loss = ce_loss(outputs, targets)
print(loss)
Thanks for the help understanding it.
Best regards
Jerome
So here is a sample data from your code with the output, label and loss having the following values
outputs = tensor([[ 0.5968, -0.8249, 1.5018, 2.7888, -0.6125],
[-1.1534, -0.4921, 1.0688, 0.2241, -0.0257],
[ 0.3747, 0.8957, 0.0816, 0.0745, 0.2695]], requires_grad=True)requires_grad=True)
labels = tensor([3, 2, 1])
loss = tensor(0.7354, grad_fn=<NllLossBackward>)
So let's examine the values,
If you compute the softmax output of your logits (outputs), using something like this torch.softmax(outputs,axis=1) you will get
probs = tensor([[0.0771, 0.0186, 0.1907, 0.6906, 0.0230],
[0.0520, 0.1008, 0.4801, 0.2063, 0.1607],
[0.1972, 0.3321, 0.1471, 0.1461, 0.1775]], grad_fn=<SoftmaxBackward>)
So these will be your prediction probabilities.
Now cross-entropy loss is nothing but a combination of softmax and negative log likelihood loss. Hence, your loss can simply be computed using
loss = (torch.log(1/probs[0,3]) + torch.log(1/probs[1,2]) + torch.log(1/probs[2,1])) / 3
, which is the average of the negative log of the probabilities of your true labels. The above equation evaluates to 0.7354, which is equivalent to the value returned from the nn.CrossEntropyLoss module.

keras error when using custom loss

I was to use a simple BiLSTM model with my own custom loss function in Keras.
See below.
model = Sequential()
model.add(Bidirectional(LSTM(128, return_sequences=True), input_shape=(1,8)))
model.add(Bidirectional(LSTM(128)))
model.add(Dense(64, activation='relu'))
model.add(Dense(20, activation='softmax'))
def my_loss_np(y_true, y_pred):
labels = [np.argmax(y_pred[i]) for i in range(y_pred.shape[1])]
loss = np.mean(labels)
return loss
import keras.backend as K
def my_loss(y_true, y_pred):
loss = K.eval(my_loss_np(K.eval(y_true), K.eval(y_pred)))
return loss
When I compile this model, I get an error -
model.compile(loss=my_loss, optimizer='adam')
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'dense_95_target' with dtype float and shape [?,?]
[[Node: dense_95_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
There are several issues here with your loss function:
You are using NumPy on tensors, unfortunately though it is an intuitive this doesn't work. You need to use tensor operators from the Keras backend, they are very similar.
To that end you are calling K.eval but at this stage you are still constructing a symbolic computation graph which will be run in TensorFlow or Theano. So the tensors don't have a value to compute per say, you need to keep it symbolic, you can get any values like you do in NumPy.
Even if you fix the problems above, you are using a non-differentiable operation argmax which will not work with gradient descent algorithms.
Your model looks like a multi-label classification problem, 20 classes as your final layer is 20 with softmax. In this case, the literature uses categorical-crossentropy loss to train the classifier network.

Resources