In one of google's articles (EXPLORING TRADEOFFS IN MODELS FOR LOW-LATENCY SPEECH ENHANCEMENT) the use this
loss function to minimize the error, I have coded this function in Python with Tensorflow,
def loss_cal(noise_source, mask, target):
landa = 0.113
masked_spec = noise_source*mask
cc = K.l2_normalize(tf.abs(tf.pow(tf.maximum(1e-4, target), 0.3))-tf.abs(tf.pow(tf.maximum(1e-4, masked_spec), 0.3)))
cm = K.l2_normalize(tf.math.pow(tf.maximum(1e-4, target), 0.3)-tf.math.pow(tf.maximum(1e-4, masked_spec), 0.3))
res = tf.math.square(cc)+landa*tf.math.square(cm)
return res
but it returns a matrix, while the loss functions must be returing a scalre, please correct me, is my implementation wrong? or it's possible to train a model with loss as matrix?
Shouldn't you be using just norm, not l2_normalize?
l2_normalize returns a matrix normalized with the norm value.
Related
I try to write a cross entropy loss function by myself. My loss function gives the same loss value as the official one, but when i use my loss function in the code instead of official cross entropy loss function, the code does not converge. When i use the official cross entropy loss function, the code converges. Here is my code, please give me some suggestions. Thanks very much
The input 'out' is a tensor (B*C) and 'label' contains class indices (1 * B)
class MylossFunc(nn.Module):
def __init__(self):
super(MylossFunc, self).__init__()
def forward(self, out, label):
out = torch.nn.functional.softmax(out, dim=1)
n = len(label)
loss = torch.FloatTensor([0])
loss = Variable(loss, requires_grad=True)
tmp = torch.log(out)
#print(out)
torch.scalar_tensor(-100)
for i in range(n):
loss = loss - torch.max(tmp[i][label[i]], torch.scalar_tensor(-100) )/n
loss = torch.sum(loss)
return loss
Instead of using torch.softmax and torch.log, you should use torch.log_softmax, otherwise your training will become unstable with nan values everywhere.
This happens because when you take the softmax of your logits using the following line:
out = torch.nn.functional.softmax(out, dim=1)
you might get a zero in one of the components of out, and when you follow that by applying torch.log it will result in nan (since log(0) is undefined). That is why torch (and other common libraries) provide a single stable operation, log_softmax, to avoid the numerical instabilities that occur when you use torch.softmax and torch.log individually.
i'm trying to define the loss function of a two-class classification problem. However, the target label is not hard label 0,1, but a float number between 0~1.
torch.nn.CrossEntropy in Pytorch do not support soft label so i'm trying to write a cross entropy function by my self.
My function looks like this
def cross_entropy(self, pred, target):
loss = -torch.mean(torch.sum(target.flatten() * torch.log(pred.flatten())))
return loss
def step(self, batch: Any):
x, y = batch
logits = self.forward(x)
loss = self.criterion(logits, y)
preds = logits
# torch.argmax(logits, dim=1)
return loss, preds, y
however it does not work at all.
Can anyone give me a suggestion is there any mistake in my loss function?
It seems like BCELoss and the robust version BCEWithLogitsLoss are working with fuzzy targets "out of the box". They do not expect target to be binary" any number between zero and one is fine.
Please read the doc.
I have a 1000 classes in the network and they have multi-label outputs. For each training example, the number of positive output is same(i.e 10) but they can be assigned to any of the 1000 classes. So 10 classes have output 1 and rest 990 have output 0.
For the multi-label classification, I am using 'binary-cross entropy' as cost function and 'sigmoid' as the activation function. When I tried this rule of 0.5 as the cut-off for 1 or 0. All of them were 0. I understand this is a class imbalance problem. From this link, I understand that, I might have to create extra output labels.Unfortunately, I haven't been able to figure out how to incorporate that into a simple neural network in keras.
nclasses = 1000
# if we wanted to maximize an imbalance problem!
#class_weight = {k: len(Y_train)/(nclasses*(Y_train==k).sum()) for k in range(nclasses)}
inp = Input(shape=[X_train.shape[1]])
x = Dense(5000, activation='relu')(inp)
x = Dense(4000, activation='relu')(x)
x = Dense(3000, activation='relu')(x)
x = Dense(2000, activation='relu')(x)
x = Dense(nclasses, activation='sigmoid')(x)
model = Model(inputs=[inp], outputs=[x])
adam=keras.optimizers.adam(lr=0.00001)
model.compile('adam', 'binary_crossentropy')
history = model.fit(
X_train, Y_train, batch_size=32, epochs=50,verbose=0,shuffle=False)
Could anyone help me with the code here and I would also highly appreciate if you could suggest a good 'accuracy' metric for this problem?
Thanks a lot :) :)
I have a similar problem and unfortunately have no answer for most of the questions. Especially the class imbalance problem.
In terms of metric there are several possibilities: In my case I use the top 1/2/3/4/5 results and check if one of them is right. Because in your case you always have the same amount of labels=1 you could take your top 10 results and see how many percent of them are right and average this result over your batch size. I didn't find a possibility to include this algorithm as a keras metric. Instead, I wrote a callback, which calculates the metric on epoch end on my validation data set.
Also, if you predict the top n results on a test dataset, see how many times each class is predicted. The Counter Class is really convenient for this purpose.
Edit: If found a method to include class weights without splitting the output.
You need a numpy 2d array containing weights with shape [number classes to predict, 2 (background and signal)].
Such an array could be calculated with this function:
def calculating_class_weights(y_true):
from sklearn.utils.class_weight import compute_class_weight
number_dim = np.shape(y_true)[1]
weights = np.empty([number_dim, 2])
for i in range(number_dim):
weights[i] = compute_class_weight('balanced', [0.,1.], y_true[:, i])
return weights
The solution is now to build your own binary crossentropy loss function in which you multiply your weights yourself:
def get_weighted_loss(weights):
def weighted_loss(y_true, y_pred):
return K.mean((weights[:,0]**(1-y_true))*(weights[:,1]**(y_true))*K.binary_crossentropy(y_true, y_pred), axis=-1)
return weighted_loss
weights[:,0] is an array with all the background weights and weights[:,1] contains all the signal weights.
All that is left is to include this loss into the compile function:
model.compile(optimizer=Adam(), loss=get_weighted_loss(class_weights))
I have trained GAN on celebA dataset. After that i separate G and D. Then i pick one image from celebA training dataset say yTrue and now i want to find the closest image to yTrue that G can generate say yPred. So the loss at output of G is ||yTrue - yPred||_2^{2} and i minimized it w.r.t generator input(latent variable from normal distribution). Below is code that is giving good results. Now the problem is i want to also add prior loss (log(1-D(G(z))) 1 in first line but i am not getting how to do it as D is not connected to G now and if i directly add k.mean(k.log(1-D.predict(G.output))) in first line it returns numpy array not tensor that is not allowed.
`loss = K.mean(K.square(yTrue - gf.output))
grad = K.gradients(loss,[gf.input])[0]
fn = K.function([gf.input], [grad])
generator_input = np.random.normal(0,1,[1,100])
for i in range(5000):
grad1 = fn([generator_input])
generator_input -= grads[0]*.01
recovered = gf.predict(generator_input)`
In keras, you get the final output to create loss functions. Then, you will have to train the full network to achieve that loss. (Train G+D joined as a single model).
In the loss function, you will have y_true and y_pred, and you use them to compare:
PS: if MSE is not taking the output of the discriminator, please detail your questoin better.
import keras.backend as K
def customLoss(yTrue,yPred):
mse = K.mean(K.square(yTrue-yPred)
prior = K.mean(K.log(1-yPred))
return mse + prior
Pass this function when compiling the model
discriminator.compile(loss=customLoss,optimizer=.....)
I have built a Keras model for image segmentation (U-Net). However in my samples some misclassifications (areas) are not that important, while other are crucial, so I want to assign higher weight in loss function to them. To complicate things further, I would like some misclassifications (class 1 instead of 2) to have very high penalty while inverse (class 2 instead of 1) shouldn't be penalized that much.
The way I see it, I need to use a sum (across all of the pixels) of weighted categorical crossentropy, but the best I could find is this:
def w_categorical_crossentropy(y_true, y_pred, weights):
nb_cl = len(weights)
final_mask = K.zeros_like(y_pred[:, 0])
y_pred_max = K.max(y_pred, axis=1)
y_pred_max = K.reshape(y_pred_max, (K.shape(y_pred)[0], 1))
y_pred_max_mat = K.cast(K.equal(y_pred, y_pred_max), K.floatx())
for c_p, c_t in product(range(nb_cl), range(nb_cl)):
final_mask += (weights[c_t, c_p] * y_pred_max_mat[:, c_p] * y_true[:, c_t])
return K.categorical_crossentropy(y_pred, y_true) * final_mask
However this code only works with a single prediction and my knowledge of Keras inner workings is lacking (and math side of it is not much better). Anyone know how I can adapt it, or even better, is there a ready-made loss function which would suit my case?
I would appreciate some pointers.
EDIT: my question is similar to How to do point-wise categorical crossentropy loss in Keras?, except that I would like to use weighted categorical crossentropy.
You can use weight maps (as proposed in the U-Net paper). In those weight maps, you can weight regions with more weight or less weight. Here is some pseudocode:
loss = compute_categorical_crossentropy()
weighted_loss = loss * weight_map # using element-wise multiplication