I am implementing the Keras Variational Autoencoder (https://keras.io/examples/generative/vae/). During the training process, the total loss printed is not the sum of the reconstruction loss and the kl loss terms as it should be. Any suggestions on how to solve this problem?
I suspect that the issue is related with the loss trackers but I have no idea how to solve.
If you use the code mentioned in your question the loss shown in during training is the total loss because you return the total loss in the training step. However if you have changed something you should attach your code (in git please :) and show the losses during training.
Related
I am using keras 2.2.4 to train a text-based emotion recognition model, which is three categories classification.
I put accuracy in the model compile metric. While training, the accuracy showed in the result bar was normal, around 86%,and the loss is around 0.35, which i believed it was working properly.
After training, I found that the prediction with the testing set was pretty bad, acc was only around 0.44. with my instinct, it might be overfitting issue with the model. However, with my random curiosity, I put the training set into the model prediction, the accuracy was also pretty bad, same as the testing set.
The result showed that it might not be an overfitting issue with the model, and I cannot come up with any possible reason why this happen. Also, the difference between the accuracy output while training with the training set and the accuracy with the training set after training is even more confusing.
Does anyone ever encounter the same situation, and what problem it may be?
Need help as I am new to Keras and was reading on dropout and how using dropout can have an impact on loss calculation during training and validation phase. This is because dropout is only present at training time and not validation time, so comparing two losses can be misleading.
Question is
The use of learning_phase_scope(1)
how does it impact validation
What steps to do to correct for testing loss when dropout is used?
It's not only Dropout but BatchNormalization as well that need to be changed or it'll affect validation performance.
If you use keras and just want to get validation loss (and or accuracy or other metrics) then you better use model.evaluate() or add validation_data while model.fit and don't do anything with learning_phase_scope.
The learning_phase_scope(1) means it's for training, 0 is for predict/validate.
Personally I use learning_phase_scope only when I want to train something that not end with simply model.fit (visualize CNN filter) but only once so far in past 3 years.
I made two different convolution neural networks for a multi-class classification. And I tested the performance of the two networks using evaluate_generator function in keras. Both models give me comparable accuracies. One gives me 55.9% and the other one gives me 54.8%. However, the model that gives me 55.9% gives a validation loss of 5.37 and the other 1.24.
How can these test losses be so different when the accuracies are
similar. If anything I would expect the loss for the model with
55.9% accuracy to be lower but it's not.
Isn't loss the total sum of errors the network is making?
Insights would be appreciated.
Isn't loss the total sum of errors the network is making?
Well, not really. Loss function or cost function is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event.
For exaple, in regression tasks loss function can be mean squared error. In classification - binary or categorical crossentropy. These loss functions measure how your model understanding of data is close to the reality.
Why both loss and accuracy are high?
High loss doesn't mean you model don't know anything. In basic case you can think about it that the smaller the loss, the more confident the model is in its choice.
So model with the higher loss not really sure about its answers.
You can also read this discussion about high loss and accuracy
Even though the accuracies are similar, the loss value is not correlated when comparing different models.
Accuracy measures the fraction of correctly classified samples over the Total Population of your samples.
With regards to the loss value, from keras documentation:
Return value
For scalars, the loss value of the test (if the model does not have a merit function) or > a list of scalars (if the model computes another merit function).
If this doesn't help on your case (I don't have a way to reproduce the issue), please check the following known issues in keras, with regards to the evaluate_generator function:
evaluate_generator
I have implemented a Keras-based, Bayesian Deep Learning model (based on this repo)
My model's loss appears to be always negative as well as the logits_variance_loss (see screenshot below). Any idea why is this happening or what does it mean for the training? .
And this is after 2 epochs
It may be a stupid question but:
I noticed that the choice of the loss function modifies the accuracy obtained during evaluation.
I thought that the loss was used only during training and of course from it depends the goodness of the model in making prediction but not the accuracy i.e amount of right predictions over the total number of samples.
EDIT
I didn't explain my self correctly.
My question comes because I recently trained a model with binary_crossentropy loss and the accuracy coming from model.evaluate() was 96%.
But it wasn't correct!
I checked "manually" and the model was getting 44% of the total predictions. Then I changed to categorical_crossentropy and then the accuracy was correct.
MAYBE ANSWER
From: another question
I have found the problem. metrics=['accuracy'] calculates accuracy automatically from cost function. So using binary_crossentropy shows binary accuracy, not categorical accuracy. Using categorical_crossentropy automatically switches to categorical accuracy and now it is the same as calculated manually using model1.predict().
Keras chooses the performace metric to use based on your loss funktion. When you use binary_crossentropy it although uses binary_accuracy which is computed differently than categorical_accuracy. You should always use categorical_crossentropy if you have more than one output neuron.
The model tries to minimize the loss function chosen. It adjusts the weights to do this. A different loss function results in different weights.
Those weights determine how many correct predictions are made over the total number of samples. So it is correct behavior to see that the loss function chosen will affect the model accuracy.
From: another question
I have found the problem. metrics=['accuracy'] calculates accuracy
automatically from cost function. So using binary_crossentropy shows
binary accuracy, not categorical accuracy. Using
categorical_crossentropy automatically switches to categorical
accuracy and now it is the same as calculated manually using
model1.predict().