How does keras obtain a floating point value for loss from a vector of loss values? - python-3.x

I am training an ML model in Tensorflow 1.15 using a custom training loop and I want to print out the loss, however, it is a vector with a dimension equal to the batch size. What does Keras do when you call model.fit() to print the loss as a float? Does it reduce it to the mean of the vector, or perhaps it performs some other reduction?
The reason I am asking is that I want to make sure the loss I am logging is consistent throughout my models and others did not require a custom training loop.

Related

For regression model, why the validation set passed to model.fit have different metric result than the model.evaluate?

I have a regression model with Euclidean distance as a loss function and RMSE as a metric evaluation (lower is better). When I passed my train, test sets to model.fit I have train_rmse, and test_rmse which their values made sense to me. But when I pass the test sets into model.evalute after loading the weight of the trained model I got different results which are approximately twice the result with model.fit. And I am aware of the difference that should happen between the train evaluation and test evaluation as I know from Keras that :
the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.
But here I am talking about the result of test-set passed to model.fit in which I beleived is evaluated on the final model. In Keras documentation, they said on validation argument that I am passing the test set in it:
validation_data: Data on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data.
When I searched about the problem I found several issues
1- Some people like here report that this issue is with the model itself if they have batch normalization layer,or if you do transfer learning and freeze some BN layers like here. my model has BN layers, and I did not freeze any layer. Also, I used the same model for the mulit-class classification problem (not regression) and the result was the same for test set in the model.fit and model.evaluate.
2- Other people like said that this is related with either the prediction or metric calculation like here, in which they found that this difference is related with the different of dtype for y_true and y_pred if someone is float32 and other float64 for example, then the metric calculation will be different. When they unified the dtype the problem is fixed.
I believed that the last case applied to me since in the regression task my labels now is tf.float32. My y_true labels already cast to tf.float32 through tfrecord, so I tried to cast the y_pred to tf.float32 before the rmse calculation, and I still have the difference in the result.
So My questions are:
Why this difference in results
To whom I should rely for test set, on model.fit result or model.evalute
I know that for training loss and accuracy, keras does a running average over the batches, and I know for model.evalute, these metric are calculated by taking all the dataset one time on the final model. But how the validation loss and accuracy calculated for validation set passed to model.fit?
UPDATE:
The problem was in the shape conflict between the y_true and y_pred. As for y_true label I save it in tfrecords as float single value and eventually will be with the size of [batch_size] while the regression model gives the prediction as [batch_size, 1] and then the result of tf.subtract(y_true, y_pred) in rmse equation will result in matrix of [batch_size, batch_sizze] and with taking the mean of this final one you will never guess it is wrong and the code will not throw any error but the calculation of rmse will be wrong. I am still working to make the shape consistent but still didn't find a good solution.

Keras "acc" metrics - an algorithm

In Keras I often see people compile a model with mean square error function and "acc" as metrics.
model.compile(optimizer=opt, loss='mse', metrics=['acc'])
I have been reading about acc and I can not find an algorithm for it?
What if I would change my loss function to binary crossentropy for an example and use 'acc' as metrics? Would this be the same metrics as in first case or Keras changes this acc based on loss function - so binary crossentropy in this case?
Check the source code from line 375. The metric_fn change dependent on loss function, so it is automatically handled by keras.
If you want to compare models using different loss function it could in some cases be necessary to specify what accuracy method you want to grade your model with, such that the models actually are tested with the same tests.

Why in model.evaluate() from Keras the loss is used to calculate accuracy?

It may be a stupid question but:
I noticed that the choice of the loss function modifies the accuracy obtained during evaluation.
I thought that the loss was used only during training and of course from it depends the goodness of the model in making prediction but not the accuracy i.e amount of right predictions over the total number of samples.
EDIT
I didn't explain my self correctly.
My question comes because I recently trained a model with binary_crossentropy loss and the accuracy coming from model.evaluate() was 96%.
But it wasn't correct!
I checked "manually" and the model was getting 44% of the total predictions. Then I changed to categorical_crossentropy and then the accuracy was correct.
MAYBE ANSWER
From: another question
I have found the problem. metrics=['accuracy'] calculates accuracy automatically from cost function. So using binary_crossentropy shows binary accuracy, not categorical accuracy. Using categorical_crossentropy automatically switches to categorical accuracy and now it is the same as calculated manually using model1.predict().
Keras chooses the performace metric to use based on your loss funktion. When you use binary_crossentropy it although uses binary_accuracy which is computed differently than categorical_accuracy. You should always use categorical_crossentropy if you have more than one output neuron.
The model tries to minimize the loss function chosen. It adjusts the weights to do this. A different loss function results in different weights.
Those weights determine how many correct predictions are made over the total number of samples. So it is correct behavior to see that the loss function chosen will affect the model accuracy.
From: another question
I have found the problem. metrics=['accuracy'] calculates accuracy
automatically from cost function. So using binary_crossentropy shows
binary accuracy, not categorical accuracy. Using
categorical_crossentropy automatically switches to categorical
accuracy and now it is the same as calculated manually using
model1.predict().

Keras: How is the loss evaluated during optimization for a network with multiple output layers?

I am using functional API in Keras to build a neural network model with multiple output layers.
I was wondering how the loss is evaluated when updating the weights during optimization (When doing back-prop). Assuming that the same loss function is used, is then the average loss of all outputs used to minimize the cost function or is each output evaluated separately to update the weights?
Thanks in advance!
There is always only one loss that is used to backpropagate the errors, when a model has multiple outputs, then each output is associated one loss, and then a "global" loss is constructed by weighting the loss for each output. You can set the weight for each loss when you compile the model.

How to write a categorization accuracy loss function for keras (deep learning library)?

How to write a categorization accuracy loss function for keras (deep learning library)?
Categorization accuracy loss is the percentage of predictions that are wrong, i.e. #wrong/#data points.
Is it possible to write a custom loss function for that?
Thanks.
EDIT
Although Keras allows you to use custom loss function, I am not convinced anymore that using accuracy as loss makes sense. First, the network's last layer will typically be soft-max, so that you obtain a vector of class probabilities rather than the single most likely class. Second, I fear that there will be issues with gradient computation due to lack of smoothness of accuracy.
OLD POST
Keras offers you the possibility to use custom loss functions. To get the accuracy loss, you can take inspiration from the examples that are already implemented. For binary classification, I would suggest the following implementation
def mean_accuracy_error(y_true, y_pred):
return K.mean(K.abs(K.sign(y_true - y_pred)), axis=-1)

Resources