How does Keras evaluate loss on test set? - keras

I am implementing a neural network classifier, to print loss and accuracy of this NN I'm using:
score = model.evaluate(x_test, y_test, verbose=False)
model.metrics_names
print('Test score: ', score[0]) #Loss on test
print('Test accuracy: ', score[1])
I would to know how Keras calculate the loss of a model. Specially whether it is evaluated on the first (and only) step of the test set.
I have search on keras.io, but I don't have find anything about it.

From the documentation:
evaluate
Computes the loss on some input data, batch by batch.
Returns
Scalar test loss (if the model has no metrics) or list of scalars (if the model computes other metrics). The attribute model.metrics_names will give you the display labels for the scalar outputs.
So it returns you either a single value that represents a loss, or a list of values that correspond to different metrics that were added to your model. These values are calculated based on the whole test set, i. e. all values in x_test and y_test.

Firstly, print model.metrics_names. Assume the output is ['loss', 'accuracy']
Then you can do:
[test_loss,test_acc] = model.evaluate(X_test,y_test)
print('test accuracy: ',test_acc*100,'%')
print('test loss: ',test_loss)

Related

How to get a RMSE value

I already fit the equation. Now I want the RMSE value
q3_1=data1[['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors', 'zipcode']]
q3_2=data1[['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors','zipcode','condition','grade','waterfront','view','sqft_above','sqft_basement','yr_built','yr_renovated',
'lat', 'long','sqft_living15','sqft_lot15']]
reg = LinearRegression()
reg.fit(q3_1,data1.price)
reg.fit(q3_2,data1.price)
I am not able to proceed from here. I need the RMSE value in both the cases.
As I can understand, you are using TensorFlow on Google Colab.
I don't know exactly what is your LinearRegression object, but IĀ suppose that it is a Keras model with a single node.
Hence, I have a question, how do you train the same model (your reg instance) with datasets with different schema -- one with 6 columns, the other with 16?
By the way, during training/fitting, keras is able to give you the MSE of your epoch, as well as a validation MSE if you provide a validation dataset. Finally, you can use the evaluate method which:
Returns the loss value & metrics values for the model [...]
Just use the "mean_squared_error" metric.
Edit
As you are using scikit-learn you have to take care of the metric yourself.
You can use the predict method to get the predictions from your trained model against a dataset.
Then, there is the mean_squared_error metric which is straighforward to use.
train_x, train_y = data1.features[:-100], data1.price[:-100]
test_x, test_y = data1.features[-100:], data1.price[-100:]
reg = LinearRegression()
reg.fit(train_x, train_y)
predictions = reg.predict(test_x)
mse = sklearn.metrics.mean_squared_error(test_y, predictions)
print("RMSE: %s" % math.sqrt(mse))

How to train a CNN model?

When trying to train the CNN model, I came across a code shown below:
def train(n_epochs, loaders, model, optimizer, criterion):
for epoch in range(1,n_epochs):
train_loss = 0
valid_loss = 0
model.train()
for i, (data,target) in enumerate(loaders['train']):
# zero the parameter (weight) gradients
optimizer.zero_grad()
# forward pass to get outputs
output = model(data)
# calculate the loss
loss = criterion(output, target)
# backward pass to calculate the parameter gradients
loss.backward()
# update the parameters
optimizer.step()
Can someone please tell me why is the second for loop used?
i.e; for i, (data,target) in enumerate(loaders['train']):
And why optimizer.zero_grad() and optimizer.step() is used?
torch.utils.data.DataLoader comes in handy when you need to prepare data batches (and perhaps shuffle them before every run).
data_train_loader = DataLoader(data_train, batch_size=64, shuffle=True)
In the above code, first for-loop iterates through the number of epochs while second loop iterates through the training dataset converted into batches via above code. For example:
for batch_idx, samples in enumerate(data_train_loader):
# samples will be a 64 x D dimensional tensor
# batch_idx is each batch index
Learn more about torch.utils.data.DataLoader from here.
Optimizer.zero_gradient(): Before the backward pass, use the optimizer object to zero all of the gradients for the tensors it will update (which are the learnable weights of the model)
optimizer.step(): We generally use optimizer.step() to make the gradient descent step. Calling the step function on an Optimizer makes an update to its parameters.
Learn more about these from here.
Optimizer is used first to load the params like this (missing in your code):
optimizer = optim.Adam(model.parameters(), lr=0.001, momentum=0.9)
This code
loss = criterion(output, target)
Is used to calculate the loss of a single batch where targets is what you got from a tuple (data,target) and data is used as the input for the model, where we got the output.
This step:
optimizer.zero_grad()
Will zero all the gradients found in the optimizer, which is very important on initialization.
The part
loss.backward()
Calculates the gradients, and the optimizer.step() updates our model weights and biases (parameters).
In PyTorch you typically use DataLoader class to load the trainging and validation sets.
loaders['train']
Is probable the full train set, which represents a single epoch.

Different loss function for validation set in Keras

I have unbalanced training dataset, thats why I built custom weighted categorical cross entropy loss function. But the problem is my validation set is balanced one and I want to use the regular categorical cross entropy loss. So can I pass different loss function for validation set within Keras? I mean the wighted one for training and regular one for validation set?
def weighted_loss(y_pred, y_ture):
'
'
'
return loss
model.compile(loss= weighted_loss, metric='accuracy')
You can try the backend function K.in_train_phase(), which is used by the Dropout and BatchNormalization layers to implement different behaviors in training and validation.
def custom_loss(y_true, y_pred):
weighted_loss = ... # your implementation of weighted crossentropy loss
unweighted_loss = K.sparse_categorical_crossentropy(y_true, y_pred)
return K.in_train_phase(weighted_loss, unweighted_loss)
The first argument of K.in_train_phase() is the tensor used in training phase, and the second is the one used in test phase.
For example, if we set weighted_loss to 0 (just to verify the effect of K.in_train_phase() function):
def custom_loss(y_true, y_pred):
weighted_loss = 0 * K.sparse_categorical_crossentropy(y_true, y_pred)
unweighted_loss = K.sparse_categorical_crossentropy(y_true, y_pred)
return K.in_train_phase(weighted_loss, unweighted_loss)
model = Sequential([Dense(100, activation='relu', input_shape=(100,)), Dense(1000, activation='softmax')])
model.compile(optimizer='adam', loss=custom_loss)
model.outputs[0]._uses_learning_phase = True # required if no dropout or batch norm in the model
X = np.random.rand(1000, 100)
y = np.random.randint(1000, size=1000)
model.fit(X, y, validation_split=0.1)
Epoch 1/10
900/900 [==============================] - 1s 868us/step - loss: 0.0000e+00 - val_loss: 6.9438
As you can see, the loss in training phase is indeed the one multiplied by 0.
Note that if there's no dropout or batch norm in your model, you'll need to manually "turn on" the _uses_learning_phase boolean switch, otherwise K.in_train_phase() will have no effect by default.
The validation loss function is just a metric and actually not needed for training. It's there because it make sense to compare the metrics which your network is actually optimzing on.
So you can add any other loss function as metric during compilation and you'll see it during training.

Vector regression with Keras

Suppose, for example, a regression problem with five scalars as output, where each output has approximately the same range. In Keras, we can model this using a 5-output dense layer without activation function (vector regression):
output_layer = layers.Dense(5, activation=None)(previous_layer)
model = models.Model(input_layer, output_layer)
model.compile(optimizer='rmsprop', loss='mse', metrics=['mse'])
Is the total loss (metric) simply the sum of the individual losses (metrics)? Is this equivalent to the following multi-output model, where the outputs have the same implicit loss weights? In my experiments, I haven't observed any significant differences but want to make sure that I didn't miss anything fundamental.
output_layer_list = []
for _ in range(5):
output_layer_list.append(layers.Dense(1, activation=None)(previous_layer))
model = models.Model(input_layer, output_layer_list)
model.compile(optimizer='rmsprop', loss='mse', metrics=['mse'])
Is there an easy way to attach weights to the outputs in the first solution similar to specifying loss_weights in case of multi-output models?
Those models are the same. To answer your questions let's look at the mse loss:
def mean_squared_error(y_true, y_pred):
return K.mean(K.square(y_pred - y_true), axis=-1)
Is the total loss (metric) simply the sum of the individual losses (metrics)? Yes, because the mse loss applies the K.mean function so you can argue it is the sum of all the elements in the output vector.
Is this equivalent to the following multi-output model, where the outputs have the same implicit loss weights? Yes, because subtraction and squaring are done element wise in vector form, so scalar outputs will produce the same as a single vector output. And a multi-output model loss is the sum of losses of individual outputs.
Yes, both are equivalent. To replicate the loss_weights functionality with your first model, you can define your own custom loss function. Something along these lines:
import tensorflow as tf
weights = K.variable(value=np.array([[0.1, 0.1, 0.1, 0.1, 0.6]]))
def custom_loss(y_true, y_pred):
return tf.matmul(K.square(y_true - y_pred), tf.transpose(weights))
and pass this function to the loss argument upon compiling:
model.compile(optimizer='rmsprop', loss=custom_loss, metrics=['mse'])

How can I calculate the loss without the weight decay in Keras?

I defined a convolutional layer and also use the L2 weight decay in Keras.
When I define the loss in the model.fit(), has all the weight decay loss been included in this loss? If the weight decay loss has been included in the total loss, how can I get the loss without this weight decay during the training?
I want to investigate the loss without the weight decay, while I want this weight decay to attend this training.
Yes, weight decay losses are included in the loss value printed on the screen.
The value you want to monitor is the total loss minus the sum of regularization losses.
The total loss is just model.total_loss
.
The regularization losses are collected in the list model.losses.
The following lines can be found in the source code of model.compile():
# Add regularization penalties
# and other layer-specific losses.
for loss_tensor in self.losses:
total_loss += loss_tensor
To get the loss without weight decay, you can reverse the above operations. I.e., the value to be monitored is model.total_loss - sum(model.losses).
Now, how to monitor this value is a bit tricky. Fortunately, the list of metrics used by a Keras model is not fixed until model.fit() is called. So you can append this value to the list, and it'll be printed on the screen during model fitting.
Here's a simple example:
input_tensor = Input(shape=(64, 64, 3))
hidden = Conv2D(32, 1, kernel_regularizer=l2(0.01))(input_tensor)
hidden = GlobalAveragePooling2D()(hidden)
out = Dense(1)(hidden)
model = Model(input_tensor, out)
model.compile(loss='mse', optimizer='adam')
loss_no_weight_decay = model.total_loss - sum(model.losses)
model.metrics_tensors.append(loss_no_weight_decay)
model.metrics_names.append('loss_no_weight_decay')
When you run model.fit(), something like this will be printed to the screen:
Epoch 1/1
100/100 [==================] - 0s - loss: 0.5764 - loss_no_weight_decay: 0.5178
You can also verify whether this value is correct by computing the L2 regularization manually:
conv_kernel = model.layers[1].get_weights()[0]
print(np.sum(0.01 * np.square(conv_kernel)))
In my case, the printed value is 0.0585, which is indeed the difference between loss and loss_no_weight_decay (with some rounding error).

Resources