Compile error on keras sequential model with custom loss function - keras

Trying to compile CNN model of ~16K parameters on GPU in google colab for mnist dataset.
With standard loss 'categorical_crossentropy', it is working fine. But with custom_loss it is giving error.
lamda=0.01
m = X_train.shape[0]
def reg_loss(lamda):
model_layers = custom_model.layers # type list where each el is Conv2D obj etc.
reg_wts = 0
for idx, layer in enumerate(model_layers):
layer_wts = model_layers[idx].get_weights() # type list
if len(layer_wts) > 0: # activation, dropout layers do not have any weights
layer_wts = model_layers[idx].get_weights()[0] #ndarray, 3,3,1,16 : layer1 output
s = np.sum(layer_wts**2)
reg_wts += s
print(idx, "reg_wts", reg_wts)
return (lamda/(2*m))* reg_wts
reg_loss(lamda)
def custom_loss(y_true, y_pred):
K.categorical_crossentropy(y_true, y_pred) + reg_loss(lamda)
custom_model.compile(loss=custom_loss, optimizer='adam', metrics=['accuracy'])
reg_loss returns 28 reg_wts 224.11805880069733
1.8676504900058112e-05
On compile, gives error AttributeError: 'NoneType' object has no attribute 'get_shape'

custom_loss function did not have return statement. A silly mistake, but the error was quite misleading. Hence it took so much time.

Related

Pytorch Resnet model error if FC layer is changed in Colab

If I simply import the Resnet Model from Pytorch in Colab, and use it to train my dataset, there are no issues. However, when I try to change the last FC layer to change the output features from 1000 to 9, which is the number of classes for my datasets, the following error is obtained.
RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)
Working version:
import torchvision.models as models
#model = Net()
model=models.resnet18(pretrained=True)
# defining the optimizer
optimizer = Adam(model.parameters(), lr=0.07)
# defining the loss function
criterion = CrossEntropyLoss()
# checking if GPU is available
if torch.cuda.is_available():
model = model.cuda()
criterion = criterion.cuda()
Version with error:
import torchvision.models as models
#model = Net()
model=models.resnet18(pretrained=True)
# defining the optimizer
optimizer = Adam(model.parameters(), lr=0.07)
# defining the loss function
criterion = CrossEntropyLoss()
# checking if GPU is available
if torch.cuda.is_available():
model = model.cuda()
criterion = criterion.cuda()
model.fc = torch.nn.Linear(512, 9)
Error occurs in the stage where training occurs, aka
outputs = model(images)
How should I go about fixing this issue?
Simple error, the fc layer should be instantiated before declaring model as cuda.
I.e
model=models.resnet18(pretrained=True)
model.fc = torch.nn.Linear(512, 9)
if torch.cuda.is_available():
model = model.cuda()

mse loss function not compatible with regularization loss (add_loss) on hidden layer output

I would like to code in tf.Keras a Neural Network with a couple of loss functions. One is a standard mse (mean squared error) with a factor loading, while the other is basically a regularization term on the output of a hidden layer. This second loss is added through self.add_loss() in a user-defined class inheriting from tf.keras.layers.Layer. I have a couple of questions (the first is more important though).
1) The error I get when trying to combine the two losses together is the following:
ValueError: Shapes must be equal rank, but are 0 and 1
From merging shape 0 with other shapes. for '{{node AddN}} = AddN[N=2, T=DT_FLOAT](loss/weighted_loss/value, model/new_layer/mul_1)' with input shapes: [], [100].
So it comes from the fact that the tensors which should add up to make one unique loss value have different shapes (and ranks). Still, when I try to print the losses during the training, I clearly see that the vectors returned as losses have shape batch_size and rank 1. Could it be that when the 2 losses are summed I have to provide them (or at least the loss of add_loss) as scalar? I know the mse is usually returned as a vector where each entry is the mse from one sample in the batch, hence having batch_size as shape. I think I tried to do the same with the "regularization" loss. Do you have an explanation for this behavio(u)r?
The sample code which gives me error is the following:
import numpy as np
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
def rate_mse(rate=1e5):
#tf.function # also needed for printing
def loss(y_true, y_pred):
tmp = rate*K.mean(K.square(y_pred - y_true), axis=-1)
# tf.print('shape %s and rank %s output in mse'%(K.shape(tmp), tf.rank(tmp)))
tf.print('shape and rank output in mse',[K.shape(tmp), tf.rank(tmp)])
tf.print('mse loss:',tmp) # print when I put tf.function
return tmp
return loss
class newLayer(tf.keras.layers.Layer):
def __init__(self, rate=5e-2, **kwargs):
super(newLayer, self).__init__(**kwargs)
self.rate = rate
# #tf.function # to be commented for NN training
def call(self, inputs):
tmp = self.rate*K.mean(inputs*inputs, axis=-1)
tf.print('shape and rank output in regularizer',[K.shape(tmp), tf.rank(tmp)])
tf.print('regularizer loss:',tmp)
self.add_loss(tmp, inputs=True)
return inputs
tot_n = 10000
xx = np.random.rand(tot_n,1)
yy = np.pi*xx
train_size = int(0.9*tot_n)
xx_train = xx[:train_size]; xx_val = xx[train_size:]
yy_train = yy[:train_size]; yy_val = yy[train_size:]
reg_layer = newLayer()
input_layer = Input(shape=(1,)) # input
hidden = Dense(20, activation='relu', input_shape=(2,))(input_layer) # hidden layer
hidden = reg_layer(hidden)
output_layer = Dense(1, activation='linear')(hidden)
model = Model(inputs=[input_layer], outputs=[output_layer])
model.compile(optimizer='Adam', loss=rate_mse(), experimental_run_tf_function=False)
#model.compile(optimizer='Adam', loss=None, experimental_run_tf_function=False)
model.fit(xx_train, yy_train, epochs=100, batch_size = 100,
validation_data=(xx_val,yy_val), verbose=1)
#new_xx = np.random.rand(10,1); new_yy = np.pi*new_xx
#model.evaluate(new_xx,new_yy)
print(model.predict(np.array([[1]])))
2) I would also have a secondary question related to this code. I noticed that printing with tf.print inside the function rate_mse only works with tf.function. Similarly, the call method of newLayer is only taken into consideration if the same decorator is commented during training. Can someone explain why this is the case or reference me to a possible solution?
Thanks in advance to whoever can provide me help. I am currently using Tensorflow 2.2.0 and keras version is 2.3.0-tf.
I stuck with the same problem for a few days. "Standard" loss is going to be a scalar at the moment when we add it to the loss from add_loss. The only way how I get it working is to add one more axis while calculating mean. So we will get a scalar, and it will work.
tmp = self.rate*K.mean(inputs*inputs, axis=[0, -1])

How can I use the LBFGS optimizer with pytorch ignite?

I started using Ignite recently and i found it very interesting.
I would like to train a model using as an optimizer the LBFGS algorithm from the torch.optim module.
This is my code:
from ignite.engine import Events, Engine, create_supervised_trainer, create_supervised_evaluator
from ignite.metrics import RootMeanSquaredError, Loss
from ignite.handlers import EarlyStopping
D_in, H, D_out = 5, 10, 1
model = simpleNN(D_in, H, D_out) # a simple MLP with 1 Hidden Layer
model.double()
train_loader, val_loader = get_data_loaders(i)
optimizer = torch.optim.LBFGS(model.parameters(), lr=1)
loss_func = torch.nn.MSELoss()
#Ignite
trainer = create_supervised_trainer(model, optimizer, loss_func)
evaluator = create_supervised_evaluator(model, metrics={'RMSE': RootMeanSquaredError(),'LOSS': Loss(loss_func)})
#trainer.on(Events.ITERATION_COMPLETED)
def log_training_loss(engine):
print("Epoch[{}] Loss: {:.5f}".format(engine.state.epoch, len(train_loader), engine.state.output))
def score_function(engine):
val_loss = engine.state.metrics['RMSE']
print("VAL_LOSS: {:.5f}".format(val_loss))
return -val_loss
handler = EarlyStopping(patience=10, score_function=score_function, trainer=trainer)
evaluator.add_event_handler(Events.COMPLETED, handler)
trainer.run(train_loader, max_epochs=100)
And the error that raises is:
TypeError: step() missing 1 required positional argument: 'closure'
I know that is required to define a closure for the implementation of LBFGS, so my question is how can I do it using ignite? or is there another approach for doing this?
The way to do it is like this:
from ignite.engine import Engine
model = ...
optimizer = torch.optim.LBFGS(model.parameters(), lr=1)
criterion =
def update_fn(engine, batch):
model.train()
x, y = batch
# pass to device if needed as here: https://github.com/pytorch/ignite/blob/40d815930d7801b21acfecfa21cd2641a5a50249/ignite/engine/__init__.py#L45
def closure():
y_pred = model(x)
loss = criterion(y_pred, y)
optimizer.zero_grad()
loss.backward()
return loss
optimizer.step(closure)
trainer = Engine(update_fn)
# everything else is the same
Source
You need to encapsulate all evaluating step with zero_grad and returning step in
for batch in loader():
def closure():
...
return loss
optim.step(closure)
Pytorch docs for 'closure'

is it possible to do continuous training in keras for multi-class classification problem?

I was tried to continues training in keras.
because I was build keras multiclass classification model after I have new labels and values. so I want to build a new model without retraining. that is why I tried continuous train in keras.
model.add(Dense(10, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(training_data, labels, epochs=20, batch_size=1)
model.save("keras_model.h5")
after completing save the model , i want to do continues training. so i tried,
model1 = load_model("keras_model.h5")
model1.fit(new_input, new_label, epochs=20, batch_size=1)
model1.save("keras_model.h5")
I tried this. but it was thrown an error. like previously 10 classes. but now we add new class means an error occurred.
so what is my question is, is it possible for continues training in keras for multiclass classification for a new class?
tensorflow.python.framework.errors_impl.InvalidArgumentError: Received
a label value of 10 which is outside the valid range of [0, 9). Label
values: 10 [[{{node
loss/dense_7_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]
The typical approach for this type of situations is to define a common model that contains most of the inner layers and is reusable; and then a second model that defines the output layer and thus the number of classes. The inner model can be reused in subsequent outer models.
Untested example:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
def make_inner_model():
""" An example model that takes 42 features and outputs a
transformation vector.
"""
inp = Input(shape=(42,), name='in')
h1 = Dense(80, activation='relu')(inp)
h2 = Dense(40)(h1)
h3 = Dense(60, activation='relu')(h2)
out = Dense(32)(h3)
return Model(inp, out)
def make_outer_model(inner_model, n_classes):
inp = Input(shape=(42,), name='inp')
hidden = inner_model(inp)
out = Dense(n_classes, activation='softmax')(hidden)
model = Model(inp, out)
model.compile('adam', 'categorical_crossentropy')
return model
inner_model = make_inner_model()
inner_model.save('inner_model_untrained.h5')
model1 = make_outer_model(inner_model, 10)
model1.summary()
# model1.fit()
# inner_model.save_weights('inner_model_weights_1.h5')
model2 = make_outer_model(inner_model, 12)
# model2.fit()
# inner_model.save_weights('inner_model_weights_2.h5')

Custom loss function in Keras based on the input data

I am trying to create the custom loss function using Keras. I want to compute the loss function based on the input and predicted the output of the neural network.
I tried using the customloss function in Keras. I think y_true is the output that we give for training and y_pred is the predicted output of the neural network. The below loss function is same as "mean_squared_error" loss in Keras.
def customloss(y_true, y_pred):
return K.mean(K.square(y_pred - y_true), axis=-1)
I would like to use the input to the neural network also to compute the custom loss function in addition to mean_squared_error loss. Is there a way to send an input to the neural network as an argument to the customloss function.
Thank you.
I have come across 2 solutions to the question you asked.
You can pass your input (scalar only) as an argument to the custom loss wrapper function.
def custom_loss(i):
def loss(y_true, y_pred):
return K.mean(K.square(y_pred - y_true), axis=-1) + something with i...
return loss
def baseline_model():
# create model
i = Input(shape=(5,))
x = Dense(5, kernel_initializer='glorot_uniform', activation='linear')(i)
o = Dense(1, kernel_initializer='normal', activation='linear')(x)
model = Model(i, o)
model.compile(loss=custom_loss(i), optimizer=Adam(lr=0.0005))
return model
This solution is also mentioned in the accepted answer here
You can pad your label with extra data columns from input and write a custom loss. This is helpful if you just want one/few feature column(s) from your input.
def custom_loss(data, y_pred):
y_true = data[:, 0]
i = data[:, 1]
return K.mean(K.square(y_pred - y_true), axis=-1) + something with i...
def baseline_model():
# create model
i = Input(shape=(5,))
x = Dense(5, kernel_initializer='glorot_uniform', activation='linear')(i)
o = Dense(1, kernel_initializer='normal', activation='linear')(x)
model = Model(i, o)
model.compile(loss=custom_loss, optimizer=Adam(lr=0.0005))
return model
model.fit(X, np.append(Y_true, X[:, 0], axis =1), batch_size = batch_size, epochs=90, shuffle=True, verbose=1)
This solution can be found also here in this thread.
I have only used the 2nd method when I had to use input feature columns in the loss. The first method can be only used with scalar arguments as mentioned in the comments.
You could wrap your custom loss with another function that takes the input tensor as an argument:
def customloss(x):
def loss(y_true, y_pred):
# Use x here as you wish
err = K.mean(K.square(y_pred - y_true), axis=-1)
return err
return loss
And then compile your model as follows:
model.compile('sgd', customloss(x))
where x is your input tensor.
NOTE: Not tested.

Resources