How to track weights and gradients in a Keras custom training loop - python-3.x

I have defined the following custom model and training loop in Keras:
class CustomModel(keras.Model):
def train_step(self, data):
x, y = data
with tf.GradientTape() as tape:
y_pred = self(x, training=True) # Forward pass
loss = self.compiled_loss(y, y_pred, regularization_losses=self.losses)
trainable_vars = self.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
self.optimizer.apply_gradients(zip(gradients, trainable_vars))
self.compiled_metrics.update_state(y, y_pred)
return {m.name: m.result() for m in self.metrics}
And I am using the following code to train the model on a simple toy data set:
inputs = keras.layers.Input(shape=(1,))
hidden = keras.layers.Dense(1, activation='tanh')(inputs)
outputs = keras.layers.Dense(1)(hidden)
x = np.arange(0, 2*np.pi, 2*np.pi/100)
y = np.sin(x)
nnmodel = CustomModel(inputs, outputs)
nnmodel.compile(optimizer=keras.optimizers.SGD(lr=0.1), loss="mse", metrics=["mae"])
nnmodel.fit(x, y, batch_size=100, epochs=2000)
I want to be able to see the values of the gradient and the trainable_vars variables in the train_step function for each training loop, and I am not sure how to do this.
I have tried to set a break point inside the train_step function in my python IDE and expecting it to stop at the break point for each epoch of the training after I call model.fit() but this didn't happen. I also tried to have them print out the values in the log after each epoch but I am not sure how to achieve this.

Related

Weighted mean squared error pytorch with sample weights regression

I am trying to use weighted mean squared error loss function for the regression task with imbalanced dataset. Basically, I have different weight assigned to each example and I am using the weighted MSE loss function. Is there a way to sample the weight tensor using TensorDataset along with input and output batch samples?
def weighted_mse_loss(inputs, targets, weights=None):
loss = (inputs - targets) ** 2
if weights is not None:
loss *= weights.expand_as(loss)
loss = torch.mean(loss)
return loss
train_dataset = torch.utils.data.TensorDataset(x_train, y_train)
weights = torch.rand(len(train_dataset))
for x, y in train_loader:
optimizer.zero_grad()
out = model(x)
loss = weighted_mse_loss(y, out, weights)
loss.backward()
If you can get the weights before creating the train dataset:
train_dataset = TensorDataset(x_train, y_train, weights)
for x, y, w in train_dataset:
...
Otherwise:
train_dataset = TensorDataset(x_train, y_train)
for (x, y), w in zip(train_dataset, weights):
...
You can also use a DataLoader but be carefull about shuffling with the second method

Keras, how to retrieve layers outputs in a custom training loop ? (more precisely in the *train_step* function)

In the Keras documentation we see that it's possible to customize what happens during training after every batch, for instance if we want to compute our own loss, we can override the train_step function:
class CustomModel(keras.Model):
def train_step(self, data):
x, y = data
with tf.GradientTape() as tape:
y_pred = self(x, training=True) # Forward pass
# Compute our own loss
loss = keras.losses.mean_squared_error(y, y_pred)
But, what if I want to use some outputs of some layers in order to compute a custom loss after every batch ?
Is there a way to have access to some layers outputs in the train_step function ?

How do you test a custom dataset in Pytorch?

I've been following tutorials in Pytorch that use datasets from Pytorch that allow you to enable whether you'd like to train using the data or not... But now I'm using a .csv and a custom dataset.
class MyDataset(Dataset):
def __init__(self, root, n_inp):
self.df = pd.read_csv(root)
self.data = self.df.to_numpy()
self.x , self.y = (torch.from_numpy(self.data[:,:n_inp]),
torch.from_numpy(self.data[:,n_inp:]))
def __getitem__(self, idx):
return self.x[idx, :], self.y[idx,:]
def __len__(self):
return len(self.data)
How can I tell Pytorch not to train my test_dataset so I can use it as a reference of how accurate my model is?
train_dataset = MyDataset("heart.csv", input_size)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle =True)
test_dataset = MyDataset("heart.csv", input_size)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle =True)
In pytorch, a custom dataset inherits the class Dataset. Mainly it contains two methods __len__() is to specify the length of your dataset object to iterate over and __getitem__() to return a batch of data at a time.
Once the dataloader objects are initialized (train_loader and test_loader as specified in your code), you need to write a train loop and a test loop.
def train(model, optimizer, loss_fn, dataloader):
model.train()
for i, (input, gt) in enumerate(dataloader):
if params.use_gpu: #(If training using GPU)
input, gt = input.cuda(non_blocking = True), gt.cuda(non_blocking = True)
predicted = model(input)
loss = loss_fn(predicted, gt)
optimizer.zero_grad()
loss.backward()
optimizer.step()
and your test loop should be:
def test(model,loss_fn, dataloader):
model.eval()
for i, (input, gt) in enumerate(dataloader):
if params.use_gpu: #(If training using GPU)
input, gt = input.cuda(non_blocking = True), gt.cuda(non_blocking = True)
predicted = model(input)
loss = loss_fn(predicted, gt)
In additional you can use metrics dictionary to log your predicted, loss, epochs etc,. The main difference between training and test loop is that we exclude back propagation (zero_grad(), backward(), step()) in inference stage.
Finally,
for epoch in range(1, epochs + 1):
train(model, optimizer, loss_fn, train_loader)
test(model, loss_fn, test_loader)
There are a couple of things to note when you're testing in pytorch:
Put your model into evaluation mode so that things like dropout and batch normalization aren't in training mode: model.eval()
Put a wrapper around your testing code to avoid the computation of gradients (saving memory and time): with torch.no_grad():
Normalise or standardise your data according to your training set only. This is important for min/max normalisation or z-score standardisation so that the model accurately reflects test performance.
Other than that, what you've written looks pretty fine to me, as you're not applying any transforms to your data (for example, image flipping or gaussian noise injections). To show what code should look like in test mode, see below:
for e in range(num_epochs):
for B, (dat, label) in enumerate(train_loader):
#transforms here
opt.zero_grad()
out = model(dat.to(device))
loss = criterion(out)
loss.backward()
opt.step()
with torch.no_grad():
model.eval()
global_corr = 0
for B, (dat,label) in enumerate(test_loader):
out = model(dat.to(device))
# get batch eval metrics here!

How to compute gradient of output wrt input in Tensorflow 2.0

I have a trained Tensorflow 2.0 model (from tf.keras.Sequential()) that takes an input layer with 26 columns (X) and produces an output layer with 1 column (Y).
In TF 1.x I was able to calculate the gradient of the output with respect to the input with the following:
model = load_model('mymodel.h5')
sess = K.get_session()
grad_func = tf.gradients(model.output, model.input)
gradients = sess.run(grad_func, feed_dict={model.input: X})[0]
In TF2 when I try to run tf.gradients(), I get the error:
RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.
In the question In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?, we see an answer on how to calculate gradients with respect to intermediate layers, but I don't see how to apply this to gradients with respect to the inputs. On the Tensorflow help for tf.GradientTape, there are examples with calculating gradients for simple functions, but not neural networks.
How can tf.GradientTape be used to calculate the gradient of the output with respect to the input?
This should work in TF2:
inp = tf.Variable(np.random.normal(size=(25, 120)), dtype=tf.float32)
with tf.GradientTape() as tape:
preds = model(inp)
grads = tape.gradient(preds, inp)
Basically you do it the same way as TF1, but using GradientTape.
I hope this is what you're looking for. This will give the gradients of the output w.r.t. the inputs.
# Whatever the input you like goes in as the initial_value
x = tf.Variable(np.random.normal(size=(25, 120)), dtype=tf.float32)
y_true = np.random.choice([0,1], size=(25,10))
print(model.output)
print(model.predict(x))
with tf.GradientTape() as tape:
pred = model.predict(x)
grads = tape.gradients(pred, x)
In the above case, we should use tape.watch()
for (x, y) in test_dataset:
with tf.GradientTape() as tape:
tape.watch(x)
pred = model(x)
grads = tape.gradient(pred, x)
but the grads will just have the grads of the inputs
The following method is better, you can use model to predict the prediction results and compute the loss, then use the loss to calculate the grads of all trainable variables
with tf.GradientTape() as tape:
predictions = model(x, training=True)
loss = loss_function(y, predictions)
grads = tape.gradient(loss, model.trainable_variables)

Custom loss function in Keras based on the input data

I am trying to create the custom loss function using Keras. I want to compute the loss function based on the input and predicted the output of the neural network.
I tried using the customloss function in Keras. I think y_true is the output that we give for training and y_pred is the predicted output of the neural network. The below loss function is same as "mean_squared_error" loss in Keras.
def customloss(y_true, y_pred):
return K.mean(K.square(y_pred - y_true), axis=-1)
I would like to use the input to the neural network also to compute the custom loss function in addition to mean_squared_error loss. Is there a way to send an input to the neural network as an argument to the customloss function.
Thank you.
I have come across 2 solutions to the question you asked.
You can pass your input (scalar only) as an argument to the custom loss wrapper function.
def custom_loss(i):
def loss(y_true, y_pred):
return K.mean(K.square(y_pred - y_true), axis=-1) + something with i...
return loss
def baseline_model():
# create model
i = Input(shape=(5,))
x = Dense(5, kernel_initializer='glorot_uniform', activation='linear')(i)
o = Dense(1, kernel_initializer='normal', activation='linear')(x)
model = Model(i, o)
model.compile(loss=custom_loss(i), optimizer=Adam(lr=0.0005))
return model
This solution is also mentioned in the accepted answer here
You can pad your label with extra data columns from input and write a custom loss. This is helpful if you just want one/few feature column(s) from your input.
def custom_loss(data, y_pred):
y_true = data[:, 0]
i = data[:, 1]
return K.mean(K.square(y_pred - y_true), axis=-1) + something with i...
def baseline_model():
# create model
i = Input(shape=(5,))
x = Dense(5, kernel_initializer='glorot_uniform', activation='linear')(i)
o = Dense(1, kernel_initializer='normal', activation='linear')(x)
model = Model(i, o)
model.compile(loss=custom_loss, optimizer=Adam(lr=0.0005))
return model
model.fit(X, np.append(Y_true, X[:, 0], axis =1), batch_size = batch_size, epochs=90, shuffle=True, verbose=1)
This solution can be found also here in this thread.
I have only used the 2nd method when I had to use input feature columns in the loss. The first method can be only used with scalar arguments as mentioned in the comments.
You could wrap your custom loss with another function that takes the input tensor as an argument:
def customloss(x):
def loss(y_true, y_pred):
# Use x here as you wish
err = K.mean(K.square(y_pred - y_true), axis=-1)
return err
return loss
And then compile your model as follows:
model.compile('sgd', customloss(x))
where x is your input tensor.
NOTE: Not tested.

Resources