Error: "One of the differentiated Tensors appears to not have been used in the graph" - pytorch

I am trying to compute a gradient of y_hat to x (y_hat is the sum of gradients of model output to x) but it gives me the error: One of the differentiated Tensors appears to not have been used in the graph. This is the code:
class Model(nn.Module):
def __init__(self,):
super(Model, self).__init__()
self.weight1 = torch.nn.Parameter(torch.tensor([[.2,.5,.9],[1.0,.3,.5],[.3,.2,.7]]))
self.weight2 = torch.nn.Parameter(torch.tensor([2.0,1.0,.4]))
def forward(self, x):
out =F.linear(x, self.weight1.T)
out =F.linear(out, self.weight2.T)
return out
model = Model()
x = torch.tensor([[0.1,0.7,0.2]])
x = x.requires_grad_()
output = model(x)
y_hat = torch.sum(torch.autograd.grad(output, x, create_graph = True)[0])
torch.autograd.grad(y_hat, x)
I think x should be in the computational graph, so I don't know why it gives me this error? Any thoughts would be appreciated!

Cause this function is actually y = x#b so after the first derivative there is no x in the result and that's why we can't do the second derivative.

Related

How do I log a confusion matrix into Wanddb?

I'm using pytorch lightning, and at the end of each epoch, I create a confusion matrix from torchmetrics.ConfusionMatrix (see code below). I would like to log this into Wandb, but the Wandb confusion matrix logger only accepts y_targets and y_predictions. Does anyone know how to extract the updated confusion matrix y_targets and y_predictions from a confusion matrix, or alternatively give Wandb my updated confusion matrix in a way that it can be processed into eg a heatmap within wandb?
class ClassificationTask(pl.LightningModule):
def __init__(self, model, lr=1e-4, augmentor=augmentor):
super().__init__()
self.model = model
self.lr = lr
self.save_hyperparameters() #not being used at the moment, good to have ther in the future
self.augmentor=augmentor
self.matrix = torchmetrics.ConfusionMatrix(num_classes=9)
self.y_trues=[]
self.y_preds=[]
def training_step(self, batch, batch_idx):
x, y = batch
x=self.augmentor(x)#.to('cuda')
y_pred = self.model(x)
loss = F.cross_entropy(y_pred, y,) #weights=class_weights_tensor
acc = accuracy(y_pred, y)
metrics = {"train_acc": acc, "train_loss": loss}
self.log_dict(metrics)
return loss
def validation_step(self, batch, batch_idx):
loss, acc = self._shared_eval_step(batch, batch_idx)
metrics = {"val_acc": acc, "val_loss": loss, }
self.log_dict(metrics)
return metrics
def _shared_eval_step(self, batch, batch_idx):
x, y = batch
y_hat = self.model(x)
loss = F.cross_entropy(y_hat, y)
acc = accuracy(y_hat, y)
self.matrix.update(y_hat,y)
return loss, acc
def validation_epoch_end(self, outputs):
confusion_matrix = self.matrix.compute()
wandb.log({"my_conf_mat_id" : confusion_matrix})
def configure_optimizers(self):
return torch.optim.Adam((self.model.parameters()), lr=self.lr)
I'm actually working on the same issue currently. I found this great PR Feature request for PyTorch lightning. Perhaps this could be of help. I think a possible solution is utilizing torch metrics confusion matrix and then incorporating that into your train/val/test steps and logging them.
https://github.com/Lightning-AI/metrics/issues/880

Computing the Hessian of a Simple NN in PyTorch wrt to Parameters

I am relatively new to PyTorch and trying to compute the Hessian of a very simple feedforward networks with respect to its weights. I am trying to get torch.autograd.functional.hessian to work. I have been digging the forums and since this is a relatively new function added to PyTorch, I am unable to find a whole lot of information on it. Here is my simple network architecture which is from some sample code on Kaggle on Mnist.
class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.l1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.l3 = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.l1(x)
x = self.relu(x)
x = self.l3(x)
return F.log_softmax(x, dim = 1)
net = Network()
optimizer = optim.SGD(net.parameters(), lr=learning_rate, momentum=0.9)
loss_func = nn.CrossEntropyLoss()
and I am running the NN for a bunch of epochs like:
for e in range(epochs):
for i in range(0, x.shape[0], batch_size):
x_mini = x[i:i + batch_size]
y_mini = y[i:i + batch_size]
x_var = Variable(x_mini)
y_var = Variable(y_mini)
optimizer.zero_grad()
net_out = net(x_var)
loss = loss_func(net_out, y_var)
loss.backward()
optimizer.step()
if i % 100 == 0:
loss_log.append(loss.data)
Then, I add all the parameters to a list and make a tensor out of it as below:
param_list = []
for param in net.parameters():
param_list.append(param.view(-1))
param_list = torch.cat(param_list)
Finally, I am trying to compute the Hessian of the converged network by running:
hessian = torch.autograd.functional.hessian(loss_func, param_list,create_graph=True)
but it gives me this error:
TypeError: forward() missing 1 required positional argument: 'target'
Any help would be appreciated.
Computing the hessian with regard to the parameters of a model (as opposed to the inputs to the model) isn't really well-supported right now. There's some work being done on this at https://github.com/pytorch/pytorch/issues/49171 , but for the moment it's very inconvenient.
Your code has a few other problems -- where you're passing loss_func, you should be passing a function that constructs the computation graph. Also, you never specify the input to the network or the target for the loss function.
Here's some code that cheats a little bit to use the existing functional interface to compute the hessian of the model weights, and concatenates everything together to give the same form as what you were trying to do:
# Pick a random input to the network
src = torch.rand(1, 2)
# Say our target for our loss is all ones
dst = torch.ones(1, dtype=torch.long)
keys = list(net.state_dict().keys())
parameters = list(net.parameters())
sizes = [x.view(-1).shape[0] for x in parameters]
ndims = sum(sizes)
def hessian_hack(*params):
for i in range(len(keys)):
path = keys[i].split('.')
cur = net
for f in range(0, len(path)-1):
cur = net.__getattr__(path[f])
cur.__delattr__(path[-1])
cur.__setattr__(path[-1], params[i])
return loss_func(net(src), dst)
# sub_hessians[i][f] is the hessian of parameter i vs parameter f
sub_hessians = torch.autograd.functional.hessian(
hessian_hack,
tuple(parameters),
create_graph=True)
# We can combine them all into a nice big hessian.
hessian = torch.cat([
torch.cat([
sub_hessians[i][f].reshape(sizes[i], sizes[f])
for f in range(len(sub_hessians[i]))
], axis=1)
for i in range(len(sub_hessians))
], axis=0)
print(hessian)

MLP always returns same predicted result

I am using pytorch to implement a simple multilayer perceptron. The input data I have is of 450 dimensions and output is 120. I have normalized my input data. I use MSE as my loss function and the training converged.
When I testing the model, I found that no matter what input is, the output almost always remains same (for all 120 dimensions). I have tried make the model simpler (2 hidden layers) or more complex (up to 7 hidden layers), but this still happened.
Here is how I build my model:
class MLP(torch.nn.Module):
def __init__(self, D_in, D_out):
super(MLP, self).__init__()
self.linear_1 = torch.nn.Linear(D_in, 1000)
self.linear_2 = torch.nn.Linear(1000, 1500)
self.linear_3 = torch.nn.Linear(1500, 1000)
self.linear_4 = torch.nn.Linear(1000, 750)
self.linear_5 = torch.nn.Linear(750, 500)
self.linear_6 = torch.nn.Linear(500, 250)
self.linear_7 = torch.nn.Linear(250, D_out)
self.sigmoid = torch.nn.Sigmoid()
def forward(self, x):
x = self.sigmoid(self.linear_1(x))
x = self.sigmoid(self.linear_2(x))
x = self.sigmoid(self.linear_3(x))
x = self.sigmoid(self.linear_4(x))
x = self.sigmoid(self.linear_5(x))
x = self.sigmoid(self.linear_6(x))
y_pred = self.linear_7(x)
return y_pred
Can anyone give any insights into this? Thanks in advance!

Understanding TensorboardX graph

I'm inspecting my model in the tensorboardX. I'm using PyTorch. Following is the forward method:
def forward(self, x):
x = self.net(x)
x = x.view(x.shape[0], -1) # flat
x = self.ext_net(x)
x = F.log_softmax(x, dim=1)
return x
In the graph, I don't understand what these values and nodes in blue and green boxes represent. Why is there a separate dataflow from input to ext_net (FCNCustom)? Shouldn't there be only one coming through net?

How to give multiple arguments in tensorflow Model call function?

I'm trying to build a model in tensorflow by extending the 'Model' class in tensorflow.keras. I need to pass two arguments in the 'call' function of this class, input images x (224,224,3) and output label y. But I get the following error while building the model:
ValueError: Currently, you cannot build your model if it has
positional or keyword arguments that are not input to the model, but
are required for its 'call' method.
class myCNN(Model):
def __init__(self):
super(myCNN, self).__init__()
base_model = tf.keras.applications.VGG16(input_shape=(224,224,3), weights='imagenet')
layer_name = 'block5_conv3'
self.conv_1 = Model(inputs=base_model.input, outputs=base_model.get_layer(layer_name).output)
self.flatten = L.Flatten(name='flatten')
self.fc1 = L.Dense(1000, activation='relu', name='fc1')
self.final = L.Activation('softmax')
# The problem is because I need y
def call(self, x, y):
x = self.conv_1(x)
x = self.flatten(x)
x = self.fc1(x)
return self.final(x)
model = myCNN()
model.build((None, 224, 224, 3, 1))
Inputs parameter of call method can be an input tensor or list/tuple of input tensors.
You can pass two arguments like this:
def call(self, inputs):
x = inputs[0]
y = inputs[1]
x = self.conv_1(x)
x = self.flatten(x)
x = self.fc1(x)
return self.final(x)

Resources