How to attach hooks to ReLUs in Inception V3 from torchvision - pytorch

I am using Inception v3 from torchvision. I tried to find the ReLUs within the model:
def recursively_find_submodules(model, submodule_type):
module_list = []
q = [model]
while q:
child = q.pop()
if isinstance(child, submodule_type):
module_list.append(child)
q.extend(list(child.children()))
return module_list
inception = torch.hub.load('pytorch/vision:v0.6.0', 'inception_v3', pretrained=True)
l = recursively_find_submodules(inception, torch.nn.ReLU) # l is empty!
So the ReLUs are not children of any module within the torch model. Upon closer inspection I found the ReLUs in the source code of torchvision but not as modules. In inception.py I found the following:
class BasicConv2d(nn.Module):
def __init__(self, in_channels, out_channels, **kwargs):
super(BasicConv2d, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, bias=False, **kwargs)
self.bn = nn.BatchNorm2d(out_channels, eps=0.001)
def forward(self, x):
x = self.conv(x)
x = self.bn(x)
return F.relu(x, inplace=True)
So the BasicConv2d module uses the ReLU function to clamp it's output instead of the module (torch.nn.ReLU). I guess there is no way to hook up to ReLU functions and modify their input / output without modifying the whole model to use ReLU modules or is there a way to do this?

You can hook to the batch-norm layer preceding the ReLU and attach there, taking into account you observe the inputs to the ReLU rather that the features after the activation.

Related

How to drop running stats to default value for Norm layer in pyTorch?

I trained model on some images. Now to fit similar dataset but with another colors I want to load this model but also i want to drop all running stats from Batchnorm layers (set them to default value, like totally untrained). What parameters should i reset? Simple model looks like this
import torch
import torch.nn as nn
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv0 = nn.Conv2d(3, 3, 3, padding = 1)
self.norm = nn.BatchNorm2d(3)
self.conv = nn.Conv2d(3, 3, 3, padding = 1)
def forward(self, x):
x = self.conv0(x)
x = self.norm(x)
return self.conv(x)
net = Net()
##or for pretrained it will be
##net = torch.load('net.pth')
def drop_to_default():
for m in net.modules():
if type(m) == nn.BatchNorm2d:
####???####
drop_to_default()
Simplest way to do that is to run reset_running_stats() method on BatchNorm objects:
def drop_to_default():
for m in net.modules():
if type(m) == nn.BatchNorm2d:
m.reset_running_stats()
Below is this method's source code:
def reset_running_stats(self) -> None:
if self.track_running_stats:
# running_mean/running_var/num_batches... are registered at runtime depending
# if self.track_running_stats is on
self.running_mean.zero_() # Zero (neutral) mean
self.running_var.fill_(1) # One (neutral) variance
self.num_batches_tracked.zero_() # Number of batches tracked
You can see the source code here, _NormBase class.

Visualize the output of Vgg16 model by TSNE plot?

I need to visualize the output of Vgg16 model which classify 14 different classes.
I load the trained model and I did replace the classifier layer with the identity() layer but it doesn't categorize the output.
Here is the snippet:
the number of samples here is 1000 images.
epoch = 800
PATH = 'vgg16_epoch{}.pth'.format(epoch)
checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
class Identity(nn.Module):
def __init__(self):
super(Identity, self).__init__()
def forward(self, x):
return x
model.classifier._modules['6'] = Identity()
model.eval()
logits_list = numpy.empty((0,4096))
targets = []
with torch.no_grad():
for step, (t_image, target, classess, image_path) in enumerate(test_loader):
t_image = t_image.cuda()
target = target.cuda()
target = target.data.cpu().numpy()
targets.append(target)
logits = model(t_image)
print(logits.shape)
logits = logits.data.cpu().numpy()
print(logits.shape)
logits_list = numpy.append(logits_list, logits, axis=0)
print(logits_list.shape)
tsne = TSNE(n_components=2, verbose=1, perplexity=10, n_iter=1000)
tsne_results = tsne.fit_transform(logits_list)
target_ids = range(len(targets))
plt.scatter(tsne_results[:,0],tsne_results[:,1],c = target_ids ,cmap=plt.cm.get_cmap("jet", 14))
plt.colorbar(ticks=range(14))
plt.legend()
plt.show()
here is what this script has been produced: I am not sure why I have all colors for each cluster!
The VGG16 outputs over 25k features to the classifier. I believe it's too much to t-SNE. It's a good idea to include a new nn.Linear layer to reduce this number. So, t-SNE may work better. In addition, I'd recommend you two different ways to get the features from the model:
The best way to get it regardless of the model is by using the register_forward_hook method. You may find a notebook here with an example.
If you don't want to use the register, I'd suggest this one. After loading your model, you may use the following class to extract the features:
class FeatNet (nn.Module):
def __init__(self, vgg):
super(FeatNet, self).__init__()
self.features = nn.Sequential(*list(vgg.children())[:-1]))
def forward(self, img):
return self.features(img)
Now, you just need to call FeatNet(img) to get the features.
To include the feature reducer, as I suggested before, you need to retrain your model doing something like:
class FeatNet (nn.Module):
def __init__(self, vgg):
super(FeatNet, self).__init__()
self.features = nn.Sequential(*list(vgg.children())[:-1]))
self.feat_reducer = nn.Sequential(
nn.Linear(25088, 1024),
nn.BatchNorm1d(1024),
nn.ReLU()
)
self.classifier = nn.Linear(1024, 14)
def forward(self, img):
x = self.features(img)
x_r = self.feat_reducer(x)
return self.classifier(x_r)
Then, you can run your model returning x_r, that is, the reduced features. As I told you, 25k features are too much for t-SNE. Another method to reduce this number is by using PCA instead of nn.Linear. In this case, you send the 25k features to PCA and then train t-SNE using the PCA's output. I prefer using nn.Linear, but you need to test to check which one you get a better result.

cannot assign 'torch.nn.modules.container.Sequential' as parameter

I was following this method
(https://discuss.pytorch.org/t/dynamic-parameter-declaration-in-forward-function/427) to dynamically assign parameters in forward function.
However, my parameter is not just one single weight tensor but it is nn.Sequential.
When I implement below:
class MyModule(nn.Module):
def __init__(self):
# you need to register the parameter names earlier
self.register_parameter('W_di', None)
def forward(self, input):
if self.W_di is None:
self.W_di = nn.Sequential(
nn.Linear(mL_n * 2, 1024),
nn.ReLU(),
nn.Linear(1024, self.hS)).to(device)
I get the following error.
TypeError: cannot assign 'torch.nn.modules.container.Sequential' as parameter 'W_di' (torch.nn.Parameter or None expected)
Is there any way that I can register nn.Sequential as a whole param? Thanks!
If you or other users still have this problem, one solution to consider is using nn.ModuleList instead of nn.Sequential.
While nn.Sequential is useful for defining a fixed sequence of layers in PyTorch, nn.ModuleList is a more flexible container that allows direct access and modification of individual layers within the list. This can be especially helpful when dealing with dynamic models or architectures that require more complex layer arrangements.
My gut feeling is that you cannot do it. Even in the static model declaration, nn.Module also specifies the parameters of every sub-modules (e.g., nn.Conv2d or nn.Linear) in a nested way. That is, every kernel or bias is registered one by one and independently.
One workaround might be to introduce dynamic sub-modules. Here is my brief implementation. One can define desired dynamic behaviors inside the function DynamicLinear.
import torch
import torch.nn as nn
class DynamicLinear(nn.Module):
def __init__(self):
super(DynamicLinear, self).__init__()
# you need to register the parameter names earlier
self.register_parameter('W_di', None)
def forward(self, x):
if self.W_di is None:
# dynamically define a linear function here
self.W_di = nn.Parameter(torch.ones(1, 1)).to(x.device)
return self.W_di # x
class MyModule(nn.Module):
def __init__(self):
super(MyModule, self).__init__()
self.net = nn.Sequential(
DynamicLinear(),
nn.ReLU(),
DynamicLinear())
def forward(self, x):
return self.net(x)
m = MyModule()
x = torch.ones(1, 1)
y = m(x)
# output: 1
print(y)

How to add BatchNormalization loss to gradient calculation in tensorflow 2.0 using keras subclass API

Using the keras subclass API it is easy enough to add a a batch normalization layer however the layer.losses list always appears empty. What is the correct method of including in the train loss when doing tape.gradient(loss, lossmodel.trainable_variables) where lossmodel is some separate keras subclass model defining a more complicated loss function that must include the gradient losses?
For example, this is minimal model with ONLY the batch norm layer. It has no loss AFAIK
class M(tf.keras.Model):
def __init__(self, axis):
super().__init__()
self.layer = tf.keras.layers.BatchNormalization(axis=axis, scale=False, center=True, virtual_batch_size=1, input_shape=(6,))
def call(self, x):
out = self.layer(x)
return out
m = M(1)
In [77]: m.layer.losses
Out[77]: []

How do I add LSTM, GRU or other recurrent layers to a Sequential in PyTorch

I like using torch.nn.Sequential as in
self.conv_layer = torch.nn.Sequential(
torch.nn.Conv1d(196, 196, kernel_size=15, stride=4),
torch.nn.Dropout()
)
But when I want to add a recurrent layer such as torch.nn.GRU it won't work because the output of recurrent layers in PyTorch is a tuple and you need to choose which part of the output you want to further process.
So is there any way to get
self.rec_layer = nn.Sequential(
torch.nn.GRU(input_size=2, hidden_size=256),
torch.nn.Linear(in_features=256, out_features=1)
)
to work? For this example, let's say I want to feed torch.nn.GRU(input_size=2, hidden_size=20)(x)[1][-1] (the last hidden state of the last layer) into the following Linear layer.
I made a module called SelectItem to pick out an element from a tuple or list
class SelectItem(nn.Module):
def __init__(self, item_index):
super(SelectItem, self).__init__()
self._name = 'selectitem'
self.item_index = item_index
def forward(self, inputs):
return inputs[self.item_index]
SelectItem can be used in Sequential to pick out the hidden state:
net = nn.Sequential(
nn.GRU(dim_in, dim_out, batch_first=True),
SelectItem(1)
)

Resources