How to freeze selected layers of a model in Pytorch? - pytorch

I am using the mobileNetV2 and I only want to freeze part of the model. I know I can use the following code to freeze the entire model
MobileNet = models.mobilenet_v2(pretrained = True)
for param in MobileNet.parameters():
param.requires_grad = False
but I want everything from (15) onward to remain unfrozen. How can I selectively freeze everything before the desired layer is frozen?
(15): InvertedResidual(
(conv): Sequential(
(0): ConvBNReLU(
(0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU6(inplace=True)
)
(1): ConvBNReLU(
(0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)
(1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU6(inplace=True)
)
(2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(16): InvertedResidual(
(conv): Sequential(
(0): ConvBNReLU(
(0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU6(inplace=True)
)
(1): ConvBNReLU(
(0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)
(1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU6(inplace=True)
)
(2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(17): InvertedResidual(
(conv): Sequential(
(0): ConvBNReLU(
(0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU6(inplace=True)
)
(1): ConvBNReLU(
(0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)
(1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU6(inplace=True)
)
(2): Conv2d(960, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)
(3): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(18): ConvBNReLU(
(0): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU6(inplace=True)
) ) (classifier): Sequential(
(0): Dropout(p=0.2, inplace=False)
(1): Linear(in_features=1280, out_features=1000, bias=True) ) )

Pytorch's model implementation is in good modularization, so like you do
for param in MobileNet.parameters():
param.requires_grad = False
, you may also do
for param in MobileNet.features[15].parameters():
param.requires_grad = True
afterwards to unfreeze parameters in (15).
Loop from 15 to 18 to unfreeze the last several layers.

Just adding this here for completeness. You can also freeze parameters in place without iterating over them with requires_grad_ (API).
For example say you have a RetinaNet and want to just fine-tune on the heads
class RetinaNet(torch.nn.Module):
def __init__(self, ...):
self.backbone = ResNet(...)
self.fpn = FPN(...)
self.box_head = torch.nn.Sequential(...)
self.cls_head = torch.nn.Sequential(...)
Then you could freeze the backbone and FPN like this:
# Getting the model
retinanet = RetinaNet(...)
# Freezing backbone and FPN
retinanet.backbone.requires_grad_(False)
retinanet.fpn.requires_grad_(False)

If you want to define some layers by name and then unfreeze them, I propose a variant of #JVGD's answer:
class RetinaNet(torch.nn.Module):
def __init__(self, ...):
self.backbone = ResNet(...)
self.fpn = FPN(...)
self.box_head = torch.nn.Sequential(...)
self.cls_head = torch.nn.Sequential(...)
# Getting the model
retinanet = RetinaNet(...)
# The param name is f'{module_name}.weight' or f'{module_name}.bias'.
# Some layers, e.g., batch norm, have additional params.
# In some circumstances, e.g., when using DataParallel(),
# the param name is prefixed by 'module.'.
params_to_train = ['cls_head.weight', 'cls_head.bias']
for name, param in retinanet.named_parameters():
# Set True only for params in the list 'params_to_train'
param.requires_grad = True if name in params_to_train else False
...
The advantage is that you can define all layers to unfreeze in one Iterable.

An optimized answer to the first answer above is to freeze only the first 15 layers [0-14] because the last layers [15-18] are by default unfrozen (param.requires_grad = True).
Therefore, we only need to code this way:
MobileNet = torchvision.models.mobilenet_v2(pretrained = True)
for param in MobileNet.features[0:14].parameters():
param.requires_grad = False

Related

Can modifying a layer in a pretrained model break the residual connections?

I'm currently working on an image recognition task using SE-ResNet-50 as my backbone network.
The pretrained SE-ResNet-50 model was obtained from Timm (https://paperswithcode.com/model/se-resnet?variant=seresnet50).
My aim is to modify the SE blocks, also known as the attention blocks, to alternative attention blocks such as CBAM or Dual attention. The script I use is:
model = timm.create_model('seresnet50', pretrained=True)
# search every se blocks in seresnet
seblocks = [name.split('.') for name, _ in model.named_modules() if name.split('.')[-1] == 'se']
for *parent, k in seblocks:
block = model.get_submodule('.'.join(parent))
block.se = CBAM(block.se.fc1.in_channels)
before:
(2): Bottleneck(
...
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
-> (se): SEModule(
(fc1): Conv2d(2048, 128, kernel_size=(1, 1), stride=(1, 1))
(bn): Identity()
(act): ReLU(inplace=True)
(fc2): Conv2d(128, 2048, kernel_size=(1, 1), stride=(1, 1))
(gate): Sigmoid()
)
(act3): ReLU(inplace=True)
)
after:
(2): Bottleneck(
...
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
-> (se): CBAM(
(ChannelGate): ChannelGate(
(mlp): Sequential(
(0): Flatten()
(1): Linear(in_features=2048, out_features=128, bias=True)
(2): ReLU()
(3): Linear(in_features=128, out_features=2048, bias=True)
)
)
(SpatialGate): SpatialGate(
(compress): ChannelPool()
(spatial): BasicConv(
(conv): Conv2d(2, 1, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), bias=False)
(bn): BatchNorm2d(1, eps=1e-05, momentum=0.01, affine=True, track_running_stats=True)
)
)
)
(act3): ReLU(inplace=True)
)
The replacement of the SE block with the CBAM block was successful.
I have two questions:
Will this replacement affect the residual connection?
How do I modify the module name, changing (SE) to (CBAM)?
thanks

Fine tuning freezing weights nnUNet

Good morning,
I've followed the instructions in this github issue:
https://github.com/MIC-DKFZ/nnUNet/issues/1108
to fine-tune an nnUNet model (pyTorch) on a pre-trained one, but this method retrain all weights, and i would like to freeze all weigths and retrain only the last layer's weights, changing the number of segmentation classes from 3 to 1.
Do you know a way to do that?
Thank you in advance
To freeze the weights you need to set parameter.requires_grad = False.
Example:
from nnunet.network_architecture.generic_UNet import Generic_UNet
model = Generic_UNet(input_channels=3, base_num_features=64, num_classes=4, num_pool=3)
for name, parameter in model.named_parameters():
if 'seg_outputs' in name:
print(f"parameter '{name}' will not be freezed")
parameter.requires_grad = True
else:
parameter.requires_grad = False
To check parameter names you can use print:
print(model)
which produces:
Generic_UNet(
(conv_blocks_localization): ModuleList(
(0): Sequential(
(0): StackedConvLayers(
(blocks): Sequential(
(0): ConvDropoutNormNonlin(
(conv): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(instnorm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01, inplace=True)
)
)
)
(1): StackedConvLayers(
(blocks): Sequential(
(0): ConvDropoutNormNonlin(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(instnorm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01, inplace=True)
)
)
)
)
)
(conv_blocks_context): ModuleList(
(0): StackedConvLayers(
(blocks): Sequential(
(0): ConvDropoutNormNonlin(
(conv): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout2d(p=0.5, inplace=True)
(instnorm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01, inplace=True)
)
(1): ConvDropoutNormNonlin(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout2d(p=0.5, inplace=True)
(instnorm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01, inplace=True)
)
)
)
(1): Sequential(
(0): StackedConvLayers(
(blocks): Sequential(
(0): ConvDropoutNormNonlin(
(conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout2d(p=0.5, inplace=True)
(instnorm): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01, inplace=True)
)
)
)
(1): StackedConvLayers(
(blocks): Sequential(
(0): ConvDropoutNormNonlin(
(conv): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(dropout): Dropout2d(p=0.5, inplace=True)
(instnorm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01, inplace=True)
)
)
)
)
)
(td): ModuleList(
(0): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
)
(tu): ModuleList(
(0): Upsample()
)
(seg_outputs): ModuleList(
(0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1), bias=False)
)
)
Or you can visualize your network with netron:
https://github.com/lutzroeder/netron

How to Extract the feature vectors and save them in Densenet121?

I'm trying to extract the feature vectors of my dateset (x-ray images) which is trained on Densenet121 CNN for classification using Pytorch. I want to extract the feature vectors from one of the the intermediate layers.
model.eval() -->
DataParallel(
(module): DenseNet121(
(densenet121): DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
(pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(denseblock1): _DenseBlock(
(denselayer1): _DenseLayer(
(norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer2): _DenseLayer(
(norm1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer3): _DenseLayer(
(norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer4): _DenseLayer(
(norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer5): _DenseLayer(
(norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer6): _DenseLayer(
(norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(transition1): _Transition(
(norm): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(pool): AvgPool2d(kernel_size=2, stride=2, padding=0)
)
(denseblock2): _DenseBlock(
(denselayer1): _DenseLayer(
(norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer2): _DenseLayer(
(norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer3): _DenseLayer(
(norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer4): _DenseLayer(
(norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer5): _DenseLayer(
(norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer6): _DenseLayer(
(norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer7): _DenseLayer(
(norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer8): _DenseLayer(
(norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer9): _DenseLayer(
(norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer10): _DenseLayer(
(norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer11): _DenseLayer(
(norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
(denselayer12): _DenseLayer(
(norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(transition2): _Transition(
(norm): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(pool): AvgPool2d(kernel_size=2, stride=2, padding=0)
)
I think I have to do some work in the following block of code but I need help to do that.
class DenseNet121(nn.Module):
def __init__(self, out_size):
super(DenseNet121, self).__init__()
self.densenet121 = torchvision.models.densenet121(pretrained = True)
num_ftrs = self.densenet121.classifier.in_features
self.densenet121.classifier = nn.Sequential(
nn.Linear(num_ftrs, out_size),
nn.Sigmoid()
)
def forward(self, x):
x = self.densenet121(x)
return x
I want to get the feature vectors and then save them in order to use them later on as an input for another function.
Thank you.
You probably want to use something like a forward hook. It is basically a function call you can register which is executed when the forward of this specific module is called. So you can register the forward hook at the points in your model where you want to log the input and/or output and write the feature vector into a file or whatever.
Finding out how to bin the correct layer it is looking at the description you posted and going down the tree. So if you want to see the input and output of denseblock1.denselayer2.conv1. It should be something along these lines
model.densenet121.features.denseblock1.denselayer2.conv1
No guarantee that it will work and it is best to try a bit around in a debugger. Maybe you also need to access elements os Sequential via an index with the [] operator or something

Apply hooks on inner layers of ResNet

The pytorch official implementation of resnet results in the following model:
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(layer1): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
)
(layer2): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential()
)
)
### Skipping layers 3 and 4
(linear): Linear(in_features=512, out_features=10, bias=True)
)
I tried applying hook to the conv1 in the first BasicBlock of layer2 using
handle = model.layer2[0][0].register_forward_hook(batchout_pre_hook)
but got the following error :
TypeError: 'BasicBlock' object does not support indexing
I am able to apply hook to the BasicBlock using handle = model.layer2[0].register_forward_hook(batchout_pre_hook) but cannot apply hook in the modules present inside the BasicBlock
For attaching a hook to conv1 in layer2's 0th block, you need to use
handle = model.layer2[0].conv1.register_forward_hook(batchout_pre_hook)
This is because inside the 0th block, the modules are named as conv1, bn1, etc. and are not a list to be accessed via an index.

Freezing all the layers but FCN head in FCN_ResNet101 Pytorch

I want to finetune an FCN_ResNet101. I would like to change the last layer as my dataset has a different number of classes. Also, finetune only the FCN head.
For the former, is it enough to only change the num_classes argument when defining the model or I need to use something like this:
model = torchvision.models.segmentation.fcn_resnet101(pretrained=True)
model.classifier=nn.identity()
model.Conv2d = nn.Conv2d(
in_channels=256,
out_channels=nb_classes,
kernel_size=1,
stride=1
)
I took this piece of code from another thread. I am not sure if it is necessary to use nn.identity(). When I do, the last layer does not change but the last layer of the one to the last FCN!
And, how many layers must be changed so my FCN head is re_trianed?
I wrote it this way but I’m mostly confused about FCN_ResNet101 architecture.
model = torchvision.models.segmentation.fcn_resnet101(pretrained=True, progress=True, num_classes=?)
#model.classifier[4] = nn.Identity()
“”"
FCNHead(
(0): Conv2d(2048, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.1)
(4): Conv2d(512, 21, kernel_size=(1, 1), stride=(1, 1))
), FCNHead(
(0): Conv2d(1024, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.1)
(4): Conv2d(256, 21, kernel_size=(1, 1), stride=(1, 1))
)]
“”"
#setting our own number of classes
layer_list = list(model.children())[-5:]
model_small = nn.Sequential(*list(model.children()))[-5:]
for param in model_small.parameters():
param.requires_grad = False
model_small.Conv2d = nn.Conv2d( in_channels=1024,kernel_size=(3,3),stride=(1,1))
model_small.BatchNorm2d = nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
model_small.ReLU = nn.ReLU()
model_small.Dropout = nn.Dropout(p=0.1)
model_small.Conv2d = nn.Conv2d(
in_channels=256,
out_channels=nb_classes,
kernel_size=1,
stride=1
)
model = model_small.to(device)
Any guidance is very much appreciated!

Resources