Get the input channels for the conv2d from previous layer? - pytorch

I was wondering if there are many convolutional layers (conv1 --> conv2 ). How can we get the input channels parameter for the conv2 from the conv1 output channel?
class MyModel(nn.Module):
def __init__(self, in_ch, num_features, out_ch2):
super(MyModel, self).__init__()
self.conv1 = nn.Conv2D(in_channels,num_features)
self.conv2 = nn.Conv2D(in_channnels_from_out_channels_of_conv1,out_ch2)
Can I get the out_channels from the conv1 layer and use it as in_ch for conv2?

Second parameter of nn.Conv2D constructor is number of output channels:
self.conv1 = nn.Conv2D(in_channels,conv1_out_channels)
self.conv2 = nn.Conv2D(conv1_out_channels,out_ch2)
as described in the docs
Also it available as a property:


Saving the model architecture with activation functions in PyTorch

I use PyTorch for training neural networks. While saving the model, the weights of the network are saved, while the activation functions are not captured. Now, I reload the model from the saved weights with the activation functions changed, the model load still does not throw error. Further, the network outputs incorrect values (obviously). Is there a way to save the structure of the neural network along with the weights? An MWE is presented below.
import torch
from torch import nn
class Test(nn.Module):
def __init__(self):
super(Test, self).__init__()
self.fc1 = nn.Linear(10, 25)
self.fc2 = nn.Linear(25, 10)
self.relu = nn.ReLU()
self.tanh = nn.Tanh()
def forward(self, inputs):
return self.tanh(self.fc2(self.relu(self.fc1(inputs))))
To save
test = Test().float(), "")
To load
import torch
from torch import nn
class Test1(nn.Module):
def __init__(self):
super(Test, self).__init__()
self.fc1 = nn.Linear(10, 25)
self.fc2 = nn.Linear(25, 10)
self.relu = nn.ReLU()
self.tanh = nn.Tanh()
def forward(self, inputs):
return self.relu(self.fc2(self.tanh(self.fc1(inputs))))
test1 = Test1().float()
test1.load_state_dict(torch.load("")) # Loads without error. However the activation functions, tanh and relu are interchanged, and the network outputs incorrect values.
Is there a way to also capture the activation functions, while saving? Thanks.

Use torch.square() inside a nn.Sequential layer in PyTorch

I want to square the result of a maxpool layer.
I tried the following:
class CNNClassifier(Classifier): # nn.Module
def __init__(self, in_channels):
self.cnn = nn.Sequential(
# maxpool
nn.MaxPool2d((1, 5), stride=(1, 5)),
# layer1
nn.Conv2d(in_channels=in_channels, out_channels=32, kernel_size=5,
Which to the experienced PyTorch user for sure makes no sense.
Indeed, the error is quite clear:
TypeError: square() missing 1 required positional arguments: "input"
How can I feed in to square the tensor from the preceding layer?
You can't put a PyTorch function in a nn.Sequential pipeline, it needs to be a nn.Module.
You could wrap it like this:
class Square(nn.Module):
def forward(self, x):
return torch.square(x)
Then use it inside your sequential layer like so:
class CNNClassifier(Classifier): # nn.Module
def __init__(self, in_channels):
self.cnn = nn.Sequential(
nn.MaxPool2d((1, 5), stride=(1, 5)),
nn.Conv2d(in_channels=in_channels, out_channels=32, kernel_size=5))

why Netron render BatchNorm2d layer as bias on my model?

below is my demo code, just to simply show I've written a batch_norm layer, and when I export the corresponding model to onnx file and use Netron to render the network, I found that the BN layer is missing, since I disable the bias, I can see the bias still exists.
after a few modify of the code I confirm that the bias showed in the Netron app is the BN because when I delete the BN layer and disable bias, the b section disappled.
the Netron app can render the model I downloaded from internet correctly, so it's can't be the app's problem, but what's wrong in my code?
class myModel(nn.Module):
def __init__(self):
self.layers = nn.Sequential(
nn.Conv2d(3, 20, 3, stride=2, bias=False),
nn.Conv2d(20, 40, 3, stride=2, bias=False),
nn.Linear(1000, 8) # 24x24x3 12x12x20 5x5x40=1000
def forward(self, x):
return self.layers(x)
m = myModel()
torch.onnx.export(m, (torch.ones(1,3,24,24),), 'test.onnx')
here is the capture, BatchNorm disappeared and bias shows
when I delete all conv layers, the batchnorm shows:
it's a version specific problem, and if I switch the order bn and relu, it will render the bn layer normally.

Implementing a simple ResNet block with PyTorch

I'm trying to implement following ResNet block, which ResNet consists of blocks with two convolutional layers and a skip connection. For some reason it doesn't add the output of skip connection, if applied, or input to the output of convolution layers.
The ResNet block has:
Two convolutional layers with:
3x3 kernel
no bias terms
padding with one pixel on both sides
2d batch normalization after each convolutional layer
The skip connection:
simply copies the input if the resolution and the number of channels do not change.
if either the resolution or the number of channels change, the skip connection should have one convolutional layer with:
1x1 convolution without bias
change of the resolution with stride (optional)
different number of input channels and output channels (optional)
the 1x1 convolutional layer is followed by 2d batch normalization.
The ReLU nonlinearity is applied after the first convolutional layer and at the end of the block.
My code:
class Block(nn.Module):
def __init__(self, in_channels, out_channels, stride=1):
in_channels (int): Number of input channels.
out_channels (int): Number of output channels.
stride (int): Controls the stride.
super(Block, self).__init__()
self.skip = nn.Sequential()
if stride != 1 or in_channels != out_channels:
self.skip = nn.Sequential(
nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=stride, bias=False),
self.skip = None
self.block = nn.Sequential(
nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=3, padding=1, stride=1, bias=False),
nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=3, padding=1, stride=1, bias=False),
def forward(self, x):
out = self.block(x)
if self.skip is not None:
out = self.skip(x)
out = x
out += x
out = F.relu(out)
return out
The problem is in the reuse of the out variable. Normally, you'd implement like this:
def forward(self, x):
identity = x
out = self.block(x)
if self.skip is not None:
identity = self.skip(x)
out += identity
out = F.relu(out)
return out
If you like "one-liners":
def forward(self, x):
out = self.block(x)
out += (x if self.skip is None else self.skip(x))
out = F.relu(out)
return out
If you really like one-liners (please, that is too much, do not choose this option :))
def forward(self, x):
return F.relu(self.block(x) + (x if self.skip is None else self.skip(x)))

Initialize weights and bias in torch

What is the equivalent command for the below keras code in Pytorch
Dense(64, kernel_initializer='he_normal', bias_initializer='zeros', name='uhat_digitcaps')(d5)
How to I initialize weights and bias?
class Net(nn.Module):
def __init__(self, in_channels, out_channels):
self.linear = nn.Linear(in_channels, 64)
nn.init.kaiming_normal_(self.linear.weight, mode='fan_out')
nn.init.constant_(self.linear.bias, 0)
