I'm reading Maxim Lapan's Deep Learning Hands On. I came across this code in chapter 2 and I don't understand a few things. Could anybody explain why the output of print(out) gives three parameters instead of the single float tensor we put in. Also, why is the super function necessary here? Finally, what is the x parameter that forward is accepting? Thank you.
class OurModule(nn.Module):
def __init__(self, num_inputs, num_classes, dropout_prob=0.3): #init
super(OurModule, self).__init__() #Call OurModule and pass the net instance (Why is this necessary?)
self.pipe = nn.Sequential( #net.pipe is the nn object now
nn.Linear(num_inputs, 5),
nn.ReLU(),
nn.Linear(5, 20),
nn.ReLU(),
nn.Linear(20, num_classes),
nn.Dropout(p=dropout_prob),
nn.Softmax(dim=1)
)
def forward(self, x): #override the default forward method by passing it our net instance and (return the nn object?). x is the tensor? This is called when 'net' receives a param?
return self.pipe(x)
if __name__ == "__main__":
net = OurModule(num_inputs=2, num_classes=3)
print(net)
v = torch.FloatTensor([[2, 3]])
out = net(v)
print(out) #[2,3] put through the forward method of the nn? Why did we get a third param for the output?
print("Cuda's availability is %s" % torch.cuda.is_available()) #find if gpu is available
if torch.cuda.is_available():
print("Data from cuda: %s" % out.to('cuda'))
OurModule.__mro__
OurModule defined a PyTorch nn.Module that accepts 2 inputs (num_inputs) and produces 3 outputs (num_classes).
It consists of:
A Linear layers that accepts 2 inputs and produces 5 outputs
A ReLU
A Linear layer that accepts 5 inputs and produces 20 outputs
A ReLU
A Linear layer that accepts 20 inputs and produces 3 (num_classes) outputs
A Dropout layer
A Softmax layer
You create v which consists of 2 inputs and pass it through this network's forward() method when you call net(v). The result of running this network (3 outputs) is then stored in out.
In your example, x takes on the value of v, torch.FloatTensor([[2, 3]])
Although #JoshVarty has provided a great answer, I would like to add a little bit.
why is the super function necessary here
The class OurModule inherits nn.Module. The super function means you want to use the parent's (nn.Module) function, namely init. You can refer to the source code to see what the parent class exactly does in the init function.
Related
I build a simple pytorch model as below. However, I receive error message that mat1 and mat2 size are not aligned. How do I tweek the code to allow the flexibility of different dimension of data?
class simpleNet(nn.Module):
def __init__(self, **input_dim, hidden_size, num_classes**):
"""
:param input_dim: input feature dimension
:param hidden_size: hidden dimension
:param num_classes: total number of classes
"""
super(TwoLayerNet, self).__init__()
# hidden layer
self.hidden = nn.Linear(input_dim, hidden_size)
# Second fully connected layer that outputs our 10 labels
self.output = nn.Linear(hidden_size, num_classes)
def forward(self, x):
out = None
x = self.hidden(x)
x = torch.sigmoid(x)
x = self.output(x)
out = x
trying to build a toy neural network using Pytorch.
For your neural network to work, your output from your previous layer should be equal to your input for next layer, since its a code snippet for just your architecture without the initializations code, I cannot tell what you can simplify, not having equals in transition is not a good practice though. However, you can use reshape function from torch to make your output of previous layer equal to your next layer to make it work as a brute force method. Refer to: https://pytorch.org/docs/stable/generated/torch.reshape.html
So i am new to deep learning and started learning PyTorch. I created a classifier model with following structure.
class model(nn.Module):
def __init__(self):
super(model, self).__init__()
resnet = models.resnet34(pretrained=True)
layers = list(resnet.children())[:8]
self.features1 = nn.Sequential(*layers[:6])
self.features2 = nn.Sequential(*layers[6:])
self.classifier = nn.Sequential(nn.BatchNorm1d(512), nn.Linear(512, 3))
def forward(self, x):
x = self.features1(x)
x = self.features2(x)
x = F.relu(x)
x = nn.AdaptiveAvgPool2d((1,1))(x)
x = x.view(x.shape[0], -1)
return self.classifier(x)
So basically I wanted to classify among three things {0,1,2}. While evaluating, I passed the image it returned a Tensor with three values like below
(tensor([[-0.1526, 1.3511, -1.0384]], device='cuda:0', grad_fn=<AddmmBackward>)
So my question is what are these three numbers? Are they probability ?
P.S. Please pardon me If I asked something too silly.
The final layer nn.Linear (fully connected layer) of self.classifier of your model produces values, that we can call a scores, for example, it may be: [10.3, -3.5, -12.0], the same you can see in your example as well: [-0.1526, 1.3511, -1.0384] which are not normalized and cannot be interpreted as probabilities.
As you can see it's just a kind of "raw unscaled" network output, in other words these values are not normalized, and it's hard to use them or interpret the results, that's why the common practice is converting them to normalized probability distribution by using softmax after the final layer, as #skinny_func has already described. After that you will get the probabilities in the range of 0 and 1, which is more intuitive representation.
So after training what you would want to do is to apply softmax to the output tensor to extract the probability of each class, then you choose the maximal value (highest probability).
in your case:
prob = torch.nn.functional.softmax(model(x), dim=1)
_, pred_class = torch.max(prob, dim=1)
I'm trying to develop a layer in Keras which works with 3D tensors. To make it flexible, I would like to postpone the code that relies on the input's exact shape as much as possible.
My layer is overriding 5 methods:
from tensorflow.python.keras.layers import Layer
class MyLayer(Layer):
def __init__(self, **kwargs):
pass
def build(self, input_shape):
pass
def call(self, inputs, verbose=False):
second_dim = K.int_shape(inputs)[-2]
# Do something with the second_dim
def compute_output_shape(self, input_shape):
pass
def get_config(self):
pass
And I'm using this layer like this:
input = Input(batch_shape=(None, None, 128), name='input')
x = MyLayer(name='my_layer')(input)
model = Model(input, x)
But I'm facing an error since the second_dim is None. How can I develop a layer that relies on the dimensions of the input but it's ok with it being provided by the actual data and not the input layer?
I ended up asking the same question differently, and I've got a perfect answer:
What is the right way to manipulate the shape of a tensor when there are unknown elements in it?
The gist of it is, don't treat the dimensions directly. Use them by reference and not by value. So, do not use K.int_shape and instead use K.shape. And use Keras operations to compose and come up with a new shape:
shape = K.shape(x)
newShape = K.concatenate([
shape[0:1],
shape[1:2] * shape[2:3],
shape[3:4]
])
I was following this method
(https://discuss.pytorch.org/t/dynamic-parameter-declaration-in-forward-function/427) to dynamically assign parameters in forward function.
However, my parameter is not just one single weight tensor but it is nn.Sequential.
When I implement below:
class MyModule(nn.Module):
def __init__(self):
# you need to register the parameter names earlier
self.register_parameter('W_di', None)
def forward(self, input):
if self.W_di is None:
self.W_di = nn.Sequential(
nn.Linear(mL_n * 2, 1024),
nn.ReLU(),
nn.Linear(1024, self.hS)).to(device)
I get the following error.
TypeError: cannot assign 'torch.nn.modules.container.Sequential' as parameter 'W_di' (torch.nn.Parameter or None expected)
Is there any way that I can register nn.Sequential as a whole param? Thanks!
If you or other users still have this problem, one solution to consider is using nn.ModuleList instead of nn.Sequential.
While nn.Sequential is useful for defining a fixed sequence of layers in PyTorch, nn.ModuleList is a more flexible container that allows direct access and modification of individual layers within the list. This can be especially helpful when dealing with dynamic models or architectures that require more complex layer arrangements.
My gut feeling is that you cannot do it. Even in the static model declaration, nn.Module also specifies the parameters of every sub-modules (e.g., nn.Conv2d or nn.Linear) in a nested way. That is, every kernel or bias is registered one by one and independently.
One workaround might be to introduce dynamic sub-modules. Here is my brief implementation. One can define desired dynamic behaviors inside the function DynamicLinear.
import torch
import torch.nn as nn
class DynamicLinear(nn.Module):
def __init__(self):
super(DynamicLinear, self).__init__()
# you need to register the parameter names earlier
self.register_parameter('W_di', None)
def forward(self, x):
if self.W_di is None:
# dynamically define a linear function here
self.W_di = nn.Parameter(torch.ones(1, 1)).to(x.device)
return self.W_di # x
class MyModule(nn.Module):
def __init__(self):
super(MyModule, self).__init__()
self.net = nn.Sequential(
DynamicLinear(),
nn.ReLU(),
DynamicLinear())
def forward(self, x):
return self.net(x)
m = MyModule()
x = torch.ones(1, 1)
y = m(x)
# output: 1
print(y)
I like using torch.nn.Sequential as in
self.conv_layer = torch.nn.Sequential(
torch.nn.Conv1d(196, 196, kernel_size=15, stride=4),
torch.nn.Dropout()
)
But when I want to add a recurrent layer such as torch.nn.GRU it won't work because the output of recurrent layers in PyTorch is a tuple and you need to choose which part of the output you want to further process.
So is there any way to get
self.rec_layer = nn.Sequential(
torch.nn.GRU(input_size=2, hidden_size=256),
torch.nn.Linear(in_features=256, out_features=1)
)
to work? For this example, let's say I want to feed torch.nn.GRU(input_size=2, hidden_size=20)(x)[1][-1] (the last hidden state of the last layer) into the following Linear layer.
I made a module called SelectItem to pick out an element from a tuple or list
class SelectItem(nn.Module):
def __init__(self, item_index):
super(SelectItem, self).__init__()
self._name = 'selectitem'
self.item_index = item_index
def forward(self, inputs):
return inputs[self.item_index]
SelectItem can be used in Sequential to pick out the hidden state:
net = nn.Sequential(
nn.GRU(dim_in, dim_out, batch_first=True),
SelectItem(1)
)