Pytorch lightning validation set has different image sizes than training set - pytorch

When i try to train a cnn, I get different shapes for the same dataloader and i dont know why. This is the output of the shapes I feed into the model:
You can see that my validation shape is [batch size, 1, image height and width]. for some reason, the image size gets changed in the last step and the batch size is 1. The same happens when I use the sanity check from pytorch lightning beforehand, which ive disabled for now. This is how the pytorch lightning data module looks like which gets the dataloader:
class MRIDataModule(pl.LightningDataModule):
def __init__(self, batch_size, data_paths):
super().__init__()
self.batch_size = batch_size
self.data_paths = data_paths
self.train_set = None
self.val_set = None
def setup(self, stage=None):
loader = get_data_loader()
self.train_set = loader(self.data_paths['train_dir'], transform=None, dimension=DIMENSION, nslice=NSLICE)
self.val_set = loader(self.data_paths['val_dir'], transform=None, dimension=DIMENSION, nslice=NSLICE)
def train_dataloader(self):
return DataLoader(self.train_set, batch_size=self.batch_size, num_workers=NUM_WORKERS, shuffle=True)
def val_dataloader(self):
return DataLoader(self.val_set, batch_size=self.batch_size, num_workers=NUM_WORKERS, shuffle=False)
here is the full code and the print statements are directly from the forward function of my model:
https://colab.research.google.com/drive/1yfbCZlwNMqaW1egaTF8HHRD4Ko8iMTxr?usp=sharing

I inspected you code and found the following:
def validation_epoch_end(self, val_step_outputs):
dummy_input = torch.zeros((1, 1, 150,150), device = device)
model_filename = CONFIG['MODEL'] + "-DIM" + str(CONFIG["DIMENSION"]) + "-model_final.onnx"
torch.onnx.export(self.net.eval(), dummy_input, model_filename)
This piece of code will be called every time your validation epoch is over. Which means that you will pass your dummy_input of size (1, 1, 150,150) to the model. That is why your are seeing a different images shape for the last validation step, than your batches coming from your dataloader

Related

how to handle different size of input data using Pytorch built in neural network

I build a simple pytorch model as below. However, I receive error message that mat1 and mat2 size are not aligned. How do I tweek the code to allow the flexibility of different dimension of data?
class simpleNet(nn.Module):
def __init__(self, **input_dim, hidden_size, num_classes**):
"""
:param input_dim: input feature dimension
:param hidden_size: hidden dimension
:param num_classes: total number of classes
"""
super(TwoLayerNet, self).__init__()
# hidden layer
self.hidden = nn.Linear(input_dim, hidden_size)
# Second fully connected layer that outputs our 10 labels
self.output = nn.Linear(hidden_size, num_classes)
def forward(self, x):
out = None
x = self.hidden(x)
x = torch.sigmoid(x)
x = self.output(x)
out = x
trying to build a toy neural network using Pytorch.
For your neural network to work, your output from your previous layer should be equal to your input for next layer, since its a code snippet for just your architecture without the initializations code, I cannot tell what you can simplify, not having equals in transition is not a good practice though. However, you can use reshape function from torch to make your output of previous layer equal to your next layer to make it work as a brute force method. Refer to: https://pytorch.org/docs/stable/generated/torch.reshape.html

Question about input dimension for conv2D-LSTM implement

I am a PyTorch beginner and would like to get help applying the conv2d-LSTM model.
I have a 2D image (1 channel x Time x Frequency) that contains time and frequency information.
I’d like to extract features automatically using conv2D and then LSTM model because 2D image contains time information
According to PyTorch documents, the output shape of conv2D is (Batch size, Channel out, Height out, Width out) and the input shape of LSTM is (Batch size, sequence length, input size). From that, I thought before input features of the LSTM network there need to reshape the output features of conv2D.
I expected the cnn-lstm model to perform well because it could learn the characteristics and time information of the image, but it did not get the expected performance.
My question is when I insert data into the LSTM model, is there any idea that LSTM learns the data by each row without flattening? Should I always flatten the 2D output?
My networks code and input/output shape are as follows. (I maintained the width size in the conv layer to preserve time information.)
Thanks a lot
class CNN_LSTM(nn.Module):
def __init__(self, paramArr1, paramArr2):
super(CNN_LSTM, self).__init__()
self.input_dim = paramArr2[0]
self.hidden_dim = paramArr2[1]
self.n_layers = paramArr2[2]
self.batch_size = paramArr2[3]
self.conv = nn.Sequential(
nn.Conv2d(1, out_channels=paramArr1[0],
kernel_size=(paramArr1[1],1),
stride=(paramArr1[2],1)),
nn.BatchNorm2d(paramArr1[0]),
nn.ReLU(),
nn.MaxPool2d(kernel_size = (paramArr1[3],1),stride=(paramArr1[4],1))
)
self.lstm = nn.LSTM(input_size = paramArr2[0],
hidden_size=paramArr2[1],
num_layers=paramArr2[2],
batch_first=True)
self.linear = nn.Linear(in_features=paramArr2[1], out_features=1)
def reset_hidden_state(self):
self.hidden = (
torch.zeros(self.n_layers, self.batch_size, self.hidden_dim).to(device),
torch.zeros(self.n_layers, self.batch_size, self.hidden_dim).to(device)
)
def forward(self, x):
x = self.conv(x)
x = x.view(x.size(0), x.size(1),-1)
x = x.permute(0,2,1)
out, (hn, cn) = self.lstm(x, self.hidden)
out = out.squeeze()[-1, :]
out = self.linear(out)
return out
model input/output shape

Pytorch Problem with Custom Dataset Class

First, I made a custom dataset to load in images from my dataframe (containing the image filepath and corresponding int label):
class Dataset(torch.utils.data.Dataset):
def __init__(self, dataframe, transform=None):
self.frame = dataframe
self.transform = transform
def __len__(self):
return len(self.frame)
def __getitem__(self, idx):
if torch.is_tensor(idx):
idx = idx.tolist()
filename = self.frame.iloc[idx, 0]
image = torch.from_numpy(io.imread(filename).transpose((2, 0, 1))).float()
label = self.frame.iloc[idx, 1]
sample = {'image': image, 'label': label}
if self.transform:
sample = self.transform(sample)
return sample
Then, I use pre-existing model architecture like so:
model = models.densenet161()
num_ftrs = model.classifier.in_features
model.classifier = nn.Linear(num_ftrs, 10) # where 10 is my number of classes
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
Finally, for training, I do the following:
model.train() # switch to train mode
for epoch in range(5):
for i, sample in enumerate(train_set): # where train_set is an instance of my Dataset class
optimizer.zero_grad()
image, label = sample['image'].unsqueeze(0), torch.Tensor(sample['label']).long()
output = model(image)
loss = criterion(output, label)
loss.backward()
optimizer.step()
However, I am experiencing errors with loss = criterion(output, label). It tells me that ValueError: Expected input batch_size (1) to match target batch_size (2).. Can someone teach me how to properly use a custom dataset, especially with loading in batches of data? Also, why am I experiencing that ValueError? Thank you!
please check the following lines:
label = self.frame.iloc[idx, 1] in dataset defination, you may print this to re-check, is this return two int
image, label = sample['image'].unsqueeze(0), torch.Tensor(sample['label']).long() in training code, you need to check the shape of the tensor

Visualize the output of Vgg16 model by TSNE plot?

I need to visualize the output of Vgg16 model which classify 14 different classes.
I load the trained model and I did replace the classifier layer with the identity() layer but it doesn't categorize the output.
Here is the snippet:
the number of samples here is 1000 images.
epoch = 800
PATH = 'vgg16_epoch{}.pth'.format(epoch)
checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
class Identity(nn.Module):
def __init__(self):
super(Identity, self).__init__()
def forward(self, x):
return x
model.classifier._modules['6'] = Identity()
model.eval()
logits_list = numpy.empty((0,4096))
targets = []
with torch.no_grad():
for step, (t_image, target, classess, image_path) in enumerate(test_loader):
t_image = t_image.cuda()
target = target.cuda()
target = target.data.cpu().numpy()
targets.append(target)
logits = model(t_image)
print(logits.shape)
logits = logits.data.cpu().numpy()
print(logits.shape)
logits_list = numpy.append(logits_list, logits, axis=0)
print(logits_list.shape)
tsne = TSNE(n_components=2, verbose=1, perplexity=10, n_iter=1000)
tsne_results = tsne.fit_transform(logits_list)
target_ids = range(len(targets))
plt.scatter(tsne_results[:,0],tsne_results[:,1],c = target_ids ,cmap=plt.cm.get_cmap("jet", 14))
plt.colorbar(ticks=range(14))
plt.legend()
plt.show()
here is what this script has been produced: I am not sure why I have all colors for each cluster!
The VGG16 outputs over 25k features to the classifier. I believe it's too much to t-SNE. It's a good idea to include a new nn.Linear layer to reduce this number. So, t-SNE may work better. In addition, I'd recommend you two different ways to get the features from the model:
The best way to get it regardless of the model is by using the register_forward_hook method. You may find a notebook here with an example.
If you don't want to use the register, I'd suggest this one. After loading your model, you may use the following class to extract the features:
class FeatNet (nn.Module):
def __init__(self, vgg):
super(FeatNet, self).__init__()
self.features = nn.Sequential(*list(vgg.children())[:-1]))
def forward(self, img):
return self.features(img)
Now, you just need to call FeatNet(img) to get the features.
To include the feature reducer, as I suggested before, you need to retrain your model doing something like:
class FeatNet (nn.Module):
def __init__(self, vgg):
super(FeatNet, self).__init__()
self.features = nn.Sequential(*list(vgg.children())[:-1]))
self.feat_reducer = nn.Sequential(
nn.Linear(25088, 1024),
nn.BatchNorm1d(1024),
nn.ReLU()
)
self.classifier = nn.Linear(1024, 14)
def forward(self, img):
x = self.features(img)
x_r = self.feat_reducer(x)
return self.classifier(x_r)
Then, you can run your model returning x_r, that is, the reduced features. As I told you, 25k features are too much for t-SNE. Another method to reduce this number is by using PCA instead of nn.Linear. In this case, you send the 25k features to PCA and then train t-SNE using the PCA's output. I prefer using nn.Linear, but you need to test to check which one you get a better result.

How to use pytorch DataLoader with a 3-D matrix for LSTM input?

I have a dataset of 3-D(time_stepinputsizetotal_num) matrix which is a .mat file. I want to use DataLoader to get a input dataset for LSTM which batch_size is 5. My code is as following:
file_path = "…/database/frameLength100/notOverlap/a.mat"
mat_data = s.loadmat(file_path)
tensor_data = torch.from_numpy(mat_data[‘a’]) #Tensor
class CustomDataset(Dataset):
def __init__(self, tensor_data):
self.tensor_data = tensor_data
def __getitem__(self, index):
data = self.tensor_data[index]
label = 1;
return data, label
def __len__(self):
return len(self.tensor_data)
custom_dataset = CustomDataset(tensor_data=tensor_data)
train_loader = DataLoader(dataset=custom_dataset, batch_size=5, shuffle=True)
I think the code is wrong but I have no idea how to correct it. What makes me confused is how how can I make DataLoader know which dimension is ‘total_num’ so that I get the dataset which batch size is 5.
If I understand correctly, you want the batching to happen along the total_num dimension, i. e. dimension 2.
You could simply use that the dimension to index your dataset, i.e. change __getitem__ to data = self.tensor_data[:, :, index], and accordingly in __len__, return self.tensor_data.size(2) instead of len(self.tensor_data). Each batch will then have size [time_step, inputsize, 5].

Resources