Convolutional Neural Network in PyTorch with custom data - pytorch

I am trying create a CNN for classification of three dancers with skeleton data in PyTorch.
The Dataset is split into 3000 pieces each with 50 frames and 72 joint position data.
I want to interpret the data like an image and therefore want to use a CNN for classification, but I am not sure how to use the Dataloader. In this link there is an example on how to train a CNN for classification, but the Dataloader uses a pre-configured dataset and I am not sure how I should format my custom data in the argument "trainset" in the command:
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,shuffle=False, num_workers=1)
My input data basically has 3000 data points each which has 50 frames of 72 joint positions.
And my labels are a vector of length 3000 each which can assume the output 0,1,2 for three different dancers.
I hope someone can help.

You need to create an class that inherits torch.utils.data.Dataset. In short, the class needs to contain an __init__ function that specifies necessary attributes, an __getitem__ function that returns a data point (optionally with its label) at an indexed position, and a __len__ function to specify the number of data points in your dataset.
Here is a template of how to create a customized dataset.
from torch.utils import data
class MyDataset(data.Dataset):
def __init__(self, root):
self.root = root
self.dset = # load your data from the root here
def __getitem__(self, index):
return self.dset[index]
def __len__(self):
return len(self.dset)
You may also like to refer to this real implementation, where I customize a dataset that uses GTA5 images for a semantic segmentation task.
Lastly, just treat it as a pre-configured dataset. For example,
trainset = MyDataset(root)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=False, num_workers=1)

Related

how to handle different size of input data using Pytorch built in neural network

I build a simple pytorch model as below. However, I receive error message that mat1 and mat2 size are not aligned. How do I tweek the code to allow the flexibility of different dimension of data?
class simpleNet(nn.Module):
def __init__(self, **input_dim, hidden_size, num_classes**):
"""
:param input_dim: input feature dimension
:param hidden_size: hidden dimension
:param num_classes: total number of classes
"""
super(TwoLayerNet, self).__init__()
# hidden layer
self.hidden = nn.Linear(input_dim, hidden_size)
# Second fully connected layer that outputs our 10 labels
self.output = nn.Linear(hidden_size, num_classes)
def forward(self, x):
out = None
x = self.hidden(x)
x = torch.sigmoid(x)
x = self.output(x)
out = x
trying to build a toy neural network using Pytorch.
For your neural network to work, your output from your previous layer should be equal to your input for next layer, since its a code snippet for just your architecture without the initializations code, I cannot tell what you can simplify, not having equals in transition is not a good practice though. However, you can use reshape function from torch to make your output of previous layer equal to your next layer to make it work as a brute force method. Refer to: https://pytorch.org/docs/stable/generated/torch.reshape.html

How to create a keras model that depends dynamically on the input dimension (not batch size)?

If you use the keras subclass api and you want to spawn a bunch of layers (n) depending on the the input dimension x = (batch_dim, n) is there a way to do this inside the build method?
Or is the only way to pass the input dim into the model at init time so the layers can be created within the init scope?
UPDATE: pseudo-code (untested) example
class BigModel(tf.keras.models.Model):
def __init__(self):
super.__init__()
self._my_submodels = list()
def build(self, input_shape):
for i in range(input_shape[1]):
self.my_submodels.append(MyModel(param=i))
def call(self, *inputs):
stuff = list()
for submodel in self.my_submodels:
stuff.append(submodel(*inputs))
# do something amazing with all the models
fan_in = ... # combine
return fan_in
You could probably rewrite the whole structure in a more vectorized way using one model with a lot of splits but it will be harder to read and deal with and I think the new tf 2.0 allows this kind of dynamism without any cost penalty.
Yes, instead of using batch_shape = (batch_size, input_dim), use batch_shape=(None, input_dim) which allows arbitrary number of batch_size.

How to get confusion matrix when using model.fit_generator

I am using model.fit_generator to train and get results for my binary (two class) model because I am giving input images directly from my folder. How to get confusion matrix in this case (TP, TN, FP, FN) as well because generally I use confusion_matrix command of sklearn.metrics to get it, which requires predicted, and actual labels. But here I don't have both. May be I can calculate predicted labels from predict=model.predict_generator(validation_generator) command. But I don't know how my model is taking input labels from my images. General structure of my input folder is:
train/
class1/
img1.jpg
img2.jpg
........
class2/
IMG1.jpg
IMG2.jpg
test/
class1/
img1.jpg
img2.jpg
........
class2/
IMG1.jpg
IMG2.jpg
........
and some blocks of my code is:
train_generator = train_datagen.flow_from_directory('train',
target_size=(50, 50), batch_size=batch_size,
class_mode='binary',color_mode='grayscale')
validation_generator = test_datagen.flow_from_directory('test',
target_size=(50, 50),batch_size=batch_size,
class_mode='binary',color_mode='grayscale')
model.fit_generator(
train_generator,steps_per_epoch=250 ,epochs=40,
validation_data=validation_generator,
validation_steps=21 )
So the above code automatically takes two class inputs, but I don't know for which it consider class 0 and for which class 1.
I've managed it in the following way, using keras.utils.Sequence.
from sklearn.metrics import confusion_matrix
from keras.utils import Sequence
class MySequence(Sequence):
def __init__(self, *args, **kwargs):
# initialize
# see manual on implementing methods
def __len__(self):
return self.length
def __getitem__(self, index):
# return index-th complete batch
# create data generator
data_gen = MySequence(evaluation_set, batch_size=10)
n_batches = len(data_gen)
confusion_matrix(
np.concatenate([np.argmax(data_gen[i][1], axis=1) for i in range(n_batches)]),
np.argmax(m.predict_generator(data_gen, steps=n_batches), axis=1)
)
The implemented class returns batches of data in tuples, that allows not to hold all of them in RAM. Please, note that it must be implemented in __getitem__, and this method must return same batch for the same argument.
Unfortunately this code iterates data twice: first time, it creates array of true answers from returned batches, the second time it calls predict method of the model.
probabilities = model.predict_generator(generator=test_generator)
will give us set of probabilities.
y_true = test_generator.classes
will give us true labels.
Because this is a binary classification problem, you have to find predicted labels. To do that you can use
y_pred = probabilities > 0.5
Then we have true labels and predicted labels on the test dataset. So, the confusion matrix is given by
font = {
'family': 'Times New Roman',
'size': 12
}
matplotlib.rc('font', **font)
mat = confusion_matrix(y_true, y_pred)
plot_confusion_matrix(conf_mat=mat, figsize=(8, 8), show_normed=False)
You can view the mapping from class names to class indices by calling the attribute class_indices on your train_generator or validation_generator objects, as in
train_generator.class_indices

How to use pytorch DataLoader with a 3-D matrix for LSTM input?

I have a dataset of 3-D(time_stepinputsizetotal_num) matrix which is a .mat file. I want to use DataLoader to get a input dataset for LSTM which batch_size is 5. My code is as following:
file_path = "…/database/frameLength100/notOverlap/a.mat"
mat_data = s.loadmat(file_path)
tensor_data = torch.from_numpy(mat_data[‘a’]) #Tensor
class CustomDataset(Dataset):
def __init__(self, tensor_data):
self.tensor_data = tensor_data
def __getitem__(self, index):
data = self.tensor_data[index]
label = 1;
return data, label
def __len__(self):
return len(self.tensor_data)
custom_dataset = CustomDataset(tensor_data=tensor_data)
train_loader = DataLoader(dataset=custom_dataset, batch_size=5, shuffle=True)
I think the code is wrong but I have no idea how to correct it. What makes me confused is how how can I make DataLoader know which dimension is ‘total_num’ so that I get the dataset which batch size is 5.
If I understand correctly, you want the batching to happen along the total_num dimension, i. e. dimension 2.
You could simply use that the dimension to index your dataset, i.e. change __getitem__ to data = self.tensor_data[:, :, index], and accordingly in __len__, return self.tensor_data.size(2) instead of len(self.tensor_data). Each batch will then have size [time_step, inputsize, 5].

Why do I get the same prediction for all training samples?

I have a neural network with num_labels separate outputs where each output consists of a softmax layer with two nodes (Yes/No).
I am taking the output of a convolution_layer and feed it as input for a simple softmax_layer which I further feed into each of said outputs:
softmax_layer = Dense(num_labels, activation='softmax', name='softmax_layer')(convolution_layer)
outputs = list()
for i in range(num_labels):
out_y = Dense(2, activation='softmax', name='out_{:d}'.format(i))(softmax_layer)
outputs.append(out_y)
So far I was able to train the model by providing a list of training samples but now I noticed that I am getting the exact same output for completely different samples in a batch:
Please note: Here, each column consists of (2,1) arrays. Each column is the prediction for one sample.
I've checked the samples, they are different. I've also tried to e.g. feed the convolution_layer into the outputs. In that case the predictions are different. I can only see this outcome if I do it the way shown above.
I could live with the fact that the outputs are "similar". In that case I'd think that the network is just learning not what I want it to learn but since they are really the same I am not quite sure what the problem here is.
I've tried something similar with a simple feed forward network:
class FeedForward:
def __init__(self, input_dim, nb_classes):
in_x = Input(shape=(input_dim, ), name='in_x')
h1 = Dense(14, name='h1', activation='relu')(in_x)
h2 = Dense(8, name='h2', activation='relu')(h1)
out = Dense(nb_classes, name='out', activation='softmax')(h2)
self.model = Model(input=[in_x], output=[out])
def compile_model(self, optimizer='adam', loss='binary_crossentropy'):
self.model.compile(optimizer=optimizer, loss=loss, metrics=["accuracy"])
But it behaves similarly. I can't imagine it's due to imbalanced data. There are 13 classes. There is some imbalance but it's not like that one class has 90% of the mass.
Am I doing this right?

Resources