Issue implementing InceptionV3 with binary classifier - transfer learning with Pytorch - pytorch

I'm having an issue getting Inception V3 to work as the feature extractor with a binary classifier in Pytorch. I update the primary and auxiliary nets in Inception to have the binary class (as done in https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html)
but I'm getting an error
#Parameters for Inception V3
num_classes= 2
model_ft = models.inception_v3(pretrained=True)
# set_parameter_requires_grad(model_ft, feature_extract)
#handle auxilliary net
num_ftrs = model_ft.AuxLogits.fc.in_features
model_ft.AuxLogits.fc = nn.Linear(num_ftrs, num_classes)
#handle primary net
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs,num_classes)
# input_size = 299
#simulate data input
x = torch.rand([64, 3, 299, 299])
#create model with inception backbone
backbone = model_ft
num_filters = backbone.fc.in_features
layers = list(backbone.children())[:-1]
feature_extractor = nn.Sequential(*layers)
# use the pretrained model to classify damage 2 classes
num_target_classes = 2
classifier = nn.Linear(num_filters, num_target_classes)
feature_extractor.eval()
with torch.no_grad():
representations = feature_extractor(x).flatten(1)
x = classifier(representations)
But Im getting the error
RuntimeError Traceback (most recent call last)
<ipython-input-54-c2be64b8a99e> in <module>()
11 feature_extractor.eval()
12 with torch.no_grad():
---> 13 representations = feature_extractor(x)
14 x = classifier(representations)
9 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
442 _pair(0), self.dilation, self.groups)
443 return F.conv2d(input, weight, bias, self.stride,
--> 444 self.padding, self.dilation, self.groups)
445
446 def forward(self, input: Tensor) -> Tensor:
RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [64, 2]
before I updated the class to 2 (when it was 1000) I was getting the same error but with [64, 1000]. This method of creating a backbone and adding a classifier worked for Resnet but not here. I think it's because of the auxiliary net structure but not sure how to update it to deal with the dual output? Thanks

Inheriting feature_extracture by children function at line layers = list(backbone.children())[:-1] will bring the module from backbone to feature_extracture only, not the operation in forward function.
Let's take a look at the code below:
class Example(torch.nn.Module):
def __init__(self):
super().__init__()
self.avg = torch.nn.AdaptiveAvgPool2d((1,1))
self.linear = torch.nn.Linear(10, 1)
def forward(self, x):
out = self.avg(x)
out = out.squeeze()
out = self.linear(out)
return out
x = torch.randn(5, 10, 12, 12)
model = Example()
y = model(x) # work well
new_model = torch.nn.Sequential(*list(model.children()))
y = new_model(x) # error
Module model and new_model have the same blocks but not the same way of working. In new_module, the output from the pooling layer is not squeezed yet, so the shape of linear input is violate its assumption which causes the error.
In your case, the last two comments are redundant and that's why it returns the error, you did create a new fc in the InceptionV3 module at line model_ft.fc = nn.Linear(num_ftrs,num_classes). Therefore, replace the last one as the code below should work fine:
with torch.no_grad():
x = model_ft(x)

Related

LSTM Autoencoder set-up for multiple features using Pytorch

I am building an LSTM autoencoder to denoise signals and will take more than 1 feature as it's input.
I have setup the model Encoder part as follows which works for single feature inputs (i.e. sequences with just one feature):
class Encoder(nn.Module):
def __init__(self, seq_len, n_features, num_layers=1, embedding_dim=64):
super(Encoder, self).__init__()
self.seq_len = seq_len
self.n_features = n_features
self.num_layers = num_layers
self.embedding_dim = embedding_dim
self.hidden_dim = 2 * embedding_dim
# input: batch_size, seq_len, features
self.lstm1 = nn.LSTM(
input_size=self.n_features,
hidden_size=self.hidden_dim,
num_layers=self.num_layers,
batch_first=True
) # output: batch size, seq_len, hidden_dim
# input: batch_size, seq_len, hidden_dim
self.lstm2 = nn.LSTM(
input_size=self.hidden_dim,
hidden_size = self.embedding_dim,
num_layers = self.num_layers,
batch_first=True
) # output: batch_size, seq_len, embedding_dim
def forward(self, x):
print(x)
x = x.reshape((1, self.seq_len, self.n_features))
print(x.shape)
x, (_, _) = self.lstm1(x)
print(x.shape)
x, (hidden_n, _) = self.lstm2(x)
print(x.shape, hidden_n.shape)
print(hidden_n)
return hidden_n.reshape((self.n_features, self.embedding_dim))
When I test this setup as follows:
model = Encoder(1024, 1)
model.forward(torch.randn(1024, 1))
with the 1 representing a single feature all is well. However, when I do the following (where 2 represents a sequence of 2 features):
model = Encoder(1024, 2)
model.forward(torch.randn(1024, 2))
I get the following error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Input In [296], in <cell line: 1>()
----> 1 model.forward(torch.randn(1024, 2))
Input In [294], in Encoder.forward(self, x)
36 print(hidden_n)
37 # print(hidden_n.reshape((self.n_features, self.embedding_dim)).shape)
---> 39 return hidden_n.reshape((self.n_features, self.embedding_dim))
RuntimeError: shape '[2, 64]' is invalid for input of size 64
The hidden_n shape comes out as torch.Size([1, 1, 64]). I would like to understand that if we have more than 1 feature, e.g. 2, do we want to get that shape into the format of 1, 2, 64 such that the hidden state has weights for both features?
Can someone please explain why reshape is not liking the way I'm trying to restructure the output of the Encoder and how I should do it such that the model is able to take any feature size into account.
What am I missing/ perhaps misunderstanding here?

Use tf.TensorSpec function to save a Keras model whose inputs come multiple places

I'm working on some demo data to build a binary classifier. There are 8 categorical variables (assume those 8 categorical variables were already integer-encoded) as well as 14 numerical variables.
I separated categorical inputs from numerical inputs, and create two parts of inputs: 8 categorical inputs first go to the embedding layer. The embedding layer will then be concatenated to the 14-dimensional numeric input layer.
Therefore, the actual (one record/ one row) / input of the whole model will be something like this:
[1,2,3,4,5,6,7,8, [1,2,3,4,…,14]]
And my model structure looks like this
cat_inputs = []
embeddings = []
for col in encoded_vars:
# find the cardinality of each categorical column:
cardinality = int(np.ceil(d[col].nunique() + 2))
# set the embedding dimension:
# at least 2, at most 50, otherwise cardinality//2
embedding_dim = max(min((cardinality)//2, 50),2)
print(cardinality, embedding_dim)
col_inputs = Input(shape=(1,))
# Specify the embedding
embedding = Embedding(cardinality, embedding_dim,
input_length=1, name=col+"_embed")(col_inputs)
# Add a but of dropout to the embedding layers to regularize:
embedding = SpatialDropout1D(0.1)(embedding)
# Flatten out the embeddings:
embedding = Reshape(target_shape=(embedding_dim,))(embedding)
# Add the input shape to inputs
cat_inputs.append(col_inputs)
# add the embeddings to the embeddings layer
embeddings.append(embedding)
num_inputs = Input(shape=(len(num_vars), ))
## Concatenate the cat_merged layer to the numerical input layer
x = Concatenate(axis = 1)(embeddings + [num_inputs])
## batch-norm layer
x = Dense(128, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.3)(x)
x = Dense(64, activation='relu')(x)
outputs = Dense(1, activation = 'sigmoid')(x)
model = Model( [num_inputs] + cat_inputs, outputs)
model.summary()
The code was runnable and obtain desired result.
However, I was wondering if anyone can direct me how to save the model specs.
I tried this
model_inputs = [model.inputs[i].shape for i in range(len(model.inputs))]
full_model = full_model.get_concrete_function(x=tf.TensorSpec(model_inputs, model.inputs[0].dtype))
However, it gives me the error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-104-f29daca70586> in <module>
1 model_inputs = [model.inputs[i].shape for i in range(len(model.inputs))]
----> 2 full_model = full_model.get_concrete_function(x=tf.TensorSpec(model_inputs, model.inputs[0].dtype))
/opt/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_spec.py in __init__(self, shape, dtype, name)
52 not convertible to a `tf.DType`.
53 """
---> 54 self._shape = tensor_shape.TensorShape(shape)
55 try:
56 self._shape_tuple = tuple(self.shape.as_list())
/opt/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_shape.py in __init__(self, dims)
774 else:
775 # Got a list of dimensions
--> 776 self._dims = [as_dimension(d) for d in dims_iter]
777
778 #property
/opt/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_shape.py in <listcomp>(.0)
774 else:
775 # Got a list of dimensions
--> 776 self._dims = [as_dimension(d) for d in dims_iter]
777
778 #property
/opt/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_shape.py in as_dimension(value)
716 return value
717 else:
--> 718 return Dimension(value)
719
720
/opt/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_shape.py in __init__(self, value)
191 raise TypeError("Cannot convert %s to Dimension" % value)
192 else:
--> 193 self._value = int(value)
194 if (not isinstance(value, compat.bytes_or_text_types) and
195 self._value != value):
TypeError: int() argument must be a string, a bytes-like object or a number, not 'TensorShape'
model_inputs
[TensorShape([None, 14]),
TensorShape([None, 1]),
TensorShape([None, 1]),
TensorShape([None, 1]),
TensorShape([None, 1]),
TensorShape([None, 1]),
TensorShape([None, 1]),
TensorShape([None, 1]),
TensorShape([None, 1])]
My understanding is that this is a multiple inputs case (with different input dimensional), and I'd greatly appreciate it if anyone can help.
Thank you!

(pytorch / mse) How can I change the shape of tensor?

Problem definition:
I have to use MSELoss function to define the loss to classification problem. Therefore it keeps saying the error message regarding the shape of tensor.
Entire error message:
torch.Size([32, 10]) torch.Size([32])
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call
last) in
53 output = model.forward(images)
54 print(output.shape, labels.shape)
---> 55 loss = criterion(output, labels)
56 loss.backward()
57 optimizer.step()
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in
call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py in
forward(self, input, target)
429
430 def forward(self, input, target):
--> 431 return F.mse_loss(input, target, reduction=self.reduction)
432
433
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in
mse_loss(input, target, size_average, reduce, reduction) 2213
ret = torch.mean(ret) if reduction == 'mean' else torch.sum(ret)
2214 else:
-> 2215 expanded_input, expanded_target = torch.broadcast_tensors(input, target) 2216 ret =
torch._C._nn.mse_loss(expanded_input, expanded_target,
_Reduction.get_enum(reduction)) 2217 return ret
/opt/conda/lib/python3.7/site-packages/torch/functional.py in
broadcast_tensors(*tensors)
50 [0, 1, 2]])
51 """
---> 52 return torch._C._VariableFunctions.broadcast_tensors(tensors)
53
54
> RuntimeError: The size of tensor a (10) must match the size of tensor
b (32) at non-singleton dimension 1
How can I reshape the tensor, and which tensor (output or labels) should I change to calculate the loss?
Entire code is attached below.
import numpy as np
import torch
# Loading the Fashion-MNIST dataset
from torchvision import datasets, transforms
# Get GPU Device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))])
# Download and load the training data
trainset = datasets.FashionMNIST('MNIST_data/', download = True, train = True, transform = transform)
testset = datasets.FashionMNIST('MNIST_data/', download = True, train = False, transform = transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 32, shuffle = True, num_workers=4)
testloader = torch.utils.data.DataLoader(testset, batch_size = 32, shuffle = True, num_workers=4)
# Examine a sample
dataiter = iter(trainloader)
images, labels = dataiter.next()
# Define the network architecture
from torch import nn, optim
import torch.nn.functional as F
model = nn.Sequential(nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10),
nn.LogSoftmax(dim = 1))
model.to(device)
# Define the loss
criterion = nn.MSELoss()
# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr = 0.001)
# Define the epochs
epochs = 5
train_losses, test_losses = [], []
for e in range(epochs):
running_loss = 0
for images, labels in trainloader:
# Flatten Fashion-MNIST images into a 784 long vector
images = images.to(device)
labels = labels.to(device)
images = images.view(images.shape[0], -1)
# Training pass
optimizer.zero_grad()
output = model.forward(images)
print(output.shape, labels.shape)
loss = criterion(output, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
else:
test_loss = 0
accuracy = 0
# Turn off gradients for validation, saves memory and computation
with torch.no_grad():
# Set the model to evaluation mode
model.eval()
# Validation pass
for images, labels in testloader:
images = images.to(device)
labels = labels.to(device)
images = images.view(images.shape[0], -1)
ps = model(images)
test_loss += criterion(ps, labels)
top_p, top_class = ps.topk(1, dim = 1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.FloatTensor))
model.train()
print("Epoch: {}/{}..".format(e+1, epochs),
"Training loss: {:.3f}..".format(running_loss/len(trainloader)),
"Test loss: {:.3f}..".format(test_loss/len(testloader)),
"Test Accuracy: {:.3f}".format(accuracy/len(testloader)))
From the output you print before it error, torch.Size([32, 10]) torch.Size([32]).
The left one is what the model gives you and the right one is from trainloader, normally you use this for something like nn.CrossEntropyLoss.
And from the full error log, the error is from this line
loss = criterion(output, labels)
The way to make this work is called One-hot Encoding, if it's me for sake of my laziness I'll write it like this.
ones = torch.sparse.torch.eye(10).to(device) # number of class class
labels = ones.index_select(0, labels)
Alternatively, you can change your loss function from nn.MSELoss() to nn.CrossEntropyLoss(). Cross entropy loss is generally preferable to MSE for categorical tasks like this, and in PyTorch's implementation this loss function takes care of a lot of the shape conversion under the hood so you can provide it with a vector of class probabilities and a single class label.
Fundamentally, your model attempts to predict what class the input belongs to by calculating a score (you might call it a 'confidence score') for each possible class. So if you have 10 classes, the model's output will be a 10-dimensional list (in PyTorch, a tensor shape [10]) and the prediction would be the the index of the highest score. Often one would apply the softmax (https://en.wikipedia.org/wiki/Softmax_function) function to convert these scores to a probability distribution, so all scores will be between 0 and 1 and the elements all sum to 1.
Then cross entropy is a common choice of loss function for this task: it compares the list of predictions to the one-hot encoded label. E.g. if you have 3 classes, a label would look like [1, 0, 0] to represent the first class. This is also called the "one-hot encoding". Meanwhile a prediction might look like [0.7, 0.1, 0.2]. In PyTorch, nn.CrossEntropyLoss() expects your labels are coming as single value tensors whose value represents the class label, since there's no real need to move long, sparse vectors around memory. So this loss function accomplishes the comparison you want to do and I'm guessing is implemented more efficiently than actually creating one-hot encodings.

Kears fit_generator() with y=None

I am playing with Variational Autoencoders and would like to adapt a Keras example found on GitHub.
Basically, the example is very simple based on mnist dataset and I would like to implement on a more difficult set as it is more realistic.
Code I'm trying to modify:
vae_dfc.fit(
x_train,
epochs=epochs,
steps_per_epoch=train_size//batch_size,
validation_data=(x_val),
validation_steps=val_size//batch_size,
verbose=1
)
With more complex datasets it is nearly impossible to load everything on memory so we need to use fit_generator() to train the model. But it doesn't seem able to handle this:
image_generator = image.ImageDataGenerator(
rescale=1./255,
validation_split=0.2
)
train_generator = image_generator.flow_from_directory(
dir,
class_mode=None,
color_mode='rgb',
target_size=(ORIGINAL_SHAPE[0], ORIGINAL_SHAPE[1]),
batch_size=BATCH_SIZE,
subset='training'
)
vae.fit_generator(
train_generator,
epochs=EPOCHS,
steps_per_epoch=train_generator.samples // BATCH_SIZE,
validation_data=validation_generator,
validation_steps=validation_generator.samples // BATCH_SIZE
)
My understanding is that class_mode=None is producing an output similar to the original simple example, but the fit_generator() is unable to handle this. Are there any workarounds to deal with the fit generator error?
Configurations:
tensorflow-gpu==1.12.0
Python 3.6
Windows 10
Cuda 9.0
Full error:
File "xxx\venv\lib\site-packages\tensorflow\python\keras\engine\training.py",
line 2177, in fit_generator
initial_epoch=initial_epoch)
File "xxx\venv\lib\site-packages\tensorflow\python\keras\engine\training_generator.py",
line 162, in fit_generator
'or (x, y). Found: ' + str(generator_output)) ValueError: Output of generator should be a tuple (x, y, sample_weight) or (x, y).
Found: [[[[0.48627454 0.34901962 0.2901961 ] ....]]]
An autoencoder needs outputs = inputs. It's different from not having outputs.
I believe you can try class_mode='input'.
If this doesn't work, you can create a wrapper generator for outputting both:
class AutoencGenerator(keras.utils.Sequence):
def __init__(self, originalGenerator):
self.generator = originalGenerator
def __len__(self):
return len(self.generator)
def __getitem__(self, i):
x = self.generator[i]
return x, x
def on_epoch_end(self):
self.generator.on_epoch_end() #this only if there is an on_epoch_end in the original
train_autoenc_generator = AutoencGenerator(train_generator)
Both options will need that your model has outputs, of course. If the model was created without outputs (unusual), make it output the results and use the loss function in model.compile(loss=the_loss).
Example of VAE
inputs = Input(shape)
means, sigmas = encoder(inputs)
def encode(x):
means, sigmas = x
randomSamples = tf.random_normal(K.shape(means)) #samples
encoded = (sigmas * randomSamples) + means
return encoded
encodings = Lambda(encode)([means, sigmas])
outputs = decoder(encodings)
kl_loss = some_tensor_function(means, sigmas)
VAE = Model(inputs, outputs)
VAE.add_loss(kl_loss)
VAE.compile(loss = 'mse', optimizer='adam')
Train with the generator:
VAE.fit_generator(train_autoenc_generator, ...)

AttributeError: 'function' object has no attribute 'predict'. Keras

I am working on an RL problem and I created a class to initialize the model and other parameters. The code is as follows:
class Agent:
def __init__(self, state_size, is_eval=False, model_name=""):
self.state_size = state_size
self.action_size = 20 # measurement, CNOT, bit-flip
self.memory = deque(maxlen=1000)
self.inventory = []
self.model_name = model_name
self.is_eval = is_eval
self.done = False
self.gamma = 0.95
self.epsilon = 1.0
self.epsilon_min = 0.01
self.epsilon_decay = 0.995
def model(self):
model = Sequential()
model.add(Dense(units=16, input_dim=self.state_size, activation="relu"))
model.add(Dense(units=32, activation="relu"))
model.add(Dense(units=8, activation="relu"))
model.add(Dense(self.action_size, activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=0.003))
return model
def act(self, state):
options = self.model.predict(state)
return np.argmax(options[0]), options
I want to run it for only one iteration, hence I create an object and I pass a vector of length 16 like this:
agent = Agent(density.flatten().shape)
state = density.flatten()
action, probs = agent.act(state)
However, I get the following error:
AttributeError Traceback (most recent call last) <ipython-input-14-4f0ff0c40f49> in <module>
----> 1 action, probs = agent.act(state)
<ipython-input-10-562aaf040521> in act(self, state)
39 # return random.randrange(self.action_size)
40 # model = self.model()
---> 41 options = self.model.predict(state)
42 return np.argmax(options[0]), options
43
AttributeError: 'function' object has no attribute 'predict'
What's the issue? I checked some other people's codes as well, like this and I think mine is also very similar.
Let me know.
EDIT:
I changed the argument in Dense from input_dim to input_shape and self.model.predict(state) to self.model().predict(state).
Now when I run the NN for one input data of shape (16,1), I get the following error:
ValueError: Error when checking input: expected dense_1_input to have
3 dimensions, but got array with shape (16, 1)
And when I run it with shape (1,16), I get the following error:
ValueError: Error when checking input: expected dense_1_input to have
3 dimensions, but got array with shape (1, 16)
What should I do in this case?
in last code block,
def act(self, state):
options = self.model.predict(state)
return np.argmax(options[0]), options
self.model is a function which is returning a model, it should be self.model().predict(state)
I used np.reshape. So in this case, I did
density_test = np.reshape(density.flatten(), (1,1,16))
and the network gave the output.

Resources