I would like to load my model which is modified with the pre-trained weights of a ResNet18. I noticed that there is no bias in the state_dict of this model and bias=False in the conv2d layer. My questions are:
1 - is it possible to access to a version of these pre-trained models in PyTorch that have bias values too?
2 - Why pre-trained bias values are not available and what is the effect of having informed weight values but randomly initialized bias vectors?
I appreciate your guidance and opinions.
Related
I am building a multilabel image classification network. The dataset contains 70k images, total number of classes are 12. With respect to the entire dataset, 12 classes has more than 10% images. Out of 12 classes, 3 classes are above 70%. I am using VGG16 network without its associated classifier.
As the training results, I am getting max of 68% validation accuracy. I have tried changing the number of units per Dense layer (512,256,128 etc), increased the number of layers (5, 6 layers), added/removed Dropout layer (with 0.5), kernel_regularization (L1=0.1, L2=0.1).
As accuracy is not the appropriate metric for multilabel classification, I am trying to incorporate HammingLoss as the metric. But it is not working, here is the issue that I opened on the GitHub repo of HammingLoss.
What can be done to improve the accuracy?
What point I am missing in case of incorporating HammingLoss?
For classification, I am using the network as:
network.add(vggBase)
network.add(tf.keras.layers.Dense(256, activation='relu'))
network.add(tf.keras.layers.Dense(64, activation='relu'))
network.add(tf.keras.layers.Dense(12, activation='sigmoid'))
network.compile(optimizer=tf keras.optimizers.Adam(learning_rate=0.001), loss=tf.keras.losses.BinaryCrossentropy(), metrics=['accuracy'])
I recommend you to use Keras Tuner for tuning.
If Hammingloss is not working for you, you could use a differnet metric as a workaround, like pr_auc for instance. The metric choice depends strongly on what you want to achieve with your model. Maybe towardsdatascience/evaluating-multi-label-classifiers can help you to find that out.
Let's suppose we want to use in our model the pre-trained weights of VGG16 up to the layer before the third max pooling and then add the layers of our choice, how could we make this happen?
VGG16 architecture overview
You can to create a new model with say, base_model (VGG model with loaded weights and the unwanted layers 'pop()'ped). Then add VGG and other layers of your choice to the empty sequential model
I loaded the PyTorch's nn.Embedding module with a pre-trained embedding matrix. I set it to trainable as follows.
self.embedding_layer = nn.Embedding(self.vocab_size, self.embed_size, paddding_idx=self.padding_idx)
self.embedding_layer.weight = nn.Parameter(self.embedding)
self.embedding_layer.weight.requires_grad = True
I processed this output using Bidirectional Gated Recurrent Units network.
After the model is trained, I checked whether the weights of nn.Embedding are updated. The weights were not updated. model.embedding_layer.weight and 'self.embedding' weights are the same.
I checked the gradients of model.embedding_layer. They are all zeros.
Could you please help me? Thank you.
I have implemented an autoencoder using Keras. I understand that I can add accuracy performance metric as follows:
autoencoder.compile(optimizer='adam',
loss='mean_squared_error',
metrics=['accuracy'])
My question is:
Is the accuracy metric applied on the last layer of the decoder by default? If so, how can I set it so that it would get the representations from middle (hidden) layer to compute accuracy performance? Do I need to define a custom metric? How would that work?
It seems that what you really want is a multiple output network.
So on top of your middle layer that defines your embedding, add a layer (or more) to do your classification.
Then have a look at Multiple outputs in Keras to create your global cost.
You may also want to start by training the autoendoder only, then the classifier additional layers only to see the performance, you can also balance the accuracy of the encoder vs the accuracy of the classifier as a loss, training "both" networks at the same time.
Keras Applications provide implementations of some of the most popular model architectures with weights pretrained on some of the most popular datasets. These predefined models are very handy for transfer learning of problems which are similar to the datasets the models were trained on.
But what if I have a very different problem and want to completely train the models on the new dataset? How can I use the models in Applications for training from scratch based on my own dataset, if I dont have pretrained weights?
You can assign a None to the weights variable, for instance with the inception V3 architecture.
keras.applications.inception_v3.InceptionV3(include_top=False, weights='None', input_shape=input_shape = (img_width, img_height, 3))
include_top=False will allow you to train the top layer with your custom network.
weights='None' means that we are training without any weights if you want to train using imagenet weight you set it to weights='imagenet'