CNN with CTC loss - loss-function

I want to extract features using a pretrained CNN model(ResNet50, VGG, etc) and use the features with a CTC loss function.
I want to build it as a text recognition model.
Anyone on how can i achieve this ?

I'm not sure if you are looking to finetune the pretrained models or to use the models for feature extraction. To do the latter freeze the petrained model weights (there are several ways to do this in PyTorch, the simplest being calling .eval() on the model), and feed the logits from the last layer of the model to your new output head. See the PyTorch tutorial here for a more in depth guide.

Related

How to use BERT pre-trained model in Keras Embedding layer

How do I use a pre-trained BERT model like bert-base-uncased as weights in the Embedding layer in Keras?
Currently, I am generating word embddings using BERT model and it takes a lot of time. And I am assigning those weights like in the cide shown below
model.add(Embedding(307200, 1536, input_length=1536, weights=[embeddings]))
I searched on internet but the method is given in PyTorch. I need to do it in Keras. Please help.

Is there a possibility to visualize intermediate layers in Keras?

I am using the DenseNet121 CNN in the Keras library and I would like to visualize the features maps when I predict images. I know that is possible with CNN we have made on our own.
Is it the same thing for models available in Keras like DenseNet?

Using pretrained models in Pytorch for Semantic Segmentation, then training only the fully connected layers with our own dataset

I am learning Pytorch and trying to understand how the library works for semantic segmentation.
What I've understood so far is that we can use a pre-trained model in pytorch. I've found an article which was using this model in the .eval() mode but I have not been able to find any tutorial on using such a model for training on our own dataset. I have a very small dataset and I need transfer learning to get results. My goal is to only train the FC layers with my own data. How is that achievable in Pytorch without complicating the code with OOP or so many .py files. I have been having a hard time figuring out such repos in github as I am not the most proficient person when it comes to OOP. I have been using Keras for Deep Learning until recently and there everything is easy and straightforward. Do I have the same options in Pycharm?
I appreciate any guidance on this. I need to run a piece of code that does the semantic segmentation and I am really confused about many of the steps I need to take.
Assume you start with a pretrained model called model. All of this occurs before you pass the model any data.
You want to find the layers you want to train by looking at all of them and then indexing them using model.children(). Running this command will show you all of the blocks and layers.
list(model.children())
Suppose you have now found the layers that you want to finetune (your FC layers as you describe). If the layers you want to train are the last 5 you can grab all of the layers except for the last 5 in order to set their requires_grad params to False so they don't train when you run the training algorithm.
list(model.children())[-5:]
Remove those layers:
layer_list = list(model.children())[-5:]
Rebuild model using sequential:
model_small = nn.Sequential(*list(model.children())[:-5])
Set requires_grad params to False:
for param in model_small.parameters():
param.requires_grad = False
Now you have a model called model_small that has all of the layers except the layers you want to train. Now you can reattach the layers that your removed and they will intrinsically have the requires_grad param set to True. Now when you train the model it will only update the weights on those layers.
model_small.avgpool_1 = nn.AdaptiveAvgPool2d()
model_small.lin1 = nn.Linear()
model_small.logits = nn.Linear()
model_small.softmax = nn.Softmax()
model = model_small.to(device)

How to use a NN in a keras generator?

I am setting up a fit_generator to train a DNN by keras. But don't know how to use a CNN inside this generator.
Basically, I have a pre-trained image generator using fully-connected convolutional networks (we can named it as GEN-NET). Now I want to used this Fully-CNN in my fit_generator to generate unlimited number of images to train another classifier (called CLASS-NET) in keras. But it always crashed my training and the error message is:
ValueError: Tensor Tensor("decoder/transform_output/mul:0", shape=(?, 128, 128, 1), dtype=float32) is not an element of this graph.
This "decoder/transform_output/mul:0" is the output of my CNN GEN-NET.
So my question is that can I use CNN based GEN-NET in my fit_generator to train GLASS-NET or it is not permitted in keras?
Keras does not really like running two separate models in a single session. You could use K.clear_session() after using the model but this would produce a lot of overhead!
Best way to do this, IMHO, is by pre-generating these images and then loading them using a generator. Basically splitting your program into two separate programs.
Otherwise, if you are using tensorflow as back-end there might be a way to do it by switching the default graph on the tf.Session, you could Google that but I would not recommend it! :)
Seems like you might have things a bit mixed up! The CNN (convolutional neural network) needs to be trained to your data, unless you're using a pretrained network for predictions. If you're going to train the CNN, you can do that with either the fit() or the fit_generator() function. Use fit() if you're feeding data directly, and use fit_generator() if your data is handled by Image Data Generators. If you've loaded a pre-trained model/weights only to make predictions, you don't need to use any fit function, since no training needs to be done.

Autoencoder with Transfer Learning?

Is there a way I can train an autoencoder model using a pre-trained model like ResNet?
I'm trying to train an autoencoder model with input as an image and output as a masked version of that image.
Is it possible to use weights from a pretrained model here?
Yes! you can definitely do transfer learning using a pre-trained network, i.e. ResNet50 as the encoder in an autoencoder. For reference, check out the following link. https://github.com/hsinyilin19/ResNetVAE
From what I know, there is no proven method to do this. I'd train the autoencoder from scratch.
In theory, if you find a pre-trained CNN which does not use max pooling, you can use those weights and architecture for the encoder stage in your autoencoder. You can also extract features from a pre-trained model and concatenate/merge them to your autoencoder. But the value add is not clear, and the architecture might become overly complex.

Resources