Replicating TF model in Keras from pbtxt and ckpts

Replicating TF model in Keras from pbtxt and ckpts - python-3.x

I'm not sure if this is actually possible with the information I have, but please let me know if that's the case.
From a previously trained Tensorflow model I have the following files:
graph.pbtxt, checkpoint, model.ckpt-10000.data-00000-of-00001, model.ckpt-10000.index, and model.ckpt-10000.meta
I was told that the input size of this model was a Dense layer of size 5000 and the output was a Dense sigmoid binary classification, but I don't know how many/what size layers were in between. (I'm also not 100% positive that the input size is correct).
From this information and associated files, is there a way to replicate the TF model with trained weights into a Keras functional model?
(The idea was that this small dense network was added onto the last FC layer of VGG-16, so that'll be my end goal.)

Related

PyTorch Lightening: load model from pre-trained; values of weights remains same

I have a pre-trained model, by changing final fc layers I have created model for downstream task. Now, I want to load weights from pertained weights. I tried self.model.load_from_checkpoint (self.pretrained_model_path). But, when I print weight values from model layers, they are exactly same which indicates weights were not loaded/updated. Note that is not giving me any warning/error.
Edit:
self.model.backbone = self.model.load_from_checkpoint (self.pretrained_model_path).backbone
updates the parameters with pre_trained weights. There might be optimal way but I found this fix.

How to fine tune InceptionV3 in Keras

I am trying to train a classifier based on the InceptionV3 architecture in Keras.
For this I loaded the pre-trained InceptionV3 model, without top, and added a final fully connected layer for the classes of my classification problem. In the first training I froze the InceptionV3 base model and only trained the final fully connected layer.
In the second step I want to "fine tune" the network by unfreezing a part of the InceptionV3 model.
Now I know that the InceptionV3 model makes extensive use of BatchNorm layers. It is recommended (link to documentation), when BatchNorm layers are "unfrozen" for fine tuning when transfer learning, to keep the mean and variances as computed by the BatchNorm layers fixed. This should be done by setting the BatchNorm layers to inference mode instead of training mode.
Please also see: What's the difference between the training argument in call() and the trainable attribute?
Now my main question is: how to set ONLY the BatchNorm layers of the InceptionV3 model to inference mode?
Currently I set the whole InceptionV3 base model to inference mode by setting the "training" argument when assembling the network:
inputs = keras.Input(shape=input_shape)
# Scale the 0-255 RGB values to 0.0-1.0 RGB values
x = layers.experimental.preprocessing.Rescaling(1./255)(inputs)
# Set include_top to False so that the final fully connected (with pre-loaded weights) layer is not included.
# We will add our own fully connected layer for our own set of classes to the network.
base_model = keras.applications.InceptionV3(input_shape=input_shape, weights='imagenet', include_top=False)
x = base_model(x, training=False)
# Classification block
x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
x = layers.Dense(num_classes, activation='softmax', name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=x)
What I don't like about this, is that in this way I set the whole model to inference mode which may set some layers to inference mode which should not be.
Here is the part of the code that loads the weights from the initial training that I did and the code that freezes the first 150 layers and unfreezes the remaining layers of the InceptionV3 part:
model.load_weights(load_model_weight_file_name)
for layer in base_model.layers[: 150]:
layer.trainable = False
for layer in base_model.layers[ 150:]:
layer.trainable = True
The rest of my code (not shown here) are the usual compile and fit calls.
Running this code seems to result a network that doesn't really learn (loss and accuracy remain approximately the same). I tried different orders of magnitude for the optimization step size, but that doesn't seem to help.
Another thing that I observed it that when I make the whole InceptionV3 part trainable
base_model.trainable = True
that the training starts with an accuracy server orders of magnitude smaller than were my first training round finished (and of course a much higher loss). Can someone explain this to me? I would at least expect the training to continue were it left off in terms of accuracy and loss.

You could do something like:
for layer in base_model.layers:
if isinstance(layer ,tf.keras.layers.BatchNormalization):
layer.trainable=False
This will iterate over each layer and check the type, setting to inference mode if the layer is BatchNorm.
As for the low starting accuracy during transfer learning, you're only loading the weights and not the optimiser state (as would occur with a full model.load() which loads architecture, weights, optimiser state etc).
This doesn't mean there's an error, but if you must load weights only just let it train, the optimiser will configure eventually and you should see progress. Also as you're potentially over-writing the pre-trained weights in your second run, make sure you use a lower learning rate so the updates are small in comparison i.e. fine-tune the weights rather than blast them to pieces.

Keras Embedding layer activation function?

In the fully connected hidden layer of Keras embedding, what is the activation function leveraged? I'm either misunderstanding the concept of this class or unable to find documentation. I understand that it is encoding from word to real-valued vector of dimension d via answers like the below on stackoverflow:
Embedding layers in Keras are trained just like any other layer in your network architecture: they are tuned to minimize the loss function by using the selected optimization method. The major difference with other layers, is that their output is not a mathematical function of the input. Instead the input to the layer is used to index a table with the embedding vectors [1]. However, the underlying automatic differentiation engine has no problem to optimize these vectors to minimize the loss function...
In my network, I have a word embedding portion that is then linked to a larger network that is predicting a binary outcome (e.g., click yes/no). I understand that this Keras embedding is not operating like word2vec because here my embedding is being trained and updated against my end cross-entropy function. But, there is no mention of how the embedding fully-connected layer is activated. Thanks!

How to use a pre-trained object detection in tensorflow?

How can I use the weights of a pre-trained network in my tensorflow project?
I know some theory information about this but no information about coding in tensorflow.

As been pointed out by #Matias Valdenegro in the comments, your first question does not make sense. For your second question however, there are multiple ways to do so. The term that you're searching for is Transfer Learning (TL). TL means transferring the "knowledge" (basically it's just the weights) from a pre-trained model into your model. Now there are several types of TL.
1) You transfer the entire weights from a pre-trained model into your model and use that as a starting point to train your network.
This is done in a situation where you now have extra data to train your model but you don't want to start over the training again. Therefore you just load the weights from your previous model and resume the training.
2) You transfer only some of the weights from a pre-trained model into your new model.
This is done in a situation where you have a model trained to classify between, say, 5 classes of objects. Now, you want to add/remove a class. You don't have to re-train the whole network from the start if the new class that you're adding has somewhat similar features with (an) existing class(es). Therefore, you build another model with the same exact architecture as your previous model except the fully-connected layers where now you have different output size. In this case, you'll want to load the weights of the convolutional layers from the previous model and freeze them while only re-train the fully-connected layers.
To perform these in Tensorflow,
1) The first type of TL can be performed by creating a model with the same exact architecture as the previous model and simply loading the model using tf.train.Saver().restore() module and continue the training.
2) The second type of TL can be performed by creating a model with the same exact architecture for the parts where you want to retain the weights and then specify the name of the weights in which you want to load from the previous pre-trained weights. You can use the parameter "trainable=False" to prevent Tensorflow from updating them.
I hope this helps.

loading weights keras LSTM not working

I am trying to load the weights from a Keras 1.0 Model into a Keras 2.0 model I created. I am sure the model architecture is exactly the same. The issues I am having is the load_weights() function is loading all the weights.
When I print the weights to a text file from the original model (loaded via load_model) and from the new model with load_weights() the later is missing many entry and are actually different. This also shows itself when making predictions as the accuracy is lower.
This problem only occurs in my LSTM layers. The embedding layers is fine and the Dense layer is also fine.
Any thoughts? I can not use load_model() as the original saved model was done in keras 1.0 and I need to use keras 2.0
EDIT MORE:
I should note I think the issue is the internal states not being loaded. Let me explain though. When I use get_weights() on each layer and I print it too terminal or a file the original model outputs a much larger matrix.
After using load_weights and then get_weights and print the weight matrix is missing many elements. I'm thinking it's the internal states.

The problem was that there was parameters for a compiled graph that were saved. I think it's safe to just port over the weights and continue training to let it catch up (maybe 1-2 epochs) if you can.
Gl

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string