Skip some layers in Keras model during "Evaluation/Validation Phase" - keras

Hi Stackoverflow community,
I am solving a problem here where I have to skip some layers of a Keras model during the Evaluation/Validation phase. "Cutom_VGG_Model" output should directly go to "Classifier" (since output and input shapes for both are the same, respectively). This would be a new evaluation strategy. Training is performed on the same model without any change. Kindly suggest some ways to solve it.
I found some solution on Keras documentation where they create Custom Model and override the "test_step" method. But I am not able to formulate the code. Below is the link to the documentation I am referring to :
https://keras.io/guides/customizing_what_happens_in_fit/
I am using Tensorflow Keras.
My requirement is to calculate the validation loss on this NEW modified Evaluation strategy after each epoch. Also, the final evaluation will also be performed with this new modified evaluation strategy.
Thanks in advance!

Related

AllenNLP Multi-Task Model: Keep encoder weights for new heads

I have trained a (AllenNLP) multi-task model. I would like to keep the encoder/backbone weights and continue training with new heads on new datasets. How can I do that with AllenNLP?
I have two basic ideas for how to do that:
I followed this AllenNLP tutorial to load the trained model and then instead of just making predictions I wanted to change the configuration and the model-heads to continue training on the new datasets...but I am kinda lost in how to do that.
I guess it should be possible to (a) save the state-dict of the previously trained encoder in a file and then (b) point to those weights in the configuration file for the new model (instead of pointing to "bert-base-cased"-weights for example). But looking at the PretrainedTransformerEmbedder-class I don't see how I could pass my own model-weights to that class.
As an additional question: Is it also possible to save the weights of the heads separately and initialize new heads with those weights?
Any help is appreciated :)
Your second idea is the preferred way, which you can accomplish by using a PretrainedModelInitializer. See the CopyNet model for an example of how to add this to your model.

extracting gradients for a model, reversing them and updating the weights in Keras

I'm trying to do domain adversarial training using gradient-reversal procedure. I have a deep-learning model architecture consisting of 5 dense layers. Now I want to extract the gradient, reverse them and then update the weights. I am not sure how to extract the gradients and finally how to add them to update the weights. I have gone through some example codes but I am still pretty doubtful regarding using Keras backend. Any help with some toy code or example with explanation is really appreciated.
Thank you,

Extending Keras TensborBoard callback to visualize model predictions

When training deep semantic segmentation models, it is often convenient to visualize a sample of predictions on the validation set during the training. Right now I'm simply saving some predictions to disk on my training server. I'm looking to migrate this task to TensorBoard. Simply put, I wan't to visualize a set of predictions (say 5) over each epoch.
I know there is a simple way to do it in pure TensorFlow like tf.summary.image(..) but I don't see any easy way to incorporate this into the Keras TensorBoard callback.
Any guidance would be much appreciated.
Fabio Perez provided an answer which should do exactly what you're looking for here How to display custom images in TensorBoard using Keras?

Keras: better way to implement layer-wise training model?

I'm currently learning implementing layer-wise training model with Keras. My solution is complicated and time-costing, could someone give me some suggestions to do it in a easy way? Also could someone explain the topology of Keras especially the relations among nodes.outbound_layer, nodes.inbound_layer and how did they associated with tensors: input_tensors and output_tensors? From the topology source codes on github, I'm quite confused about:
input_tensors[i] == inbound_layers[i].inbound_nodes[node_indices[i]].output_tensors[tensor_indices[i]]
Why the inbound_nodes contain output_tensors, I'm not clear about the relations among them....If I wanna remove layers in certain positions of the API model, what should I firstly remove? Also, when adding layers to some certain places, what shall I do first?
Here is my solution to a layerwise training model. I can do it on Sequential model and now trying to implement in on the API model:
To do it, I'm simply add a new layer after finish previous training and re-compile (model.compile()) and re-fit (model.fit()).
Since Keras model requires output layer, I would always add an output layer. As a result, each time when I wanna add a new layer, I have to remove the output layer then add it back. This can be done using model.pop(), in this case model has to be a keras.Sequential() model.
The Sequential() model supports many useful functions including model.add(layer). But for customised model using model API: model=Model(input=...., output=....), those pop() or add() functions are not supported and implement them takes some time and maybe not convenient.

How do I use a trained Theano artificial neural network on single examples?

I have been following the http://deeplearning.net/tutorial/ tutorial on how to train an ANN to classify the MNIST numbers. I am now at the "Convolutional Neural Networks" chapter. I want to use the trained network on single examples (MNIST images) and get the predictions. Is there a way to do that?
I have looked ahead in the tutorial and on google but can't find anything.
Thanks a lot in advance for any kind of help!
The material in the Theano tutorial in the earlier chapters, before reaching the Convolutional Neural Networks (CNN) chapter, give a good overview of how Theano works and some of the components the CNN sample code uses. It might be reasonable to assume that students reaching this point have developed their understanding of Theano sufficiently to figure out how to modify the code to extract the model's predictions. Here's a few hints.
The CNN's output layer, called layer3, is an instance of the LogisticRegression class, introduced in an earlier chapter.
The LogisticRegression class has an attribute called y_pred. The comments next to the code which assigns that attribute's values says
symbolic description of how to compute prediction as class whose
probability is maximal
Looking for places where y_pred is used in the logistic regression sample will highlight a function called predict(). This does for the logistic regression sample what is desired of the CNN example.
If one follows the same approach, using layer3.y_pred as the output of a new Theano function, the model's predictions will become apparent.

Resources