I am trying to examine the layers of my pytorch model in a way that allows me to keep track of which layers feed input to others. I've been able to get a list of the layers by using model.modules(), but this list doesn't preserve any information about which layers feed into others in the transformer network I'm analyzing. Is there a way to access each layer and its weights while keeping track of what feeds into where?
Related
How can I use the weights of a pre-trained network in my tensorflow project?
I know some theory information about this but no information about coding in tensorflow.
As been pointed out by #Matias Valdenegro in the comments, your first question does not make sense. For your second question however, there are multiple ways to do so. The term that you're searching for is Transfer Learning (TL). TL means transferring the "knowledge" (basically it's just the weights) from a pre-trained model into your model. Now there are several types of TL.
1) You transfer the entire weights from a pre-trained model into your model and use that as a starting point to train your network.
This is done in a situation where you now have extra data to train your model but you don't want to start over the training again. Therefore you just load the weights from your previous model and resume the training.
2) You transfer only some of the weights from a pre-trained model into your new model.
This is done in a situation where you have a model trained to classify between, say, 5 classes of objects. Now, you want to add/remove a class. You don't have to re-train the whole network from the start if the new class that you're adding has somewhat similar features with (an) existing class(es). Therefore, you build another model with the same exact architecture as your previous model except the fully-connected layers where now you have different output size. In this case, you'll want to load the weights of the convolutional layers from the previous model and freeze them while only re-train the fully-connected layers.
To perform these in Tensorflow,
1) The first type of TL can be performed by creating a model with the same exact architecture as the previous model and simply loading the model using tf.train.Saver().restore() module and continue the training.
2) The second type of TL can be performed by creating a model with the same exact architecture for the parts where you want to retain the weights and then specify the name of the weights in which you want to load from the previous pre-trained weights. You can use the parameter "trainable=False" to prevent Tensorflow from updating them.
I hope this helps.
I am working on a project in which I need to edit individual weights and biases.
Is there any way to actually get access to the layers weights and biases so I can edit them manually?
from tf.layers.dense()
So far I have created my own model and stored the biases outside like so:
for _ in range(population_size):
hidden_layer.append(tf.Variable(tf.truncated_normal([11, 20])))
output_layer.append(tf.Variable(tf.truncated_normal([20, 9])))
population.append([hidden_layer, output_layer])
I am then trying to feed population into the model using feed dict. It's turning out to be a real hell because I cannot feed them into the models because of the shapes of the Variables are not the same.
Is there any native support for getting the weights from the dense layer?
From the keras doc:
All Keras layers have a number of methods in common:
layer.get_weights(): returns the weights of the layer as a list of Numpy arrays.
layer.set_weights(weights): sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of get_weights).
You can easily access all layers inside your model with yourmodel.layers.
I would like to understand how Keras sets up weights to be shared. Specifically, I would like to use a convolutional 1D layer for processing a time-frequency representation of an audio signal and feed it into an RNN (perhaps a GRU layer) that has:
local support (e.g. like the Conv1D layer with a specified kernel size). Things that are far away in frequency from an output are unlikely to affect the output.
Shared weights, that is I train only a single set of weights across all of the neurons in the RNN layer. Similar inferences should work at lower or higher frequencies.
Essentially, I'm looking for many of the properties that we find in the 2D RNN layers. I've been looking at some of the Keras source code for the convnets to try to understand how weight sharing is implemented, but when I see the weight allocation code in the layer build methods (e.g. in the _Conv class), it's not clear to me how the code is specifying that the weights for each filter are shared. Is this buried in the backend? I see that the backend call is to a specific 1D, 2D, or 3D convolution.
Any pointers in the right direction would be appreciated.
Thank you - Marie
In this article, I've come across the following network structure:
Figure 1(b). https://wx4.sinaimg.cn/mw690/5396ee05ly1fg9vi5phcbj20vj0kb0ty.jpg
Each layer is a fully connected one.
The weights shared by the two parts are denoted by Wc.
The pairs of the top fully connected layers of dimension 500 are concatenated to create a layer of dimension 1000 which is then used directly to reconstruct the input of size 784.
I want to implement it with keras, however I am not skilled with keras.
any ideas on how to implement this?
thank you very much!
I'm currently learning implementing layer-wise training model with Keras. My solution is complicated and time-costing, could someone give me some suggestions to do it in a easy way? Also could someone explain the topology of Keras especially the relations among nodes.outbound_layer, nodes.inbound_layer and how did they associated with tensors: input_tensors and output_tensors? From the topology source codes on github, I'm quite confused about:
input_tensors[i] == inbound_layers[i].inbound_nodes[node_indices[i]].output_tensors[tensor_indices[i]]
Why the inbound_nodes contain output_tensors, I'm not clear about the relations among them....If I wanna remove layers in certain positions of the API model, what should I firstly remove? Also, when adding layers to some certain places, what shall I do first?
Here is my solution to a layerwise training model. I can do it on Sequential model and now trying to implement in on the API model:
To do it, I'm simply add a new layer after finish previous training and re-compile (model.compile()) and re-fit (model.fit()).
Since Keras model requires output layer, I would always add an output layer. As a result, each time when I wanna add a new layer, I have to remove the output layer then add it back. This can be done using model.pop(), in this case model has to be a keras.Sequential() model.
The Sequential() model supports many useful functions including model.add(layer). But for customised model using model API: model=Model(input=...., output=....), those pop() or add() functions are not supported and implement them takes some time and maybe not convenient.