I'm trying to take the last layer in a model (old model) and make a new model of only one layer (new model) that has the exact same parameters as the last layer of the old model. I want to do this in a way that's agnostic to what the last layer of the old model happens to be. I'm trying to do it with this code, but am getting an error.
newModel = Sequential()
newModel.add(type(oldModel.layers[-1])(oldModel.layers[-1].output_shape,
activation=oldModel.layers[-1].activation,
input_shape=oldModel.layers[-1].input_shape))
That yields the following error:
TypeError: __init__() missing 1 required positional argument: 'output_dim'
If I check the last layer in oldModel, it shows me this:
full_model.model.layers[-1]
>>>> <keras.layers.core.Dense at 0x7fe22010e128>
I tried adding output_dim to the list of parameters I'm copying in this way, but that didn't seem to help. It gave me this error instead when I did that:
Exception: Input 0 is incompatible with layer dense_8: expected ndim=2, found ndim=3
Any idea what I'm doing wrong here?
Found the answer myself. If, instead of making the input_shape the same as the input_shape of the last layer of the old model, I make it the output_shape of the penultimate layer of the old model and specify only [1:] of that output array, it works. Code that works is as follows:
newModel.add(type(oldModel.layers[-1])(oldModel.layers[-1].output_shape,
activation=oldModel.layers[-1].activation,
input_shape=oldModel.layers[-2].output_shape[1:]))
Related
I wanted to use the multilingual-codesearch model but first the code doesn't work and outputs the following error which suggest that it cannot load with only weights:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("ncoop57/multilingual-codesearch")
model = AutoModel.from_pretrained("ncoop57/multilingual-codesearch")
ValueError: Unrecognized model in ncoop57/multilingual-codesearch. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: gpt_neo, big_bird, speech_to_text, vit, wav2vec2, m2m_100, convbert, led, blenderbot-small, retribert, ibert, mt5, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, mpnet, bart, blenderbot, reformer, longformer, roberta, deberta-v2, deberta, flaubert, fsmt, squeezebert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm-prophetnet, prophetnet, xlm, ctrl, electra, encoder-decoder, funnel, lxmert, dpr, layoutlm, rag, tapas
Then I downloaded the pytorch bin file but it only contains the weight dictionnary (state dictionnary as mentioned here), which means that if I want to use the model I have to initialize the good architecture and then load the weights.
But how am I supposed to find the architecture fitting the weight of a model that complex ? I saw that some method could find back the model based on the weight dictionnary but I didn't manage to make them work (I think about enter link description here).
How can one find back the architecture of a weight dictionnary in order to make the model work ? Is it even possible ?
On Android Studio, I'm not able to view the model metadata, even though I had added the metadata in Python manually. I get the error:
This model is not supported: input tensor 0 does not have a name.
My attempts at fixing:
I added the layer name to the tensorflow input layer, using:
img_input = Input(shape=input_shape, batch_size=1, name="input_image")
I even checked it in Netron, and the input layer showed up as input_image as expected:
I fixed it by copying 1 line from the documentation, I needed to add a specific property to the TensorMetadataT object, input_meta.name = "image". This does not come from the model layer name, but needs to be manually added.
input_meta.name = "image_input"
Weirdly: This wasn't a problem before, but for some reason, my previous code broke and I needed this change.
I am following this notebook:
One of the method:
def init_hidden(self, batch_size):
''' Initializes hidden state '''
# Create two new tensors with sizes n_layers x batch_size x n_hidden,
# initialized to zero, for hidden state and cell state of LSTM
weight = next(self.parameters()).data
if (train_on_gpu):
hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda(),
weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda())
else:
hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_(),
weight.new(self.n_layers, batch_size, self.n_hidden).zero_())
return hidden
I would like to see what type of weight is and how to use new() method, so I was trying to find out the parameters() method as the data attribute comes from paramerters() method.
Surprisingly, I cannot find where it comes from after reading the source code of nn module in PyTorch.
How do you figure out where to see the definition of methods you saw from PyTorch?
so I was trying to find out the parameters() method as the data
attribute comes from paramerters() method.
Surprisingly, I cannot find where it comes from after reading the
source code of nn module in PyTorch.
You can see the module definition under torch/nn/modules/module.py here at line 178.
You can then easily spot the parameters() method here.
How do you guys figure out where to see the definition of methods you
saw from PyTorch?
The easiest way that I myself always use, is to use VSCode's Go to Definition or its Peek -> Peek definition feature.
I believe Pycharm has a similar functionality as well.
You can also check source code directly from PyTorch documentation.
See here for torch.nn.Module.parameters function (just click orange "Source", which gets you here).
Source is linked if it isn't written in C/C++/low level, in this case you have to go through GitHub and get around the project.
I'm trying to visualize my model graph on TensorBoard. I'm using Keras 2.1.5 and tensorflow-gpu 1.13.1. My model is a concatenation of convolutional layers and, at the end, a custom layer where I make some operations with tensors.
Everything works fine, although I defined some prints at the end and at the beggining of my custom layer and then check that Python enters two time my code when I just call it once.
I'm running some trainings and check the model graph on TensorBoard and then found out something I haven't seen in any other example on the web:
Graph of custom layer
There is a connected graph of my custom layer (trans2img) and another one unconnected and with empty placeholders as inputs. I don't understand the reason.
Here is a simple example as my code:
def custom_layer(inputs):
with tf.name_scope('trans2img'):
a = inputs[0]
def some_operation(a):
with tf.name_scope('op1'):
b = 2*a
return b
def some_other_op(b, c):
with tf.name_scope('op2'):
d = b/c
return d
b = some_operation(a)
d = some_other_op(b, inputs[1])
return d
After that, in my network definition file, I load this custom layer as from custom_layer import custom_layer, and then use it as a Lambda layer:
net = Lambda(custom_layer)[branch1, branch2]
I don't know if it is because the way I define the inner operations in my custom_layer or the way I call them. I would like to know how to interpret this second unconnected graph I get and if it's an indicator of unnineficientcode. I would appreciate any clue and help.
This issue happens when you are using multiple tf scope.
For each scope it creates a new op with "op_{integer}".
You need to make use of "absolute_name_scope(scope)" to resolve your issue.
Please refer to below link on how to make use of it.
https://github.com/tensorflow/tensorflow/issues/9545
https://github.com/tensorflow/tensorflow/pull/23250/commits/1169eaca048b660905fc5776e617cc824a19b125
I want to use xception model to classify images,but iam getting valuerror.
xception=keras.applications.xception.Xception(include_top=False,input_shape=(71,71,3))
classifier=Sequential()
for layer in xception.layers:
classifier.add(layer)
Iam getting this error
ValueError: Input 0 is incompatible with layer conv2d_1: expected axis -1 of input shape to have value 64 but got shape (None, 33, 33, 128)
I also get this error when using resnet.But i dont get it when iam using vgg16 or vgg19.Can anyone say how to use it??
You can use the functional API. Here is one possible example of classifier
#Base model Xception
xception=keras.applications.xception.Xception(include_top=False,input_shape=(71,71,3))
# Input of your model
input=Input(shape=(71,71,3))
# Add the inception base model to your model
y=xception(input)
.
.
# Other layers by passing previous output
y=Dense(...)(y)
# Define model
model=Model(input,y)
Docs