I'm trying to visualize my model graph on TensorBoard. I'm using Keras 2.1.5 and tensorflow-gpu 1.13.1. My model is a concatenation of convolutional layers and, at the end, a custom layer where I make some operations with tensors.
Everything works fine, although I defined some prints at the end and at the beggining of my custom layer and then check that Python enters two time my code when I just call it once.
I'm running some trainings and check the model graph on TensorBoard and then found out something I haven't seen in any other example on the web:
Graph of custom layer
There is a connected graph of my custom layer (trans2img) and another one unconnected and with empty placeholders as inputs. I don't understand the reason.
Here is a simple example as my code:
def custom_layer(inputs):
with tf.name_scope('trans2img'):
a = inputs[0]
def some_operation(a):
with tf.name_scope('op1'):
b = 2*a
return b
def some_other_op(b, c):
with tf.name_scope('op2'):
d = b/c
return d
b = some_operation(a)
d = some_other_op(b, inputs[1])
return d
After that, in my network definition file, I load this custom layer as from custom_layer import custom_layer, and then use it as a Lambda layer:
net = Lambda(custom_layer)[branch1, branch2]
I don't know if it is because the way I define the inner operations in my custom_layer or the way I call them. I would like to know how to interpret this second unconnected graph I get and if it's an indicator of unnineficientcode. I would appreciate any clue and help.
This issue happens when you are using multiple tf scope.
For each scope it creates a new op with "op_{integer}".
You need to make use of "absolute_name_scope(scope)" to resolve your issue.
Please refer to below link on how to make use of it.
https://github.com/tensorflow/tensorflow/issues/9545
https://github.com/tensorflow/tensorflow/pull/23250/commits/1169eaca048b660905fc5776e617cc824a19b125
Related
I'm using the pre-trained model:
import fasttext.util
fasttext.util.download_model('en', if_exists='ignore') # English
ft = fasttext.load_model('cc.en.300.bin')
Where can I find an exhaustive list of the values of the hyperparameters used to train the model?
https://fasttext.cc/docs/en/options.html list the default values, that differ from the used one: for example, the dimension of the word vectors is 300 and not 100 (citing https://fasttext.cc/docs/en/crawl-vectors.html that doesn't list them all).
From looking at the _FastText Python model class in Facebook's source...
https://github.com/facebookresearch/fastText/blob/a20c0d27cd0ee88a25ea0433b7f03038cd728459/python/fasttext_module/fasttext/FastText.py#L99
...it looks like, at least when creating a model, all the hyperparameters are added as attributes on the object.
Have you checked if that's the case on your loaded model? For example, does ft.dim report 300, and other parameters like ft.minCount report anything interesting?
Update: As that didn't seem to work, it also looks like the _FastText model wraps an internal instance of a native (not-in-Python) FastText model in its .f attribute. (See a few lines up from the source code I pointed to earlier.)
And that native-instance is set up by the module specified by fasttext_pybind.cc. That code looks like it specified a bunch of read-write class variable, associated with the metaparameters - see for example starting at:
https://github.com/facebookresearch/fastText/blob/a20c0d27cd0ee88a25ea0433b7f03038cd728459/python/fasttext_module/fasttext/pybind/fasttext_pybind.cc#L88
So: does ft.f.minCount or ft.f.dim return anything useful from a post-loaded model ft?
Citing NVS Abhilash from https://github.com/facebookresearch/fastText/issues/887#issuecomment-649018188 the right code to write is:
args_obj = ft.f.getArgs()
for hparam in dir(args_obj):
if not hparam.startswith('__'):
print(f"{hparam} -> {getattr(args_obj, hparam)}")
This will print all the hyperparameters of the trained model!
I wanted to use the multilingual-codesearch model but first the code doesn't work and outputs the following error which suggest that it cannot load with only weights:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("ncoop57/multilingual-codesearch")
model = AutoModel.from_pretrained("ncoop57/multilingual-codesearch")
ValueError: Unrecognized model in ncoop57/multilingual-codesearch. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: gpt_neo, big_bird, speech_to_text, vit, wav2vec2, m2m_100, convbert, led, blenderbot-small, retribert, ibert, mt5, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, mpnet, bart, blenderbot, reformer, longformer, roberta, deberta-v2, deberta, flaubert, fsmt, squeezebert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm-prophetnet, prophetnet, xlm, ctrl, electra, encoder-decoder, funnel, lxmert, dpr, layoutlm, rag, tapas
Then I downloaded the pytorch bin file but it only contains the weight dictionnary (state dictionnary as mentioned here), which means that if I want to use the model I have to initialize the good architecture and then load the weights.
But how am I supposed to find the architecture fitting the weight of a model that complex ? I saw that some method could find back the model based on the weight dictionnary but I didn't manage to make them work (I think about enter link description here).
How can one find back the architecture of a weight dictionnary in order to make the model work ? Is it even possible ?
I just started (self) learning TensorFlow and I decided to follow the book "Learning TensorFlow" which I found in my local library.
Unfortunately in the book they are using TensorFlow 1.x, while I want to use the 2.4 version.
I have some troubles in replicating an example from Chapter 3. The point of the code is to create a new empty computing graph, create a node (i.e. in this case a constant) and then figure out whether the node belongs to the default graph or to the newly created one.
Here is the code from the book, which should work fine with TensorFlow1:
import tensorflow as tf
print(tf.get_default_graph())
g = tf.Graph() # This creates a new empty graph
a = tf.constant(5) # This creates a node
print(a.graph is g)
print(a.graph is tf.get_default_graph())
I did realize that the attribute get_default_graph() is no longer available in TensorFlow 2 and I substituted with tf.compat.v1.get_default_graph() but I still get the following error:
AttributeError: Tensor.graph is meaningless when eager execution is enabled.
Any help will be very much appreciated! Thanks in advance!
After import tensorflow need disable eager execution, like below:
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
print(tf.compat.v1.get_default_graph())
g = tf.Graph() # This creates a new empty graph
a = tf.constant(5) # This creates a node
print(a.graph is g)
print(a.graph is tf.compat.v1.get_default_graph())
I am following this notebook:
One of the method:
def init_hidden(self, batch_size):
''' Initializes hidden state '''
# Create two new tensors with sizes n_layers x batch_size x n_hidden,
# initialized to zero, for hidden state and cell state of LSTM
weight = next(self.parameters()).data
if (train_on_gpu):
hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda(),
weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda())
else:
hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_(),
weight.new(self.n_layers, batch_size, self.n_hidden).zero_())
return hidden
I would like to see what type of weight is and how to use new() method, so I was trying to find out the parameters() method as the data attribute comes from paramerters() method.
Surprisingly, I cannot find where it comes from after reading the source code of nn module in PyTorch.
How do you figure out where to see the definition of methods you saw from PyTorch?
so I was trying to find out the parameters() method as the data
attribute comes from paramerters() method.
Surprisingly, I cannot find where it comes from after reading the
source code of nn module in PyTorch.
You can see the module definition under torch/nn/modules/module.py here at line 178.
You can then easily spot the parameters() method here.
How do you guys figure out where to see the definition of methods you
saw from PyTorch?
The easiest way that I myself always use, is to use VSCode's Go to Definition or its Peek -> Peek definition feature.
I believe Pycharm has a similar functionality as well.
You can also check source code directly from PyTorch documentation.
See here for torch.nn.Module.parameters function (just click orange "Source", which gets you here).
Source is linked if it isn't written in C/C++/low level, in this case you have to go through GitHub and get around the project.
I have an onnx graph/model that has big constants in it, so it is taking a lot of time to load it and parse it. Can I "strip" the data from the graph, so I inspect the graph nodes without its data ?
Initializer is one of the field in GraphProto. You should be able to clear initializer field with simple python script. I haven't tested the following code but should be something like this:
import onnx
def clear_initializer(model_path):
model = onnx.load_model(model_path)
model.graph.ClearField('initializer')
onnx.save_model(model)
references:
https://developers.google.com/protocol-buffers/docs/reference/python/google.protobuf.message.Message-class
https://github.com/onnx/onnx/blob/2e7099ee7c37b196c197c9a084a97698a41da232/onnx/init.py