I tried converting a pytorch model into deployment (ios) so I'm converting a model to either cafffe or Onnx (https://discuss.pytorch.org/t/typeerror-forward-missing-2-required-positional-arguments-cap-lens-and-hidden/20010/8). My initial error was only having 'one layer" inside an lstm yet I encountered another problem.
I tried implementing two layers (nlayers=2) by instantiating a new rnn object called text_encoder
enter image description here
Yet I'm given an error of some key(s) inside the state_dict is missing. This error doesn't occur for having one layer yet but I do get an error during the conversion (https://discuss.pytorch.org/t/typeerror-forward-missing-2-required-positional-arguments-cap-lens-and-hidden/20010/8)
I'm not sure if this is happening because the model I loaded only had one layer or another problem. How can I recover missing key's while adding a new layer? Or is that impossible? Original post (https://discuss.pytorch.org/t/missing-key-s-in-state-dict/20154) and (https://discuss.pytorch.org/t/typeerror-forward-missing-2-required-positional-arguments-cap-lens-and-hidden/20010/11)
enter image description here
Related
I have created a Pytorch object from the class Sequential (see official page).
As they suggest, I am saving it using the command torch.save(model.state_dict(), PATH).
It seems that everything has worked fine, since when I use torch.load(PATH) in another file I get
an ordered Dict like
'0.weight': tensor([[ 0.1202, ...]]) ,
'0.bias': tensor([ 0.1422, ...]) ,
...
with all the shapes of the tensors being correct. However, when doing
model = Sequential()
model.load_state_dict(torch.load(PATH))
I get the error
RuntimeError: Error(s) in loading state_dict for Sequential:
Unexpected key(s) in state_dict: "0.weight", "0.bias", "2.weight", "2.bias", "4.weight", "4.bias".
When trying to load, the model you are trying to load into (model) is an empty Sequential object with no layers. On the other hand, looking at the error message, the state dictionary of the model you are trying to load from indicates that it has at least five layers, with the first, third, and fifth layers containing a weight and bias parameter. This is a mismatch since the corresponding layers and parameters do not exist in model.
To fix this, model should have the same architecture as the saved model you are trying to load. Since you say you saved the original model yourself, use an identical initialization step when creating model.
I am struggling with restoring a keras model from a .pb file. I have seen a couple of posts explaining how to do inference using a model saved in .pb format, but what I need is to load the weights into a keras.Model class.
I have a function that returns an instance of the said original model, but untrained. I want to restore the weights of that model from the .pb file. My goal is to then truncate the model and use the truncated model for other purposes.
So far the furthest I've gotten is using the tf.saved_model.load(session, ['serving'], export_dir) function to get a tensorflow.core.protobuf.meta_graph_pb2.MetaGraphDef object. From here I can access the graph_def attribute, and from there the nodes.
How can I go from that to getting the weights and then loading those into the instance of the untrained keras Model?
Maybe if that's not doable there is a way to "truncate" the graph_def somehow and then make inference using that?
I am playing around with code from this Github repository https://github.com/jindongwang/Pytorch-CapsuleNet.
After training the model for 5 epochs, I got an accuracy of 99.2% on the test dataset. So I saved model using the following code:
torch.save(capsule_net.state_dict(),"capsnet_mnist_state.pt")
I tried loading the model back in another machine with the below code:
capsnet = CapsNet(Config())
capsnet.load_state_dict(torch.load('capsnet_mnist_state.pt'))
capsnet.eval()
Now the model predicts the 0 has the output for every input. Is there anything wrong with way I saved the model or loaded the model?.
I don't think there is anything wrong with the way you saved or the way you load your model.
But I happened to see the same problem because I didn't initialise some parameters, so they were stuck to 'None' even after loading the state dict.
I would suggest to look into the loaded state_dict to see if any parameters makes no sense.
Hope this help.
I am using the class api to subclass a model based on keras.models.Model.
Is there some trick to getting the save_weights working?
Am seeing errors like
ValueError: Layer #0 (named "dense_S") expects 0 weight(s), but the saved weights have 2 element(s).
Am trying by_name=True and False.
EDIT: it seems that calling predict once, with ANY data, is needed to build the layers for some reason. It would be interesting to here a proper explanation from anyone who knows more.
I'm trying to create a small Bidirectional recurrent NN. The model itself compiles without error, but when trying to fit the model I get the error stating I should compile first. Please see the code snippet below:
# fourth recurrent model, bidirectional
bidirectional_recurrent = Sequential()
bidirectional_recurrent.add(Bidirectional(GRU(32, input_shape=(int(lookback/steps), data_scaled.shape[-1]))))
bidirectional_recurrent.add(Dense(1))
bidirectional_recurrent.compile(optimizer='rmsprop', loss='mae')
bidirectional_recurrent_history = bidirectional_recurrent.fit_generator(train_gen, steps_per_epoch=500, epochs=40,
validation_data=val_gen, validation_steps=val_steps)
RuntimeError: You must compile your model before using it.
I've used the same setup to train unidirectional RNN's which worked fine. Any tips to help solve the run-time error are appreciated. (restarting the kernel did not help)
Maybe I did not instantiate 'Bidirectional' correctly?
Please note: This question is different from Do I need to compile before 'X' type of questions
Note2: R examples of the same code can be found here
Found it,
When using Bidirectional, it should be treated as a layer, shifting the input_shape to be contained in Bidirectional() instead of in the GRU() object solved the problem
so
bidirectional_recurrent.add(Bidirectional(GRU(32, input_shape=(int(lookback/steps),
data_scaled.shape[-1]))))
becomes
bidirectional_recurrent.add(Bidirectional(GRU(32), input_shape=(int(lookback/steps),
data_scaled.shape[-1])))