XLNetForSequenceClassification Pretrained model unable to load - nlp

I tried loading the XLNet pretrained but this occurred. I've tried this before and it worked, however, now it doesn't. Any suggestion on how to fix this problem?
model = XLNetForSequenceClassification.from_pretrained("xlnet-large-cased", num_labels = 2)
model.to(device)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-55-d6f698a3714b> in <module>()
----> 1 model = XLNetForSequenceClassification.from_pretrained("xlnet-large-cased", num_labels = 2)
2 model.to(device)
3 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/sparse.py in __init__(self, num_embeddings, embedding_dim, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse, _weight)
95 self.scale_grad_by_freq = scale_grad_by_freq
96 if _weight is None:
---> 97 self.weight = Parameter(torch.Tensor(num_embeddings, embedding_dim))
98 self.reset_parameters()
99 else:
RuntimeError: Trying to create tensor with negative dimension -1: [-1, 1024]

You should import XLNetForSequenceClassification from transformers and not from pytorch-transformers. First, make sure transformers is installed:
> pip install transformers
Then, in your code:
from transformers import XLNetForSequenceClassification
model = XLNetForSequenceClassification.from_pretrained("xlnet-large-cased", num_labels = 2)
This should work.

If you've not changed internally anything, most likely a version mismatch. Have you upgraded any relevant modules? Go back to previous version if you have that should solve it.
Pytorch Quantization RuntimeError: Trying to create tensor with negative dimension

Related

Dealing with infs in Seq2Seq Trainer

I am trying to fine tune a hugging face model onto a Shell Code dataset (https://huggingface.co/datasets/SoLID/shellcode_i_a32)
The training code is a basic hugging face trainer method but we keep running into nan/inf issues
from transformers import PreTrainedTokenizerFast
tokenizer = PreTrainedTokenizerFast(tokenizer_file="tkn1.json", padding_side="right")
special_tokens={'pad_token': "[PAD]"}
tokenizer.add_special_tokens(special_tokens)
# token_wrap = PreTrainedTokenizer()
data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=model)
training_args = Seq2SeqTrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
lr_scheduler_type = "cosine",
weight_decay=0.01,
save_total_limit=3,
per_device_train_batch_size=128,
num_train_epochs=5,
warmup_ratio=0.06,
learning_rate=1.0e-04,
# fp16=True,
debug=["underflow_overflow"]
)
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets["test"],
eval_dataset=tokenized_datasets["test"],
tokenizer=tokenizer,
data_collator=data_collator,
)
# trainer.train()
# print(tokenizer.)
trainer.train()
# eval_loss = trainer.evaluate()
# print(f">>> Perplexity: {math.exp(eval_loss['eval_loss']):.2f}")
The outputs look like -
You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
Detected inf/nan during batch_number=0
Last 1 forward frames:
abs min abs max metadata
shared Embedding
5.42e-06 2.04e+04 weight
0.00e+00 1.46e+03 input[0]
1.56e-03 2.04e+04 output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-120-ff4a54906908> in <module>
33 # trainer.train()
34 # print(tokenizer.)
---> 35 trainer.train()
36 # eval_loss = trainer.evaluate()
37 # print(f">>> Perplexity: {math.exp(eval_loss['eval_loss']):.2f}")
9 frames
/usr/local/lib/python3.8/dist-packages/transformers/debug_utils.py in forward_hook(self, module, input, output)
278
279 # now we can abort, as it's pointless to continue running
--> 280 raise ValueError(
281 "DebugUnderflowOverflow: inf/nan detected, aborting as there is no point running further. "
282 "Please scroll up above this traceback to see the activation values prior to this event."
ValueError: DebugUnderflowOverflow: inf/nan detected, aborting as there is no point running further. Please scroll up above this traceback to see the activation values prior to this event.
The very first layer seems to start throwing inf/nans when we start training and doesn't go much beyond that
We have tried tweaking our training arguments but have hit a brick wall here. Any help appreciated!

Issue implementing InceptionV3 with binary classifier - transfer learning with Pytorch

I'm having an issue getting Inception V3 to work as the feature extractor with a binary classifier in Pytorch. I update the primary and auxiliary nets in Inception to have the binary class (as done in https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html)
but I'm getting an error
#Parameters for Inception V3
num_classes= 2
model_ft = models.inception_v3(pretrained=True)
# set_parameter_requires_grad(model_ft, feature_extract)
#handle auxilliary net
num_ftrs = model_ft.AuxLogits.fc.in_features
model_ft.AuxLogits.fc = nn.Linear(num_ftrs, num_classes)
#handle primary net
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs,num_classes)
# input_size = 299
#simulate data input
x = torch.rand([64, 3, 299, 299])
#create model with inception backbone
backbone = model_ft
num_filters = backbone.fc.in_features
layers = list(backbone.children())[:-1]
feature_extractor = nn.Sequential(*layers)
# use the pretrained model to classify damage 2 classes
num_target_classes = 2
classifier = nn.Linear(num_filters, num_target_classes)
feature_extractor.eval()
with torch.no_grad():
representations = feature_extractor(x).flatten(1)
x = classifier(representations)
But Im getting the error
RuntimeError Traceback (most recent call last)
<ipython-input-54-c2be64b8a99e> in <module>()
11 feature_extractor.eval()
12 with torch.no_grad():
---> 13 representations = feature_extractor(x)
14 x = classifier(representations)
9 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
442 _pair(0), self.dilation, self.groups)
443 return F.conv2d(input, weight, bias, self.stride,
--> 444 self.padding, self.dilation, self.groups)
445
446 def forward(self, input: Tensor) -> Tensor:
RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [64, 2]
before I updated the class to 2 (when it was 1000) I was getting the same error but with [64, 1000]. This method of creating a backbone and adding a classifier worked for Resnet but not here. I think it's because of the auxiliary net structure but not sure how to update it to deal with the dual output? Thanks
Inheriting feature_extracture by children function at line layers = list(backbone.children())[:-1] will bring the module from backbone to feature_extracture only, not the operation in forward function.
Let's take a look at the code below:
class Example(torch.nn.Module):
def __init__(self):
super().__init__()
self.avg = torch.nn.AdaptiveAvgPool2d((1,1))
self.linear = torch.nn.Linear(10, 1)
def forward(self, x):
out = self.avg(x)
out = out.squeeze()
out = self.linear(out)
return out
x = torch.randn(5, 10, 12, 12)
model = Example()
y = model(x) # work well
new_model = torch.nn.Sequential(*list(model.children()))
y = new_model(x) # error
Module model and new_model have the same blocks but not the same way of working. In new_module, the output from the pooling layer is not squeezed yet, so the shape of linear input is violate its assumption which causes the error.
In your case, the last two comments are redundant and that's why it returns the error, you did create a new fc in the InceptionV3 module at line model_ft.fc = nn.Linear(num_ftrs,num_classes). Therefore, replace the last one as the code below should work fine:
with torch.no_grad():
x = model_ft(x)

Classification Metrics for Sequential tagging in NLP

I am writing one sequential tagging code in NLP using python google colab. As I have used crf layer in my model, I used crf.metrices to get the results . I had executed the following code :
//
pred_cat = model.predict(X_te)
pred = np.argmax(pred_cat, axis=-1)
y_te_true = np.argmax(y_te, axis=-1)
tags=['O', 'BOC', 'IOC','<pad>']
from sklearn_crfsuite import metrics as crf_metrics
print(crf_metrics.flat_classification_report(y_true=y_te_true,y_pred=pred,labels=tags))
//
I am getting the following error:
TypeError Traceback (most recent call last)
<ipython-input-21-510856efb26d> in <module>()
1 from sklearn_crfsuite import metrics as crf_metrics
----> 2 print(crf_metrics.flat_classification_report(y_true=y_te_true,y_pred=pred,labels=tags))
1 frames
/usr/local/lib/python3.7/dist-packages/sklearn_crfsuite/metrics.py in flat_classification_report(y_true, y_pred, labels, **kwargs)
66 """
67 from sklearn import metrics
---> 68 return metrics.classification_report(y_true, y_pred, labels, **kwargs)
69
70
TypeError: classification_report() takes 2 positional arguments but 3 were given
Can anyone please throw some light on this?

save and load fine-tuned bert classification model using tensorflow 2.0

I am trying to save a fine-tuned binary classification model based on pretrained Bert module 'uncased_L-12_H-768_A-12'. I'm using tf2.
The code set up the model structure:
bert_classifier, bert_encoder =bert.bert_models.classifier_model(bert_config, num_labels=2)
then:
# import pre-trained model structure from the check point file
checkpoint = tf.train.Checkpoint(model=bert_encoder)
checkpoint.restore(
os.path.join(gs_folder_bert, 'bert_model.ckpt')).assert_consumed()
then: I compiled and fit the model
bert_classifier.compile(
optimizer=optimizer,
loss=loss,
metrics=metrics)
bert_classifier.fit(
Text_train, Label_train,
validation_data=(Text_val, Label_val),
batch_size=32,
epochs=1)
at last: I saved the model in the model folder which then automatically generates a file named saved_model.pb within
bert_classifier.save('/content/drive/My Drive/model')
also tried this:
tf.saved_model.save(bert_classifier, export_dir='/content/drive/My Drive/omg')
now I try to load the model and apply it on test data:
from tensorflow import keras
ttt = keras.models.load_model('/content/drive/My Drive/model')
I got:
KeyError Traceback (most recent call last)
<ipython-input-77-93f80aa585da> in <module>()
----> 1 tf.keras.models.load_model(filepath='/content/drive/My Drive/omg', custom_objects={'Transformera':bert_classifier})
9 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saved_model/load.py in _revive_graph_network(self, metadata, node_id)
392 else:
393 model = models_lib.Functional(
--> 394 inputs=[], outputs=[], name=config['name'])
395
396 # Record this model and its layers. This will later be used to reconstruct
KeyError: 'name'
This error message doesn't help me with what to do...please kindly advice.
I also tried to save the model in h5 format, but when i load it
ttt = keras.models.load_model('/content/drive/My Drive/model.h5')
I got this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-36-12f76139ec24> in <module>()
----> 1 ttt = keras.models.load_model('/content/drive/My Drive/model.h5')
5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name)
294 cls = get_registered_object(class_name, custom_objects, module_objects)
295 if cls is None:
--> 296 raise ValueError('Unknown ' + printable_module_name + ': ' + class_name)
297
298 cls_config = config['config']
ValueError: Unknown layer: BertClassifier
Seems as if you have the answer right in the question: '/content/drive/My Drive/model' will fail due to the whitespace character.
You could try it with escaping the backspace: '/content/drive/My\ Drive/model'.
Other option, after I had exactly the same problem with saving and loading. What helped was to just save the weights of the pre-trained model and not saving the whole model:
Just take a look right here: https://keras.io/api/models/model_saving_apis/, especially at the methods save_weights() and load_weights().

tf 2.0 AttributeError: module 'tensorflow' has no attribute 'get_default_session'

I am converting my mnist cnn ML code to tf 2.0. codes were running well in tf 1.13
After switch to tf 2.0 and modified it, at the step of model fitting, an error occurred.
code
annealer = LearningRateScheduler(lambda x: 1e-3 * 0.95 ** x)
batch_size = 100
epochs = 30
history = model.fit_generator(datagen.flow(X_train,Y_train, batch_size=batch_size),
epochs = epochs,
validation_data = (X_val,Y_val),
verbose = 1,
steps_per_epoch=X_train.shape[0] // batch_size,
callbacks=[annealer])
error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-39-d1e7a6160362> in <module>()
7 verbose = 1,
8 steps_per_epoch=X_train.shape[0] // batch_size,
----> 9 callbacks=[annealer])
5 frames
/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in get_session()
188 global _SESSION
189
--> 190 default_session = tf.get_default_session()
191
192 if default_session is not None:
AttributeError: module 'tensorflow' has no attribute 'get_default_session'
if I remove 'callback' or switch model.fit without callbacks option, everything runs well.
and I assume this is some incompatibility issue.
any suggestion how to implement callback right, so that I can do the variable lr?
thanks.

Resources