ModelCheckpoint doesn't save the model - python-3.x

I am trying to build a speech recognition model following this tutorial
https://www.analyticsvidhya.com/blog/2019/07/learn-build-first-speech-to-text-model-python/
there is 2 part, the first is a training model which output is the input of the second part ( testing model)
at the end of the training model, there is this part which should save the result of the training
model = Model(inputs, outputs)
model.summary()
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['acc'])
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10, min_delta=0.0001)
mc = ModelCheckpoint('best_model.hdf5', monitor='val_acc', verbose=1, save_best_only=True, mode='max')
so the result should be saved in this file "best_model.hdf5"
this model run without any error but I didn't found any file created
when I tried to load the model in testing model, I got an error message that this file wasn't found
any help please ?
keras version installed: 2.3.1
update 1:
I tried to know the location at which your code is running using:
print(os.getcwd())
I got the same direction of model file, I tried to put this location in the code to save in it and to load from it but still there is no file created and I got the same error message
update 2:
I add
print(os.listdi())
after ModelCheckpoint function and also I didn't find it

How about try to define the checkpoint_file_path separately and use that variable in the function call? This mostly happens because of the "/" before the file path name. so you can try "/best_model.hdf5"

Related

How to save and load the trained LSTM model?

I am trying to save and load with the following code but it is not working. It is showing me an error telling me that it is not able to find the model. Am I missing something? I'm using Google Colab. Thank you
import keras
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_loss',patience=6,),
keras.callbacks.ModelCheckpoint(filepath='my_model.h5',monitor='val_loss',mode='min', save_freq='epoch',save_best_only=True,)]
model.compile(loss=MeanAbsoluteError(), optimizer='Adam',metrics=[RootMeanSquaredError()])
history= model.fit(X_train, y_train,batch_size=512, epochs=100,callbacks=callbacks_list,validation_data=(X_val, y_val))
from tensorflow.keras.models import load_model
#save model to single file
model.save('my_model.h5')
#To load model
model = load_model('my_model.h5')
Since you are using Google Colab, you must mount your drive to access the data on Colab. Assuming that the notebook you are executing is in the directory my_dir (update the path according to YOUR particular path) you can add the following code to a cell before your save and load code:
from google.colab import drive
drive.mount('/content/drive') # mounts the drive
%cd /content/drive/MyDrive/my_dir/ # moves your position inside the directory where you are executing the code
# ... your code to save and your code to load

Load pytorch model with correct args from files

Having followed Chris McCormick's tutorial for creating a BERT Fake News Detector (link here), at the end he saves the PyTorch model using the following code:
output_dir = './model_save/'
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Save a trained model, configuration and tokenizer using `save_pretrained()`.
# They can then be reloaded using `from_pretrained()`
model_to_save = model.module if hasattr(model, 'module') else model
model_to_save.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)
As he says himself, it can be reloaded using from_pretrained(). Currently, what the code does is create an output directory with 6 files:
config.json
merges.txt
pytorch_model.bin
special_tokens_map.json
tokenizer_config.json
vocab.json
So how can I use the from_pretrained() method to load the model with all of its arguments and respective weights, and which files do I use from the six?
I understand that a model can be loaded as such (from PyTorch documentation):
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
but how can I make use of the files in the output directory to do this?
Any help is appreciated!

In spacy custom trianed model : Config Validation error ner -> incorrect_spans_key extra fields not permitted

I am running into the problem whenever I try to load custom trained NER model of spacy inside docker container.
Note:
I am using latest spacy version 3.0 and trained that NER model using CLI commands of spacy, first by converting Train data format into .spacy format
The error throws as following(You can check error in image as hyperlinked):
config validation error
My trained model file structure looks like this:
custom ner model structure
But while run that model without docker it works perfectly. What wrong I have done in this process. Plz help me to resolve the error.
Thank you in advance.

Object detection in detectron2 using pytorch on google colab. Reuse already trained model or import existing trained model and predict the objects

First I downloaded the output folder of the trained model and imported it in a new project on the google colab server.
In a new project without training the model, I have given the path of model_final.pth of the existing output folder to cfg.MODEL.WEIGHTS =/content/output/model_final.pth. but goes in an infinite loop.
3.I change the model weights cfg.MODEL.WEIGHTS = "detectron2://COCO-Detection/faster_rcnn_R_101_FPN_3x/137851257/model_final_f6e8b1.pkl". but still it doesn't predict objects.
I change the model weights path and gave the previously trained model metrics JSON file still it not working
cfg.MODEL.WEIGHTS=/content/output/metrics.json 5.By using DetectionCheckpointer(model).load("/content/output/model_final.pth") DetectionCheckpointer(model).load("detectron2://COCO-Detection/faster_rcnn_R_101_FPN_3x/137851257/model_final_f6e8b1.pkl")
it gives an error model is not defined.
what is this model_final.pkl file? and where did we get it?
what should we do to import the existing train model and predict the objects in the new project?
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.DATASETS.TEST = ("microcontroller_test", )
predictor = DefaultPredictor(cfg)
Above code goes in an infinite loop
DetectionCheckpointer(model).load("/content/output/model_final.pth")
DetectionCheckpointer(model).load("detectron2://COCO-Detection/faster_rcnn_R_101_FPN_3x/137851257/model_final_f6e8b1.pkl")
Error:
NameError Traceback (most recent call last)
<ipython-input-12-69f2a7846756> in <module>()
----> 1 DetectionCheckpointer(model).load("/content/output/model_final.pth")
2
3 DetectionCheckpointer(model).load("detectron2://COCO-Detection/faster_rcnn_R_101_FPN_3x/137851257/model_final_f6e8b1.pkl")
NameError: name 'model' is not defined

Operator translate error occurs when I try to convert onnx file to caffe2

I train a boject detection model on pytorch, and I have exported to onnx file.
And I want to convert it to caffe2 model :
import onnx
import caffe2.python.onnx.backend as onnx_caffe2_backend
# Load the ONNX ModelProto object. model is a standard Python protobuf object
model = onnx.load("CPU4export.onnx")
# prepare the caffe2 backend for executing the model this converts the ONNX model into a
# Caffe2 NetDef that can execute it. Other ONNX backends, like one for CNTK will be
# availiable soon.
prepared_backend = onnx_caffe2_backend.prepare(model)
# run the model in Caffe2
# Construct a map from input names to Tensor data.
# The graph of the model itself contains inputs for all weight parameters, after the input image.
# Since the weights are already embedded, we just need to pass the input image.
# Set the first input.
W = {model.graph.input[0].name: x.data.numpy()}
# Run the Caffe2 net:
c2_out = prepared_backend.run(W)[0]
# Verify the numerical correctness upto 3 decimal places
np.testing.assert_almost_equal(torch_out.data.cpu().numpy(), c2_out, decimal=3)
print("Exported model has been executed on Caffe2 backend, and the result looks good!")
I always got this error :
RuntimeError: ONNX conversion failed, encountered 1 errors:
Error while processing node: input: "90"
input: "91"
output: "92"
op_type: "Resize"
attribute {
name: "mode"
s: "nearest"
type: STRING
}
. Exception: Don't know how to translate op Resize
How can I solve it ?
The problem is that the Caffe2 ONNX backend does not yet support the export of the Resize operator.
Please raise an issue on the Caffe2 / PyTorch github -- there's an active community of developers who should be able to address this use case.

Resources