Unable to save keras model in databricks - keras

I am saving keras model
model.save('model.h5')
in databricks, but model is not saving,
I have also tried saving in /tmp/model.h5 as mentioned here
but model is not saving.
The saving cell executes but when I load model it shows no model.h5 file is available.
when I do this dbfs_model_path = 'dbfs:/FileStore/models/model.h5' dbutils.fs.cp('file:/tmp/model.h5', dbfs_model_path)
OR try loading model
tf.keras.models.load_model("file:/tmp/model.h5")
I get error message java.io.FileNotFoundException: File file:/tmp/model.h5 does not exist

The problem is that Keras is designed to work only with local files, so it doesn't understand URIs, such as dbfs:/, or file:/. So you need to use local paths for saving & loading operations, and then copy files to/from DBFS (unfortunately /dbfs doesn't play well with Keras because of the way it works).
The following code works just fine. Note that dbfs:/ or file:/ are used only in the calls to the dbutils.fs commands - Keras stuff uses the names of local files.
create model & save locally as /tmp/model-full.h5:
from tensorflow.keras.applications import InceptionV3
model = InceptionV3(weights="imagenet")
model.save('/tmp/model-full.h5')
copy data to DBFS as dbfs:/tmp/model-full.h5 and check it:
dbutils.fs.cp("file:/tmp/model-full.h5", "dbfs:/tmp/model-full.h5")
display(dbutils.fs.ls("/tmp/model-full.h5"))
copy file from DBFS as /tmp/model-full2.h5 & load it:
dbutils.fs.cp("dbfs:/tmp/model-full.h5", "file:/tmp/model-full2.h5")
from tensorflow import keras
model2 = keras.models.load_model("/tmp/model-full2.h5")

Related

How to save and load the trained LSTM model?

I am trying to save and load with the following code but it is not working. It is showing me an error telling me that it is not able to find the model. Am I missing something? I'm using Google Colab. Thank you
import keras
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_loss',patience=6,),
keras.callbacks.ModelCheckpoint(filepath='my_model.h5',monitor='val_loss',mode='min', save_freq='epoch',save_best_only=True,)]
model.compile(loss=MeanAbsoluteError(), optimizer='Adam',metrics=[RootMeanSquaredError()])
history= model.fit(X_train, y_train,batch_size=512, epochs=100,callbacks=callbacks_list,validation_data=(X_val, y_val))
from tensorflow.keras.models import load_model
#save model to single file
model.save('my_model.h5')
#To load model
model = load_model('my_model.h5')
Since you are using Google Colab, you must mount your drive to access the data on Colab. Assuming that the notebook you are executing is in the directory my_dir (update the path according to YOUR particular path) you can add the following code to a cell before your save and load code:
from google.colab import drive
drive.mount('/content/drive') # mounts the drive
%cd /content/drive/MyDrive/my_dir/ # moves your position inside the directory where you are executing the code
# ... your code to save and your code to load

Load trained model on another machine - fastai, torch, huggingface

I am using fastai with pytorch to fine tune XLMRoberta from huggingface.
I've trained the model and everything is fine on the machine where I trained it.
But when I try to load the model on another machine I get OSError - Not Found - No such file or directory pointing to .cache/torch/transformers/. The issue is the path of a vocab_file.
I've used fastai's Learner.export to export the model in .pkl file, but I don't believe that issue is related to fastai since I found the same issue appearing in flairNLP.
It appears that the path to the cache folder, where the vocab_file is stored during the training, is embedded in the .pkl file:
The error comes from transformer's XLMRobertaTokenizer __setstate__:
def __setstate__(self, d):
self.__dict__ = d
self.sp_model = spm.SentencePieceProcessor()
self.sp_model.Load(self.vocab_file)
which tries to load the vocab_file using the path from the file.
I've tried patching this method using:
pretrained_model_name = "xlm-roberta-base"
vocab_file = XLMRobertaTokenizer.from_pretrained(pretrained_model_name).vocab_file
def _setstate(self, d):
self.__dict__ = d
self.sp_model = spm.SentencePieceProcessor()
self.sp_model.Load(vocab_file)
XLMRobertaTokenizer.__setstate__ = MethodType(_setstate, XLMRobertaTokenizer(vocab_file))
And that successfully loaded the model but caused other problems like missing model attributes and other unwanted issues.
Can someone please explain why is the path embedded inside the file, is there a way to configure it without reexporting the model or if it has to be reexported how to configure it dynamically using fastai, torch and huggingface.
I faced the same error. I had fine tuned XLMRoberta on downstream classification task with fastai version = 1.0.61. I'm loading the model inside docker.
I'm not sure about why the path is embedded, but I found a workaround. Posting for future readers who might be looking for workaround as retraining is usually not possible.
I created /home/.cache/torch/transformer/ inside the docker image.
RUN mkdir -p /home/<username>/.cache/torch/transformers
Copied the files (which were not found in docker) from my local /home/.cache/torch/transformer/ to docker image /home/.cache/torch/transformer/
COPY filename:/home/<username>/.cache/torch/transformers/filename

Loading pretrained FastAI models in Kaggle kernels without using internet

I am trying to load a densenet121 model in Kaggle kernel without switching on the internet.
I have done the required steps such as adding the pre-trained weights to my input directory and moving it to '.cache/torch/checkpoints/'. It still would not work and throws a gaierror.
The following the is code SNIPPET:
!mkdir -p /tmp/.cache/torch/checkpoints
!cp ../input/fastai-pretrained-models/densenet121-a639ec97.pth /tmp/.cache/torch/checkpoints/densenet121-a639ec97.pth
learn_cd = create_cnn(data_cd, models.densenet121, metrics=[error_rate, accuracy],model_dir = Path('../kaggle/working/models'),path=Path('.'),).to_fp16()
I have been struggling with this for a long time. Any help would be immensely helpful
so input path "../input/" in kaggle kernel is read only. create a folder in "kaggle/working" rather and copy the model weights there. Example below
if not os.path.exists('/root/.cache/torch/hub/checkpoints/'):
os.makedirs('/root/.cache/torch/hub/checkpoints/')
!mkdir '/kaggle/working/resnet34'
!cp '/root/.cache/torch/hub/checkpoints/resnet34-333f7ec4.pth' '/kaggle/working/resnet34/resnet34.pth'

pbtxt missing after saving a trained model

What I am trying to do is to convert my trained CNN to TfLite and use it in my android app. AFAIK I need the .pbtxt in order to freeze the parameters and do the conversion.
However when I save my network using this standard code:
saver = tf.train.Saver(max_to_keep=4)
saver.save(sess=session, save_path="some_path", global_step=step)
I only get the
.data
.index
.meta
checkpoint
files. No pbtxt.
Is there a way to convert the trained network to tflite without a pbtxt or can I obtain the pbtxt from those files?
Thank you
Simply execute:
tf.train.write_graph(session.graph.as_graph_def(),
"path",
'model.pb',
as_text=False)
to get a .pb or
tf.train.write_graph(session.graph.as_graph_def(),
"path",
'model.pbtxt',
as_text=True)
to get the text version.

How to change data input source in Tensorflow models repository CIFAR-10 tutorial

I'm trying to change the code in CIFAR10 tutorial in models repository, and I'm trying to add code in cifar10_train.py to change the input, but I don't know how to edit read_cifar10 function to read my .npy file. Can someone tell me how to change the input source to a .npy file by changing cifar10_input file?

Resources