How to save and load the trained LSTM model? - keras

I am trying to save and load with the following code but it is not working. It is showing me an error telling me that it is not able to find the model. Am I missing something? I'm using Google Colab. Thank you
import keras
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_loss',patience=6,),
keras.callbacks.ModelCheckpoint(filepath='my_model.h5',monitor='val_loss',mode='min', save_freq='epoch',save_best_only=True,)]
model.compile(loss=MeanAbsoluteError(), optimizer='Adam',metrics=[RootMeanSquaredError()])
history= model.fit(X_train, y_train,batch_size=512, epochs=100,callbacks=callbacks_list,validation_data=(X_val, y_val))
from tensorflow.keras.models import load_model
#save model to single file
model.save('my_model.h5')
#To load model
model = load_model('my_model.h5')

Since you are using Google Colab, you must mount your drive to access the data on Colab. Assuming that the notebook you are executing is in the directory my_dir (update the path according to YOUR particular path) you can add the following code to a cell before your save and load code:
from google.colab import drive
drive.mount('/content/drive') # mounts the drive
%cd /content/drive/MyDrive/my_dir/ # moves your position inside the directory where you are executing the code
# ... your code to save and your code to load

Related

How do i import the saved module in Google Colab?

I was build a NER module using Spacy in Google Colab. I saved it to the disk using nlp.to_disk() function.
nlp.to_disk("RCM.model")
This module is saved under the files. How should i import the RCM module for testing purpose?
i have tried the below code but it didn't work.
from google.colab import drive
my_module = drive.mount('/content/RCM.model', force_remount=True)
If you save a model you can load it using spacy.load.
import spacy
spacy.load("RCM.model") # the argument should be the path to the directory

Unable to save keras model in databricks

I am saving keras model
model.save('model.h5')
in databricks, but model is not saving,
I have also tried saving in /tmp/model.h5 as mentioned here
but model is not saving.
The saving cell executes but when I load model it shows no model.h5 file is available.
when I do this dbfs_model_path = 'dbfs:/FileStore/models/model.h5' dbutils.fs.cp('file:/tmp/model.h5', dbfs_model_path)
OR try loading model
tf.keras.models.load_model("file:/tmp/model.h5")
I get error message java.io.FileNotFoundException: File file:/tmp/model.h5 does not exist
The problem is that Keras is designed to work only with local files, so it doesn't understand URIs, such as dbfs:/, or file:/. So you need to use local paths for saving & loading operations, and then copy files to/from DBFS (unfortunately /dbfs doesn't play well with Keras because of the way it works).
The following code works just fine. Note that dbfs:/ or file:/ are used only in the calls to the dbutils.fs commands - Keras stuff uses the names of local files.
create model & save locally as /tmp/model-full.h5:
from tensorflow.keras.applications import InceptionV3
model = InceptionV3(weights="imagenet")
model.save('/tmp/model-full.h5')
copy data to DBFS as dbfs:/tmp/model-full.h5 and check it:
dbutils.fs.cp("file:/tmp/model-full.h5", "dbfs:/tmp/model-full.h5")
display(dbutils.fs.ls("/tmp/model-full.h5"))
copy file from DBFS as /tmp/model-full2.h5 & load it:
dbutils.fs.cp("dbfs:/tmp/model-full.h5", "file:/tmp/model-full2.h5")
from tensorflow import keras
model2 = keras.models.load_model("/tmp/model-full2.h5")

Google COLAB free version saving Keras trained model

I saved keras trained model in google colab free version
model.save("my_model.h5")
i tried to retrieve model using below method
from keras.models import load_model
model = load_model('my_model.h5')
But it is throwing errors
OSError: Unable to open file (unable to open file: name = 'my_model.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
will i able to retrive saved model from free google colab version, can you any help on this
I checked similar question in stackoverflow, i think these answers belongs to colab pro version
Otherwise, do i have to save model in specific path to local drive while training?
What is Problem
You are storing your model in runtime not in your google drive. After 12 hour runtime automatically deleted with data. So we have to save model in google drive.
How to store to Google Drive
First connect to google drive
from google.colab import drive
drive.mount('/content/drive')
Now you will find file explorer at left side which has drive directory. When you go inside that directory, it will take you to google drive.
Suppose I want to put my data in drive My Drive then
from keras.models import load_model
MODEL_PATH = './drive/My Drive/model.h5'
# Now save model in drive
model.save(MODEL_PATH)
# Load Model
model = load_model(MODEL_PATH)
When you open your drive, you will find file model.h5 in drive.

How to correctly save keras model to be able to load with hub.Module()?

I am attempting to retrain inception v3 on a new image set.
When I try to save the model I receive an error.
I have tried:
tf.keras.models.save_model(model, filename)
and
model.save(filename)
and
tf.contrib.saved_model.save_keras_model(model, filename)
All give me a similar error, Module has no 'name'
I have attached my code relevant to the problem.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import os
import sys
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import matplotlib.pyplot as plt
FLAGS = None
def create_model(m, img_data):
# load feature extractor (inception_v3)
features_extractor_layer = tf.keras.layers.Lambda(m, input_shape=img_data.image_shape)
# make pre-trained layers un-trainable
features_extractor_layer.trainable = False
print(features_extractor_layer.name)
# add new activation layer to train to our classes
model = tf.keras.Sequential([
features_extractor_layer,
tf.keras.layers.Dense(img_data.num_classes, activation='softmax')
])
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
def get_and_gen_images(module):
"""
get images from image directory or url
:param module: module (to get required image size info
:return: batched image data
"""
data_name = os.path.splitext(os.path.basename(FLAGS.image_dir_or_url))[0]
print("data: ", data_name)
# download images to cache if not already
if FLAGS.image_dir_or_url.startswith('https://'):
data_root = tf.keras.utils.get_file(data_name,
FLAGS.image_dir_or_url,
untar=True,
cache_dir=os.getcwd())
else: # specify directory with images
data_root = tf.keras.utils.get_file(data_name,
FLAGS.image_dir_or_url)
# get image size for specific module
image_size = hub.get_expected_image_size(module)
# TODO: this is where to add noise, rotations, shifts, etc.
image_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255, validation_split=0.2)
# create image stream
train_image_data = image_generator.flow_from_directory(str(data_root),
target_size=image_size,
batch_size=FLAGS.batch_size,
subset='training')
validation_image_data = image_generator.flow_from_directory(str(data_root),
target_size=image_size,
batch_size=FLAGS.batch_size,
subset='validation')
return train_image_data, validation_image_data
# load module (will download from url or directory_
module = hub.Module(FLAGS.tfhub_module)
# generate image stream
train_image_data, validation_image_data = get_and_gen_images(module)
model = create_model(module, train_image_data)
model.summary()
file = FLAGS.saved_model_dir + "/modelname.h5"
model.save(file)
This should save a ".h5" model file, but I receive a naming error:
Traceback (most recent call last):
File "/home/raphy/projects/vmi/tf_cpu/retrain.py", line 305, in <module>
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/home/raphy/.local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/raphy/projects/vmi/tf_cpu/retrain.py", line 205, in main
model.save(file)
File "/home/raphy/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py", line 319, in save
save_model(self, filepath, overwrite, include_optimizer)
File "/home/raphy/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/saving.py", line 105, in save_model
'config': model.get_config()
File "/home/raphy/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py", line 326, in get_config
'config': layer.get_config()
File "/home/raphy/.local/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py", line 756, in get_config
function = self.function.__name__
AttributeError: 'Module' object has no attribute '__name__'
I want to save the model in the format of the tf_hub models.
As specified in Hosting TF Hub TF Org Link,
If you are interested in hosting your own repository of models that
are loadable with the tensorflow_hub library, your HTTP distribution
service should follow the following protocol.
In other words, you cannot load any Model using TF Hub, but you can load only the Modules present in the TF Hub Modules Site.
If you want to Load your Saved Model, you can do it using tf.saved_model.load.
But if you want to do it using TF Hub, please refer this link.
Also, mentioning the instructions below just in case if the link doesn't work:
Hosting your own models:
TensorFlow Hub provides an open repository of trained models at thub.dev. The tensorflow_hub library can load models from this repository and other HTTP based repositories of machine learning models. In particular the protocol allows to use the URL identifying the model both for the documentation of the model and the endpoint to fetch the model.
If you are interested in hosting your own repository of models that are loadable with the tensorflow_hub library, your HTTP distribution service should follow the following protocol.
Protocol:
When a URL such as https://example.com/model is used to identify a model to load or instantiate, the model resolver will attempt to download a compressed tarball from the URL after appending a query parameter ?tf-hub-format=compressed.
The query param is to be interpreted as a comma separated list of the model formats that the client is interested in. For now only the "compressed" format is defined.
The compressed format indicates that the client expects a tar.gz archive with the model contents. The root of the archive is the root of the model directory and should contain a SavedModel, as in this example:
# Create a compressed model from a SavedModel directory.
$ tar -cz -f model.tar.gz --owner=0 --group=0 -C /tmp/export-model/ .
# Inspect files inside a compressed model
$ tar -tf model.tar.gz
./
./variables/
./variables/variables.data-00000-of-00001
./variables/variables.index
./assets/
./saved_model.pb
Tarballs for use with the deprecated hub.Module() API from TF1 will also contain a ./tfhub_module.pb file. The hub.load() API for TF2 SavedModels ignores such a file.
The tensorflow_hub library expects that model URLs are versioned and that the model content of a given version is immutable, so that it can be cached indefinitely.

Google Colab can't access drive content

Even though I defined my Google Drive(and my dataset in it) to google colab but when I run my code I give this error:FileNotFoundError: [Errno 2] No such file or directory: 'content/drive/My Drive/....
I already defined google drive in google colab and I can access to it through google colab but when I run my code I give this error
from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
model=Sequential()
model.add(Convolution2D(32,3,3,input_shape=(64,64,3),activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(32,3,3,activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(output_dim=128,activation='relu'))
model.add(Dense(output_dim=1,activation='sigmoid'))
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
train_datagen=ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen=ImageDataGenerator(rescale=1./255)
training_set=train_datagen.flow_from_directory(
directory='content/drive/My Drive/Convolutional_Neural_Networks/dataset/training_set',
target_size=(64,64),
batch_size=32,
class_mode='binary')
test_set=test_datagen.flow_from_directory(
directory='content/drive/My Drive/Convolutional_Neural_Networks/dataset/test_set',
target_size=(64,64),
batch_size=32,
class_mode='binary')
#train
model.fit_generator(
training_set,
samples_per_epoch=8000,
nb_epoch=2,
validation_data=test_set,
nb_val_samples=1000)
import numpy as np
from keras.preprocessing import image
test_image=image.load_img('sunn.jpg',target_size=(64,64))
test_image=image.img_to_array(test_image)
test_image=np.expand_dims(test_image,axis=0)
result=model.predict(test_image)
training_set.class_indices
if result[0][0] >= 0.5:
prediction='dog'
else:
prediction='cat'
print(prediction)
After mounting, move into the dataset folder.
cd content/drive/My Drive/Convolutional_Neural_Networks/dataset/
Don't use the !.
Then set your directory as ./training_set
I think you are missing a leading / in your /content/drive... path.
It's typical to mount you Drive files via
from google.colab import drive
drive.mount('/content/drive')
https://colab.research.google.com/notebooks/io.ipynb#scrollTo=u22w3BFiOveA
I have been trying, and for those curious, it has not been possible for me to use flow from directory with a folder inside google drive. The collab file environment does not read the path and gives a "Folder does not exist" error. I have been trying to solve the problem and searching stack, similar questions have been posted here Google collaborative and here Deep learnin on Google Colab: loading large image dataset is very long, how to accelerate the process? , with no effective solution and for some reason, many downvotes to those who ask.
The only solution I find to reading 20k images in google colab, is uploading them and then processing them, wasting two sad hours to do so. It makes sense, google identifies things inside the drive with ids, flow from directory requires it to be identified both the dataset, and the classes with folder absolute paths, not being compatible with google drives identification method. Alternative might be using a google cloud enviroment instead I suppose and paying.We are getting quite a lot for free as it is. This is my novice understanding of the situation, please correct me if wrong.
edit1: I was able to use flow from directory on google collab, google does identify things with path also, the thing is that if you use os.getcwd(), it does not work properly, if you use it it will give you that the current working directory is "/content", when in truth is "/content/drive/My Drive/foldersinsideyourdrive/...../folderthathasyourcollabnotebook/. If you change in the traingenerator the path so that it includes this setting, and ignore os, it works. I had however, problems with the ram even when using flow from directory, not being able to train my cnn anyway, might be something that just happens to me though.
from google.colab import drive
drive.mount('/content/drive')
using above code you can load your drive in colab,
when to load images use:
directory='drive/My Drive/Convolutional_Neural_Networks/dataset/test_set',
not this :
directory='content/drive/My Drive/Convolutional_Neural_Networks/dataset/test_set',
for keras imagedatagenerator dataset strcut:
So, I started by the default commands of Colab
from google.colab import drive
drive.mount('/gdrive', force_remount=True)
And the main changes that I did was here
img_width, img_height = 64, 64
train_data_dir = '/gdrive/My Drive/Colab Notebooks/dataset/training_set'
validation_data_dir = '/gdrive/My Drive/Colab Notebooks/dataset/test_set'
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(64, 64),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(64, 64),
batch_size=32,
class_mode='binary')
classifier.fit_generator(
train_generator,
steps_per_epoch=8000, # Number of images in train set
epochs=25,
validation_data=validation_generator,
validation_steps=2000)
This worked for me and I hope this helps someone.
P.s. Ignore the indentation.
for some reason you have to %cd into your google drive folder and then execute your code in order to access files from your drive or write files there.
first mount your google drive:
from google.colab import drive
drive.mount('/gdrive', force_remount=True)
then cd into your google drive and then run your code:
%cd /content/drive/My\ Drive/
directory='./Convolutional_Neural_Networks/dataset/training_set'
Try removing "content", it worked for me after 1-hour of troubleshooting here
cd drive/My Drive/dissertation
After Mounted at /content/drive
from google.colab import drive
drive.mount('/content/drive')
Change working directory to folder created previously
cd '/content/drive/My Drive/PLANT DISEASE RECOGNITION'
This causes me the error that we cannot change the directory.
To solve this error we may use
%cd /content/drive/My\ Drive/PLANT DISEASE RECOGNITION
After following the mount drive advice:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
I realised that referencing the dataset directly, by name, didn't work. Loading the path (parent) of my dataset did work.
This didn't work:
dataset = load_dataset("/content/drive/MyDrive/my_filename.json")
This did work:
dataset = load_dataset("/content/drive/MyDrive")

Resources