How do i import the saved module in Google Colab? - nlp

I was build a NER module using Spacy in Google Colab. I saved it to the disk using nlp.to_disk() function.
nlp.to_disk("RCM.model")
This module is saved under the files. How should i import the RCM module for testing purpose?
i have tried the below code but it didn't work.
from google.colab import drive
my_module = drive.mount('/content/RCM.model', force_remount=True)

If you save a model you can load it using spacy.load.
import spacy
spacy.load("RCM.model") # the argument should be the path to the directory

Related

How to save and load the trained LSTM model?

I am trying to save and load with the following code but it is not working. It is showing me an error telling me that it is not able to find the model. Am I missing something? I'm using Google Colab. Thank you
import keras
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_loss',patience=6,),
keras.callbacks.ModelCheckpoint(filepath='my_model.h5',monitor='val_loss',mode='min', save_freq='epoch',save_best_only=True,)]
model.compile(loss=MeanAbsoluteError(), optimizer='Adam',metrics=[RootMeanSquaredError()])
history= model.fit(X_train, y_train,batch_size=512, epochs=100,callbacks=callbacks_list,validation_data=(X_val, y_val))
from tensorflow.keras.models import load_model
#save model to single file
model.save('my_model.h5')
#To load model
model = load_model('my_model.h5')
Since you are using Google Colab, you must mount your drive to access the data on Colab. Assuming that the notebook you are executing is in the directory my_dir (update the path according to YOUR particular path) you can add the following code to a cell before your save and load code:
from google.colab import drive
drive.mount('/content/drive') # mounts the drive
%cd /content/drive/MyDrive/my_dir/ # moves your position inside the directory where you are executing the code
# ... your code to save and your code to load

how to list all downloaded datset from nltk

I downloaded some of the datasets from nltk using
import nltk
import nltk.corpus
nltk.download()
Now I want to list all the downloaded dataset
I don't know how.
You need to find a path where the downloads are stored. It should be nltk.data.path.
Also, try using nltk.data.find:
import os
import nltk
print(os.listdir(nltk.data.find("corpora")))

ModuleNotFoundError: No module named 'google.cloud.automl_v1beta1.proto'

I am trying to follow this tutorial on Google Cloud Platform,
https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/notebooks/samples/tables/census_income_prediction/getting_started_notebook.ipynb, however, I am running into issues when I try to import the autoML module, specifically the below two lines
# AutoML library.
from google.cloud import automl_v1beta1 as automl
import google.cloud.automl_v1beta1.proto.data_types_pb2 as data_types
The first line works, but for the 2nd one, I get the error: ModuleNotFoundError: No module named 'google.cloud.automl_v1beta1.proto'. It seems for some reason there is no module called proto and I cannot figure out how to resolve this. There are a couple of posts regarding the issue of not being able to find module google.cloud. In my case I am able to import automl_v1beta1 from google.cloud but not proto.data_types_pb2 from google.cloud.automl_v1beta1
I think you can:
from google.cloud import automl_v1beta1 as automl
import google.cloud.automl_v1beta1.types as data_types
Or:
import google.cloud.automl_v1beta1 as automl
import google.cloud.automl_v1beta1.types as data_types
But (!) given the import errors, there may be other changes to the SDK in the code that follows.

Loading a 8.9 GB dataset from Google Drive to Google Colab?

I am working on a huge laboratory dataset and want to know how to load an 8.9GB dataset from my google drive to my google colab file. The error it shows is runtime stopped, Restarting it.
I've already tried chunksize, nrows, na_filter, and dask. There might be a problem implementing them though. If you could explain to me how to use it. I am attaching my original code below.
import pandas as pd
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
id = '1M4tregypJ_HpXaQCIykyG2lQtAMR9nPe'
downloaded = drive.CreateFile({'id':id})
downloaded.GetContentFile('Filename.csv')
df = pd.read_csv('Filename.csv')
df.head()
If you suggest any of the methods I've already tried please do so with appropriate and working code.
The problem is probably from pd.read_csv('Filename.csv').
A 8.9GB CSV file will take more than 13GB RAM. You should not load the whole file into memory, but work incrementally.

Loading a TensorFlow frozen graph (.pb) in Node.js

I saved a tensorflow model in a frozen PB file which is suitable to be used by TensorFlow Lite.
This file can be loaded in Android and works well by following code:
import org.tensorflow.contrib.android.TensorFlowInferenceInterface;
…
TensorFlowInferenceInterface inferenceInterface;
inferenceInterface = new TensorFlowInferenceInterface(context.getAssets(), "MODEL_FILE.pb");
Is there any way to load the frozen graph in Node.js?
I found the solution here:
1. the model must be converted to a web-friendly format which is a JSON file.
2. Then can be loaded using '#tensorflow/tfjs' in Nodejs.

Resources