I am using Keras newsgroup example code for text classification. I have saved the trained model using the h5py library. Will the embedding layer also get saved or should I write some extra code when loading the model to use the embedding layer?
Embedding layer is part of the model so it will be saved with the model. Check out this on saving the model.
Also one important addition, the Keras Embedding layer will be initialized with the random values at the very start of the training process, and the parameters will be learned in the training phase.
Related
I am new in Pytorch. My question is: How do I apply transfer learning to a custom dataset? I am doing image segmentation on brain tumors. I can find examples which use U-net structure but I could not find examples using weights of the pre-trained models for a U-net image segmentation?
You could obtain pre-trained models in two ways:
Model weights or complete models shared in formats such .pt or .pth:
In this case, Saving and Loading Models is a good starting point. Copying from the tutorial there, you could load a model as
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
The other way is to load the model from torchvision. A list is available models is available at Torchvision Models. U-Net is not available yet. However, it is possible to load a pre-trained model as the encoder and write a separate decoder to form a U-Net with a pre-trained encoder.
In this case, the model object returned from the function calls shown in the API are already loaded with pretrained weights when pretrained=True.
For writing a custom dataloader, PyTorch data loaders may be a useful guide.
I've got a basic Keras model with a GRU layer where stateful=True. I want to convert my model to a TFLite model and make predictions on data one element at a time, i.e a sequence will be fed to the model in batches of size 1. Looking at the TensorFlow Docs, there isn't a way to convert a Stateful GRU to a TFLite Model. However it does say (https://www.tensorflow.org/lite/convert/rnn):
It is still possible to model a stateful Keras LSTM layer using the underlying stateless Keras LSTM layer and managing the state explicitly in the user program. Such a TensorFlow program can still be converted to TensorFlow Lite using the feature being described here.
I don't what is meant by this. If I set the GRU to not be stateful, how can I prevent its state from being reset after each prediction (batch)? Is there a way to keep the state from resetting?
How can I use the weights of a pre-trained network in my tensorflow project?
I know some theory information about this but no information about coding in tensorflow.
As been pointed out by #Matias Valdenegro in the comments, your first question does not make sense. For your second question however, there are multiple ways to do so. The term that you're searching for is Transfer Learning (TL). TL means transferring the "knowledge" (basically it's just the weights) from a pre-trained model into your model. Now there are several types of TL.
1) You transfer the entire weights from a pre-trained model into your model and use that as a starting point to train your network.
This is done in a situation where you now have extra data to train your model but you don't want to start over the training again. Therefore you just load the weights from your previous model and resume the training.
2) You transfer only some of the weights from a pre-trained model into your new model.
This is done in a situation where you have a model trained to classify between, say, 5 classes of objects. Now, you want to add/remove a class. You don't have to re-train the whole network from the start if the new class that you're adding has somewhat similar features with (an) existing class(es). Therefore, you build another model with the same exact architecture as your previous model except the fully-connected layers where now you have different output size. In this case, you'll want to load the weights of the convolutional layers from the previous model and freeze them while only re-train the fully-connected layers.
To perform these in Tensorflow,
1) The first type of TL can be performed by creating a model with the same exact architecture as the previous model and simply loading the model using tf.train.Saver().restore() module and continue the training.
2) The second type of TL can be performed by creating a model with the same exact architecture for the parts where you want to retain the weights and then specify the name of the weights in which you want to load from the previous pre-trained weights. You can use the parameter "trainable=False" to prevent Tensorflow from updating them.
I hope this helps.
I am interested in training a model in tf.keras and then loading it with keras. I know this is not highly-advised, but I am interested in using tf.keras to train the model because
tf.keras is easier to build input pipelines
I want to take advantage of the tf.dataset API
and I am interested in loading it with keras because
I want to use coreml to deploy the model to ios.
I want to use coremltools to convert my model to ios, and coreml tools only works with keras, not tf.keras.
I have run into a few road-blocks, because not all of the tf.keras layers can be loaded as keras layers. For instance, I've had no trouble with a simple DNN, since all of the Dense layer parameters are the same between tf.keras and keras. However, I have had trouble with RNN layers, because tf.keras has an argument time_major that keras does not have. My RNN layers have time_major=False, which is the same behavior as keras, but keras sequential layers do not have this argument.
My solution right now is to save the tf.keras model in a json file (for the model structure) and delete the parts of the layers that keras does not support, and also save an h5 file (for the weights), like so:
model = # model trained with tf.keras
# save json
model_json = model.to_json()
with open('path_to_model_json.json', 'w') as json_file:
json_ = json.loads(model_json)
layers = json_['config']['layers']
for layer in layers:
if layer['class_name'] == 'SimpleRNN':
del layer['config']['time_major']
json.dump(json_, json_file)
# save weights
model.save_weights('path_to_my_weights.h5')
Then, I use the coremlconverter tool to convert from keras to coreml, like so:
with CustomObjectScope({'GlorotUniform': glorot_uniform()}):
coreml_model = coremltools.converters.keras.convert(
model=('path_to_model_json','path_to_my_weights.h5'),
input_names=#inputs,
output_names=#outputs,
class_labels = #labels,
custom_conversion_functions = { "GlorotUniform": tf.keras.initializers.glorot_uniform
}
)
coreml_model.save('my_core_ml_model.mlmodel')
My solution appears to be working, but I am wondering if there is a better approach? Or, is there imminent danger in this approach? For instance, is there a better way to convert tf.keras models to coreml? Or is there a better way to convert tf.keras models to keras? Or is there a better approach that I haven't thought of?
Any advice on the matter would be greatly appreciated :)
Your approach seems good to me!
In the past, when I had to convert tf.keras model to keras model, I did following:
Train model in tf.keras
Save only the weights tf_model.save_weights("tf_model.hdf5")
Make Keras model architecture using all layers in keras (same as the tf keras one)
load weights by layer names in keras: keras_model.load_weights(by_name=True)
This seemed to work for me. Since, I was using out of box architecture (DenseNet169), I had to very less work to replicate tf.keras network to keras.
I built a LSTM model for text classification using Keras. Now I have new data to be trained. instead of appending to the original data and retrain the model, I thought of training the data using the model weights. i.e. making the weights to get trained with the new data.
However, irrespective of the volume i train, the model is not predicting the correct classification (even if i give the same sentence for prediction). What could be the reason?
Kindly help me.
Are you using the following to save the trained model?
model.save('model.h5')
model.save_weights('model_weights.h5')
And the following to load it?
from keras.models import load_model
model = load_model('model.h5') # Load the architecture
model = model.load_weights('model_weights.h5') # Set the weights
# train on new data
model.compile...
model.fit...
The model loaded is the exact same as the model being saved here. If you are doing this, then there must be something different in the data (in comparison with what it is trained on).