What is the different for torchvision.models.resnet and torch.hub.load? - pytorch

There are two method for using resnet of pytorch.
methods 1:
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet50', pretrained=True)
model.eval()
methods 2:
import torch
net = models.resnet50(pretrained=True)
Are they load the same model. If not what is the difference?

The only difference that there is between your models if you load them in that way it's the number of layers, since you're loading resnet18 with Torch Hub and resnet50 with Models (thus, also the pretrained weights). They behave differently, you can see more about that in this paper.
Torch Hub also lets you publish pretrained models in your repository, but since you're loading it from 'pytorch/vision:v0.10.0' (which is the same repository from which Models is loading the neural networks), there should be no difference between:
model = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
and
model = models.resnet18(pretrained=True)

Related

Loading a GPU trained BERTopic model on CPU?

I trained a BERTopic model on a GPU, and now for visualization purposes I want to load it on a CPU.
But when I tried to do that I got:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
When I tried to use the suggested fix I got the same problem?
Saw some fix that suggests to save the model without its embeddings model, but don't want to retrain an resave unless its the last option, and would also love if someone could explain what's this embedding model and what's going on under the hood.
topic_model = torch.load(args.model, map_location=torch.device('cpu'))
When you want to save the BERTopic model without the embedding model, you can run the following:
from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups
from sentence_transformers import SentenceTransformer
docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']
# Train the model
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
topic_model = BERTopic(embedding_model=embedding_model)
topics, probs = topic_model.fit_transform(docs)
# Save the model without the embedding model
topic_model.save("my_model", save_embedding_model=False)
This should prevent any issues with GPU/CPU if you are not using any of the cuML sub-models in BERTopic.
Saw some fix that suggests to save the model without its embeddings model, but don't want to retrain an resave unless its the last option, and would also love if someone could explain what's this embedding model and what's going on under the hood.
The embedding model is typically a pre-trained model that actually is not learning from the input data. There are options to make it learn during training but that requires a custom component in BERTopic. In other words, when you use a pre-trained model, it is no problem removing that pre-trained model when saving the topic model as there would be no need to re-train the model.
In other words, we would first save our topic model in our GPU environment without the embedding model:
topic_model.save("my_model", save_embedding_model=False)
Then, we load in our saved BERTopic model in our CPU environment and then pass the pre-trained embedding model:
from sentence_transformers import SentenceTransformer
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
topic_model = BERTopic.load("my_model", embedding_model=embedding_model )
You can learn more about the role of the embedding model here.

What are differences between AutoModelForSequenceClassification vs AutoModel

We can create a model from AutoModel(TFAutoModel) function:
from transformers import AutoModel
model = AutoModel.from_pretrained('distilbert-base-uncase')
In other hand, a model is created by AutoModelForSequenceClassification(TFAutoModelForSequenceClassification):
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification('distilbert-base-uncase')
As I know, both models use distilbert-base-uncase library to create models.
From name of methods, the second class( AutoModelForSequenceClassification ) is created for Sequence Classification.
But what are really differences in 2 classes? And how to use them correctly?
(I searched in huggingface but it is not clear)
The difference between AutoModel and AutoModelForSequenceClassification model is that AutoModelForSequenceClassification has a classification head on top of the model outputs which can be easily trained with the base model

Compatibility between keras and tf.keras models

I am interested in training a model in tf.keras and then loading it with keras. I know this is not highly-advised, but I am interested in using tf.keras to train the model because
tf.keras is easier to build input pipelines
I want to take advantage of the tf.dataset API
and I am interested in loading it with keras because
I want to use coreml to deploy the model to ios.
I want to use coremltools to convert my model to ios, and coreml tools only works with keras, not tf.keras.
I have run into a few road-blocks, because not all of the tf.keras layers can be loaded as keras layers. For instance, I've had no trouble with a simple DNN, since all of the Dense layer parameters are the same between tf.keras and keras. However, I have had trouble with RNN layers, because tf.keras has an argument time_major that keras does not have. My RNN layers have time_major=False, which is the same behavior as keras, but keras sequential layers do not have this argument.
My solution right now is to save the tf.keras model in a json file (for the model structure) and delete the parts of the layers that keras does not support, and also save an h5 file (for the weights), like so:
model = # model trained with tf.keras
# save json
model_json = model.to_json()
with open('path_to_model_json.json', 'w') as json_file:
json_ = json.loads(model_json)
layers = json_['config']['layers']
for layer in layers:
if layer['class_name'] == 'SimpleRNN':
del layer['config']['time_major']
json.dump(json_, json_file)
# save weights
model.save_weights('path_to_my_weights.h5')
Then, I use the coremlconverter tool to convert from keras to coreml, like so:
with CustomObjectScope({'GlorotUniform': glorot_uniform()}):
coreml_model = coremltools.converters.keras.convert(
model=('path_to_model_json','path_to_my_weights.h5'),
input_names=#inputs,
output_names=#outputs,
class_labels = #labels,
custom_conversion_functions = { "GlorotUniform": tf.keras.initializers.glorot_uniform
}
)
coreml_model.save('my_core_ml_model.mlmodel')
My solution appears to be working, but I am wondering if there is a better approach? Or, is there imminent danger in this approach? For instance, is there a better way to convert tf.keras models to coreml? Or is there a better way to convert tf.keras models to keras? Or is there a better approach that I haven't thought of?
Any advice on the matter would be greatly appreciated :)
Your approach seems good to me!
In the past, when I had to convert tf.keras model to keras model, I did following:
Train model in tf.keras
Save only the weights tf_model.save_weights("tf_model.hdf5")
Make Keras model architecture using all layers in keras (same as the tf keras one)
load weights by layer names in keras: keras_model.load_weights(by_name=True)
This seemed to work for me. Since, I was using out of box architecture (DenseNet169), I had to very less work to replicate tf.keras network to keras.

Keras Init Sequential model layers by Model layers

I'm trying to build some app using Transfer Learning. I want to use Vgg16 so I've done sth like this:
vgg16_model = keras.applications.vgg16.VGG16() but I want to transfer layers from Vgg16 to my model.
model = Sequential(layers=vgg16_model.layers) (I've seen this here)
but it leads me to error
TypeError: The added layer must be an instance of class Layer. Found:
tensorflow.python.keras.engine.input_layer.InputLayer
How can I init my Sequential model by vgg16 layers?
Thanks in advance.
Try this:
vgg = VGG16()
model = Sequential()
model.add(vgg)
model.add(...) # add additional layers

Fine-Tune pre-trained InceptionResnetV2

Steps for fine-tuning a network are as follow:
Add your custom network on top of an already trained base
network.
Freeze the base network.
Train the part you added.
Unfreeze some layers in the base network.
Jointly train both these layers and the part you added.
Now if the network architecture is simple as VGG16, we can simply unfreeze the base network from block5_conv1 (Conv2D) and re-train it.
VGG16 Architecture
But When the architecture is highly complex as InceptionResnetV2, where to start? Does anyone has any practical experience? Run the following code in python to see the model:
from keras.applications import InceptionResNetV2
conv_base = InceptionResNetV2(weights='imagenet',
include_top=False,
input_shape=(299, 299, 3))
conv_base.summary()
from keras.utils import plot_model
plot_model(conv_base, to_file='model.png')`
A very basic fine-tuning of model with InceptionResNetV2 will look like this:
from inception_resnet_v2 import InceptionResNetV2
# ImageNet classification
model = InceptionResNetV2()
model.predict(...)
# Finetuning on another 100-class dataset
base_model = InceptionResNetV2(include_top=False, pooling='avg')
# The first argument in the next line represents the number of classes
outputs = Dense(100, activation='softmax')(base_model.output)
model = Model(base_model.inputs, outputs)
model.compile(...)
model.fit(...)
This is a good place to start github.com/yuyang-huang/keras-inception-resnet-v2

Resources