Quantitative analysis of Neural Networks needs load all variables first in
TensorFlow. So I import the SSDLite MobileNet v2 model's checkpoint and restore weights. But I can not get its weights or variables by using some functions, such as get_all_coolection_keys, get_collection. The code snippet as follows:
import tensorflow as tf
import numpy as np
model_dir = 'ssdlite_mobilenet_v2_coco_2018_05_09'
checkpoint = tf.train.get_checkpoint_state(model_dir)
input_checkpoint = checkpoint.model_checkpoint_path # ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt
sess = tf.Session()
tf.reset_default_graph()
saver = tf.train.import_meta_graph(input_checkpoint + '.meta')
graph = tf.get_default_graph()
saver.restore(sess, input_checkpoint)
# INFO:tensorflow:Restoring parameters from ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt
graph_collections_keys = graph.get_all_collection_keys()
print(graph_collections_keys) # []
hyperparameters = tf.get_collection('hyperparameters')
print(len(hyperparameters)) # 0
model_variables = tf.get_collection(tf.GraphKeys.MODEL_VARIABLES)
print(len(model_variables)) # 0
So, how to access a pretrained model's weights (names and values) in TensorFlow?
pal, you can't restore variable name, weights is as far as you go. Also in tensorflow or in your code in general , the only option you got for restoring is saver . It also depends what kind of data you wanna restore if it's .npy file then np.load will do.
Related
In my torch model, the last layer is a torch.nn.Sigmoid() and the loss is the torch.nn.BCELoss.
In the training step, the following error has occurred:
RuntimeError: torch.nn.functional.binary_cross_entropy and torch.nn.BCELoss are unsafe to autocast.
Many models use a sigmoid layer right before the binary cross entropy layer.
In this case, combine the two layers using torch.nn.functional.binary_cross_entropy_with_logits
or torch.nn.BCEWithLogitsLoss. binary_cross_entropy_with_logits and BCEWithLogits are
safe to autocast.
However, when trying to reproduce this error while computing the loss and backpropagation, everything goes correctly:
import torch
from torch import nn
# last layer
sigmoid = nn.Sigmoid()
# loss
bce_loss = nn.BCELoss()
# the true classes
true_cls = torch.tensor([
[0.],
[1.]])
# model prediction classes
pred_cls = sigmoid(
torch.tensor([
[0.4949],
[0.4824]],requires_grad=True)
)
pred_cls
# tensor([[0.6213],
# [0.6183]], grad_fn=<SigmoidBackward>)
out = bce_loss(pred_cls, true_cls)
out
# tensor(0.7258, grad_fn=<BinaryCrossEntropyBackward>)
out.backward()
What am i missing?
I appreciate any help you can provide.
You have to move it to cuda first and enable the autocast, like this:
import torch
from torch import nn
from torch.cuda.amp import autocast
# last layer
sigmoid = nn.Sigmoid().cuda()
# loss
bce_loss = nn.BCELoss().cuda()
# the true classes
true_cls = torch.tensor([
[0.],
[1.]]).cuda()
with autocast():
# model prediction classes
pred_cls = sigmoid(
torch.tensor([
[0.4949],
[0.4824]], requires_grad=True
).cuda()
)
pred_cls
# tensor([[0.6213],
# [0.6183]], grad_fn=<SigmoidBackward>)
out = bce_loss(pred_cls, true_cls)
out
# tensor(0.7258, grad_fn=<BinaryCrossEntropyBackward>)
out.backward()
RuntimeError: torch.nn.functional.binary_cross_entropy and torch.nn.BCELoss are unsafe to autocast.
Many models use a sigmoid layer right before the binary cross entropy layer.
In this case, combine the two layers using torch.nn.functional.binary_cross_entropy_with_logits
or torch.nn.BCEWithLogitsLoss. binary_cross_entropy_with_logits and BCEWithLogits are
safe to autocast.
Model saved with
net= Net()
model= torch.nn.DataParallel(net)
############################
# Training
############################
torch.save(model,'./model_shear_pre.pkl')
Model loading with
net = Net()
model = torch.nn.DataParallel(net, device_ids=[0,1])
model = torch.load('./model_shear_finish.pkl', map_location={'cuda:0':'cuda:0', 'cuda:1':'cuda:0', 'cuda:2':'cuda:1', 'cuda:3':'cuda:1'})
The prob is that when training I used a machine with 4 GPU, after saving the model, I would like to test it on a new machine with only 2 GPU.
After loading the saved model, I expect the model's device_ids would be [0,1], but it still be [0,1,2,3] which is the old setting. Is there anything wrong when saving or loading?
You should save the weights instead of the whole model.
net = Net()
model = torch.nn.DataParallel(net)
############################
# Training
############################
torch.save(model.state_dict(),'./model_shear_pre.pkl')
Then load the weight in CPU before move to all GPU
net = Net()
weights = torch.load('./model_shear_finish.pkl', map_location='cpu')
net.load_state_dict(weights)
model = torch.nn.DataParallel(net, device_ids=[0,1])
But if you have an already trained model that is saved using the whole model instead of just weights this might also work
net = torch.load('./model_shear_finish.pkl', map_location='cpu')
model = torch.nn.DataParallel(net, device_ids=[0,1])
I still recommend save only weights though. Saving and loading the whole model can really screw you up because you have to import the model the exact same way both in the save and load. And a lot of time that's a tricky thing to do. Like
train.py
from nets import Net
net = Net()
torch.save(net, './model_shear_finish.pkl')
inference.py
# this won't work
import nets
torch.load('./model_shear_finish.pkl', map_location='cpu')
# this will work
from nets import Net
torch.load('./model_shear_finish.pkl', map_location='cpu')
This question has been answered for Tensorflow 1, eg: How to Properly Combine TensorFlow's Dataset API and Keras?, but this answer hasn't helped for my use case.
Below is an example of a model with three float32 inputs and one float32 output. I have a large amount of data that doesn't all fit into memory at once, so it's split into separate files. I'm trying to use the Dataset API to train a model by bringing in a portion of the training data at once.
import tensorflow as tf
import tensorflow.keras.layers as layers
import numpy as np
# Create TF model of a given architecture (number of hidden layers, layersize, #outputs, activation function)
def create_model(h=2, l=64, activation='relu'):
model = tf.keras.Sequential([
layers.Dense(l, activation=activation, input_shape=(3,), name='input_layer'),
*[layers.Dense(l, activation=activation) for _ in range(h)],
layers.Dense(1, activation='linear', name='output_layer')])
return model
# Load data (3 X variables, 1 Y variable) split into 5 files
# (for this example, just create a list 5 numpy arrays)
list_of_training_datasets = [np.random.rand(10,4).astype(np.float32) for _ in range(5)]
validation_dataset = np.random.rand(30,4).astype(np.float32)
def data_generator():
for data in list_of_training_datasets:
x_data = data[:, 0:3]
y_data = data[:, 3:4]
yield((x_data,y_data))
# prepare model
model = create_model(h=2,l=64,activation='relu')
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam())
# load dataset
dataset = tf.data.Dataset.from_generator(data_generator,(np.float32,np.float32))
# fit model
model.fit(dataset, epochs=100, validation_data=(validation_dataset[:,0:3],validation_dataset[:,3:4]))
Running this, I get the error:
ValueError: Cannot take the length of shape with unknown rank.
Does anyone know how to get this working? I would also like to be able to use the batch dimension, to load two data files at a time, for example.
You need to need to specify the shapes of the your dataset along with the return data types like this.
dataset = tf.data.Dataset.from_generator(data_generator,
(np.float32,np.float32),
((None, 3), (None, 1)))
The following works, but I don't know if this is the most efficient.
As far as I understand, if your training dataset is split into 10 pieces, then you should set steps_per_epoch=10. This ensures that each epoch will step through all data once. As far as I understand, dataset.repeat() is needed because the dataset iterator is "used up" after the first epoch. .repeat() ensures that the iterator gets created again after being used up.
import numpy as np
import tensorflow.keras.layers as layers
import tensorflow as tf
# Create TF model of a given architecture (number of hidden layers, layersize, #outputs, activation function)
def create_model(h=2, l=64, activation='relu'):
model = tf.keras.Sequential([
layers.Dense(l, activation=activation, input_shape=(3,), name='input_layer'),
*[layers.Dense(l, activation=activation) for _ in range(h)],
layers.Dense(1, activation='linear', name='output_layer')])
return model
# Load data (3 X variables, 1 Y variable) split into 5 files
# (for this example, just create a list 5 numpy arrays)
list_of_training_datasets = [np.random.rand(10,4).astype(np.float32) for _ in range(5)]
steps_per_epoch = len(list_of_training_datasets)
validation_dataset = np.random.rand(30,4).astype(np.float32)
def data_generator():
for data in list_of_training_datasets:
x_data = data[:, 0:3]
y_data = data[:, 3:4]
yield((x_data,y_data))
# prepare model
model = create_model(h=2,l=64,activation='relu')
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam())
# load dataset
dataset = tf.data.Dataset.from_generator(data_generator,output_types=(np.float32,np.float32),
output_shapes=(tf.TensorShape([None,3]), tf.TensorShape([None,1]))).repeat()
# fit model
model.fit(dataset.as_numpy_iterator(), epochs=10,steps_per_epoch=steps_per_epoch,
validation_data=(validation_dataset[:,0:3],validation_dataset[:,3:4]))
Briefly, I put in place a data input pipline using tensorflow Dataset API. Then, I implemented a CNN model for classification using keras, which i converted to an estimator. I feeded my estimator Train and Eval Specs with my input_fn providing input data for training and evaluation. And as final step I launched the model training with tf.estimator.train_and_evaluate
def my_input_fn(tfrecords_path):
dataset = (...)
return batch_fbanks, batch_labels
def build_model():
model = tf.keras.models.Sequential()
model.add(...)
model.compile(...)
return model
model = build_model()
run_config=tf.estimator.RunConfig(model_dir,save_summary_steps=100,save_checkpoints_steps=1000)
estimator = tf.keras.estimator.model_to_estimator(model,config=run_config)
def serving_input_receiver_fn():
inputs = {'Conv1_input': tf.compat.v1.placeholder(shape=[None, 11,120,1], dtype=tf.float32)}
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
exporter = tf.estimator.BestExporter(serving_input_receiver_fn, name="best_exporter", exports_to_keep=5)
train_spec_dnn = tf.estimator.TrainSpec(input_fn = lambda: my_input_fn(train_data_path),hooks=[hook])
eval_spec_dnn = tf.estimator.EvalSpec(input_fn = lambda: my_eval_input_fn(eval_data_path),exporters=exporter,start_delay_secs=0,throttle_secs=15)
tf.estimator.train_and_evaluate(estimator, train_spec_dnn, eval_spec_dnn)
I save the 5 best checkpoints using the tf.estimator.BestExporter as shown above. Once i finished training, i want to reload the best model and convert it to an estimator to re-evaluate the model and predict on new dataset. However my issue is in restoring the checkpoint to an estimator. I tried several solutions but each time i don't get the estimator object I need to run its evaluate and predict methods.
Just to specify more, each of the best checkpoints directory is organised as follow:
./
variables/
variables.data-00000-of-00002
variables.data-00001-of-00002
variables.index
saved_model.pb
So the question is how can I get an estimator object from the best checkpoint so that i can use it to evaluate my model and predict on new data?
Note : I found some proposed solutions relying on TensorFlow v1 features which can not solve my problem because i work with TF v2.
Thanks a lot, any help is appreciated.
You can use the class below created from tf.estimator.BestExporter
What it does is, except for saving the best model (.pb files and etc) it will also save
the best-exported model checkpoint on a different folder.
Below is the class:
import shutil, glob, os
# import tensorflow.logging as logging
## the path where all the checkpoint reside
BEST_CHECKPOINTS_PATH_FROM = 'PATH TO ALL CHECKPOINT FILES'
## the path it will save the best exporter checkpoint files
BEST_CHECKPOINTS_PATH_TO = 'PATH TO BEST EXPORTER CHECKPOINT FILES TO BE SAVE'
class BestCheckpointsExporter(tf.estimator.BestExporter):
def export(self, estimator, export_path, checkpoint_path, eval_result,is_the_final_export):
if self._best_eval_result is None or \
self._compare_fn(self._best_eval_result, eval_result):
#print('Exporting a better model ({} instead of {})...'.format(eval_result, self._best_eval_result))
for name in glob.glob(checkpoint_path + '.*'):
print(name)
print(os.path.join(BEST_CHECKPOINTS_PATH_TO, os.path.basename(name)))
shutil.copy(name, os.path.join(BEST_CHECKPOINTS_PATH_TO, os.path.basename(name)))
# also save the text file used by the estimator api to find the best checkpoint
with open(os.path.join(BEST_CHECKPOINTS_PATH_TO, "checkpoint"), 'w') as f:
f.write("model_checkpoint_path: \"{}\"".format(os.path.basename(checkpoint_path)))
self._best_eval_result = eval_result
else:
print('Keeping the current best model ({} instead of {}).'.format(self._best_eval_result, eval_result))
Example Usage of the Class
You will just replace the exporter by calling the class and pass the serving_input_receiver_fn.
def serving_input_receiver_fn():
inputs = {'my_dense_input': tf.compat.v1.placeholder(shape=[None, 4], dtype=tf.float32)}
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
exporter = BestCheckpointsExporter(serving_input_receiver_fn=serving_input_receiver_fn)
train_spec_dnn = tf.estimator.TrainSpec(input_fn = input_fn, max_steps=5)
eval_spec_dnn = tf.estimator.EvalSpec(input_fn=input_fn,exporters=exporter,start_delay_secs=0,throttle_secs=15)
(x, y) = tf.estimator.train_and_evaluate(keras_estimator, train_spec_dnn, eval_spec_dnn)
At this point, It will save the best-exported model checkpoint files in the folder you have specified.
For loading the checkpoint files you need to do the following steps:
Step 1: Rebuild your model instance
def build_model():
model = tf.keras.models.Sequential()
model.add(...)
model.compile(...)
return model
model = build_model()
Step 2: use the model load_weights API
Reference URL: https://www.tensorflow.org/tutorials/keras/save_and_load
ck_path = tf.train.latest_checkpoint('PATH TO BEST EXPORTER CHECKPOINT FILES')
model.load_weights(ck_path)
## From there you will be able to call the predict & evaluate the functionality of the trained model
##PREDICT
prediction = model.predict(x)
##EVALUATE
for features_batch, labels_batch in input_fn().take(1):
model.evaluate(features_batch, labels_batch)
Note: All of these have been simulated on google colab.
I am using Keras functional API to build a classifier and I am using the training flag in the dropout layer to enable dropout when predicting new instances (in order to get an estimate of the uncertainty). In order to get the expected response one needs to repeat this prediction several times, with keras randomly activating links in the dense layer, and of course it is computational expensive. Therefore, I would also like to have the option to not use dropout at the prediction phase, i.e., use all the network links. Does anyone know how I can do this? Following is a sample code of what I am doing. I tried to look if predict has any relevant parameter but does not seem like it does (?). I can technically train the same model without the training flag at the dropout layer, but I do not want to do this (or better I want to have a more clean solution, rather than having 2 different models).
from sklearn.datasets import make_circles
from keras.models import Sequential
from keras.utils import to_categorical
from keras.layers import Dense
from keras.layers import Dropout
import numpy as np
import keras
# generate a 2d classification sample dataset
X, y = make_circles(n_samples=100, noise=0.1, random_state=1)
n_train = 30
trainX, testX = X[:n_train, :], X[n_train:, :]
trainy, testy = y[:n_train], y[n_train:]
trainy = to_categorical(trainy)
testy = to_categorical(testy)
inputlayer = keras.layers.Input((2,))
d = keras.layers.Dense(500, activation = 'relu')(inputlayer)
d1 = keras.layers.Dropout(rate = .3)(d,training = True)
out = keras.layers.Dense(2, activation = 'softmax')(d1)
model = keras.Model(inputs = inputlayer, outputs = out)
model.compile(loss = 'categorical_crossentropy',metrics = ['accuracy'],optimizer='adam')
model.fit(x = trainX, y = trainy, validation_data=(testX, testy),epochs=1000, verbose=1)
# another prediction on a specific sample
print(model.predict(testX[0:1,:]))
# another prediction on the same sample
print(model.predict(testX[0:1,:]))
Running the above example I get the following output:
[[0.9230819 0.07691813]]
[[0.8222245 0.17777553]]
which is as expected, different class probabilities for the same input, since there is a random (de)activation of the links from the dropout layer.
Any suggestions on how I can enable/disable dropout at the prediction phase with the functional API?
Sure, you do not need to set the training flag when building the Dropout layer. After training your model you define this function:
mc_func = K.function([model.input, K.learning_phase()],
[model.output])
Then you call mc_func with your input and flag 1 to enable dropout, or 0 to disable it:
stochastic_pred = mc_func([some_input, 1])
deterministic_pred = mc_func([some_input, 0])