Running Tensorflow model test data after closing session - python-3.x

I have a Convnet I am trying to replicate (not my original code) that was able to run test dataset into the trained model only when I trained and tested in the same sitting. I tweaked only a few lines of the code to make it run test data after said sitting so I am not sure what might be going on. I noticed that "logits_out" was a dataflow edge rather than node in tensorboard, so is it that because edges aren't saved in checkpoints automatically, in conjunction with the fact that it is not saved as a node or in any other form intentionally in original code, that it can't be called after the first sitting closes?
This is the general structure of the training phase:
tf.reset_default_graph()
graph = tf.Graph()
with graph.as_default():
with tf.name_scope('1st_pool'):
#first layer
#subsequent layers
with graph.as_default():
#flattening, dropout, optimization, etc...
#some summary.scalar for loss analyses
logits_out = tf.layers.dense(flat, 1) #flat is the flattened array
saved_1 = tf.train.Saver()
trained_event = tf.summary.FileWriter('./CNN/train', graph=graph)
test_event = tf.summary.FileWriter('./CNN/test', graph=graph)
merged = tf.summary.merge_all()
with tf.Session(graph=graph) as sess:
#training and "validating"
sess.run(tf.global_variables_initializer())
#running train summaries
if step = test_round:
#running test summaries
saved_1.save(sess, './CNN/model_1.ckpt')
(EDITED:code pasted incorrectly)
This code ran successfully during the continuous sitting with graph still open:
with tf.Session(graph=graph) as sess:
saved_1.restore(sess, tf.train.latest_checkpoint('./CNN'))
#
pred = sess.run(logits_out, feed_dict={some inputs for placeholders})
#
Only tweaked 2 lines pretty much (shown below) to load meta files in a new graph on the next day but gave the error "name 'logits_out' is not defined" when I try to run in a separate sitting (in fact, other variables I tried to sess.run gave the same error):
with tf.Session(graph=tf.get_default_graph()) as sess:
saved_1 = tf.train.import_meta_graph('./CNN/model_1.ckpt.meta')
saved_1.restore(sess, tf.train.latest_checkpoint('./CNN'))
pred = sess.run(logits_out, feed_dict={some inputs for placeholders})
#
EDITED:I'm thinking it might be because I am missing a scope - or misunderstanding how tensorflow names stuff - after restoring the session/graph the next day, but I can't see how - the only thing that had been named were the pool.

I was able to run data through the model by just creating the graph by running this section of code today:
tf.reset_default_graph()
graph = tf.Graph()
with graph.as_default():
with tf.name_scope('1st_pool'):
#first layer
#subsequent layers
with graph.as_default():
#flattening, dropout, optimization, etc...
#some summary.scalar for loss analyses
logits_out = tf.layers.dense(flat, 1) #flat is the flattened array
saved_1 = tf.train.Saver()
trained_event = tf.summary.FileWriter('./CNN/train', graph=graph)
test_event = tf.summary.FileWriter('./CNN/test', graph=graph)
merged = tf.summary.merge_all()
with tf.Session(graph=graph) as sess:
#training and "validating"
sess.run(tf.global_variables_initializer())
#running train summaries
if step = test_round:
#running test summaries
saved_1.save(sess, './CNN/model_1.ckpt')
and then running
the code without the edited 2 lines:
with tf.Session(graph=graph) as sess:
saved_1.restore(sess, tf.train.latest_checkpoint('./CNN'))
#
pred = sess.run(logits_out, feed_dict={some inputs for placeholders})
#
So the gist of all this entire post on SO was that I did not have to use tf.train.import_meta_graph, but what I don't understand is what is the use of tf.train.import_meta_graph? I thought it imports the graph and it's metadata saved in ".meta" file so I could avoid having to rebuild the graph from the source code?
(note: I will remove this postscript question once I figure it out)

Related

Data Normalization in Tensorflow Model

I have a Tensorflow regression model that i have with been working with. I have the model tuned well and getting good results while training. However, when i goto evalute the results are horrible. I did some research and found that i am not normalizing my test features and labels as well so i suspect that is where the problem is. My thought is to normalize the whole dataset before splitting the dataset into train and test sets but i am getting an attribute error that has me stumped.
here is the code sample. Please help :)
#concatenate the surface data and single_downhole_col into a single dataframe
training_Data =[]
training_Data = pd.concat([surface_Data, single_downhole_col], axis=1)
#print('training data shape:',training_Data.shape)
#print(training_Data.head())
#normalize the data using keras
model_normalizer_layer = tf.keras.layers.Normalization(axis=-1)
model_normalizer_layer.adapt(training_Data)
normalized_training_Data = model_normalizer_layer(training_Data)
#convert the data frame to array
dataset = normalized_training_Data.copy()
dataset.tail()
#create a training and test set
train_dataset = dataset.sample(frac=0.8, random_state=0)
test_dataset = dataset.drop(train_dataset.index)
#check the data
train_dataset.describe().transpose()
#split features from labels
train_features = train_dataset.copy()
test_features = test_dataset.copy()
and if there is any interest in knowing how the normalizer layer is used in the model then please see below
def build_and_compile_model(data):
model = keras.Sequential([
model_normalizer_layer,
layers.Dense(260, input_dim=401,activation='relu'),
layers.Dense(80, activation='relu'),
#layers.Dense(40, activation='relu'),
layers.Dense(1)
])
i found that quasimodos suggestion of using normalization of the data set before processing in my model was the ideal solution. It scaled the data 0 to 1 for all columns as expected and allowed me to display the data prior to training to validate it was correct.
For whatever reason the keras.layers.normalization was not working in my case.
x = training_Data.values
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
training_Data = pd.DataFrame(x_scaled)
# normalize the data using keras
model_normalizer_layer = tf.keras.layers.Normalization(axis=-1)
model_normalizer_layer.adapt(training_Data)
normalized_training_Data = model_normalizer_layer(training_Data)
The only part that i have yet to figure out is how do i scale the predict data from the model back to the original ranges of the column??? i'm sure its simple but i'm stumped.

keras training on big datasets seperately keras

I am working on a keras denoising neural network that denoise high Dimension x-ray images. The idea is to train on some datasets eg.1,2,3 and after having the weights, another datasets eg.4,5,6 will start with a new training with weights initialized from the previous training. Implementation-wise it works, however the weights resulted from the last rotation perform better only on the datasets that were used to train on in this rotation. Same goes for other rotation.
In other words, weights resutlted from training on dataset: 4,5,6 doesn't give the good results on an image of dataset 1 as intended as the weights that were trained on datasets: 1,2,3. which shouldn't be what I intend to do
The idea is that weights should be tweaked to work with all datasets effectively, as training on the whole dataset doesn't fit into memory.
I tried other solutions such as creating custom generator that takes images from disk and do the training as batches which is very slow as it depends on factors like I/O operations happening on disk or the time complexity of processing functions happening inside the custom keras generator!
Below is a code that shows what I am doing. I have 12 datasets, seperated into 4 checkpoints. data is loaded and training goes and saves final model to an array and next training takes the weights from the previous rotation and continues.
EPOCHES = 150
NUM_CHKPTS = 4
weights = []
for chk in range(1,NUM_CHKPTS+1):
log_dir = os.path.join(os.getcwd(), 'resnet_checkpts_' + str(EPOCHES) + "_tl2_chkpt" + str(chk))
if not os.path.isdir(log_dir):
os.makedirs(log_dir)
else:
print('Training log directory already exists # {}.'.format(log_dir))
tb_output = TensorBoard(log_dir=log_dir, histogram_freq=1)
print("Loading Data From CHKPT #" + str(chk))
h5f = h5py.File('C:\\autoencoder\\datasets\\mix\\chk' + str(chk) + '.h5','r')
org_patch = h5f['train_data'][:]
noisy_patch = h5f['train_noisy'][:]
h5f.close()
input_patch, test_patch, noisy_patch, test_noisy_patch = train_test_split(org_patch, noisy_patch, train_size=0.8, shuffle=True)
print("Reshaping")
train_data = np.array([np.reshape(input_patch[i], (52, 52, 1)) for i in range(input_patch.shape[0])], dtype = np.float32)
train_noisy_data = np.array([np.reshape(noisy_patch[i], (52, 52, 1)) for i in range(noisy_patch.shape[0])], dtype = np.float32)
test_data = np.array([np.reshape(test_patch[i], (52, 52, 1)) for i in range(test_patch.shape[0])], dtype = np.float32)
test_noisy_data = np.array([np.reshape(test_noisy_patch[i], (52, 52, 1)) for i in range(test_noisy_patch.shape[0])], dtype = np.float32)
print('Number of training samples are:', train_data.shape[0])
print('Number of test samples are:', test_data.shape[0])
# IN = np.ones((len(XTRAINFILES), 52, 52, 1 ))
if chk == 1:
print("Generating the Model For The First Time..")
autoencoder_model = model_autoencoder(train_noisy_data)
print("Done!")
else:
autoencoder_model=load_model(weights[chk-2])
checkpt_path = log_dir + r"\\cp-{epoch:04d}.ckpt"
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpt_path, verbose=0, save_weights_only=True, save_freq='epoch')
optimizer = tf.keras.optimizers.Adam(lr=0.0001)
autoencoder_model.compile(loss='mse',optimizer=optimizer)
autoencoder_model.fit(train_noisy_data, train_data,
batch_size=128,
epochs=EPOCHES, shuffle=True, verbose=1,
validation_data=(test_noisy_data, test_data),
callbacks=[tb_output, checkpoint_callback])
weight_dir = log_dir+'\\model_resnet_new_OL' + str(EPOCHES) + 'epochs.h5'
weights.append(weight_dir)
autoencoder_model.save(weight_dir) # Defined saved model name by number of epochs.
Tensorboard Graphs, Rotations are 1,2,3,4 from up down :
Your model will forget previous dataset as you train on new dataset.
I read in reinforcement learning, when game are used to train Deep Reinforcement Learning (DRL), then you have to create memory replay, which collect data from different rounds of game, because each round of game has different data, then randomly some of that data is chosen to train model. that way DRL model can learn to play different rounds of game without forgetting previous rounds.
You can try to create a single dataset by taking some random samples from each dataset.
When you train model on new dataset that make sure data from all previous rotation are in current rotation.
Also in transfer learning, when you train model on new dataset, you have to freeze previous layers so that model don`t forget previous training. you are not using transfer learning but still when you start training on 2nd dataset your 1st dataset will slowly be removed from memory of weights.
you can try freezing initial layers of decoder so that they are not updated when extracting feature, assuming all of the dataset contain similar images, that way your model will not forget previous training as in transfer learning. but still when you train on new dataset previous will be forgotten.

Restore best checkpoint to an estimator tensorflow 2.x

Briefly, I put in place a data input pipline using tensorflow Dataset API. Then, I implemented a CNN model for classification using keras, which i converted to an estimator. I feeded my estimator Train and Eval Specs with my input_fn providing input data for training and evaluation. And as final step I launched the model training with tf.estimator.train_and_evaluate
def my_input_fn(tfrecords_path):
dataset = (...)
return batch_fbanks, batch_labels
def build_model():
model = tf.keras.models.Sequential()
model.add(...)
model.compile(...)
return model
model = build_model()
run_config=tf.estimator.RunConfig(model_dir,save_summary_steps=100,save_checkpoints_steps=1000)
estimator = tf.keras.estimator.model_to_estimator(model,config=run_config)
def serving_input_receiver_fn():
inputs = {'Conv1_input': tf.compat.v1.placeholder(shape=[None, 11,120,1], dtype=tf.float32)}
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
exporter = tf.estimator.BestExporter(serving_input_receiver_fn, name="best_exporter", exports_to_keep=5)
train_spec_dnn = tf.estimator.TrainSpec(input_fn = lambda: my_input_fn(train_data_path),hooks=[hook])
eval_spec_dnn = tf.estimator.EvalSpec(input_fn = lambda: my_eval_input_fn(eval_data_path),exporters=exporter,start_delay_secs=0,throttle_secs=15)
tf.estimator.train_and_evaluate(estimator, train_spec_dnn, eval_spec_dnn)
I save the 5 best checkpoints using the tf.estimator.BestExporter as shown above. Once i finished training, i want to reload the best model and convert it to an estimator to re-evaluate the model and predict on new dataset. However my issue is in restoring the checkpoint to an estimator. I tried several solutions but each time i don't get the estimator object I need to run its evaluate and predict methods.
Just to specify more, each of the best checkpoints directory is organised as follow:
./
variables/
variables.data-00000-of-00002
variables.data-00001-of-00002
variables.index
saved_model.pb
So the question is how can I get an estimator object from the best checkpoint so that i can use it to evaluate my model and predict on new data?
Note : I found some proposed solutions relying on TensorFlow v1 features which can not solve my problem because i work with TF v2.
Thanks a lot, any help is appreciated.
You can use the class below created from tf.estimator.BestExporter
What it does is, except for saving the best model (.pb files and etc) it will also save
the best-exported model checkpoint on a different folder.
Below is the class:
import shutil, glob, os
# import tensorflow.logging as logging
## the path where all the checkpoint reside
BEST_CHECKPOINTS_PATH_FROM = 'PATH TO ALL CHECKPOINT FILES'
## the path it will save the best exporter checkpoint files
BEST_CHECKPOINTS_PATH_TO = 'PATH TO BEST EXPORTER CHECKPOINT FILES TO BE SAVE'
class BestCheckpointsExporter(tf.estimator.BestExporter):
def export(self, estimator, export_path, checkpoint_path, eval_result,is_the_final_export):
if self._best_eval_result is None or \
self._compare_fn(self._best_eval_result, eval_result):
#print('Exporting a better model ({} instead of {})...'.format(eval_result, self._best_eval_result))
for name in glob.glob(checkpoint_path + '.*'):
print(name)
print(os.path.join(BEST_CHECKPOINTS_PATH_TO, os.path.basename(name)))
shutil.copy(name, os.path.join(BEST_CHECKPOINTS_PATH_TO, os.path.basename(name)))
# also save the text file used by the estimator api to find the best checkpoint
with open(os.path.join(BEST_CHECKPOINTS_PATH_TO, "checkpoint"), 'w') as f:
f.write("model_checkpoint_path: \"{}\"".format(os.path.basename(checkpoint_path)))
self._best_eval_result = eval_result
else:
print('Keeping the current best model ({} instead of {}).'.format(self._best_eval_result, eval_result))
Example Usage of the Class
You will just replace the exporter by calling the class and pass the serving_input_receiver_fn.
def serving_input_receiver_fn():
inputs = {'my_dense_input': tf.compat.v1.placeholder(shape=[None, 4], dtype=tf.float32)}
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
exporter = BestCheckpointsExporter(serving_input_receiver_fn=serving_input_receiver_fn)
train_spec_dnn = tf.estimator.TrainSpec(input_fn = input_fn, max_steps=5)
eval_spec_dnn = tf.estimator.EvalSpec(input_fn=input_fn,exporters=exporter,start_delay_secs=0,throttle_secs=15)
(x, y) = tf.estimator.train_and_evaluate(keras_estimator, train_spec_dnn, eval_spec_dnn)
At this point, It will save the best-exported model checkpoint files in the folder you have specified.
For loading the checkpoint files you need to do the following steps:
Step 1: Rebuild your model instance
def build_model():
model = tf.keras.models.Sequential()
model.add(...)
model.compile(...)
return model
model = build_model()
Step 2: use the model load_weights API
Reference URL: https://www.tensorflow.org/tutorials/keras/save_and_load
ck_path = tf.train.latest_checkpoint('PATH TO BEST EXPORTER CHECKPOINT FILES')
model.load_weights(ck_path)
## From there you will be able to call the predict & evaluate the functionality of the trained model
##PREDICT
prediction = model.predict(x)
##EVALUATE
for features_batch, labels_batch in input_fn().take(1):
model.evaluate(features_batch, labels_batch)
Note: All of these have been simulated on google colab.

retraining last layer of inception-v3 significantly slowers the classification

In an attempt for transfer learning over inception-v3 with TF and PY3.5, I've tested two approaches:
1- retraining the last layer, as shown here: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/image_retraining
2- Apply linear SVM on top of inception-V3 bottlenecks as demonstrated here: https://www.kernix.com/blog/image-classification-with-a-pre-trained-deep-neural-network_p11
Expectedly, they should've had a similar runtime for classification phase, since the critical part - the bottlenecks extraction - is identical. In practice though, the retrained network is about 8X slower when running classification.
My questions is whether anyone has an idea for the reason of this.
Some code snippets:
SVM on top (the faster):
def getTensors():
graph_def = tf.GraphDef()
f = open('classify_image_graph_def.pb', 'rb')
graph_def.ParseFromString(f.read())
tensorBottleneck, tensorsResizedImage = tf.import_graph_def(graph_def, name='', return_elements=['pool_3/_reshape:0', 'Mul:0'])
return tensorBottleneck, tensorsResizedImage
def calc_bottlenecks(imgFile, tensorBottleneck, tensorsResizedImage):
""" - read, decode and resize to get <resizedImage> - """
bottleneckValues = sess.run(tensorBottleneck, {tensorsResizedImage : resizedImage})
return np.squeeze(bottleneckValues)
This takes about 0.5 sec on my (Windows) laptop while the SVM part takes no time.
Retraining last layer - (this is harder to summarize since longer code)
def loadGraph(pbFile):
with tf.gfile.FastGFile(pbFile, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
softmaxTensor = sess.graph.get_tensor_by_name('final_result:0')
def labelImage(imageFile, softmaxTensor):
with tf.Session() as sess:
input_layer_name = 'DecodeJpeg/contents:0'
predictions, = sess.run(softmax_tensor, {input_layer_name: image_data})
'pbFile' is the file saved be the retrainer, which supposed to have identical topology and weights excluding the classification layer, as 'classify_image_graph_def.pb'. This takes about 4sec to run (on my same laptop, without the loading).
Any idea for the performance gap?
Thanks!
Solved. The problem was in creating a new tf.Session() for every image. Storing the session when reading graph and using it made runtime back to expected.
def loadGraph(pbFile):
...
with tf.Session() as sess:
softmaxTensor = sess.graph.get_tensor_by_name('final_result:0')
sessToStore = sess
return softmaxTensor, sessToStore
def labelImage(imageFile, softmaxTensor, sessToStore):
input_layer_name = 'DecodeJpeg/contents:0'
predictions, = sessToStore.run(softmax_tensor, {input_layer_name: image_data})

Reporting accuracy and loss issues with MonitoredTrainingSession

I am performing transfer learning on InceptionV3 for a dataset of 5 types of flowers. All layers are frozen except the output layer. My implementation is heavily based off of the Cifar10 tutorial from Tensorflow and the input dataset is formated in the same way as Cifar10.
I have added a MonitoredTrainingSession (like in the tutorial) to report the accuracy and loss after a certain number of steps. Below is the section of the code for the MonitoredTrainingSession (almost identical to the tutorial):
class _LoggerHook(tf.train.SessionRunHook):
def begin(self):
self._step = -1
self._start_time = time.time()
def before_run(self,run_context):
self._step+=1
return tf.train.SessionRunArgs([loss,accuracy])
def after_run(self,run_context,run_values):
if self._step % LOG_FREQUENCY ==0:
current_time = time.time()
duration = current_time - self._start_time
self._start_time = current_time
loss_value = run_values.results[0]
acc = run_values.results[1]
examples_per_sec = LOG_FREQUENCY/duration
sec_per_batch = duration / LOG_FREQUENCY
format_str = ('%s: step %d, loss = %.2f, acc = %.2f (%.1f examples/sec; %.3f sec/batch)')
print(format_str %(datetime.now(),self._step,loss_value,acc,
examples_per_sec,sec_per_batch))
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
if MODE == 'train':
file_writer = tf.summary.FileWriter(LOGDIR,tf.get_default_graph())
with tf.train.MonitoredTrainingSession(
save_checkpoint_secs=70,
checkpoint_dir=LOGDIR,
hooks=[tf.train.StopAtStepHook(last_step=NUM_EPOCHS*NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN),
tf.train.NanTensorHook(loss),
_LoggerHook()],
config=config) as mon_sess:
original_saver.restore(mon_sess,INCEPTION_V3_CHECKPOINT)
print("Proceeding to training stage")
while not mon_sess.should_stop():
mon_sess.run(train_op,feed_dict={training:True})
print('acc: %f' %mon_sess.run(accuracy,feed_dict={training:False}))
print('loss: %f' %mon_sess.run(loss,feed_dict={training:False}))
When the two lines printing the accuracy and loss under mon_sess.run(train_op... are removed, the loss and accuracy printed from after_run, after it trains for surprisingly only 20 min, report that the model is performing very well on the training set and the loss is decreasing. Even the moving average loss was reporting great results. It eventually approaches greater than 90% accuracy for multiple random batches.
After, the training session was reporting high accuracy for a while,I stopped the training session, restored the model, and ran it on random batches from the same training set. It performed poorly, only achieving between 50% and 85% accuracy. I confirmed it was restored properly because it did perform better than a model with an untrained output layer.
I then went back to training again from the last checkpoint. The accuracy was initially low but after about 10 mini batch runs the accuracy went back above 90%. I then repeated the process but this time added the two lines for evaluating the loss and accuracy after the training operation. Those two evaluations reported that the model was having issues converging and performing poorly. While the evaluations via before_run and after_run, now only occasionally showed high accuracy and low loss (the results jumped around). But still after_run sometimes reported 100% accuracy (the fact that it is no longer consistent I think is because after_run is getting called also for mon_sess.run(accuracy...) and mon_sess.run(loss...)).
Why would the results reported from MonitoredTrainingSession be indicating the model is performing well when it really isn't? Aren't the two operations in SessionRunArgs being fed with the same mini batch as train_op, indicating model performance on the batch before gradient update?
Here is the code I used for restoring and testing the model(based of the cifar10 tutorial):
elif MODE == 'test':
init = tf.global_variables_initializer()
ckpt = tf.train.get_checkpoint_state(LOGDIR)
if ckpt and ckpt.model_checkpoint_path:
with tf.Session(config=config) as sess:
init.run()
saver = tf.train.Saver()
print(ckpt.model_checkpoint_path)
saver.restore(sess,ckpt.model_checkpoint_path)
global_step = tf.contrib.framework.get_or_create_global_step()
coord = tf.train.Coordinator()
threads =[]
try:
for qr in tf.get_collection(tf.GraphKeys.QUEUE_RUNNERS):
threads.extend(qr.create_threads(sess, coord=coord, daemon=True,start=True))
print('model restored')
i =0
num_iter = 4*NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN/BATCH_SIZE
print(num_iter)
while not coord.should_stop() and i < num_iter:
print("loss: %.2f," %loss.eval(feed_dict={training:False}),end="")
print("acc: %.2f" %accuracy.eval(feed_dict={training:False}))
i+=1
except Exception as e:
print(e)
coord.request_stop(e)
coord.request_stop()
coord.join(threads,stop_grace_period_secs=10)
Update :
So I was able to fix the issue. However, i am not sure why it worked. In the arg_scope for the inception model i was passing in an is_training Boolean placeholder for Batch Norm and dropout used by inception. However, when I removed the placeholder and just set the is_training keyword to true, the accuracy on the training set when the model was restored was extremely high. This was the same model checkpoint that previously performed poorly. When i trained it i always had the is_training placeholder set to true. Having the is_training set to true while testing would mean batch Norm is now using th sample mean and variance.
Why would telling Batch Norm to now use the sample average and sample standard deviation like it does during training increase the accuracy?
This would also mean that the dropout layer is dropping units and that the model's accuracy during testing on both the training set and test set is higher with the dropout layer enabled.
Update 2
I went through the tensorflow slim inceptionv3 model code that the arg_scope in the code above is referencing. I removed the final dropout layer after the Avg pool 8x8 and the accuracy remained at around 99%. However, when I set is_training to False only for the batch norm layers, the accuracy dropped back to around 70%. Here is the arg_scope from slim\nets\inception_v3.py and my modification.
with variable_scope.variable_scope(
scope, 'InceptionV3', [inputs, num_classes], reuse=reuse) as scope:
with arg_scope(
[layers_lib.batch_norm],is_training=False): #layers_lib.dropout], is_training=is_training):
net, end_points = inception_v3_base(
inputs,
scope=scope,
min_depth=min_depth,
depth_multiplier=depth_multiplier)
I tried this with both the dropout layer removed and the dropout layer kept with passing in is_training=True to the dropout layer.
(Summarizing from dylan7's debugging in the question's comments)
Batch norm relies on variables to save the summary statistics it normalizes with. These are only updated when is_training is True through an UPDATE_OPS collection (see the batch_norm documentation). If these update ops don't get run (or the variables are overwritten), there may be transient "reasonable" statistics based on each batch which get lost when is_training is False (testing data is not, and should not be, used to inform batch_norm summary statistics).

Resources