Saving Learning Progress of Neural Network in Tensorflow - python-3.x

In Python, Tensorflow: I trained and applied a Neural Network using Tensorflow, now I need to save its progress to train it further at a later point in time.
I went through a lot of configurations using
saver = tf.train.Saver()
saver.restore()
tf.train.import_meta_graph('model/model.ckpt.meta')
tf.train.export_meta_graph('model/model.ckpt.meta')
etc...
but it always produces an Error.
Here is my code. It is similar to the Mnist example codes, but uses custom generated input and has a single, continuous output neuron.
x = tf.placeholder('float', [None, 4000]) # 4000 is my structure, just an example
y = tf.placeholder('float') # And I need a single, continuous output
def train_neural_network(x):
testdata_images, testdata_labels = generate_training_data_batch(size = 10)
#Generates test data
data=[]
for i in range(how_many_batches):
data.append(generate_training_data_batch(size = 10))
#Generates training data
prediction = neural_network_model(x)
# neural_network_model() is defined as a 4000x15x15x10x1 neural network
cost = tf.reduce_mean( tf.square( tf.subtract(y, prediction) ) )
optimizer = tf.train.AdamOptimizer(0.01).minimize(cost)
hm_epochs = 10
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(hm_epochs):
epoch_loss = 0
for i in range(how_many_batches):
epoch_x, epoch_y = data[i]
_, c = sess.run([optimizer, cost], feed_dict = {x: epoch_x, y: epoch_y})
epoch_loss += c
accuracy1 = tf.subtract(y, prediction)
result = sess.run(accuracy1, feed_dict={x: epoch_x, y: epoch_y})
print(result)
# This is just here so I can see what is going on
saver.save(sess, 'model/model.ckpt')
tf.train.export_meta_graph('model/model.ckpt.meta')
tf.reset_default_graph()
Later in the same file I want to use the saved neural network to make some predictions with it:
train_neural_network(x)
X, Y = generate_training_data_batch(size = 1)
prediction = neural_network_model(x)
with tf.Session() as sess:
tf.train.import_meta_graph('model/model.ckpt.meta')
sess.run(tf.global_variables_initializer())
thought = sess.run(prediction, feed_dict={x: X})
print(Y, thought)
With this version i get the Error message
ValueError: Tensor("Variable:0", shape=(4000, 15), dtype=float32_ref) must be from the same graph as Tensor("Placeholder_35:0", shape=(?, 4000), dtype=float32).
I also got Error Messages like
ValueError: At least two variables have the same name: Variable/Adam
I am looking for a solution of this for a few weeks now, so I would be very relieved to finally get this sorted out.

Related

How to save and restore the convolution autoencoder neural network model

I use the convolutional autoencoder neural network method to train my model and then save it, but when I restore my model to reconstruct the image which is similar to the training image, the reconstruction result is very bad and the loss is large. I am not sure if I am wrong with saving and reading files.
Training model and save it!
#--------------------------------------------------------------------------
x = tf.placeholder(tf.float32, [None, dim], name = "X")
y = tf.placeholder(tf.float32, [None, dim], name = "Y")
keepprob = tf.placeholder(tf.float32, name = "K")
pred = cae(x, weights, biases, keepprob, imgsize)["out"]
cost = tf.reduce_sum(tf.square(cae(x, weights, biases, keepprob,imgsize)["out"] - tf.reshape(y, shape=[-1, imgsize, imgsize, 1])))
learning_rate = 0.01
optm = tf.train.AdamOptimizer(learning_rate).minimize(cost)
#--------------------------------------------------------------------------
sess = tf.Session()
save_model = os.path.join(PATH,'temp_saved_model')
saver = tf.train.Saver()
tf.add_to_collection("COST", cost)
tf.add_to_collection("PRED", pred)
sess.run(tf.global_variables_initializer())
mean_img = np.zeros((dim))
batch_size = 100
n_epochs = 1000
for epoch_i in range(n_epochs):
for batch_i in range(ntrain // batch_size):
trainbatch = np.array(train)
trainbatch = np.array([img - mean_img for img in trainbatch])
sess.run(optm, feed_dict={x: trainbatch, y: trainbatch, keepprob: 1.})
save_path = saver.save(sess, save_model)
print('Model saved in file: %s' %save_path)
sess.close()
Restoring the model and try to reconstruct the image.
tf.reset_default_graph()
save_model = os.path.join(PATH + 'SaveModel/','temp_saved_model.meta')
imgsize = 64
dim = imgsize * imgsize
mean_img = np.zeros((dim))
with tf.Session() as sess:
saver = tf.train.import_meta_graph(save_model)
saver.restore(sess, tf.train.latest_checkpoint(PATH + 'SaveModel/'))
cost = tf.get_collection("COST")[0]
pred = tf.get_collection("PRED")[0]
graph = tf.get_default_graph()
x = graph.get_tensor_by_name("X:0")
y = graph.get_tensor_by_name("Y:0")
k = graph.get_tensor_by_name("K:0")
for i in range(10):
test_xs = np.array(data)
test = load_image(test_xs, imgsize)
test = np.array([img - mean_img for img in test])
print ("[%02d/%02d] cost: %.4f" % (i, 10, sess.run(cost, feed_dict={x: test, y: test, K: 1.})))
The loss value in the training process is 1.321..., but the reconstruction loss is 16545.10441... Is there something wrong in my code?
First make sure that Your Restore and Save functions are in Different files.
There are a few problems that I have debugged so far,
keepprob changes from 'K' to 'k' while building graph after restore.
You are facing same Images as Logits and labels (Doesn't make sense until you are trying to learn an Identity function)
You are calculating training cost before saving the model and validation/test cost after restoring the model.
Your code in saver
recon = sess.run(pred, feed_dict={x: testbatch, keepprob: 1.})
fig, axs = plt.subplots(2, n_examples, figsize=(15, 4))
for example_i in range(5):
axs[0][example_i].matshow(np.reshape(testbatch[example_i, :], (imgsize, imgsize)), cmap=plt.get_cmap('gray'))
axs[1][example_i].matshow(np.reshape(np.reshape(recon[example_i, ...], (dim,)) + mean_img, (imgsize, imgsize)), cmap=plt.get_cmap('gray'))
plt.show()
Your code in restore function
recon = sess.run(pred, feed_dict={x: test, k: 1.})
cost = sess.run(cost, feed_dict={x: test, y: test, k: 1.})
if (i % 2) == 0:
fig, axs = plt.subplots(2, n_examples, figsize=(15, 4))
for example_i in range(n_examples):
axs[0][example_i].matshow(np.reshape(test[example_i, :], (imgsize, imgsize)), cmap=plt.get_cmap('gray'))
axs[1][example_i].matshow(np.reshape(np.reshape(recon[example_i, ...], (dim,)) + mean_img, (imgsize, imgsize)), cmap=plt.get_cmap('gray'))
plt.show()
Also nowhere in your code are you printing/plotting cost even in your recover module you are plotting recon variable
If you are trying to test autoencoder-decoder pair, to generate the original Image, your model is a bit too small(Shallow), If that makes sense, try implementing it, if you are confused, check out this link. https://pgaleone.eu/neural-networks/deep-learning/2016/12/13/convolutional-autoencoders-in-tensorflow/
And in any case, feel free to add comments for further clarification.

TF | How to predict from CNN after training is done

Trying to work with the framework provided in the course Stanford cs231n, given the code below.
I can see the accuracy getting better and the net is trained however after the training process and checking the results on the validation set, how would I go to input one image into the model and see its prediction?
I have searched around and couldn't find some built in predict function in tensorflow as there is in keras.
Initializing the net and its parameters
# clear old variables
tf.reset_default_graph()
# setup input (e.g. the data that changes every batch)
# The first dim is None, and gets sets automatically based on batch size fed in
X = tf.placeholder(tf.float32, [None, 30, 30, 1])
y = tf.placeholder(tf.int64, [None])
is_training = tf.placeholder(tf.bool)
def simple_model(X,y):
# define our weights (e.g. init_two_layer_convnet)
# setup variables
Wconv1 = tf.get_variable("Wconv1", shape=[7, 7, 1, 32]) # Filter of size 7x7 with depth of 3. No. of filters is 32
bconv1 = tf.get_variable("bconv1", shape=[32])
W1 = tf.get_variable("W1", shape=[4608, 360]) # 5408 is 13x13x32 where 13x13 is the output of 7x7 filter on 32x32 image with padding of 2.
b1 = tf.get_variable("b1", shape=[360])
# define our graph (e.g. two_layer_convnet)
a1 = tf.nn.conv2d(X, Wconv1, strides=[1,2,2,1], padding='VALID') + bconv1
h1 = tf.nn.relu(a1)
h1_flat = tf.reshape(h1,[-1,4608])
y_out = tf.matmul(h1_flat,W1) + b1
return y_out
y_out = simple_model(X,y)
# define our loss
total_loss = tf.losses.hinge_loss(tf.one_hot(y,360),logits=y_out)
mean_loss = tf.reduce_mean(total_loss)
# define our optimizer
optimizer = tf.train.AdamOptimizer(5e-4) # select optimizer and set learning rate
train_step = optimizer.minimize(mean_loss)
Function for evaluating the model whether for training or validation and plots the results:
def run_model(session, predict, loss_val, Xd, yd,
epochs=1, batch_size=64, print_every=100,
training=None, plot_losses=False):
# Have tensorflow compute accuracy
correct_prediction = tf.equal(tf.argmax(predict,1), y)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# shuffle indicies
train_indicies = np.arange(Xd.shape[0])
np.random.shuffle(train_indicies)
training_now = training is not None
# setting up variables we want to compute and optimize
# if we have a training function, add that to things we compute
variables = [mean_loss,correct_prediction,accuracy]
if training_now:
variables[-1] = training
# counter
iter_cnt = 0
for e in range(epochs):
# keep track of losses and accuracy
correct = 0
losses = []
# make sure we iterate over the dataset once
for i in range(int(math.ceil(Xd.shape[0]/batch_size))):
# generate indicies for the batch
start_idx = (i*batch_size)%Xd.shape[0]
idx = train_indicies[start_idx:start_idx+batch_size]
# create a feed dictionary for this batch
feed_dict = {X: Xd[idx,:],
y: yd[idx],
is_training: training_now }
# get batch size
actual_batch_size = yd[idx].shape[0]
# have tensorflow compute loss and correct predictions
# and (if given) perform a training step
loss, corr, _ = session.run(variables,feed_dict=feed_dict)
# aggregate performance stats
losses.append(loss*actual_batch_size)
correct += np.sum(corr)
# print every now and then
if training_now and (iter_cnt % print_every) == 0:
print("Iteration {0}: with minibatch training loss = {1:.3g} and accuracy of {2:.2g}"\
.format(iter_cnt,loss,np.sum(corr)/actual_batch_size))
iter_cnt += 1
total_correct = correct/Xd.shape[0]
total_loss = np.sum(losses)/Xd.shape[0]
print("Epoch {2}, Overall loss = {0:.3g} and accuracy of {1:.3g}"\
.format(total_loss,total_correct,e+1))
if plot_losses:
plt.plot(losses)
plt.grid(True)
plt.title('Epoch {} Loss'.format(e+1))
plt.xlabel('minibatch number')
plt.ylabel('minibatch loss')
plt.show()
return total_loss,total_correct
The functions calls that trains the model
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print('Training')
run_model(sess,y_out,mean_loss,x_train,y_train,1,64,100,train_step,True)
print('Validation')
run_model(sess,y_out,mean_loss,x_val,y_val,1,64)
You do not need to go far, you simply pass your new (test) feature matrix X_test into your network and perform a forward pass - the output layer is the prediction. So the code is something like this
session.run(y_out, feed_dict={X: X_test})

tensorflow-for-onehot-classification , cost is always 0

This follows on from this post (not mine): TensorFlow for binary classification
I had a similar issue and converted my data to use one hot encoding. However I'm still getting a cost of 0. Interestingly the accuracy is correct (90%) when I feed my training data back into it.
Code below:
# Set parameters
learning_rate = 0.02
training_iteration = 2
batch_size = int(np.size(y_vals)/300)
display_step = 1
numOfFeatures = 20 # 784 if MNIST
numOfClasses = 2 #10 if MNIST dataset
# TF graph input
x = tf.placeholder("float", [None, numOfFeatures])
y = tf.placeholder("float", [None, numOfClasses])
# Create a model
# Set model weights to random numbers: https://www.tensorflow.org/api_docs/python/tf/random_normal
W = tf.Variable(tf.random_normal(shape=[numOfFeatures,1])) # Weight vector
b = tf.Variable(tf.random_normal(shape=[1,1])) # Constant
# Construct a linear model
model = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
# Minimize error using cross entropy
# Cross entropy
cost_function = -tf.reduce_sum(y*tf.log(model))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function)
# Initializing the variables
init = tf.global_variables_initializer()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Training cycle
for iteration in range(training_iteration):
avg_cost = 0.
total_batch = int(len(x_vals)/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_xs = x_vals[i*batch_size:(i*batch_size)+batch_size]
batch_ys = y_vals_onehot[i*batch_size:(i*batch_size)+batch_size]
# Fit training using batch data
sess.run(optimizer, feed_dict={x: batch_xs, y: batch_ys})
# Compute average loss
avg_cost += sess.run(cost_function, feed_dict={x: batch_xs, y: batch_ys})/total_batch
# Display logs per eiteration step
if iteration % display_step == 0:
print ("Iteration:", '%04d' % (iteration + 1), "cost=", "{:.9f}".format(avg_cost))
print ("Tuning completed!")
# Evaluation function
correct_prediction = tf.equal(tf.argmax(model, 1), tf.argmax(y, 1))
#correct_prediction = tf.equal(model, y)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
# Test the model
print ("Accuracy:", accuracy.eval({x: x_vals_test, y: y_vals_test_onehot}))
Your output for cost is using:
"{:.9f}".format(avg_cost)
Therefore, maybe you can replace 9 with bigger number.
Ok here is what I found in the end.
Replace:
b = tf.Variable(tf.random_normal(shape=[1,1]))
with:
b = tf.Variable(tf.zeros([1]))

TensorFlow cannot feed value error

I am implementing a logistic regression function. It is quite simple and work properly up until I get to the part where I want to calculate its accuracy. Here is my logistic regression...
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# tf Graph Input
x = tf.get_variable("input_image", shape=[100,784], dtype=tf.float32)
x_placeholder = tf.placeholder(tf.float32, shape=[100, 784])
assign_x_op = x.assign(x_placeholder).op
y = tf.placeholder(shape=[100,10], name='input_label', dtype=tf.float32) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.get_variable("weights", shape=[784, 10], dtype=tf.float32, initializer=tf.random_normal_initializer())
b = tf.get_variable("biases", shape=[1, 10], dtype=tf.float32, initializer=tf.zeros_initializer())
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
saver = tf.train.Saver()
# launch the graph
with tf.Session() as sess:
sess.run(init)
# training cycle
for epoch in range(FLAGS.training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples/FLAGS.batch_size)
# loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
# Assign the contents of `batch_xs` to variable `x`.
sess.run(assign_x_op, feed_dict={x_placeholder: batch_xs})
_, c = sess.run([optimizer, cost], feed_dict={y: batch_ys})
# compute average loss
avg_cost += c / total_batch
# display logs per epoch step
if (epoch + 1) % FLAGS.display_step == 0:
print("Epoch:", '%04d' % (epoch + 1), "cost=", "{:.9f}".format(avg_cost))
save_path = saver.save(sess, "/tmp/model.ckpt")
print("Model saved in file: %s" % save_path)
print("Optimization Finished!")
As you can see it is a basic logistic regression and function and it works perfectly.
It is important to not that batch_size is 100.
Now, after the code snipped above, I try the following...
# list of booleans to determine the correct predictions
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
print(correct_prediction.eval({x_placeholder:mnist.test.images, y:mnist.test.labels}))
# calculate total accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
However the code fails on correct_prediction. I get the following error...
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (10000, 784) for Tensor 'Placeholder:0', which has shape '(100, 784)'
I believe I get this error because of the value I am trying to assign the placeholder for x. How can I fix this? Do I need to reshape the array?
In
x_placeholder = tf.placeholder(tf.float32, shape=[100, 784])
y = tf.placeholder(shape=[100,10], name='input_label', dtype=tf.float32) # 0-9
avoid fixing the first dimension as 100, since it prohibits you from using any other batch size (so if the number of images in mnist.test.images is different from 100, you'll get an error). Instead specify them as None:
x_placeholder = tf.placeholder(tf.float32, shape=[None, 784])
y = tf.placeholder(shape=[None,10], name='input_label', dtype=tf.float32) #
Then you can use any batch size

TensorFlow variable configuration

I successfully implemented a feed-forward algorithm in TensorFlow that looked as follows...
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
...and the training cycle was as follows...
# launch the graph
with tf.Session() as sess:
sess.run(init)
# training cycle
for epoch in range(FLAGS.training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples/FLAGS.batch_size)
# loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
_, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
...the rest of the code is not necessary. Up until this point the code works perfect. It is important to note that my batch_size is 100. The problem is I am using tf.placeholder for my values but in fact I need to change them to use tf.get_variable. The first thing I did was change the following...
# tf Graph Input
x = tf.get_variable("input_image", shape=[100,784], dtype=tf.float32)
y = tf.placeholder(shape=[100,10], name='input_label', dtype=tf.float32) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.get_variable("weights", shape=[784, 10], dtype=tf.float32, initializer=tf.random_normal_initializer())
b = tf.get_variable("biases", shape=[1, 10], dtype=tf.float32, initializer=tf.zeros_initializer())
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
...so far so good. But now I am trying to implement the training cycle and this is where I run into issues. I run the exact same training cycle as above with batch_size = 100 and I get the following errors...
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node GradientDescent/update_input_image/ApplyGradientDescent was passed float from _recv_input_image_0:0 incompatible with expected float_ref.
How can I fix this issue? The error is coming from the following line...
_, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
It's unclear to me why you needed to change x to a tf.Variable when you are continuing to feed a value for it. There are two workarounds (not counting the case where you could just revert x to being tf.placeholder() as in the working code):
The error is being raised because the optimizer is attempting to apply an SGD update to the value that you're feeding (which leads to a confusing runtime type error). You could prevent optimizer from doing this by passing trainable=False when constructing x:
x = tf.get_variable("input_image", shape=[100, 784], dtype=tf.float32,
trainable=False)
Since x is a variable, you could assign the image to the variable in a separate step before running the optimizer.
x = tf.get_variable("input_image", shape=[100, 784], dtype=tf.float32)
x_placeholder = tf.placeholder(tf.float32, shape=[100, 784])
assign_x_op = x.assign(x_placeholder).op
# ...
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
# Assign the contents of `batch_xs` to variable `x`.
sess.run(assign_x_op, feed_dict={x_placeholder: batch_xs})
# N.B. Now you do not need to feed `x`.
_, c = sess.run([optimizer, cost], feed_dict={y: batch_ys})
This latter version would allow you to perform gradient descent on the contents of the image (which might be why you'd want to store it in a variable in the first place).

Resources