I am trying to write a neural network that recognizes the xor function from scratch. The full code is here (in python 3).
I am currently getting the error :
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients
I am new to tensorflow and I don't understand why this is. Can anyone help me out in correcting my code? Thanks in advance.
P.S. If more details are required in the question, do let me know before downvoting. Thanks again!
Edit: relevant part of code:
def initialize_parameters():
# Create Weights and Biases for Hidden Layer and Output Layer
W1 = tf.get_variable("W1", [2, 2], initializer = tf.contrib.layers.xavier_initializer())
b1 = tf.get_variable("b1", [2, 1], initializer = tf.zeros_initializer())
W2 = tf.get_variable("W2", [1, 2], initializer = tf.contrib.layers.xavier_initializer())
b2 = tf.get_variable("b2", [1, 1], initializer = tf.zeros_initializer())
parameters = {
"W1" : W1,
"b1" : b1,
"W2" : W2,
"b2" : b2
}
return parameters
def forward_propogation(X, parameters):
threshold = tf.constant(0.5, name = "threshold")
W1, b1 = parameters["W1"], parameters["b1"]
W2, b2 = parameters["W2"], parameters["b2"]
Z1 = tf.add(tf.matmul(W1, X), b1)
A1 = tf.nn.relu(Z1)
tf.squeeze(A1)
Z2 = tf.add(tf.matmul(W2, A1), b2)
A2 = tf.round(tf.sigmoid(Z2))
print(A2.shape)
tf.squeeze(A2)
A2 = tf.reshape(A2, [1, 1])
print(A2.shape)
return A2
def compute_cost(A, Y):
logits = tf.transpose(A)
labels = tf.transpose(Y)
cost = tf.nn.sigmoid_cross_entropy_with_logits(logits = logits, labels = labels)
return cost
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001, num_epochs = 1500):
ops.reset_default_graph()
(n_x, m) = X_train.shape
n_y = Y_train.shape[0]
costs = []
X, Y = create_placeholders(n_x, n_y)
parameters = initialize_parameters()
A2 = forward_propogation(X, parameters)
cost = compute_cost(A2, Y)
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as session:
session.run(init)
for epoch in range(num_epochs):
epoch_cost = 0
_, epoch_cost = session.run([optimizer, cost], feed_dict = {X : X_train, Y : Y_train})
parameters = session.run(parameters)
correct_prediction = tf.equal(tf.argmax(A2), tf.argmax(Y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Training Accuracy is {0} %...".format(accuracy.eval({X : X_train, Y : Y_train})))
print("Test Accuracy is {0} %...".format(accuracy.eval({X : X_test, Y : Y_test})))
return parameters
The error is caused by the use of tf.round when you define A2 (known issue, by the way).
In this particular task, the solution is simply not to use tf.round at all. Remember that, the output of tf.sigmoid is the value between 0 and 1, which can be interpreted as probability of result 1. Cross-entropy loss function is measuring the distance to the target, 0 or 1, and computes the needed update to the weights based on this distance. Calling tf.round before the cross-entropy will squeeze the probability to either 0 or 1 - that's will make cross-entropy pretty meaningless.
By the way, tf.losses.softmax_cross_entropy should work better, because you've applied the sigmoid yourself in the second layer.
Related
I am trying to pass 3 values in the same network at one time, since I need the values of all 3 vectors for calculating the triplet loss. But it gives an error when I pass the second value.
The code snippet is:
# runs the siamese network
def forward_prop(x):
w1 = tf.get_variable("w1", [n1, 2048], initializer=tf.contrib.layers.xavier_initializer()) * 0.01
b1 = tf.get_variable("b1", [n1, 1], initializer=tf.zeros_initializer())*0.01
z1 = tf.add(tf.matmul(w1, x), b1) # n1*2048 x 2048*batch_size = n1*batch_size
a1 = tf.nn.relu(z1) # n1*batch_size
w2 = tf.get_variable("w2", [n2, n1], initializer=tf.contrib.layers.xavier_initializer()) * 0.01
b2 = tf.get_variable("b2", [n2, 1], initializer=tf.zeros_initializer()) * 0.01
z2 = tf.add(tf.matmul(w2, a1), b2) # n2*n1 x n1*batch_size = n2*batch_size
a2 = tf.nn.relu(z2) # n2*batch_size
w3 = tf.get_variable("w3", [n3, n2], initializer=tf.contrib.layers.xavier_initializer()) * 0.01
b3 = tf.get_variable("b3", [n3, 1], initializer=tf.zeros_initializer()) * 0.01
z3 = tf.add(tf.matmul(w3, a2), b3) # n3*n2 x n2*batch_size = n3*batch_size
a3 = tf.nn.relu(z3) # n3*batch_size
w4 = tf.get_variable("w4", [n4, n3], initializer=tf.contrib.layers.xavier_initializer()) * 0.01
b4 = tf.get_variable("b4", [n4, 1], initializer=tf.zeros_initializer()) * 0.01
z4 = tf.add(tf.matmul(w4, a3), b4) # n4*n3 x n3*batch_size = n4*batch_size
a4 = tf.nn.relu(z4) # n4*batch_size = 128*batch_size (128 feature vectors for all training examples)
return a4
def back_prop():
anchor_embeddings = forward_prop(x1)
positive_embeddings = forward_prop(x2)
negative_embeddings = forward_prop(x3)
# finding sum of squares of distances
distance_positive = tf.reduce_sum(tf.square(anchor_embeddings - positive_embeddings), 0)
distance_negative = tf.reduce_sum(tf.square(anchor_embeddings - negative_embeddings), 0)
# applying the triplet loss equation
triplet_loss = tf.maximum(0., distance_positive - distance_negative + margin)
triplet_loss = tf.reduce_mean(triplet_loss)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(triplet_loss)
with tf.Session as sess:
sess.run(tf.global_variables_initializer())
feed_dict = {
x1: anchors,
x2: positives,
x3: negatives
}
print("Starting the Siamese network...")
for epoch in range(total_epochs_net_1):
for _ in range(len(anchors)):
_, triplet_loss = sess.run([optimizer, triplet_loss], feed_dict=feed_dict)
print("Epoch", epoch, "completed out of", total_epochs_net_1)
saver = tf.train.Saver()
saver.save(sess, 'face_recognition_model')
I am getting error in the following line:
positive_embeddings = forward_prop(x2)
The tf.get_variable in the forward_prop() function throws the error.
The error says:
ValueError: Variable w1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?
I think it's because the variable w1 gets defined in the first call of forward_prop() function in the following line:
anchor_embeddings = forward_prop(x1)
How to resolve this? I cannot pass the three values separately since i will need all the three values for computing the triplet loss. Any help will be appreciated. Thanks!
You're misconfiguring your network here:
def back_prop():
anchor_embeddings = forward_prop(x1)
positive_embeddings = forward_prop(x2)
negative_embeddings = forward_prop(x3)
You should only define 1 network, you're erroneously defining 3 sets of variables for each of the 3 inputs, effectively 3 neural networks are being defined here.
For triplet loss what you want to do is feed the 3 inputs in as a batch to a single network (all 3 get processed by the same network), not as individual variables. For this discussion, I'll assume your inputs are images and you're training on a single set of 3 inputs on each training step.
If your images are 256x256x1 (grayscale) in size, then a single triplet batch would be of shape [3 x 256 x 256 x 1]. Now your output will be of shape [3 x size_of_your_output_layer]. Your loss function should now be written with the understanding that the first axis there represents your 3 values: anchor, positive, negative. Compute the loss appropriately.
You can, of course, pass in multiple anchors, positives, and negatives, you'll just have to deal with this in more complex detail at the loss function, perfectly doable though. My triplet loss functions have gotten pretty complex though, so I suggest keeping it simple to start.
I have one simple TensorFlow model and accuracy for that is 1. But when I try to predict some new inputs it always returns Zero(0).
import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()
# generate data
np.random.seed(10)
#inputs = np.random.uniform(low=1.2, high=1.5, size=[5000, 150]).astype('float32')
inputs = np.random.randint(low=50, high=500, size=[5000, 150])
label = np.random.uniform(low=1.3, high=1.4, size=[5000, 1])
# reverse_label = 1 - label
reverse_label = np.random.uniform(
low=1.3, high=1.4, size=[5000, 1])
reverse_label1 = np.random.randint(
low=80, high=140, size=[5000, 1])
#labels = np.append(label, reverse_label, 1)
#labels = np.append(labels, reverse_label1, 1)
labels = reverse_label1
print(inputs)
print(labels)
# parameters
learn_rate = 0.001
epochs = 100
n_input = 150
n_hidden = 15
n_output = 1
# set weights/biases
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_output])
b0 = tf.Variable(tf.truncated_normal([n_hidden], stddev=0.2, seed=0))
b1 = tf.Variable(tf.truncated_normal([n_output], stddev=0.2, seed=0))
w0 = tf.Variable(tf.truncated_normal([n_input, n_hidden], stddev=0.2, seed=0))
w1 = tf.Variable(tf.truncated_normal([n_hidden, n_output], stddev=0.2, seed=0))
# step function
def returnPred(x, w0, w1, b0, b1):
z1 = tf.add(tf.matmul(x, w0), b0)
a2 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a2, w1), b1)
h = tf.nn.relu(z2)
return h # return the first response vector from the
y_ = returnPred(x, w0, w1, b0, b1) # predict operation
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=y_, labels=y)) # calculate loss between prediction and actual
model = tf.train.AdamOptimizer(learning_rate=learn_rate).minimize(
loss) # apply gradient descent based on loss
init = tf.global_variables_initializer()
tf.Session = sess
sess.run(init) # initialize graph
for step in range(0, epochs):
sess.run([model, loss], feed_dict={x: inputs, y: labels}) # train model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: inputs, y: labels})) # print accuracy
inp = np.random.randint(low=50, high=500, size=[5, 150])
print(sess.run(tf.argmax(y_, 1), feed_dict={x: inp})) # predict some new inputs
All functions are working properly and my problem is with the latest line of code. I tried only "y_" instead "tf.argmax(y_, 1)" but not worked too.
How can I fix that?
Regards,
There are multiple mistakes in your code.
Starting with this lines of code:
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: inputs, y: labels})) # print accuracy
You are performing linear regression but you are checking accuracy with that of logistic regression methodology. If you want to see how your linear regression network is performing, print the loss. Ensure that your loss is decreasing after each epoch of training.
If you look into that accuracy code, run the following code:
print(y_.get_shape()) # Outputs (?, 1)
There is only one input and both of your function tf.argmax(y,1) and tf.argmax(y_,1) will always return [0,0,..]. So as a result your accuracy will be always 1.0. Delete those three lines of code.
Next, to get the outputs, just run the following code:
print(sess.run(y_, feed_dict={x: inp}))
But since your data is random, don't expect good set of outputs.
I created a TensorFlow neural network that has 2 hidden layers with 10 units each using ReLU activations and Xavier Initialization for the weights. The output layer has 1 unit outputting binary classification (0 or 1) using the sigmoid activation function to classify whether it believes a passenger on the titanic survived based on the input features.
(The only code omitted is the load_data function which populates the variables X_train, Y_train, X_test, Y_test used later in the program)
Parameters
# Hyperparams
learning_rate = 0.001
lay_dims = [10,10, 1]
# Other params
m = X_train.shape[1]
n_x = X_train.shape[0]
n_y = Y_train.shape[0]
Inputs
X = tf.placeholder(tf.float32, shape=[X_train.shape[0], None], name="X")
norm = tf.nn.l2_normalize(X, 0) # normalize inputs
Y = tf.placeholder(tf.float32, shape=[Y_train.shape[0], None], name="Y")
Initialize Weights & Biases
W1 = tf.get_variable("W1", [lay_dims[0],n_x], initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.get_variable("b1", [lay_dims[0],1], initializer=tf.zeros_initializer())
W2 = tf.get_variable("W2", [lay_dims[1],lay_dims[0]], initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.get_variable("b2", [lay_dims[1],1], initializer=tf.zeros_initializer())
W3 = tf.get_variable("W3", [lay_dims[2],lay_dims[1]], initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.get_variable("b3", [lay_dims[2],1], initializer=tf.zeros_initializer())
Forward Prop
Z1 = tf.add(tf.matmul(W1,X), b1)
A1 = tf.nn.relu(Z1)
Z2 = tf.add(tf.matmul(W2,A1), b2)
A2 = tf.nn.relu(Z2)
Y_hat = tf.add(tf.matmul(W3,A2), b3)
BackProp
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=tf.transpose(Y_hat), labels=tf.transpose(Y)))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
Session
# Initialize
init = tf.global_variables_initializer()
with tf.Session() as sess:
# Initialize
sess.run(init)
# Normalize Inputs
sess.run(norm, feed_dict={X:X_train, Y:Y_train})
# Forward/Backprob and update weights
for i in range(10000):
c, _ = sess.run([cost, optimizer], feed_dict={X:X_train, Y:Y_train})
if i % 100 == 0:
print(c)
correct_prediction = tf.equal(tf.argmax(Y_hat), tf.argmax(Y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Training Set:", sess.run(accuracy, feed_dict={X: X_train, Y: Y_train}))
print("Testing Set:", sess.run(accuracy, feed_dict={X: X_test, Y: Y_test}))
After running running 10,000 epochs of training, the cost goes down each time so it shows that the learning_rate is okay and that the cost function appears normal. However, after training, all of my Y_hat values (predictions on the training set) are 1 (predicting the passenger survived). So basically the prediction just outputs y=1 for every training example.
Also, when I run tf.argmax on Y_hat, the result is a matrix of all 0's. The same thing is happening when tf.argmax is applied to Y (ground truth labels) which is odd because Y consists of all the correct labels for the training examples.
Any help is greatly appreciated. Thanks.
I assume your Y_hat is a (1,m) matrix with m is the number of training example. Then the tf.argmax(Y_hat) will give all 0. According to tensorflow documentation, argmax
Returns the index with the largest value across axes of a tensor.
If you do not pass in axis, the axis is set as 0. Because the axis 0 only has one value, the returned index becomes 0 all the time.
I know I am late but I'd would also point out that since your label matrix is of shape (n,1), i.e., there is only 1 class to predict, and hence, cross entropy doesn't make sense. In such cases you should use something different for calculating the cost (may be a mean squared error or something similar).
I had similar problem recently while I was working on my college project and I found a work around, I turned this binary output into 2 classes such as present and absent so if it's present it's [1,0]. I know this is not the best way to do it but it can be helpful when you need the working thing instantly.
So I am training a network to classify images in tensor flow. After I trained the network I began work on trying to use it to classify other images. The goal is to import an image, feed it to the classifier and have it print the result. I am having some trouble getting that part off the ground though. Here is what I have so far. I found that having tf.argmax(y,1) gave an error. I found that changing it to 0 fixed that error. However I am not convinced that it is actually working. I tossed 2 images through the classifier and they both got the same class even though they are vastly different. Just need some perspective here. Is this valid? Or is there something wrong here that will always feed me the same class (in this case I got class 0 for both of the images I tried).
Is this even the right way to approach making predictions in tensor flow? This is just the culmination of my debugging, not sure if it is what should be done or not.
from sklearn.model_selection import train_test_split
from sklearn.utils import shuffle
X_train,X_validation,y_train,y_validation=train_test_split(X_train,y_train, test_size=20,random_state=0)
X_train, y_train = shuffle(X_train, y_train)
def LeNet(x):
# Arguments used for tf.truncated_normal, randomly defines variables
for the weights and biases for each layer
mu = 0
sigma = 0.1
# SOLUTION: Layer 1: Convolutional. Input = 32x32x3. Output = 28x28x6.
conv1_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 3, 6), mean = mu, stddev = sigma))
conv1_b = tf.Variable(tf.zeros(6))
conv1 = tf.nn.conv2d(x, conv1_W, strides=[1, 1, 1, 1], padding='VALID') + conv1_b
# SOLUTION: Activation.
conv1 = tf.nn.relu(conv1)
# SOLUTION: Pooling. Input = 28x28x6. Output = 14x14x6.
conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
# SOLUTION: Layer 2: Convolutional. Output = 10x10x16.
conv2_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 6, 16), mean = mu, stddev = sigma))
conv2_b = tf.Variable(tf.zeros(16))
conv2 = tf.nn.conv2d(conv1, conv2_W, strides=[1, 1, 1, 1], padding='VALID') + conv2_b
# SOLUTION: Activation.
conv2 = tf.nn.relu(conv2)
# SOLUTION: Pooling. Input = 10x10x16. Output = 5x5x16.
conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
# SOLUTION: Flatten. Input = 5x5x16. Output = 400.
fc0 = flatten(conv2)
# SOLUTION: Layer 3: Fully Connected. Input = 400. Output = 120.
fc1_W = tf.Variable(tf.truncated_normal(shape=(400, 120), mean = mu, stddev = sigma))
fc1_b = tf.Variable(tf.zeros(120))
fc1 = tf.matmul(fc0, fc1_W) + fc1_b
# SOLUTION: Activation.
fc1 = tf.nn.relu(fc1)
# SOLUTION: Layer 4: Fully Connected. Input = 120. Output = 84.
fc2_W = tf.Variable(tf.truncated_normal(shape=(120, 84), mean = mu, stddev = sigma))
fc2_b = tf.Variable(tf.zeros(84))
fc2 = tf.matmul(fc1, fc2_W) + fc2_b
# SOLUTION: Activation.
fc2 = tf.nn.relu(fc2)
# SOLUTION: Layer 5: Fully Connected. Input = 84. Output = 43.
fc3_W = tf.Variable(tf.truncated_normal(shape=(84, 43), mean = mu, stddev = sigma))
fc3_b = tf.Variable(tf.zeros(43))
logits = tf.matmul(fc2, fc3_W) + fc3_b
return logits
import tensorflow as tf
x = tf.placeholder(tf.float32, (None, 32, 32, 3))
y = tf.placeholder(tf.int32, (None))
one_hot_y = tf.one_hot(y, 43)
EPOCHS=10
BATCH_SIZE=128
rate = 0.001
logits = LeNet(x)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, one_hot_y)
loss_operation = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate = rate)
training_operation = optimizer.minimize(loss_operation)
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_y, 1))
accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
saver = tf.train.Saver()
def evaluate(X_data, y_data):
num_examples = len(X_data)
total_accuracy = 0
sess = tf.get_default_session()
for offset in range(0, num_examples, BATCH_SIZE):
batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
accuracy = sess.run(accuracy_operation, feed_dict={x: batch_x, y: batch_y})
total_accuracy += (accuracy * len(batch_x))
return total_accuracy / num_examples
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
num_examples = len(X_train)
print("Training...")
print()
for i in range(EPOCHS):
X_train, y_train = shuffle(X_train, y_train)
for offset in range(0, num_examples, BATCH_SIZE):
end = offset + BATCH_SIZE
batch_x, batch_y = X_train[offset:end], y_train[offset:end]
sess.run(training_operation, feed_dict={x: batch_x, y: batch_y})
validation_accuracy = evaluate(X_validation, y_validation)
print("EPOCH {} ...".format(i+1))
print("Validation Accuracy = {:.3f}".format(validation_accuracy))
print()
saver.save(sess, './lenet')
print("Model saved")
import cv2
image=cv2.imread('File path')
image=cv2.resize(image,(32,32)) #classifier takes 32X32 images
image=np.array(image)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver3 = tf.train.import_meta_graph('./lenet.meta')
saver3.restore(sess, "./lenet")
pred = tf.nn.softmax(logits)
predictions = sess.run(tf.argmax(y,0), feed_dict={x: image})
print (predictions)
So what had to happen here was first clear the kernel and outputs. Somewhere along the way my placeholders got muddled up and clearing the kernel fixed that right up. Then I had to realize what really had to get done here: I had to call up the softmax function on my new data.
Like this:
pred = tf.nn.softmax(logits)
classification = sess.run(pred, feed_dict={x: image_array})
I am trying to use a Tensorflow DNN for a Kaggle Competion. The data is about 100 columns of categorical data, 29 columns of numerical data, and 1 column for the output. What I did was I split it into training and testing with X and y using Scikit's train test split function, where X is a list of each rows without the "id" or the value that needs to be predicted, and y is the value that is needed to be predicted. I then built the model, shown below:
import tensorflow as tf
import numpy as np
import time
import pickle
with open('pickle.pickle', 'rb') as f:
trainX, trainy, testX, testy = pickle.load(f)
trainX = np.array(trainX)
trainy = np.array(trainy)
trainy = trainy.reshape(trainy.shape[0], 1)
testX = np.array(testX)
testy = np.array(testy)
print (trainX.shape)
print (trainy.shape)
testX = testX.reshape(testX.shape[0], 130)
testy = testy.reshape(testy.shape[0], 1)
print (testX.shape)
print (testy.shape)
n_nodes_hl1 = 256
n_nodes_hl2 = 256
n_nodes_hl3 = 256
n_classes = 1
batch_size = 100
# Matrix = h X w
X = tf.placeholder('float', [None, len(trainX[0])])
y = tf.placeholder('float')
def model(data):
hidden_1_layer = {'weights':tf.Variable(tf.random_normal([trainX.shape[1], n_nodes_hl1])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}
hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}
hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}
output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
'biases':tf.Variable(tf.random_normal([n_classes]))}
# (input_data * weights) + biases
l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['biases'])
l1 = tf.nn.sigmoid(l1)
l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['biases'])
l2 = tf.nn.sigmoid(l2)
l3 = tf.add(tf.matmul(l2, hidden_3_layer['weights']), hidden_3_layer['biases'])
l3 = tf.nn.sigmoid(l3)
output = tf.matmul(l3, output_layer['weights']) + output_layer['biases']
return output
def train(x):
pred = model(x)
#loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(pred, y))
loss = tf.reduce_mean(tf.square(pred - y))
optimizer = tf.train.AdamOptimizer(0.01).minimize(loss)
epochs = 1
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print ('Beginning Training \n')
for e in range(epochs):
timeS = time.time()
epoch_loss = 0
i = 0
while i < len(trainX):
start = i
end = i + batch_size
batch_x = np.array(trainX[start:end])
batch_y = np.array(trainy[start:end])
_, c = sess.run([optimizer, loss], feed_dict = {x: batch_x, y: batch_y})
epoch_loss += c
i += batch_size
done = time.time() - timeS
print ('Epoch', e + 1, 'completed out of', epochs, 'loss:', epoch_loss, "\nTime:", done, 'seconds\n')
correct = tf.equal(tf.arg_max(pred, 1), tf.arg_max(y, 1))
acc = tf.reduce_mean(tf.cast(correct, 'float'))
print("Accuracy:", acc.eval({x:testX, y:testy}))
train(X)
Output for 1 epoch:
Epoch 1 completed out of 1 loss: 1498498282.5
Time: 1.3765859603881836 seconds
Accuracy: 1.0
I do realize that the loss is very high, and I am using 1 epoch just for testing purposes, and yes, I know my code is quite messy. But all I want to do is print out a prediction. How would I do that? I know that I need to feed a list of features for X, but I just don't understand how to do it. I also don't quite understand why my accuracy is at 1.0, so if you have any suggestions for that, or any ways to change my code, I would be more that happy to listen to any ideas.
Thanks in advance
To get a prediction you just have to evaluate pred, which is the operation that defines the output of the model.
How to do it? With pred.eval(). But you need an input to evalaute its prediction, so you have to provide a feed_dict dictionary to eval() with the sample (or samples) you want to process.
The resulting code looks like:
predictions = pred.eval(feed_dict = {x:testX})
Notice how this is very similar to acc.eval({x:testX, y:testy}), because the idea is the same. You have an operation (acc in this case) which needs some input to be evaluated, and you can evaluate it either by calling acc.eval() or sess.run(acc) with the corresponding feed_dict with the necessary inputs.
The simplest way would be to use the existing session while training (between iterations):
print (sess.run(model, {x:X_example}))
where X_example is some numpy example tensor.
The below line will give you probability scores for every class for example is you 3 classes then the below line will give you a array of shape of 1x3
Considering you want prediction of a single data point X_test you can do the following:
output = sess.run(pred, {x:X_test})
the maximum number in the above variable output will be you prediction so for that we will modify the above statement :
output = sess.run(tf.argmax(pred, 1), {x:X_test})
print("your prediction for X_test is :", output[0])
Other thing you can do is :
output = sess.run(pred, {x:X_test})
output = np.argmax(output)
print("your prediction for X_test is :", output)