Correct way to compute AUC in tensorflow - python-3.x

I'm calculating the area under the curve (AUC) in TensorFlow.
Here is part of my code:
with tf.name_scope("output"):
W = tf.Variable(tf.random_normal([num_filters_total, num_classes], stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
l2_loss += tf.nn.l2_loss(W)
l2_loss += tf.nn.l2_loss(b)
self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
self.softmax_scores = tf.nn.softmax(self.scores)
self.predictions = tf.argmax(self.scores, 1, name="predictions")
# CalculateMean cross-entropy loss
with tf.name_scope("loss"):
self.losses = tf.nn.softmax_cross_entropy_with_logits(labels=self.input_y,logits=self.scores)
self.loss = tf.reduce_mean(self.losses) + l2_reg_lambda * l2_loss
# Accuracy
with tf.name_scope("accuracy"):
correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")
# AUC
with tf.name_scope("auc"):
self.auc = tf.metrics.auc(labels = tf.argmax(self.input_y, 1), predictions = self.predictions)`
`
In the above piece of code, input_y is a tensor with shape (batch_size,2) and predictions has the shape (batch_size,).
Therefore the real values for labels and predictions variables in tf.metrics.auc are [0,1,1,1,0,0,...].
I wonder if it's a correct way to compute AUC?
I've tried with the following command:
self.auc = tf.metrics.auc(labels = tf.argmax(self.input_y, 1), predictions = tf.reduce_max(self.softmax_scores,axis=1))
But this only gives me zero numbers.
Another thing I notice is that while the accuracy is quite stable at the end of the training process, the auc computed by the first method keeps increasing. Is that correct?
Thanks.

Related

3-layer feedfoward neural network not predicting regression values accurately

I'm pretty new to Tensorflow. Currently, I'm doing a 3-layer network, with 10 neurons in the hidden layer with ReLU, mini-batch gradient descent size of 8, L2 regularisation weight decay parameter (beta) of 0.001. The Tensorflow version I'm using is 1.14 and I'm on Python 3.6.
The issue that boggles my mind is that my predicted values and testing errors are absolutely off the charts.
For example, I plotted out the test errors and the predicted vs target values for a sample size of 50, and this is what came out.
As you can see, both plots are way off, and I haven't had the slightest clue as to why.
Here's how the dataset roughly looks like. The first column is discarded as it is just a counter value, and the last column is the target.
My code:
NUM_FEATURES = 7
num_neuron = 10
batch_size = 8
beta = 0.001
learning_rate = 0.001
epochs = 4000
seed = 10
np.random.seed(seed)
# read and divide data into test and train sets
total_dataset= np.genfromtxt('dataset_excel.csv', delimiter=',')
X_data, Y_data = total_dataset[1:, 1:8], total_dataset[1:, -1]
Y_data = Y_data.reshape(Y_data.shape[0], 1)
# shuffle input, ensure both are shuffled with the same order
shufflestate = np.random.get_state()
np.random.shuffle(X_data)
np.random.set_state(shufflestate)
np.random.shuffle(Y_data)
# 70% used for training, 30% used for testing
trainX = X_data[:280]
trainY = Y_data[:280]
testX = X_data[280:]
testY = Y_data[280:]
trainX = (trainX - np.mean(trainX, axis=0)) / np.std(trainX, axis=0)
# Create the model
x = tf.placeholder(tf.float32, [None, NUM_FEATURES])
y_ = tf.placeholder(tf.float32, [None, 1])
# get 50 samples for plotting of predicted vs target values
limited50testX = testX[:50]
limited50testY = testY[:50]
# Hidden
with tf.name_scope('hidden'):
weight1 = tf.Variable(tf.truncated_normal([NUM_FEATURES, num_neuron],stddev=1.0,name='weight1'))
bias1 = tf.Variable(tf.zeros([num_neuron]),name='bias1')
hidden = tf.nn.relu(tf.matmul(x, weight1) + bias1)
# output
with tf.name_scope('linear'):
weight2 = tf.Variable(tf.truncated_normal([num_neuron, 1],stddev=1.0 / np.sqrt(float(num_neuron))),name='weight2')
bias2 = tf.Variable(tf.zeros([1]),name='bias2')
logits = tf.matmul(hidden, weight2) + bias2
ridgeLoss = tf.square(y_ - logits)
regularisation = tf.nn.l2_loss(weight1) + tf.nn.l2_loss(weight2)
loss = tf.reduce_mean(ridgeLoss + beta * regularisation)
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = optimizer.minimize(loss)
error = tf.reduce_mean(tf.square(y_ - logits))
N = len(trainX)
idx = np.arange(N)
predicted=[]
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
train_err = []
test_err_ = []
for i in range(epochs):
for batchStart, batchEnd in zip(range(0, trainX.shape[0], batch_size),range(batch_size, trainX.shape[0], batch_size)):
train_op.run(feed_dict={x: trainX[batchStart:batchEnd], y_: trainY[batchStart:batchEnd]})
err = error.eval(feed_dict={x: trainX, y_: trainY})
train_err.append(err)
if i % 100 == 0:
print('iter %d: train error %g' % (i, train_err[i]))
test_err = error.eval(feed_dict={x: testX, y_: testY})
test_err_.append(test_err)
predicted = sess.run(logits, feed_dict={x:limited50testX})
print("predicted values: ", predicted)
print("size of predicted values is", len(predicted))
print("targets: ", limited50testY)
print("size of target values is", len(limited50testY))
#plot predictions vs targets
numberList=np.arange(0, 50, 1).tolist()
predplot = plt.figure(1)
plt.plot(numberList, predicted, label='Predictions')
plt.plot(numberList, limited50testY, label='Targets')
plt.xlabel('50 samples')
plt.ylabel('Value')
plt.legend(loc='lower right')
predplot.show()
# plot training error
trainplot = plt.figure(2)
plt.plot(range(epochs), train_err)
plt.xlabel(str(epochs) + ' iterations')
plt.ylabel('Train Error')
trainplot.show()
#plot testing error
testplot = plt.figure(3)
plt.plot(range(epochs), test_err_)
plt.xlabel(str(epochs) + ' iterations')
plt.ylabel('Test Error')
testplot.show()
Not sure if that's it, but trainX is normalized whereas testX is not. You might want to use the same normalization on testX before predicting.

LSTM Time-Series produces shifted forecast?

I am doing a time-series forecast with a LSTM NN and Keras. As input features there are two variables (precipitation and temperature) and the one target to be predicted is groundwater-level.
It seems to be working quite all right, though there is a serious offset between the actual data and the output (see image).
Now I've read that this is can be a classic sign of the network not working, as it seems to be mimicing the output and
what the model is actually doing is that when predicting the value at
time “t+1”, it simply uses the value at time “t” as its prediction https://towardsdatascience.com/how-not-to-use-machine-learning-for-time-series-forecasting-avoiding-the-pitfalls-19f9d7adf424
However, this is not actually possible in my case, as the target-values are not used as input variable. I am using a multi variate time-series with two features, independent of the output feature.
Also, the predicted values are not offset in future (t+1) but rather seem to lag behind (t-1).
Does anyone know what could cause this problem?
This is the complete code of my network:
# Split in Input and Output Data
x_1 = data[['MeanT']].values
x_2 = data[['Precip']].values
y = data[['Z_424A_6857']].values
# Scale Data
x = np.hstack([x_1, x_2])
scaler = MinMaxScaler(feature_range=(0, 1))
x = scaler.fit_transform(x)
scaler_out = MinMaxScaler(feature_range=(0, 1))
y = scaler_out.fit_transform(y)
# Reshape Data
x_1, x_2, y = H.create2feature_data(x_1, x_2, y, window)
train_size = int(len(x_1) * .8)
test_size = int(len(x_1)) # * .5
x_1 = np.expand_dims(x_1, 2) # 3D tensor with shape (batch_size, timesteps, input_dim) // (nr. of samples, nr. of timesteps, nr. of features)
x_2 = np.expand_dims(x_2, 2)
y = np.expand_dims(y, 1)
# Split Training Data
x_1_train = x_1[:train_size]
x_2_train = x_2[:train_size]
y_train = y[:train_size]
# Split Test Data
x_1_test = x_1[train_size:test_size]
x_2_test = x_2[train_size:test_size]
y_test = y[train_size:test_size]
# Define Model Input Sets
inputA = Input(shape=(window, 1))
inputB = Input(shape=(window, 1))
# Build Model Branch 1
branch_1 = layers.GRU(16, activation=act, dropout=0, return_sequences=False, stateful=False, batch_input_shape=(batch, 30, 1))(inputA)
branch_1 = layers.Dense(8, activation=act)(branch_1)
#branch_1 = layers.Dropout(0.2)(branch_1)
branch_1 = Model(inputs=inputA, outputs=branch_1)
# Build Model Branch 2
branch_2 = layers.GRU(16, activation=act, dropout=0, return_sequences=False, stateful=False, batch_input_shape=(batch, 30, 1))(inputB)
branch_2 = layers.Dense(8, activation=act)(branch_2)
#branch_2 = layers.Dropout(0.2)(branch_2)
branch_2 = Model(inputs=inputB, outputs=branch_2)
# Combine Model Branches
combined = layers.concatenate([branch_1.output, branch_2.output])
# apply a FC layer and then a regression prediction on the combined outputs
comb = layers.Dense(6, activation=act)(combined)
comb = layers.Dense(1, activation="linear")(comb)
# Accept the inputs of the two branches and then output a single value
model = Model(inputs=[branch_1.input, branch_2.input], outputs=comb)
model.compile(loss='mse', optimizer='adam', metrics=['mse', H.r2_score])
model.summary()
# Training
model.fit([x_1_train, x_2_train], y_train, epochs=epoch, batch_size=batch, validation_split=0.2, callbacks=[tensorboard])
model.reset_states()
# Evaluation
print('Train evaluation')
print(model.evaluate([x_1_train, x_2_train], y_train))
print('Test evaluation')
print(model.evaluate([x_1_test, x_2_test], y_test))
# Predictions
predictions_train = model.predict([x_1_train, x_2_train])
predictions_test = model.predict([x_1_test, x_2_test])
predictions_train = np.reshape(predictions_train, (-1,1))
predictions_test = np.reshape(predictions_test, (-1,1))
# Reverse Scaling
predictions_train = scaler_out.inverse_transform(predictions_train)
predictions_test = scaler_out.inverse_transform(predictions_test)
# Plot results
plt.figure(figsize=(15, 6))
plt.plot(orig_data, color='blue', label='True GWL')
plt.plot(range(train_size), predictions_train, color='red', label='Predicted GWL (Training)')
plt.plot(range(train_size, test_size), predictions_test, color='green', label='Predicted GWL (Test)')
plt.title('GWL Prediction')
plt.xlabel('Day')
plt.ylabel('GWL')
plt.legend()
plt.show()
I am using a batch size of 30 timesteps, a lookback of 90 timesteps, with a total data size of around 7500 time steps.
Any help would be greatly appreciated :-) Thank you!
Probably my answer is not relevant two years later, but I had a similar issue when experimenting with LSTM encoder-decoder model. I solved my problem by scaling the input data in the range -1 .. 1 instead of 0 .. 1 as in your example.

How to inverse transform the predicted values in a multivariate time series LSTM Model

I'm setting up a multivariate time series LSTM model where I use the historical data of 9 variables as my input and 3 timesteps. Dimensions of my inputs are as follows:
X_train_reshape = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 9))
X_test_reshape = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 9))
print(X_train.shape,y_train3.shape, X_test.shape, y_test3.shape)
(1744, 3, 9) (1744, 1) (434, 3, 9) (434, 1)
I scaled my input to be between (0,1).
scaler = MinMaxScaler(feature_range=(0, 1))
scaler = scaler.fit(train)
train = scaler.transform(train)
test = scaler.transform(test)
It seems like my model is working and successfully predicting the target variables. However, I receive the following error when I try to inverse transform my target variables.
yhat_inv = scaler.inverse_transform(model.predict(X_train)).flatten()
"ValueError: non-broadcastable output operand with shape (1744,1) doesn't match the broadcast shape (1744,9)"
How can I inverse transform the predicted values?
This code is for prediction of test set and inverse scaling
yhat = model.predict(test_X)
test_X = test_X.reshape((test_X.shape[0], test_X.shape[2]))
# invert scaling for forecast
inv_yhat = concatenate((yhat, test_X[:, 1:]), axis=1)
inv_yhat = scaler.inverse_transform(inv_yhat)
inv_yhat = inv_yhat[:,0]
# invert scaling for actual
test_y = test_y.reshape((len(test_y), 1))
inv_y = concatenate((test_y, test_X[:, 1:]), axis=1)
inv_y = scaler.inverse_transform(inv_y)
inv_y = inv_y[:,0]
# calculate RMSE
rmse = sqrt(mean_squared_error(inv_y, inv_yhat))
print('Test RMSE: %.3f' % rmse)

TensorFlow Trained Model Predicts Always Zero

I have one simple TensorFlow model and accuracy for that is 1. But when I try to predict some new inputs it always returns Zero(0).
import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()
# generate data
np.random.seed(10)
#inputs = np.random.uniform(low=1.2, high=1.5, size=[5000, 150]).astype('float32')
inputs = np.random.randint(low=50, high=500, size=[5000, 150])
label = np.random.uniform(low=1.3, high=1.4, size=[5000, 1])
# reverse_label = 1 - label
reverse_label = np.random.uniform(
low=1.3, high=1.4, size=[5000, 1])
reverse_label1 = np.random.randint(
low=80, high=140, size=[5000, 1])
#labels = np.append(label, reverse_label, 1)
#labels = np.append(labels, reverse_label1, 1)
labels = reverse_label1
print(inputs)
print(labels)
# parameters
learn_rate = 0.001
epochs = 100
n_input = 150
n_hidden = 15
n_output = 1
# set weights/biases
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_output])
b0 = tf.Variable(tf.truncated_normal([n_hidden], stddev=0.2, seed=0))
b1 = tf.Variable(tf.truncated_normal([n_output], stddev=0.2, seed=0))
w0 = tf.Variable(tf.truncated_normal([n_input, n_hidden], stddev=0.2, seed=0))
w1 = tf.Variable(tf.truncated_normal([n_hidden, n_output], stddev=0.2, seed=0))
# step function
def returnPred(x, w0, w1, b0, b1):
z1 = tf.add(tf.matmul(x, w0), b0)
a2 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a2, w1), b1)
h = tf.nn.relu(z2)
return h # return the first response vector from the
y_ = returnPred(x, w0, w1, b0, b1) # predict operation
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=y_, labels=y)) # calculate loss between prediction and actual
model = tf.train.AdamOptimizer(learning_rate=learn_rate).minimize(
loss) # apply gradient descent based on loss
init = tf.global_variables_initializer()
tf.Session = sess
sess.run(init) # initialize graph
for step in range(0, epochs):
sess.run([model, loss], feed_dict={x: inputs, y: labels}) # train model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: inputs, y: labels})) # print accuracy
inp = np.random.randint(low=50, high=500, size=[5, 150])
print(sess.run(tf.argmax(y_, 1), feed_dict={x: inp})) # predict some new inputs
All functions are working properly and my problem is with the latest line of code. I tried only "y_" instead "tf.argmax(y_, 1)" but not worked too.
How can I fix that?
Regards,
There are multiple mistakes in your code.
Starting with this lines of code:
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: inputs, y: labels})) # print accuracy
You are performing linear regression but you are checking accuracy with that of logistic regression methodology. If you want to see how your linear regression network is performing, print the loss. Ensure that your loss is decreasing after each epoch of training.
If you look into that accuracy code, run the following code:
print(y_.get_shape()) # Outputs (?, 1)
There is only one input and both of your function tf.argmax(y,1) and tf.argmax(y_,1) will always return [0,0,..]. So as a result your accuracy will be always 1.0. Delete those three lines of code.
Next, to get the outputs, just run the following code:
print(sess.run(y_, feed_dict={x: inp}))
But since your data is random, don't expect good set of outputs.

How to output a prediction in Tensorflow?

I am trying to use a Tensorflow DNN for a Kaggle Competion. The data is about 100 columns of categorical data, 29 columns of numerical data, and 1 column for the output. What I did was I split it into training and testing with X and y using Scikit's train test split function, where X is a list of each rows without the "id" or the value that needs to be predicted, and y is the value that is needed to be predicted. I then built the model, shown below:
import tensorflow as tf
import numpy as np
import time
import pickle
with open('pickle.pickle', 'rb') as f:
trainX, trainy, testX, testy = pickle.load(f)
trainX = np.array(trainX)
trainy = np.array(trainy)
trainy = trainy.reshape(trainy.shape[0], 1)
testX = np.array(testX)
testy = np.array(testy)
print (trainX.shape)
print (trainy.shape)
testX = testX.reshape(testX.shape[0], 130)
testy = testy.reshape(testy.shape[0], 1)
print (testX.shape)
print (testy.shape)
n_nodes_hl1 = 256
n_nodes_hl2 = 256
n_nodes_hl3 = 256
n_classes = 1
batch_size = 100
# Matrix = h X w
X = tf.placeholder('float', [None, len(trainX[0])])
y = tf.placeholder('float')
def model(data):
hidden_1_layer = {'weights':tf.Variable(tf.random_normal([trainX.shape[1], n_nodes_hl1])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}
hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}
hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}
output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
'biases':tf.Variable(tf.random_normal([n_classes]))}
# (input_data * weights) + biases
l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['biases'])
l1 = tf.nn.sigmoid(l1)
l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['biases'])
l2 = tf.nn.sigmoid(l2)
l3 = tf.add(tf.matmul(l2, hidden_3_layer['weights']), hidden_3_layer['biases'])
l3 = tf.nn.sigmoid(l3)
output = tf.matmul(l3, output_layer['weights']) + output_layer['biases']
return output
def train(x):
pred = model(x)
#loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(pred, y))
loss = tf.reduce_mean(tf.square(pred - y))
optimizer = tf.train.AdamOptimizer(0.01).minimize(loss)
epochs = 1
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print ('Beginning Training \n')
for e in range(epochs):
timeS = time.time()
epoch_loss = 0
i = 0
while i < len(trainX):
start = i
end = i + batch_size
batch_x = np.array(trainX[start:end])
batch_y = np.array(trainy[start:end])
_, c = sess.run([optimizer, loss], feed_dict = {x: batch_x, y: batch_y})
epoch_loss += c
i += batch_size
done = time.time() - timeS
print ('Epoch', e + 1, 'completed out of', epochs, 'loss:', epoch_loss, "\nTime:", done, 'seconds\n')
correct = tf.equal(tf.arg_max(pred, 1), tf.arg_max(y, 1))
acc = tf.reduce_mean(tf.cast(correct, 'float'))
print("Accuracy:", acc.eval({x:testX, y:testy}))
train(X)
Output for 1 epoch:
Epoch 1 completed out of 1 loss: 1498498282.5
Time: 1.3765859603881836 seconds
Accuracy: 1.0
I do realize that the loss is very high, and I am using 1 epoch just for testing purposes, and yes, I know my code is quite messy. But all I want to do is print out a prediction. How would I do that? I know that I need to feed a list of features for X, but I just don't understand how to do it. I also don't quite understand why my accuracy is at 1.0, so if you have any suggestions for that, or any ways to change my code, I would be more that happy to listen to any ideas.
Thanks in advance
To get a prediction you just have to evaluate pred, which is the operation that defines the output of the model.
How to do it? With pred.eval(). But you need an input to evalaute its prediction, so you have to provide a feed_dict dictionary to eval() with the sample (or samples) you want to process.
The resulting code looks like:
predictions = pred.eval(feed_dict = {x:testX})
Notice how this is very similar to acc.eval({x:testX, y:testy}), because the idea is the same. You have an operation (acc in this case) which needs some input to be evaluated, and you can evaluate it either by calling acc.eval() or sess.run(acc) with the corresponding feed_dict with the necessary inputs.
The simplest way would be to use the existing session while training (between iterations):
print (sess.run(model, {x:X_example}))
where X_example is some numpy example tensor.
The below line will give you probability scores for every class for example is you 3 classes then the below line will give you a array of shape of 1x3
Considering you want prediction of a single data point X_test you can do the following:
output = sess.run(pred, {x:X_test})
the maximum number in the above variable output will be you prediction so for that we will modify the above statement :
output = sess.run(tf.argmax(pred, 1), {x:X_test})
print("your prediction for X_test is :", output[0])
Other thing you can do is :
output = sess.run(pred, {x:X_test})
output = np.argmax(output)
print("your prediction for X_test is :", output)

Resources