ValueError: Cannot feed value of shape (100, 1) for Tensor 'Placeholder_1:0', which has shape '(?, 10)' - python-3.x

I was learning Tensorflow from the Tensorflow documentation and was trying to implement MNIST but i keep getting this error.
ValueError: Cannot feed value of shape (100, 1) for Tensor Placeholder_1:0', which has shape '(?, 10)'
Here's the code
# placeholders for the data
x = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, 10])
# weights and biases
w = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
# softmax model
activation = tf.nn.softmax_cross_entropy_with_logits(logits = tf.matmul(x, w) + b, labels=y)
# backpropagation
train = tf.train.GradientDescentOptimizer(0.5).minimize(activation)
# creating tensorflow session
s = tf.InteractiveSession()
# i have already initialised the variables
# gradient descent
for i in range(100):
x_bat, y_bat= create_batch(x_train, y_train, size=100)
train_step = s.run(train, feed_dict={x: x_bat, y: y_bat})

The problem is with create_batch function that outputs the wrong y_bat shape. Most probably, you forgot to do one-hot encoding.
I.e., the current y_bat is a [100] vector of integers 0..9, but it should be a [100, 10] vector of 0 and 1.
If you get the data with input_data.read_data_sets function, then simply add one_hot=True.

Related

CrossEntropyLoss on sequences

I need to compute the torch.nn.CrossEntropyLoss on sequences.
The output tensor y_est has shape: [batch_size, sequence_length, embedding_dim]. The values are embedded as one-hot vectors with embedding_dim dimensions (y_est is not binary however).
The target tensor y has shape: [batch_size, sequence_length] and contains the integer index of the correct class in the range [0, embedding_dim).
If I compute the loss on the two input data, with the shape described above, I get an error 1.
What I would like to do is described by the cycle at [2]. For each sequence in the batch, I would like the sum of the losses computed on each element in the sequence.
After reading the documentation of torch.nn.CrossEntropyLoss I came up with the solution [3], which seems to compute exactly what I want: the losses computed at point [2] and [3] are equale.
However, since .permute(.) returns a view of the original tensor, I am afraid it might mess up the backward propagation on the loss. Somewhere (I do not remember where, sorry) I have read that views should not be used in computing the loss.
Is my solution correct?
import torch
batch_size = 5
seq_len = 10
emb_dim = 100
y_est = torch.randn( (batch_size, seq_len, emb_dim))
y = torch.randint(0, emb_dim, (batch_size, seq_len) )
print("y_est, batch x seq x emb:", y_est.shape)
print("y, batch x seq", y.shape)
loss_fn = torch.nn.CrossEntropyLoss(reduction="none")
# [1]
# loss = loss_fn(y_est, y)
# error:
# RuntimeError: Expected target size [5, 100], got [5, 10]
[2]
loss = 0
for i in range(y_est.shape[1]):
loss += loss_fn ( y_est[:, i, :], y[:, i]).sum()
print(loss)
[3]
y_est_2 = torch.permute( y_est, (0, 2, 1))
print("y_est_2", y_est_2.shape)
loss2 = loss_fn(y_est_2, y).sum()
print(loss2)
whose output is:
y_est, batch x seq x emb: torch.Size([5, 10, 100])
y, batch x seq torch.Size([5, 10])
tensor(253.9994)
y_est_2 torch.Size([5, 100, 10])
tensor(253.9994)
Is the solution correct (also for what concerns the backward pass)? Is there a better way?
If y_est are probabilities you really want to compute the error/loss of a categorical output in each timestep/element of a sequence then y and y_est have to have the same shape. To do so, the categories/classes of y can be expanded to the same dim as y_est with one-hot encoding
import torch
batch_size = 5
seq_len = 10
emb_dim = 100
y_est = torch.randn( (batch_size, seq_len, emb_dim))
y = torch.randint(0, emb_dim, (batch_size, seq_len) )
y = torch.nn.functional.one_hot(y, num_classes=emb_dim).type(torch.float)
loss_fn = torch.nn.CrossEntropyLoss()
loss = loss_fn(y_est, y)
print(loss)

Numpy and tensorflow RNN shape representation mismatch

I'm building my first RNN in tensorflow. After understanding all the concepts regarding the 3D input shape, I came across with this issue.
In my numpy version (1.15.4), the shape representation of 3D arrays is the following: (panel, row, column). I will make each dimension different so that it is clearer:
In [1]: import numpy as np
In [2]: arr = np.arange(30).reshape((2,3,5))
In [3]: arr
Out[3]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
In [4]: arr.shape
Out[4]: (2, 3, 5)
In [5]: np.__version__
Out[5]: '1.15.4'
Here my understanding is: I have two timesteps with each timestep having 3 observations with 5 features in each observation.
However, in tensorflow "theory" (which I believe it is strongly based in numpy) RNN cells expect tensors (i.e. just n-dimensional matrices) of shape [batch_size, timesteps, features], which could be translated to: (row, panel, column) in the numpy "jargon".
As can be seen, the representation doesn't match, leading to errors when feeding numpy data into a placeholder, which in most of the examples and theory is defined like:
x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')
np.reshape() doesn't solve the issue because it just rearranges the dimensions, but messes up with the data.
I'm using for the first time the Dataset API, but I encounter the problems once into the session, not in the Dataset API ops.
I'm using the static_rnn method, and everything works well until I have to feed the data into the placeholder, which obviously results in a shape error.
I have tried to change the placeholder shape to shape=[N_TIMESTEPS_X, None, N_FEATURES]. HOWEVER, I'm using the dataset API, and I get errors when making the initializer if I change the Xplaceholder to the shape=[N_TIMESTEPS_X, None, N_FEATURES].
So, to summarize:
First problem: Shape errors with different shape representations.
Second problem: Dataset error when equating the shape representations (I think that either static_rnn or dynamic_rnn would function if this is resolved).
My question is:
¿Is there anything I'm missing in regard to this different representation logic which makes the practice confusing?
¿Could the solution be attained to switching to dynamic_rnn? (although the problems about the shape I encounter are related to the dataset API initializer being fed with shape [N_TIMESTEPS_X, None, N_FEATURES], not with the RNN cell itself.
Thank you very much for your time.
Full code:
'''The idea is to create xt, yt, xval and yval. My numpy arrays to
be fed are of the following shapes:
The 3D xt array has a shape of: (11, 69579, 74)
The 3D xval array has a shape of: (11, 7732, 74)
The yt array has a shape of: (69579, 3)
The yval array has a shape of: (7732, 3)
'''
N_TIMESTEPS_X = xt.shape[0] ## The stack number
BATCH_SIZE = 256
#N_OBSERVATIONS = xt.shape[1]
N_FEATURES = xt.shape[2]
N_OUTPUTS = yt.shape[1]
N_NEURONS_LSTM = 128 ## Number of units in the LSTMCell
N_NEURONS_DENSE = 64 ## Number of units in the Dense layer
N_EPOCHS = 600
LEARNING_RATE = 0.1
### Define the placeholders anda gather the data.
train_data = (xt, yt)
validation_data = (xval, yval)
## We define the placeholders as a trick so that we do not break into memory problems, associated with feeding the data directly.
'''As an alternative, you can define the Dataset in terms of tf.placeholder() tensors, and feed the NumPy arrays when you initialize an Iterator over the dataset.'''
batch_size = tf.placeholder(tf.int64)
x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')
y = tf.placeholder(tf.float32, shape=[None, N_OUTPUTS], name='YPlaceholder')
# Creating the two different dataset objects.
train_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE)
# Creating the Iterator type that permits to switch between datasets.
itr = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)
train_init_op = itr.make_initializer(train_dataset)
validation_init_op = itr.make_initializer(val_dataset)
next_features, next_labels = itr.get_next()
### Create the graph
cellType = tf.nn.rnn_cell.LSTMCell(num_units=N_NEURONS_LSTM, name='LSTMCell')
inputs = tf.unstack(next_features, N_TIMESTEPS_X, axis=0)
'''inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size]'''
RNNOutputs, _ = tf.nn.static_rnn(cell=cellType, inputs=inputs, dtype=tf.float32)
predictionsLayer = tf.layers.dense(inputs=tf.layers.batch_normalization(RNNOutputs[-1]), units=N_NEURONS_DENSE, activation=None, name='Dense_Layer')
### Define the cost function, that will be optimized by the optimizer.
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictionsLayer, labels=next_labels, name='Softmax_plus_Cross_Entropy'))
optimizer_type = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='AdamOptimizer')
optimizer = optimizer_type.minimize(cost)
### Model evaluation
correctPrediction = tf.equal(tf.argmax(predictionsLayer,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correctPrediction,tf.float32))
#confusionMatrix = tf.confusion_matrix(next_labels, predictionsLayer, num_classes=3, name='ConfMatrix')
N_BATCHES = train_data[0].shape[0] // BATCH_SIZE
## Saving variables so that we can restore them afterwards.
saver = tf.train.Saver()
save_dir = '/home/zmlaptop/Desktop/tfModels/{}_{}'.format(cellType.__class__.__name__, datetime.now().strftime("%Y%m%d%H%M%S"))
os.mkdir(save_dir)
varDict = {'nTimeSteps':N_TIMESTEPS_X, 'BatchSize': BATCH_SIZE, 'nFeatures':N_FEATURES,
'nNeuronsLSTM':N_NEURONS_LSTM, 'nNeuronsDense':N_NEURONS_DENSE, 'nEpochs':N_EPOCHS,
'learningRate':LEARNING_RATE, 'optimizerType': optimizer_type.__class__.__name__}
varDicSavingTxt = save_dir + '/varDict.txt'
modelFilesDir = save_dir + '/modelFiles'
os.mkdir(modelFilesDir)
logDir = save_dir + '/TBoardLogs'
os.mkdir(logDir)
acc_summary = tf.summary.scalar('Accuracy', accuracy)
loss_summary = tf.summary.scalar('Cost_CrossEntropy', cost)
summary_merged = tf.summary.merge_all()
with open(varDicSavingTxt, 'w') as outfile:
outfile.write(repr(varDict))
with tf.Session() as sess:
tf.set_random_seed(2)
sess.run(tf.global_variables_initializer())
train_writer = tf.summary.FileWriter(logDir + '/train', sess.graph)
validation_writer = tf.summary.FileWriter(logDir + '/validation')
# initialise iterator with train data
sess.run(train_init_op, feed_dict = {x : train_data[0], y: train_data[1], batch_size: BATCH_SIZE})
print('¡Training starts!')
for epoch in range(N_EPOCHS):
batchAccList = []
tot_loss = 0
for batch in range(N_BATCHES):
optimizer_output, loss_value, summary = sess.run([optimizer, cost, summary_merged])
accBatch = sess.run(accuracy)
tot_loss += loss_value
batchAccList.append(accBatch)
if batch % 10 == 0:
train_writer.add_summary(summary, batch)
epochAcc = tf.reduce_mean(batchAccList)
if epoch%10 == 0:
print("Epoch: {}, Loss: {:.4f}, Accuracy: {}".format(epoch, tot_loss / N_BATCHES, epochAcc))
#confM = sess.run(confusionMatrix)
#confDic = {'confMatrix': confM}
#confTxt = save_dir + '/confMDict.txt'
#with open(confTxt, 'w') as outfile:
# outfile.write(repr(confDic))
#print(confM)
# initialise iterator with validation data
sess.run(validation_init_op, feed_dict = {x : validation_data[0], y: validation_data[1], batch_size:len(validation_data[0])})
print('Validation Loss: {:4f}, Validation Accuracy: {}'.format(sess.run(cost), sess.run(accuracy)))
summary_val = sess.run(summary_merged)
validation_writer.add_summary(summary_val)
saver.save(sess, modelFilesDir)
Is there anything I'm missing in regard to this different
representation logic which makes the practice confusing?
In fact, you made a mistake about the input shapes of static_rnn and dynamic_rnn. The input shape of static_rnn is [timesteps,batch_size, features](link),which is a list of 2D tensors of shape [batch_size, features]. But The input shape of dynamic_rnn is either [timesteps,batch_size, features] or [batch_size,timesteps, features] depending on time_major is True or False(link).
Could the solution be attained to switching to dynamic_rnn?
The key is not that you use static_rnn or dynamic_rnn, but that your data shape matches the required shape. The general format of placeholder is like your code is [None, N_TIMESTEPS_X, N_FEATURES]. It's also convenient for you to use dataset API.
You can use transpose()(link) instead of reshape().transpose() will permute the dimensions of an array and won't messes up with the data.
So your code needs to be modified.
# permute the dimensions
xt = xt.transpose([1,0,2])
xval = xval.transpose([1,0,2])
# adjust shape,axis=1 represents timesteps
inputs = tf.unstack(next_features, axis=1)
Other errors should have nothing to do with rnn shape.

Tensor Objects of different size

I am attempting to implement a CNN, but I have run into a minor issue.
x = tf.placeholder(tf.float32, [None, 28, 28, 1])
# 0-9 digits recognition => 10 classes.
y = tf.placeholder(tf.float32, [None, 10])
...code for layers...
...etc....
# Output has a shape of [batch_size, 10]
logits = tf.layers.dense(inputs=dropout, units=10)
# Softmax layer for deriving probabilities.
pred = tf.nn.softmax(logits, name="softmax_tensor")
# Convert labels to a one-hot encoding.
onehot_labels = tf.one_hot(indices=tf.cast(y, tf.int32), depth=10)
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits)
As is visible, the losses function will not run properly because logits and onehot_labels are of different shapes. logits is shape=(2,) whereas onehot_labels is shape=(3,) and this is because it depends on the y placeholder which is [batch_size, 10].
I am not sure how to fix this. I need to change the shape of either of these variables, but I am not sure which one. Does the CNN require y, which are the labels, to have batch_size as an argument? Where am I going wrong?
Some extra info, I intend to run the CNN within a session as so..
# Assign the contents of `batch_xs` to variable `x`.
_, c = sess.run([train_op, loss], feed_dict={x:sess.run(batch_xs), y:batch_ys})
If your label data are the actual classes, then the code should be:
y = tf.placeholder(tf.float32, [None, 1])
...
onehot_labels = tf.one_hot(indices=tf.cast(y, tf.int32), depth=10)
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits)
Otherwise, your label must be already one-hot data, then the code should be:
# y is already one-hot label data.
y = tf.placeholder(tf.float32, [None, 10])
...
loss = tf.losses.softmax_cross_entropy(onehot_labels=y, logits=logits)
Please refer to mint tutorial for an example.

EEG data classification in Tensorflow

I have EEG data files which I want to classify into 2 classes using tensorflow in CNN.My data is 3D(91,2500,39),91 is the no of electrode,2500 no of samples and 39 is the number of chunks.The last dimension(39) varies in different files between(38-41),So after reducing the dimension to 2D I resize all the files to (91,97500) and append all the files to my empty list with the following codes
for file in os.listdir(dataDir):
data=scipy.io.loadmat(file)
x=data['eegdata']
x = x.reshape(91, -1)
x=cv2.resize(x,(91,97500))
# x=x.reshape(8872500)
# print(x.shape)
X.append(x)
X=np.array(X)
I then created tensor placeholder for my input raw data with dimension [None,8872500],the value 8872500 comes from 91*97500,I am supposed to classify the data into two classes
x = tf.placeholder(tf.float32, shape=[None, 8872500])
y = tf.placeholder(tf.float32, shape=[None, 2])
keep_prob = tf.placeholder(tf.float32)
I reshaped my input before feeding to my convolution layer
x = tf.reshape(x, shape=[-1, 91, 97500, 1])
my batch size is 6,when I run my codes I get an error
ValueError: Cannot feed value of shape (6, 97500, 91) for Tensor 'Placeholder:0', which has shape '(?, 8872500)'
I tried to reshape my input to 8872500(as it can be seen on the commented part of the for loop),when I do that my computer stack,which shapes should I use for my program(tensor and input)?
My Session looks like this
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
step = 1
# Keep training until reach max iterations
while step * batch_size < training_iters:
offset = (step * batch_size) % (ytrain.shape[0] - batch_size)
batch_x = xtrain[offset:(offset + batch_size)]
batch_y = ytrain[offset:(offset + batch_size)]
# Run optimization op (backprop)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout})
Thank you.
what is the shape of batch_x? The problem is with your placeholder:
Try this:
x = tf.placeholder(tf.float32, shape=[None, 91, 97500])
Also, it would be better to call the variable after reshaping x something like reshaped_x to prevent confusion.

TensorFlow variable configuration

I successfully implemented a feed-forward algorithm in TensorFlow that looked as follows...
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
...and the training cycle was as follows...
# launch the graph
with tf.Session() as sess:
sess.run(init)
# training cycle
for epoch in range(FLAGS.training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples/FLAGS.batch_size)
# loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
_, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
...the rest of the code is not necessary. Up until this point the code works perfect. It is important to note that my batch_size is 100. The problem is I am using tf.placeholder for my values but in fact I need to change them to use tf.get_variable. The first thing I did was change the following...
# tf Graph Input
x = tf.get_variable("input_image", shape=[100,784], dtype=tf.float32)
y = tf.placeholder(shape=[100,10], name='input_label', dtype=tf.float32) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.get_variable("weights", shape=[784, 10], dtype=tf.float32, initializer=tf.random_normal_initializer())
b = tf.get_variable("biases", shape=[1, 10], dtype=tf.float32, initializer=tf.zeros_initializer())
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
...so far so good. But now I am trying to implement the training cycle and this is where I run into issues. I run the exact same training cycle as above with batch_size = 100 and I get the following errors...
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node GradientDescent/update_input_image/ApplyGradientDescent was passed float from _recv_input_image_0:0 incompatible with expected float_ref.
How can I fix this issue? The error is coming from the following line...
_, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
It's unclear to me why you needed to change x to a tf.Variable when you are continuing to feed a value for it. There are two workarounds (not counting the case where you could just revert x to being tf.placeholder() as in the working code):
The error is being raised because the optimizer is attempting to apply an SGD update to the value that you're feeding (which leads to a confusing runtime type error). You could prevent optimizer from doing this by passing trainable=False when constructing x:
x = tf.get_variable("input_image", shape=[100, 784], dtype=tf.float32,
trainable=False)
Since x is a variable, you could assign the image to the variable in a separate step before running the optimizer.
x = tf.get_variable("input_image", shape=[100, 784], dtype=tf.float32)
x_placeholder = tf.placeholder(tf.float32, shape=[100, 784])
assign_x_op = x.assign(x_placeholder).op
# ...
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
# Assign the contents of `batch_xs` to variable `x`.
sess.run(assign_x_op, feed_dict={x_placeholder: batch_xs})
# N.B. Now you do not need to feed `x`.
_, c = sess.run([optimizer, cost], feed_dict={y: batch_ys})
This latter version would allow you to perform gradient descent on the contents of the image (which might be why you'd want to store it in a variable in the first place).

Resources