Tensor Objects of different size - python-3.x

I am attempting to implement a CNN, but I have run into a minor issue.
x = tf.placeholder(tf.float32, [None, 28, 28, 1])
# 0-9 digits recognition => 10 classes.
y = tf.placeholder(tf.float32, [None, 10])
...code for layers...
...etc....
# Output has a shape of [batch_size, 10]
logits = tf.layers.dense(inputs=dropout, units=10)
# Softmax layer for deriving probabilities.
pred = tf.nn.softmax(logits, name="softmax_tensor")
# Convert labels to a one-hot encoding.
onehot_labels = tf.one_hot(indices=tf.cast(y, tf.int32), depth=10)
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits)
As is visible, the losses function will not run properly because logits and onehot_labels are of different shapes. logits is shape=(2,) whereas onehot_labels is shape=(3,) and this is because it depends on the y placeholder which is [batch_size, 10].
I am not sure how to fix this. I need to change the shape of either of these variables, but I am not sure which one. Does the CNN require y, which are the labels, to have batch_size as an argument? Where am I going wrong?
Some extra info, I intend to run the CNN within a session as so..
# Assign the contents of `batch_xs` to variable `x`.
_, c = sess.run([train_op, loss], feed_dict={x:sess.run(batch_xs), y:batch_ys})

If your label data are the actual classes, then the code should be:
y = tf.placeholder(tf.float32, [None, 1])
...
onehot_labels = tf.one_hot(indices=tf.cast(y, tf.int32), depth=10)
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits)
Otherwise, your label must be already one-hot data, then the code should be:
# y is already one-hot label data.
y = tf.placeholder(tf.float32, [None, 10])
...
loss = tf.losses.softmax_cross_entropy(onehot_labels=y, logits=logits)
Please refer to mint tutorial for an example.

Related

CrossEntropyLoss on sequences

I need to compute the torch.nn.CrossEntropyLoss on sequences.
The output tensor y_est has shape: [batch_size, sequence_length, embedding_dim]. The values are embedded as one-hot vectors with embedding_dim dimensions (y_est is not binary however).
The target tensor y has shape: [batch_size, sequence_length] and contains the integer index of the correct class in the range [0, embedding_dim).
If I compute the loss on the two input data, with the shape described above, I get an error 1.
What I would like to do is described by the cycle at [2]. For each sequence in the batch, I would like the sum of the losses computed on each element in the sequence.
After reading the documentation of torch.nn.CrossEntropyLoss I came up with the solution [3], which seems to compute exactly what I want: the losses computed at point [2] and [3] are equale.
However, since .permute(.) returns a view of the original tensor, I am afraid it might mess up the backward propagation on the loss. Somewhere (I do not remember where, sorry) I have read that views should not be used in computing the loss.
Is my solution correct?
import torch
batch_size = 5
seq_len = 10
emb_dim = 100
y_est = torch.randn( (batch_size, seq_len, emb_dim))
y = torch.randint(0, emb_dim, (batch_size, seq_len) )
print("y_est, batch x seq x emb:", y_est.shape)
print("y, batch x seq", y.shape)
loss_fn = torch.nn.CrossEntropyLoss(reduction="none")
# [1]
# loss = loss_fn(y_est, y)
# error:
# RuntimeError: Expected target size [5, 100], got [5, 10]
[2]
loss = 0
for i in range(y_est.shape[1]):
loss += loss_fn ( y_est[:, i, :], y[:, i]).sum()
print(loss)
[3]
y_est_2 = torch.permute( y_est, (0, 2, 1))
print("y_est_2", y_est_2.shape)
loss2 = loss_fn(y_est_2, y).sum()
print(loss2)
whose output is:
y_est, batch x seq x emb: torch.Size([5, 10, 100])
y, batch x seq torch.Size([5, 10])
tensor(253.9994)
y_est_2 torch.Size([5, 100, 10])
tensor(253.9994)
Is the solution correct (also for what concerns the backward pass)? Is there a better way?
If y_est are probabilities you really want to compute the error/loss of a categorical output in each timestep/element of a sequence then y and y_est have to have the same shape. To do so, the categories/classes of y can be expanded to the same dim as y_est with one-hot encoding
import torch
batch_size = 5
seq_len = 10
emb_dim = 100
y_est = torch.randn( (batch_size, seq_len, emb_dim))
y = torch.randint(0, emb_dim, (batch_size, seq_len) )
y = torch.nn.functional.one_hot(y, num_classes=emb_dim).type(torch.float)
loss_fn = torch.nn.CrossEntropyLoss()
loss = loss_fn(y_est, y)
print(loss)

Recurrent neural network architecture

I'm working on a RNN architecture which does speech enhancement. The dimensions of the input is [XX, X, 1024] where XX is the batch size and X is the variable sequence length.
The input to the network is positive valued data and the output is masked binary data(IBM) which is later used to construct enhanced signal.
For instance, if the input to network is [10, 65, 1024] the output will be [10,65,1024] tensor with binary values. I'm using Tensorflow with mean squared error as loss function. But I'm not sure which activation function to use here(which keeps the outputs either zero or one), Following is the code I've come up with so far
tf.reset_default_graph()
num_units = 10 #
num_layers = 3 #
dropout = tf.placeholder(tf.float32)
cells = []
for _ in range(num_layers):
cell = tf.contrib.rnn.LSTMCell(num_units)
cell = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob = dropout)
cells.append(cell)
cell = tf.contrib.rnn.MultiRNNCell(cells)
X = tf.placeholder(tf.float32, [None, None, 1024])
Y = tf.placeholder(tf.float32, [None, None, 1024])
output, state = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
out_size = Y.get_shape()[2].value
logit = tf.contrib.layers.fully_connected(output, out_size)
prediction = (logit)
flat_Y = tf.reshape(Y, [-1] + Y.shape.as_list()[2:])
flat_logit = tf.reshape(logit, [-1] + logit.shape.as_list()[2:])
loss_op = tf.losses.mean_squared_error(labels=flat_Y, predictions=flat_logit)
#adam optimizier as the optimization function
optimizer = tf.train.AdamOptimizer(learning_rate=0.001) #
train_op = optimizer.minimize(loss_op)
#extract the correct predictions and compute the accuracy
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
Also my reconstruction isn't good. Can someone suggest on improving the model?
If you want your outputs to be either 0 or 1, to me it seems a good idea to turn this into a classification problem. To this end, I would use a sigmoidal activation and cross entropy:
...
prediction = tf.nn.sigmoid(logit)
loss_op = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=Y, logits=logit))
...
In addition, from my point of view the hidden dimensionality (10) of your stacked RNNs seems quite small for such a big input dimensionality (1024). However this is just a guess, and it is something that needs to be tuned.

ValueError: Cannot feed value of shape (100, 1) for Tensor 'Placeholder_1:0', which has shape '(?, 10)'

I was learning Tensorflow from the Tensorflow documentation and was trying to implement MNIST but i keep getting this error.
ValueError: Cannot feed value of shape (100, 1) for Tensor Placeholder_1:0', which has shape '(?, 10)'
Here's the code
# placeholders for the data
x = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, 10])
# weights and biases
w = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
# softmax model
activation = tf.nn.softmax_cross_entropy_with_logits(logits = tf.matmul(x, w) + b, labels=y)
# backpropagation
train = tf.train.GradientDescentOptimizer(0.5).minimize(activation)
# creating tensorflow session
s = tf.InteractiveSession()
# i have already initialised the variables
# gradient descent
for i in range(100):
x_bat, y_bat= create_batch(x_train, y_train, size=100)
train_step = s.run(train, feed_dict={x: x_bat, y: y_bat})
The problem is with create_batch function that outputs the wrong y_bat shape. Most probably, you forgot to do one-hot encoding.
I.e., the current y_bat is a [100] vector of integers 0..9, but it should be a [100, 10] vector of 0 and 1.
If you get the data with input_data.read_data_sets function, then simply add one_hot=True.

EEG data classification in Tensorflow

I have EEG data files which I want to classify into 2 classes using tensorflow in CNN.My data is 3D(91,2500,39),91 is the no of electrode,2500 no of samples and 39 is the number of chunks.The last dimension(39) varies in different files between(38-41),So after reducing the dimension to 2D I resize all the files to (91,97500) and append all the files to my empty list with the following codes
for file in os.listdir(dataDir):
data=scipy.io.loadmat(file)
x=data['eegdata']
x = x.reshape(91, -1)
x=cv2.resize(x,(91,97500))
# x=x.reshape(8872500)
# print(x.shape)
X.append(x)
X=np.array(X)
I then created tensor placeholder for my input raw data with dimension [None,8872500],the value 8872500 comes from 91*97500,I am supposed to classify the data into two classes
x = tf.placeholder(tf.float32, shape=[None, 8872500])
y = tf.placeholder(tf.float32, shape=[None, 2])
keep_prob = tf.placeholder(tf.float32)
I reshaped my input before feeding to my convolution layer
x = tf.reshape(x, shape=[-1, 91, 97500, 1])
my batch size is 6,when I run my codes I get an error
ValueError: Cannot feed value of shape (6, 97500, 91) for Tensor 'Placeholder:0', which has shape '(?, 8872500)'
I tried to reshape my input to 8872500(as it can be seen on the commented part of the for loop),when I do that my computer stack,which shapes should I use for my program(tensor and input)?
My Session looks like this
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
step = 1
# Keep training until reach max iterations
while step * batch_size < training_iters:
offset = (step * batch_size) % (ytrain.shape[0] - batch_size)
batch_x = xtrain[offset:(offset + batch_size)]
batch_y = ytrain[offset:(offset + batch_size)]
# Run optimization op (backprop)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout})
Thank you.
what is the shape of batch_x? The problem is with your placeholder:
Try this:
x = tf.placeholder(tf.float32, shape=[None, 91, 97500])
Also, it would be better to call the variable after reshaping x something like reshaped_x to prevent confusion.

TensorFlow variable configuration

I successfully implemented a feed-forward algorithm in TensorFlow that looked as follows...
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
...and the training cycle was as follows...
# launch the graph
with tf.Session() as sess:
sess.run(init)
# training cycle
for epoch in range(FLAGS.training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples/FLAGS.batch_size)
# loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
_, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
...the rest of the code is not necessary. Up until this point the code works perfect. It is important to note that my batch_size is 100. The problem is I am using tf.placeholder for my values but in fact I need to change them to use tf.get_variable. The first thing I did was change the following...
# tf Graph Input
x = tf.get_variable("input_image", shape=[100,784], dtype=tf.float32)
y = tf.placeholder(shape=[100,10], name='input_label', dtype=tf.float32) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.get_variable("weights", shape=[784, 10], dtype=tf.float32, initializer=tf.random_normal_initializer())
b = tf.get_variable("biases", shape=[1, 10], dtype=tf.float32, initializer=tf.zeros_initializer())
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
...so far so good. But now I am trying to implement the training cycle and this is where I run into issues. I run the exact same training cycle as above with batch_size = 100 and I get the following errors...
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node GradientDescent/update_input_image/ApplyGradientDescent was passed float from _recv_input_image_0:0 incompatible with expected float_ref.
How can I fix this issue? The error is coming from the following line...
_, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
It's unclear to me why you needed to change x to a tf.Variable when you are continuing to feed a value for it. There are two workarounds (not counting the case where you could just revert x to being tf.placeholder() as in the working code):
The error is being raised because the optimizer is attempting to apply an SGD update to the value that you're feeding (which leads to a confusing runtime type error). You could prevent optimizer from doing this by passing trainable=False when constructing x:
x = tf.get_variable("input_image", shape=[100, 784], dtype=tf.float32,
trainable=False)
Since x is a variable, you could assign the image to the variable in a separate step before running the optimizer.
x = tf.get_variable("input_image", shape=[100, 784], dtype=tf.float32)
x_placeholder = tf.placeholder(tf.float32, shape=[100, 784])
assign_x_op = x.assign(x_placeholder).op
# ...
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
# Assign the contents of `batch_xs` to variable `x`.
sess.run(assign_x_op, feed_dict={x_placeholder: batch_xs})
# N.B. Now you do not need to feed `x`.
_, c = sess.run([optimizer, cost], feed_dict={y: batch_ys})
This latter version would allow you to perform gradient descent on the contents of the image (which might be why you'd want to store it in a variable in the first place).

Resources