Pre-trained embedding layer: tf.constant with unsupported shape - keras

I am going to use pre-trained word embeddings in Keras model. my matrix weights are stored in ;matrix.w2v.wv.vectors.npy; and it has shape (150854, 100).
Now when I add the embedding layer in the Keras model with different parameters as follows:
model.add(Embedding(5000, 100,
embeddings_initializer=keras.initializers.Constant(emb_matrix),
input_length=875, trainable=False))
I get the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-61-8731e904e60a> in <module>()
1 model = Sequential()
2
----> 3 model.add(Embedding(5000,100,
embeddings_initializer=keras.initializers.Constant(emb_matrix),
input_length=875,trainable=False))
4 model.add(Conv1D(128, 10, padding='same', activation='relu'))
5 model.add(MaxPooling1D(10))
22 frames
/usr/local/lib/python3.7/dist-
packages/tensorflow/python/framework/constant_op.py in
_constant_eager_impl(ctx, value, dtype, shape, verify_shape)
323 raise TypeError("Eager execution of tf.constant with unsupported shape
"
324 "(value has %d elements, shape is %s with %d
elements)." %
--> 325 (num_t, shape, shape.num_elements()))
326
327
TypeError: Eager execution of tf.constant with unsupported shape (value has
15085400 elements, shape is (5000, 100) with 500000 elements).
Kindly tell me where I am doing a mistake.

Your embeddings layer expects a vocabulary of 5,000 words and initializes an embeddings matrix of the shape 5000×100. However. the word2vec model that you are trying to load has a vocabulary of 150,854 words.
Your either need to increase the capacity of the embedding layer or truncate the embedding matrix to allow the most frequent words only.

Related

How to solve "logits and labels must have the same first dimension" error

I'm trying out different Neural Network architectures for a word based NLP.
So far I've used bidirectional-, embedded- and models with GRU's guided by this tutorial: https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571 and it all worked out well.
When I tried using LSTM's however, I get an error saying:
logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704]
How can I solve this?
My source and target dataset consists of 7200 sample sentences. They are integer tokenized and embedded. The source dataset is post padded to match the length of the target dataset.
Here is my model and the relevant code:
lstm_model = Sequential()
lstm_model.add(Embedding(src_vocab_size, 128, input_length=X.shape[1], input_shape=X.shape[1:]))
lstm_model.add(LSTM(128, return_sequences=False, dropout=0.1, recurrent_dropout=0.1))
lstm_model.add(Dense(128, activation='relu'))
lstm_model.add(Dropout(0.5))
lstm_model.add((Dense(target_vocab_size, activation='softmax')))
lstm_model.compile(optimizer=Adam(0.002), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = lstm_model.fit(X, Y, batch_size = 32, callbacks=CALLBACK, epochs = 100, validation_split = 0.25) #At this line the error is raised!
With the shapes:
X.shape = (7200, 147)
Y.shape = (7200, 147, 1)
src_vocab_size = 188
target_vocab_size = 186
I've looked at similar question on here already and tried adding a Reshape layer
simple_lstm_model.add(Reshape((-1,)))
but this only causes the following error:
"TypeError: __int__ returned non-int (type NoneType)"
It's really weird as I preprocess the dataset the same way for all models and it works just fine except for the above.
You should have return_sequences=True and return_state=False in calling the LSTM constructor.
In your snippet, the LSTM only return its last state, instead of the sequence of states for every input embedding. In theory, you could have spotted it from the error message:
logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704]
The logits should be three-dimensional: batch size × sequence length × number of classes. The length of the sequences is 147 and indeed 32 × 147 = 4704 (number of your labels). This could have told you the length of the sequences disappeared.

I want to know how can we give a categorical variable as an input to an embedding layer in keras and train that embedding layer?

let's say we have a data frame where we have a categorical column which has 7 categories - Monday, Tuesday, Wednesday, Thursday, Friday, Saturday and Sunday. Let's say we have 100 data points and we want to give the categorical data as an input to the embedding layer and train the embedding layer using Keras. How do we actually achieve it? Can you share some intuition with code examples?
I have tried this code but it gives me an error which says "ValueError: "input_length" is 1, but received input has shape (None, 26)". I have referred to this blog https://medium.com/#satnalikamayank12/on-learning-embeddings-for-categorical-data-using-keras-165ff2773fc9, but I didn't get how to use it for my particular case.
from sklearn.preprocessing import LabelEncoder
l_encoder=LabelEncoder()
l_encoder.fit(X_train["Weekdays"])
encoded_weekdays_train=l_encoder.transform(X_train["Weekdays"])
encoded_weekdays_test=l_encoder.transform(X_test["Weekdays"])
no_of_unique_cat=len(X_train.school_state.unique())
embedding_size = min(np.ceil((no_of_unique_cat)/2),50)
embedding_size = int(embedding_size)
vocab = no_of_unique_cat+1
#Get the flattened LSTM output for categorical text
input_layer2 = Input(shape=(embedding_size,))
embedding = Embedding(input_dim=vocab, output_dim=embedding_size, input_length=1, trainable=True)(input_layer2)
flatten_school_state = Flatten()(embedding)
I want to know in case of 7 categories, what will be the shape of input_layer2? What should be the vocab size, output dim and input_length? Can anyone explain, or correct my code? Your insights will be really helpful.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-46-e28d41acae85> in <module>
1 #Get the flattened LSTM output for input text
2 input_layer2 = Input(shape=(embedding_size,))
----> 3 embedding = Embedding(input_dim=vocab, output_dim=embedding_size, input_length=1, trainable=True)(input_layer2)
4 flatten_school_state = Flatten()(embedding)
~/anaconda3/lib/python3.7/site-packages/keras/engine/base_layer.py in __call__(self, inputs, **kwargs)
472 if all([s is not None
473 for s in to_list(input_shape)]):
--> 474 output_shape = self.compute_output_shape(input_shape)
475 else:
476 if isinstance(input_shape, list):
~/anaconda3/lib/python3.7/site-packages/keras/layers/embeddings.py in compute_output_shape(self, input_shape)
131 raise ValueError(
132 '"input_length" is %s, but received input has shape %s' %
--> 133 (str(self.input_length), str(input_shape)))
134 elif s1 is None:
135 in_lens[i] = s2
ValueError: "input_length" is 1, but received input has shape (None, 26)
embedding_size can never be the input size.
A Keras embedding takes "integers" as input. You should have your data as numbers from 0 to 6.
If your 100 data points form a sequence of days, you cannot restrict the length of the sequences in the embedding to 1.
Your input shape should be (length_of_sequence,). Which means your training data should have shape (any, length_of_sequence). Which is probably (1, 100) by your description.
All the rest is automatic.

strides should be of length 1, 1 or 3 but was 2

I have been trying to stack Convolutional neural networks with GRUs for an image to text problem.
Here's my model :
model=Sequential()
model.add(TimeDistributed(Conv2D(16,kernel_size
(3,3),data_format="channels_last",input_shape=
(129,80,564,3),padding='SAME',strides=(1,1))))
model.add(TimeDistributed(Activation("relu")))
model.add(TimeDistributed(Conv2D(16,kernel_size =(3,3),strides=(1,1))))
model.add(TimeDistributed(Activation("relu")))
model.add(TimeDistributed(MaxPooling2D(pool_size=2,strides=(1,1) )))
model.add(TimeDistributed(Reshape((280*38*16,))))
model.add(TimeDistributed(Dense(32)))
model.add(GRU(512))
model.add(Dense(50))
model.add(Activation("softmax"))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=
['accuracy'])
When I try to fit my model I get the following error :
-------------------------------------------------------------------------
ValueError Traceback (most recent call
last)
<ipython-input-125-c6a3c418689c> in <module>()
1 nb_epoch = 100
----> 2 model.fit(X2,L2, epochs=100)
10 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py
in _get_sequence(value, n, channel_index, name)
71 else:
72 raise ValueError("{} should be of length 1, {} or {} but was
{}".format(
---> 73 name, n, n + 2, current_n))
74
75 if channel_index == 1:
ValueError: strides should be of length 1, 1 or 3 but was 2
I cannot even begin to wrap my head around why this message appears.I have specified the "strides" parameters for all layers. Any help will be deeply appreciated.
P.S - I did not have any problems when I tried to fit a model without TimeDistributed layers. Maybe there is something to do with this that raises this error.
You have made several mistakes in your code.
In the first layer you should specify the input_shape of the TimeDistributed layer, not Conv2D layer.
MaxPooling2D is used for down-sampling the images spatial size. but with strides=(1,1) the image size will remain same and not be reduced.
Using padding='SAME' in the first layer will add zero-padding while doing convolution and will result in a shape mismatch error in the Reshape layer. Instead you can use Flatten layer.
Default value of stride in a Conv2D is strides=(1,1), so, it's optional to mention.
Finally, the working code should be something following:
model=keras.models.Sequential()
model.add(keras.layers.TimeDistributed(keras.layers.Conv2D(16, kernel_size=(3,3), data_format="channels_last"),input_shape=(129,80,564,3)))
model.add(keras.layers.TimeDistributed(keras.layers.Activation("relu")))
model.add(keras.layers.TimeDistributed(keras.layers.Conv2D(16, kernel_size =(3,3))))
model.add(keras.layers.TimeDistributed(keras.layers.Activation("relu")))
model.add(keras.layers.TimeDistributed(keras.layers.MaxPooling2D(pool_size=2)))
# model.add(keras.layers.TimeDistributed(keras.layers.Flatten()))
model.add(keras.layers.TimeDistributed(keras.layers.Reshape((280*38*16,))))
model.add(keras.layers.TimeDistributed(keras.layers.Dense(32)))
model.add(keras.layers.GRU(512))
model.add(keras.layers.Dense(50))
model.add(keras.layers.Activation("softmax"))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics= ['accuracy'])

Error on executing sess.run() ValueError: setting an array element with a sequence

This error is generated when I try to run the training step. The dataset is MNIST dataset from Kaggle. I'm using a neural network to predict the handwritten digits:
Input Data : [33600, 784] reshaped into [784, 33600]
Neural network architecture:
Layer 1 has W1 1000 by 784 relu
Layer 2 has W2 1000 by 1000 relu
Layer 3 has W3 500 by 1000 relu
Layer 4 has W4 200 by 500 relu
Layer 5 has W5 10 by 200 with softmax
No biases used
Code:
print(X_train[:, 0].reshape(-1, 1).shape," ",y_train[:,0].reshape(-1,1).shape)`
Output: (784, 1) (10, 1)
Code:
X, Y = tf.placeholder(tf.float32,[784, None]), tf.placeholder(tf.float32,[10, None])
logits = forward_propagation(X, parameters)
cost = compute_cost(logits, Y)
optimizer = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(cost)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
_,c = sess.run([optimizer,cost], feed_dict= {X:X_train[:,0].reshape(-1,1),
Y:y_train[:, 0].reshape(-1,1)})
print(c)
Output:
ValueError Traceback (most recent call
last)
<ipython-input-41-f78f499b0606> in <module>()
8 with tf.Session() as sess:
9 sess.run(tf.global_variables_initializer())
---> 10 _,c = sess.run([optimizer,cost], feed_dict=
{X:np.asarray(X_train), Y:np.asarray(y_train)})
11 print(c)
.......
.......
ValueError: setting an array element with a sequence.
Please correct the code if you can.
I got the solution. As mentioned in the answers of many other similar questions, the problem is generally with the shape & type of arrays provided to the feed_dict.
My main focus was only on X:X_train[:,0].reshape(-1,1) but it was correct shape and type. The error was in Y:y_train[:, 0].reshape(-1,1). I could not detect this error because I applied one_hot_encoding on y_train but forgot to use the .toarray() method after tranforming. So the shape of y_train appeared to be correct but actually it was wrong.
As a general suggestion after going through many similar questions, I would say to thoroughly check shapes, types and content of the arrays being fed to the feed_dict.

How to fit a 3D matrix in keras model?

I am creating a regression model with keras. I have 10 145 * 5 matrices of ten digits. I am facing problem to fit the 10 145 * 5 matrices in keras model.
X is input matrix
In: X.shape
Out: (10, 145, 5)
y is target matrix
In: y.shape
Out: (10,)
For each 145 * 5 matrix there will be one value in the target matrix
Making the model
In: model = Sequential([
Dense(32, input_dim=145),
Activation('sigmoid'),
Dense(output_dim=10)
])
Although the previous line is not throwing any error or warning but I am quite sure that it is not the correct way to fit the model in this case.
In: model.compile(optimizer='sgd',loss='mse')
No problem so far. But when I am trying to fit the matrices
In: model.fit(X, y.reshape(-1, 1))
After this line I am getting a long Traceback which ultimately says
ValueError: Error when checking model input: expected dense_input_1 to have 2 dimensions, but got array with shape (10, 145, 5)
Please help me to correctly fit the matrices in the model. Thanks!
Use input_shape instead of input_dim. Also since the number of dimensions of output is changing you need to use Flatten or Reshape as one of the dimensions.
from keras.layers import Flatten
model = Sequential([
Dense(32, input_shape=(145,5)),
Flatten(),
Activation('sigmoid'),
Dense(output_dim=10)
])
model.summary()
Use model.summary() to check the structure of your model for better understanding.

Resources