Invalid argument: indices[0,0] = -4 is not in [0, 40405) - keras

I have a model that was kinda working on some data. I've added in some tokenized word data in the dataset (somewhat truncated for brevity):
vocab_size = len(tokenizer.word_index) + 1
comment_texts = df.comment_text.values
tokenizer = Tokenizer(num_words=num_words)
tokenizer.fit_on_texts(comment_texts)
comment_seq = tokenizer.texts_to_sequences(comment_texts)
maxtrainlen = max_length(comment_seq)
comment_train = pad_sequences(comment_seq, maxlen=maxtrainlen, padding='post')
vocab_size = len(tokenizer.word_index) + 1
df.comment_text = comment_train
x = df.drop('label', 1) # the thing I'm training
labels = df['label'].values # Also known as Y
x_train, x_test, y_train, y_test = train_test_split(
x, labels, test_size=0.2, random_state=1337)
n_cols = x_train.shape[1]
embedding_dim = 100 # TODO: why?
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_shape=(n_cols,)),
LSTM(32),
Dense(32, activation='relu'),
Dense(512, activation='relu'),
Dense(12, activation='softmax'), # for an unknown type, we don't account for that while training
])
model.summary()
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['acc'])
# convert the y_train to a one hot encoded variable
encoder = LabelEncoder()
encoder.fit(labels) # fit on all the labels
encoded_Y = encoder.transform(y_train) # encode on y_train
one_hot_y = np_utils.to_categorical(encoded_Y)
model.fit(x_train, one_hot_y, epochs=10, batch_size=16)
Now, I get this error:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 12, 100) 4040500
_________________________________________________________________
lstm (LSTM) (None, 32) 17024
_________________________________________________________________
dense (Dense) (None, 32) 1056
_________________________________________________________________
dense_1 (Dense) (None, 512) 16896
_________________________________________________________________
dense_2 (Dense) (None, 12) 6156
=================================================================
Total params: 4,081,632
Trainable params: 4,081,632
Non-trainable params: 0
_________________________________________________________________
Train on 4702 samples
Epoch 1/10
2020-03-04 22:37:59.499238: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: indices[0,0] = -4 is not in [0, 40405)
I think this must be coming from my comment_text column since that is the only thing I added.
Here is what comment_text looks like before I make the substitution:
And here is after:
My full code (before I made the change) is here:
https://colab.research.google.com/drive/1y8Lhxa_DROZg0at3VR98fi5WCcunUhyc#scrollTo=hpEoqR4ne9TO

You should be training with comment_train, not with x which is taking whatever is in the unknown df.
The embedding_dim=100 is free to choose. It's like the number of units in a hidden layer. You can tune this parameter to find which is best for your model as well as you can tune the number of units in hidden layers.
In your case, you will need a model with two or more inputs:
One input for the comments, passing through the embedding and processing text
Another input for the rest of the data, passing probably through a standard netork.
At some point you will concatenate these two branches and keep on going.
This link has a good tutorial about the functional API models and shows a model that has two text inputs and an extra input: https://www.tensorflow.org/guide/keras/functional

Related

Keras LSTM Layer ValueError: Dimensions must be equal, but are 17 and 2

I'm working on a basic RNN model for a multiclass task and I'm facing some issues with output dimensions.
This is my input/output shapes:
input.shape = (50000, 2, 5) # (samples, features, feature_len)
output.shape = (50000, 17, 185) # (samples, features, feature_len) <-- one hot encoded
input[0].shape = (2, 5)
output[0].shape = (17, 185)
This is my model, using Keras functional API:
inp = tf.keras.Input(shape=(2, 5,))
x = tf.keras.layers.LSTM(128, input_shape=(2, 5,), return_sequences=True, activation='relu')(inp)
out = tf.keras.layers.Dense(185, activation='softmax')(x)
model = tf.keras.models.Model(inputs=inp, outputs=out)
This is my model.summary():
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 2, 5)] 0
_________________________________________________________________
lstm (LSTM) (None, 2, 128) 68608
_________________________________________________________________
dense (Dense) (None, 2, 185) 23865
=================================================================
Total params: 92,473
Trainable params: 92,473
Non-trainable params: 0
_________________________________________________________________
Then I compile the model and run fit():
model.compile(optimizer='adam',
loss=tf.nn.softmax_cross_entropy_with_logits,
metrics='accuracy')
model.fit(x=input, y=output, epochs=5)
And I'm getting a dimension error:
ValueError: Dimensions must be equal, but are 17 and 2 for '{{node Equal}} = Equal[T=DT_INT64, incompatible_shape_error=true](ArgMax, ArgMax_1)' with input shapes: [?,17], [?,2].
The error is clear, the model output a dimension 2 and my output has dimension 17, although I understand the issue, I can't find a way of fixing it, any ideas?
I think your output shape is not "output[0].shape = (17, 185)" but "dense (Dense) (None, 2, 185) ".
You need to change your output shape or change your layer structure.
LSTM output is a list of encoder_outputs, when you specify return_sequences=True. hence; I suggest just using the last item of encoder_outputs as the input of your Dense layer. you can see the example section of this link to the documentation. It may help you.

ValueError: expected dense_22 to have shape (None, 37) but got array with shape (1000, 2)

I am currently working on a question answering system. I create a synthetic dataset that contains multiple words in the answers. But, the answers are not a span of the given context.
Initially, I am planning to test it using a deep learning-based model. But I have some problems building the model.
This is how I vectorized data.
def vectorize(data, word2idx, story_maxlen, question_maxlen, answer_maxlen):
""" Create the story and question vectors and the label """
Xs, Xq, Y = [], [], []
for story, question, answer in data:
xs = [word2idx[word] for word in story]
xq = [word2idx[word] for word in question]
y = [word2idx[word] for word in answer]
#y = np.zeros(len(word2idx) + 1)
#y[word2idx[answer]] = 1
Xs.append(xs)
Xq.append(xq)
Y.append(y)
return (pad_sequences(Xs, maxlen=story_maxlen),
pad_sequences(Xq, maxlen=question_maxlen),
pad_sequences(Y, maxlen=answer_maxlen))
#np.array(Y))
below is how I create the model.
# story encoder. Output dim: (None, story_maxlen, EMBED_HIDDEN_SIZE)
story_encoder = Sequential()
story_encoder.add(Embedding(input_dim=vocab_size,
output_dim=EMBED_HIDDEN_SIZE,
input_length=story_maxlen))
story_encoder.add(Dropout(0.3))
# question encoder. Output dim: (None, question_maxlen, EMBED_HIDDEN_SIZE)
question_encoder = Sequential()
question_encoder.add(Embedding(input_dim=vocab_size,
output_dim=EMBED_HIDDEN_SIZE,
input_length=question_maxlen))
question_encoder.add(Dropout(0.3))
# episodic memory (facts): story * question
# Output dim: (None, question_maxlen, story_maxlen)
facts_encoder = Sequential()
facts_encoder.add(Merge([story_encoder, question_encoder],
mode="dot", dot_axes=[2, 2]))
facts_encoder.add(Permute((2, 1)))
## combine response and question vectors and do logistic regression
answer = Sequential()
answer.add(Merge([facts_encoder, question_encoder],
mode="concat", concat_axis=-1))
answer.add(LSTM(LSTM_OUTPUT_SIZE, return_sequences=True))
answer.add(Dropout(0.3))
answer.add(Flatten())
answer.add(Dense(vocab_size,activation= "softmax"))
answer.compile(optimizer="rmsprop", loss="categorical_crossentropy",
metrics=["accuracy"])
answer.fit([Xs_train, Xq_train], Y_train,
batch_size=BATCH_SIZE, nb_epoch=NBR_EPOCHS,
validation_data=([Xs_test, Xq_test], Y_test))
and this is the summary of the model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
merge_46 (Merge) (None, 5, 616) 0
_________________________________________________________________
lstm_23 (LSTM) (None, 5, 32) 83072
_________________________________________________________________
dropout_69 (Dropout) (None, 5, 32) 0
_________________________________________________________________
flatten_9 (Flatten) (None, 160) 0
_________________________________________________________________
dense_22 (Dense) (None, 37) 5957
=================================================================
Total params: 93,765.0
Trainable params: 93,765.0
Non-trainable params: 0.0
_________________________________________________________________
It gives the following error.
ValueError: Error when checking model target: expected dense_22 to have shape (None, 37) but got array with shape (1000, 2)
I think the error is related to Y_train, Y_test. I should encode them to categorical values and the answers are not spans of text, but sequential. I don't know what/how to do it.
how can I fix it? any ideas?
EDIT:
When I use sparse_categorical_crossentropy in the loss, and Reshape(2,-1);
answer.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
merge_94 (Merge) (None, 5, 616) 0
_________________________________________________________________
lstm_65 (LSTM) (None, 5, 32) 83072
_________________________________________________________________
dropout_139 (Dropout) (None, 5, 32) 0
_________________________________________________________________
reshape_22 (Reshape) (None, 2, 80) 0
_________________________________________________________________
dense_44 (Dense) (None, 2, 37) 2997
=================================================================
Total params: 90,805.0
Trainable params: 90,805.0
Non-trainable params: 0.0
_________________________________________________________________
EDIT2:
The model after modifications
# story encoder. Output dim: (None, story_maxlen, EMBED_HIDDEN_SIZE)
story_encoder = Sequential()
story_encoder.add(Embedding(input_dim=vocab_size,
output_dim=EMBED_HIDDEN_SIZE,
input_length=story_maxlen))
story_encoder.add(Dropout(0.3))
# question encoder. Output dim: (None, question_maxlen, EMBED_HIDDEN_SIZE)
question_encoder = Sequential()
question_encoder.add(Embedding(input_dim=vocab_size,
output_dim=EMBED_HIDDEN_SIZE,
input_length=question_maxlen))
question_encoder.add(Dropout(0.3))
# episodic memory (facts): story * question
# Output dim: (None, question_maxlen, story_maxlen)
facts_encoder = Sequential()
facts_encoder.add(Merge([story_encoder, question_encoder],
mode="dot", dot_axes=[2, 2]))
facts_encoder.add(Permute((2, 1)))
## combine response and question vectors and do logistic regression
## combine response and question vectors and do logistic regression
answer = Sequential()
answer.add(Merge([facts_encoder, question_encoder],
mode="concat", concat_axis=-1))
answer.add(LSTM(LSTM_OUTPUT_SIZE, return_sequences=True))
answer.add(Dropout(0.3))
#answer.add(Flatten())
answer.add(keras.layers.Reshape((2, -1)))
answer.add(Dense(vocab_size,activation= "softmax"))
answer.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy",
metrics=["accuracy"])
answer.fit([Xs_train, Xq_train], Y_train,
batch_size=BATCH_SIZE, nb_epoch=NBR_EPOCHS,
validation_data=([Xs_test, Xq_test], Y_test))
It still gives
ValueError: Error when checking model target: expected dense_46 to have 3 dimensions, but got array with shape (1000, 2)
As far as I understand - Y_train, Y_test comprise of indexes (not one-hot vectors). If so - change loss to sparse_categorical_entropy:
answer.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy",
metrics=["accuracy"])
As far as I understand - Y_train, Y_test have a sequence dimension. And the length of questions (5) doesn't equal to the length of the answers (2). This dimension is removed by Flatten(). Try to replace Flatten() by Reshape():
# answer.add(Flatten())
answer.add(tf.keras.layers.Reshape((2, -1)))

Keras ValueError: Shapes (None, 1) and (None, 16) are incompatible

I'm trying to train a multilabel classification from text input.
I first tokenize the text
tokenizer = Tokenizer(num_words=max_words)
tokenizer.fit_on_texts(df['text'])
sequences = tokenizer.texts_to_sequences(df['text'])
data = pad_sequences(sequences, maxlen=maxlen)
getting the following shape:
Shape of data tensor: (1333, 100) Shape of label tensor: (1333,)
Then I split in train and validations
x_train = data[:training_samples]
y_train = labels[:training_samples]
x_val = data[training_samples: training_samples + validation_samples]
y_val = labels[training_samples: training_samples + validation_samples]
I use Glove for word representations
embeddings_index = {}
f = open(os.path.join(glove_dir, 'glove.6B.100d.txt'))
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
f.close()
embedding_dim = 100
embedding_matrix = np.zeros((max_words, embedding_dim))
for word, i in word_index.items():
if i < max_words:
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
I build the Keras model
model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(16, activation='softmax'))
model.summary()
Ending up with
Model: "sequential_32"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_27 (Embedding) (None, 100, 100) 1000000
_________________________________________________________________
flatten_21 (Flatten) (None, 10000) 0
_________________________________________________________________
dense_56 (Dense) (None, 64) 640064
_________________________________________________________________
dense_57 (Dense) (None, 16) 1040
=================================================================
Total params: 1,641,104
Trainable params: 1,641,104
Non-trainable params: 0
I set the weigth of the emedding layer:
model.layers[0].set_weights([embedding_matrix])
model.layers[0].trainable = False
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['categorical_accuracy'])
history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_data=(x_val, y_val))
But I end up with the error
ValueError: Shapes (None, 1) and (None, 16) are incompatible
Everything works right if I do a single-label classification (using Dense(1) as last layer and sigmoid activation), but I can't understand why this is happening.
You should encode your labels into one-hot format if you use categorical_crossentropy.
Otherwise you can try with sparse_categorical_crossentropy as loss function which accept your format of labels (info).
https://stats.stackexchange.com/questions/326065/cross-entropy-vs-sparse-cross-entropy-when-to-use-one-over-the-other

Error when checking input: expected dense_28_input to have 2 dimensions, but got array with shape (3084, 32, 32)

I'm trying to make a prediction with my model where shape of the array is (3084, 32, 32).
Getting value Error here is error image
Here is my model
model.add(Dense(1028, input_shape = (3084,), activation = "sigmoid"))
model.add(Dense(514, activation="sigmoid"))
model.add(Dense(len(lb.classes_), activation="softmax"))
summary
Model: "sequential_21"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_57 (Dense) (None, 1028) 3171380
_________________________________________________________________
dense_58 (Dense) (None, 514) 528906
_________________________________________________________________
dense_59 (Dense) (None, 4) 2060
=================================================================
Total params: 3,702,346
Trainable params: 3,702,346
Non-trainable params: 0
_________________________________________________________________
trying to fit using
opt = SGD(lr = 0.01)
model.compile(loss = "categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
H = model.fit(train_X, train_Y, validation_data = (test_X, test_Y), epochs = 75, batch_size = 32)
You need to specify the input shape correctly, the following model should work.
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
model = Sequential()
model.add(Dense(1028, input_shape = (32,32), activation = "sigmoid"))
model.add(Flatten())
model.add(Dense(514, activation="sigmoid"))
model.add(Dense(4, activation="softmax"))
model.summary()

How to see keras.engine.sequential.Sequential

I am new to Keras and deep learning and was working with MNIST on Keras. When I created a model using
model = models.Sequential()
model.add(layers.Dense(512,activation = 'relu',input_shape=(28*28,)))
model.add(layers.Dense(32,activation ='relu'))
model.add(layers.Dense(10,activation='softmax'))
and then I printed it
print(model)
output is
<keras.engine.sequential.Sequential at 0x7f3d554f6710>
My question is that is there any way to see a better result of Keras, meaning if i print model i can see that i have 3 hidden layers with first hidden layer having 512 hidden units and 784 input units, 2nd hidden layer having 512 input units and 32 hidden units and so on.
You can also try plot_model()
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(512,activation = 'relu',input_shape=(28*28,)))
model.add(tf.keras.layers.Dense(32,activation ='relu'))
model.add(tf.keras.layers.Dense(10,activation='softmax'))
model.summary()
from keras.utils.vis_utils import plot_model
plot_model(model, show_shapes=True, show_layer_names=True)
model.summary() will print he entire model for you.
model = Sequential()
model.add(Dense(512,activation = 'relu',input_shape=(28*28,)))
model.add(Dense(32,activation ='relu'))
model.add(Dense(10,activation='softmax'))
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 512) 401920
_________________________________________________________________
dense_1 (Dense) (None, 32) 16416
_________________________________________________________________
dense_2 (Dense) (None, 10) 330
=================================================================
Total params: 418,666
Trainable params: 418,666
Non-trainable params: 0
____________________________

Resources