I want to build a deep RNN where my x_train and my y_train. When I execute the code below:
print(X_train_fea.shape, y_train_fea.shape)
X_train_res = np.reshape(X_train_fea,(10510,10,1))
y_train_res = np.reshape(y_train_fea.to_numpy(),(-1,1))
print(X_train_res.shape, y_train_res.shape)
result:
(10510, 10) (10510,)
(10510, 10, 1) (10510, 1)
and
model = Sequential([
LSTM(90, input_shape=(10,1)),
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
When I fit the model
history = model.fit(X_train_res, y_train_res,epochs=5)
I got
ValueError: Shapes (None, 1) and (None, 90) are incompatible
Looks like y_train_res comprise of integer indices not one-hot vectors. If so you have to use sparse_categorical_crossentropy:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
and change its shape to 1D:
y_train_res = np.reshape(y_train_fea.to_numpy(),(-1,))
Related
I'm trying to train a multilabel classification from text input.
I first tokenize the text
tokenizer = Tokenizer(num_words=max_words)
tokenizer.fit_on_texts(df['text'])
sequences = tokenizer.texts_to_sequences(df['text'])
data = pad_sequences(sequences, maxlen=maxlen)
getting the following shape:
Shape of data tensor: (1333, 100) Shape of label tensor: (1333,)
Then I split in train and validations
x_train = data[:training_samples]
y_train = labels[:training_samples]
x_val = data[training_samples: training_samples + validation_samples]
y_val = labels[training_samples: training_samples + validation_samples]
I use Glove for word representations
embeddings_index = {}
f = open(os.path.join(glove_dir, 'glove.6B.100d.txt'))
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
f.close()
embedding_dim = 100
embedding_matrix = np.zeros((max_words, embedding_dim))
for word, i in word_index.items():
if i < max_words:
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
I build the Keras model
model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(16, activation='softmax'))
model.summary()
Ending up with
Model: "sequential_32"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_27 (Embedding) (None, 100, 100) 1000000
_________________________________________________________________
flatten_21 (Flatten) (None, 10000) 0
_________________________________________________________________
dense_56 (Dense) (None, 64) 640064
_________________________________________________________________
dense_57 (Dense) (None, 16) 1040
=================================================================
Total params: 1,641,104
Trainable params: 1,641,104
Non-trainable params: 0
I set the weigth of the emedding layer:
model.layers[0].set_weights([embedding_matrix])
model.layers[0].trainable = False
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['categorical_accuracy'])
history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_data=(x_val, y_val))
But I end up with the error
ValueError: Shapes (None, 1) and (None, 16) are incompatible
Everything works right if I do a single-label classification (using Dense(1) as last layer and sigmoid activation), but I can't understand why this is happening.
You should encode your labels into one-hot format if you use categorical_crossentropy.
Otherwise you can try with sparse_categorical_crossentropy as loss function which accept your format of labels (info).
https://stats.stackexchange.com/questions/326065/cross-entropy-vs-sparse-cross-entropy-when-to-use-one-over-the-other
I have a model that was kinda working on some data. I've added in some tokenized word data in the dataset (somewhat truncated for brevity):
vocab_size = len(tokenizer.word_index) + 1
comment_texts = df.comment_text.values
tokenizer = Tokenizer(num_words=num_words)
tokenizer.fit_on_texts(comment_texts)
comment_seq = tokenizer.texts_to_sequences(comment_texts)
maxtrainlen = max_length(comment_seq)
comment_train = pad_sequences(comment_seq, maxlen=maxtrainlen, padding='post')
vocab_size = len(tokenizer.word_index) + 1
df.comment_text = comment_train
x = df.drop('label', 1) # the thing I'm training
labels = df['label'].values # Also known as Y
x_train, x_test, y_train, y_test = train_test_split(
x, labels, test_size=0.2, random_state=1337)
n_cols = x_train.shape[1]
embedding_dim = 100 # TODO: why?
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_shape=(n_cols,)),
LSTM(32),
Dense(32, activation='relu'),
Dense(512, activation='relu'),
Dense(12, activation='softmax'), # for an unknown type, we don't account for that while training
])
model.summary()
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['acc'])
# convert the y_train to a one hot encoded variable
encoder = LabelEncoder()
encoder.fit(labels) # fit on all the labels
encoded_Y = encoder.transform(y_train) # encode on y_train
one_hot_y = np_utils.to_categorical(encoded_Y)
model.fit(x_train, one_hot_y, epochs=10, batch_size=16)
Now, I get this error:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 12, 100) 4040500
_________________________________________________________________
lstm (LSTM) (None, 32) 17024
_________________________________________________________________
dense (Dense) (None, 32) 1056
_________________________________________________________________
dense_1 (Dense) (None, 512) 16896
_________________________________________________________________
dense_2 (Dense) (None, 12) 6156
=================================================================
Total params: 4,081,632
Trainable params: 4,081,632
Non-trainable params: 0
_________________________________________________________________
Train on 4702 samples
Epoch 1/10
2020-03-04 22:37:59.499238: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: indices[0,0] = -4 is not in [0, 40405)
I think this must be coming from my comment_text column since that is the only thing I added.
Here is what comment_text looks like before I make the substitution:
And here is after:
My full code (before I made the change) is here:
https://colab.research.google.com/drive/1y8Lhxa_DROZg0at3VR98fi5WCcunUhyc#scrollTo=hpEoqR4ne9TO
You should be training with comment_train, not with x which is taking whatever is in the unknown df.
The embedding_dim=100 is free to choose. It's like the number of units in a hidden layer. You can tune this parameter to find which is best for your model as well as you can tune the number of units in hidden layers.
In your case, you will need a model with two or more inputs:
One input for the comments, passing through the embedding and processing text
Another input for the rest of the data, passing probably through a standard netork.
At some point you will concatenate these two branches and keep on going.
This link has a good tutorial about the functional API models and shows a model that has two text inputs and an extra input: https://www.tensorflow.org/guide/keras/functional
I have one column with categorical data with 1003 different categories, and I have a lot of columns with regular integer data. I want to embed the column with categorical data and have the embedded output together with all the other columns as input to my model. I am unsure of how to do this but have tried in the following code using merge. Unfortunately, this gives a Value error: '"concat" mode can only merge layers with matching output shapes except for the concat axis. Layer shapes: [(None, 1, 11), (None, 53)]'.
Any help would be greatly appreciated.
hidden_layers = [1000,500,500]
embedding = Sequential()
embedding.add(1003, 11, input_length = 1))
model1 = Sequential()
model1.add(Dense(53, input_dim=53, activation='relu'))
model = Sequential()
model = model.add(Merge([embedding, model1], mode = 'concat'))
for i, layer_size in enumerate(hidden_layers):
model.add(Dense(layer_size, activation='relu'))
model.add(Dense(self.output_layers, activation='linear'))
model.compile(optimizer = 'adam', loss = 'mse')
The Embedding layer produces a 3D tensor as you see in the error message (None, 1, 11) where 1 is the sequence length you are embedding. In order to merge with a 2D tensor you would have to Flatten it:
embedding = Sequential()
embedding.add(Embedding(1003, 11, input_length = 1))
embedding.add(Flatten())
which will give (None, 11) and can be merged with (None, 53).
I'm trying to train a model that takes in two inputs, concatenates them, and feeds the result into an LSTM. The last layer is a Dense() call, and the targets are binary vectors (with more than one 1). The task is classification.
My input sequences are 50 rows of 23 timesteps with 5625 features (x_train), and my supplementary input (not really a sequence) is 50 one-hot rows of length 23 (total_hours)
The error I'm getting is:
ValueError: Error when checking target: expected dense_1 to have shape (1, 5625)
but got array with shape (5625, 1)
And my code is:
import numpy as np
from keras.layers import LSTM, Dense, Input, Concatenate
from keras.models import Model
#CREATING DUMMY INPUT
hours_input_1 = np.eye(23)
hours_input_2 = np.eye(23)
hours_input_3 = np.pad(np.eye(4), pad_width=((0, 19), (0, 19)), mode='constant')
hours_input_3 = hours_input_3[:4,]
total_hours = np.vstack((hours_input_1, hours_input_2, hours_input_3))
seq_input = np.random.normal(size=(50, 24, 5625))
y_train = np.array([seq_input[i, -1, :] for i in range(50)])
x_train = np.array([seq_input[i, :-1, :] for i in range(50)])
#print 'total_hours', total_hours.shape #(50, 23)
#print 'x_train', x_train.shape #(50, 23, 5625)
#print 'y_train shape', y_train.shape #(50, 5625)
#MODEL DEFINITION
seq_model_in = Input(shape=(1,), batch_shape=(1, 1, 5625))
hours_model_in = Input(shape=(1,), batch_shape=(1, 1, 1))
merged = Concatenate(axis=-1)([seq_model_in, hours_model_in])
#print merged.shape #(1, 1, 5626) = added the 'hour' on as an extra feature
merged_lstm = LSTM(10, batch_input_shape=(1, 1, 5625), return_sequences=False, stateful=True)(merged)
merged_dense = Dense(5625, activation='sigmoid')(merged_lstm)
model = Model(inputs=[seq_model_in, hours_model_in], outputs=merged_dense)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
#TRAINING
for epoch in range(10):
for i in range(50):
y_true = y_train[i,:]
for j in range(23):
input_1 = np.expand_dims(np.expand_dims(x_train[i][j], axis=1), axis=1)
input_1 = np.reshape(input_1, (1, 1, x_train.shape[2]))
input_2 = np.expand_dims(np.expand_dims(np.array([total_hours[i][j]]), axis=1), axis=1)
tr_loss, tr_acc = model.train_on_batch([input_1, input_2], y_true)#np.array([y_true]))
model.reset_states()
My model.summary() looks like this:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (1, 1, 5625) 0
__________________________________________________________________________________________________
input_2 (InputLayer) (1, 1, 1) 0
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (1, 1, 5626) 0 input_1[0][0]
input_2[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (1, 10) 225480 concatenate_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (1, 5625) 61875 lstm_1[0][0]
==================================================================================================
Total params: 287,355
Trainable params: 287,355
Non-trainable params: 0
__________________________________________________________________________________________________
I am working with Keras version 2.1.2 with the TensorFlow backend (TensorFlow version 1.4.0. How can I resolve the ValueError?
It turns out I needed to address the target, as the ValueError implied.
If you replace:
y_true = y_train[i,:]
with:
y_true_1 = np.expand_dims(y_train[i,:], axis=1)
y_true = np.swapaxes(y_true_1, 0, 1)
The code runs.
I meet exactly the same question as you. I also followed the guide of #td2014 but finally an error appears. My input shape is (24443, 124, 30), my lstm layer is set as follows:
model = Sequential()
model.add(LSTM(4, input_shape = (1, 30), return_sequences = True))
model.add(Dense(1))
model.add(Activation('softmax'))
model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'adam')
model.fit(X_train, y_train, epoch = 1, batch_size = 124, verbose = 2)
The error I get is "Error when checking input : expected lstm_4_input have shape (None, 1, 30) but got array with shape (24443, 124, 30)"
Do you have some suggestions for that?