I'm trying to train a model that takes in two inputs, concatenates them, and feeds the result into an LSTM. The last layer is a Dense() call, and the targets are binary vectors (with more than one 1). The task is classification.
My input sequences are 50 rows of 23 timesteps with 5625 features (x_train), and my supplementary input (not really a sequence) is 50 one-hot rows of length 23 (total_hours)
The error I'm getting is:
ValueError: Error when checking target: expected dense_1 to have shape (1, 5625)
but got array with shape (5625, 1)
And my code is:
import numpy as np
from keras.layers import LSTM, Dense, Input, Concatenate
from keras.models import Model
#CREATING DUMMY INPUT
hours_input_1 = np.eye(23)
hours_input_2 = np.eye(23)
hours_input_3 = np.pad(np.eye(4), pad_width=((0, 19), (0, 19)), mode='constant')
hours_input_3 = hours_input_3[:4,]
total_hours = np.vstack((hours_input_1, hours_input_2, hours_input_3))
seq_input = np.random.normal(size=(50, 24, 5625))
y_train = np.array([seq_input[i, -1, :] for i in range(50)])
x_train = np.array([seq_input[i, :-1, :] for i in range(50)])
#print 'total_hours', total_hours.shape #(50, 23)
#print 'x_train', x_train.shape #(50, 23, 5625)
#print 'y_train shape', y_train.shape #(50, 5625)
#MODEL DEFINITION
seq_model_in = Input(shape=(1,), batch_shape=(1, 1, 5625))
hours_model_in = Input(shape=(1,), batch_shape=(1, 1, 1))
merged = Concatenate(axis=-1)([seq_model_in, hours_model_in])
#print merged.shape #(1, 1, 5626) = added the 'hour' on as an extra feature
merged_lstm = LSTM(10, batch_input_shape=(1, 1, 5625), return_sequences=False, stateful=True)(merged)
merged_dense = Dense(5625, activation='sigmoid')(merged_lstm)
model = Model(inputs=[seq_model_in, hours_model_in], outputs=merged_dense)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
#TRAINING
for epoch in range(10):
for i in range(50):
y_true = y_train[i,:]
for j in range(23):
input_1 = np.expand_dims(np.expand_dims(x_train[i][j], axis=1), axis=1)
input_1 = np.reshape(input_1, (1, 1, x_train.shape[2]))
input_2 = np.expand_dims(np.expand_dims(np.array([total_hours[i][j]]), axis=1), axis=1)
tr_loss, tr_acc = model.train_on_batch([input_1, input_2], y_true)#np.array([y_true]))
model.reset_states()
My model.summary() looks like this:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (1, 1, 5625) 0
__________________________________________________________________________________________________
input_2 (InputLayer) (1, 1, 1) 0
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (1, 1, 5626) 0 input_1[0][0]
input_2[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (1, 10) 225480 concatenate_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (1, 5625) 61875 lstm_1[0][0]
==================================================================================================
Total params: 287,355
Trainable params: 287,355
Non-trainable params: 0
__________________________________________________________________________________________________
I am working with Keras version 2.1.2 with the TensorFlow backend (TensorFlow version 1.4.0. How can I resolve the ValueError?
It turns out I needed to address the target, as the ValueError implied.
If you replace:
y_true = y_train[i,:]
with:
y_true_1 = np.expand_dims(y_train[i,:], axis=1)
y_true = np.swapaxes(y_true_1, 0, 1)
The code runs.
Related
I am trying to create a Scaled Dot Attention model using keras utility layer collection found at https://github.com/zimmerrol/keras-utility-layer-collection. I am using the default model (code below) provided in the examples notebook in the github repo. My problem is I get the following error :
ValueError: Input 0 is incompatible with layer model_1: expected
shape=(None, 500, 3), found shape=(500, 3)
The code is below
# input: time series with 500 steps
# each step has a 3dim valuethe output sequences
# of a LSTM, RNN, etc.
net_input = Input(shape=(500, 3))
net = TimeDistributed(Dense(3))(net_input)
# queries
net_q = TimeDistributed(Dense(3))(net_input)
# values
net_v = TimeDistributed(Dense(3))(net_input)
# keys
net_k = TimeDistributed(Dense(3))(net_input)
# add one ScaledDotProductAttention layer
net = attention.ScaledDotProductAttention(name="attention", return_attention=False)([net_q, net_v, net_k])
net_output = TimeDistributed(Dense(2))(net)
model = Model(inputs=net_input, outputs=net_output)
model.summary()
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics='accuracy')
# dummy data
# x = np.random.rand(64, 500, 3)
# y = np.random.rand(64, 500, 2)
# X, y = make_multilabel_classification(n_samples=500, n_features=3, n_classes=3, n_labels=2, random_state=1)
X, y = make_classification(n_samples=500, n_features=3, n_redundant=0, n_repeated=0, n_informative=3,
n_classes=3, random_state=1)
print(X.shape, y.shape)
model.fit(X, y, batch_size=500, epochs=1)
EDIT :
The summary of the model is also added below
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) [(None, 500, 3)] 0
__________________________________________________________________________________________________
time_distributed_5 (TimeDistrib (None, 500, 3) 12 input_2[0][0]
__________________________________________________________________________________________________
time_distributed_6 (TimeDistrib (None, 500, 3) 12 input_2[0][0]
__________________________________________________________________________________________________
time_distributed_7 (TimeDistrib (None, 500, 3) 12 input_2[0][0]
__________________________________________________________________________________________________
attention (ScaledDotProductAtte (None, 500, 3) 0 time_distributed_5[0][0]
time_distributed_6[0][0]
time_distributed_7[0][0]
__________________________________________________________________________________________________
time_distributed_8 (TimeDistrib (None, 500, 2) 8 attention[0][0]
==================================================================================================
Total params: 44
Trainable params: 44
Non-trainable params: 0
__________________________________________________________________________________________________
Any help would be appreciated. Thanks!
I'm working on a basic RNN model for a multiclass task and I'm facing some issues with output dimensions.
This is my input/output shapes:
input.shape = (50000, 2, 5) # (samples, features, feature_len)
output.shape = (50000, 17, 185) # (samples, features, feature_len) <-- one hot encoded
input[0].shape = (2, 5)
output[0].shape = (17, 185)
This is my model, using Keras functional API:
inp = tf.keras.Input(shape=(2, 5,))
x = tf.keras.layers.LSTM(128, input_shape=(2, 5,), return_sequences=True, activation='relu')(inp)
out = tf.keras.layers.Dense(185, activation='softmax')(x)
model = tf.keras.models.Model(inputs=inp, outputs=out)
This is my model.summary():
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 2, 5)] 0
_________________________________________________________________
lstm (LSTM) (None, 2, 128) 68608
_________________________________________________________________
dense (Dense) (None, 2, 185) 23865
=================================================================
Total params: 92,473
Trainable params: 92,473
Non-trainable params: 0
_________________________________________________________________
Then I compile the model and run fit():
model.compile(optimizer='adam',
loss=tf.nn.softmax_cross_entropy_with_logits,
metrics='accuracy')
model.fit(x=input, y=output, epochs=5)
And I'm getting a dimension error:
ValueError: Dimensions must be equal, but are 17 and 2 for '{{node Equal}} = Equal[T=DT_INT64, incompatible_shape_error=true](ArgMax, ArgMax_1)' with input shapes: [?,17], [?,2].
The error is clear, the model output a dimension 2 and my output has dimension 17, although I understand the issue, I can't find a way of fixing it, any ideas?
I think your output shape is not "output[0].shape = (17, 185)" but "dense (Dense) (None, 2, 185) ".
You need to change your output shape or change your layer structure.
LSTM output is a list of encoder_outputs, when you specify return_sequences=True. hence; I suggest just using the last item of encoder_outputs as the input of your Dense layer. you can see the example section of this link to the documentation. It may help you.
I'm trying to train a multilabel classification from text input.
I first tokenize the text
tokenizer = Tokenizer(num_words=max_words)
tokenizer.fit_on_texts(df['text'])
sequences = tokenizer.texts_to_sequences(df['text'])
data = pad_sequences(sequences, maxlen=maxlen)
getting the following shape:
Shape of data tensor: (1333, 100) Shape of label tensor: (1333,)
Then I split in train and validations
x_train = data[:training_samples]
y_train = labels[:training_samples]
x_val = data[training_samples: training_samples + validation_samples]
y_val = labels[training_samples: training_samples + validation_samples]
I use Glove for word representations
embeddings_index = {}
f = open(os.path.join(glove_dir, 'glove.6B.100d.txt'))
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
f.close()
embedding_dim = 100
embedding_matrix = np.zeros((max_words, embedding_dim))
for word, i in word_index.items():
if i < max_words:
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
I build the Keras model
model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(16, activation='softmax'))
model.summary()
Ending up with
Model: "sequential_32"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_27 (Embedding) (None, 100, 100) 1000000
_________________________________________________________________
flatten_21 (Flatten) (None, 10000) 0
_________________________________________________________________
dense_56 (Dense) (None, 64) 640064
_________________________________________________________________
dense_57 (Dense) (None, 16) 1040
=================================================================
Total params: 1,641,104
Trainable params: 1,641,104
Non-trainable params: 0
I set the weigth of the emedding layer:
model.layers[0].set_weights([embedding_matrix])
model.layers[0].trainable = False
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['categorical_accuracy'])
history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_data=(x_val, y_val))
But I end up with the error
ValueError: Shapes (None, 1) and (None, 16) are incompatible
Everything works right if I do a single-label classification (using Dense(1) as last layer and sigmoid activation), but I can't understand why this is happening.
You should encode your labels into one-hot format if you use categorical_crossentropy.
Otherwise you can try with sparse_categorical_crossentropy as loss function which accept your format of labels (info).
https://stats.stackexchange.com/questions/326065/cross-entropy-vs-sparse-cross-entropy-when-to-use-one-over-the-other
I have a model that was kinda working on some data. I've added in some tokenized word data in the dataset (somewhat truncated for brevity):
vocab_size = len(tokenizer.word_index) + 1
comment_texts = df.comment_text.values
tokenizer = Tokenizer(num_words=num_words)
tokenizer.fit_on_texts(comment_texts)
comment_seq = tokenizer.texts_to_sequences(comment_texts)
maxtrainlen = max_length(comment_seq)
comment_train = pad_sequences(comment_seq, maxlen=maxtrainlen, padding='post')
vocab_size = len(tokenizer.word_index) + 1
df.comment_text = comment_train
x = df.drop('label', 1) # the thing I'm training
labels = df['label'].values # Also known as Y
x_train, x_test, y_train, y_test = train_test_split(
x, labels, test_size=0.2, random_state=1337)
n_cols = x_train.shape[1]
embedding_dim = 100 # TODO: why?
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_shape=(n_cols,)),
LSTM(32),
Dense(32, activation='relu'),
Dense(512, activation='relu'),
Dense(12, activation='softmax'), # for an unknown type, we don't account for that while training
])
model.summary()
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['acc'])
# convert the y_train to a one hot encoded variable
encoder = LabelEncoder()
encoder.fit(labels) # fit on all the labels
encoded_Y = encoder.transform(y_train) # encode on y_train
one_hot_y = np_utils.to_categorical(encoded_Y)
model.fit(x_train, one_hot_y, epochs=10, batch_size=16)
Now, I get this error:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 12, 100) 4040500
_________________________________________________________________
lstm (LSTM) (None, 32) 17024
_________________________________________________________________
dense (Dense) (None, 32) 1056
_________________________________________________________________
dense_1 (Dense) (None, 512) 16896
_________________________________________________________________
dense_2 (Dense) (None, 12) 6156
=================================================================
Total params: 4,081,632
Trainable params: 4,081,632
Non-trainable params: 0
_________________________________________________________________
Train on 4702 samples
Epoch 1/10
2020-03-04 22:37:59.499238: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: indices[0,0] = -4 is not in [0, 40405)
I think this must be coming from my comment_text column since that is the only thing I added.
Here is what comment_text looks like before I make the substitution:
And here is after:
My full code (before I made the change) is here:
https://colab.research.google.com/drive/1y8Lhxa_DROZg0at3VR98fi5WCcunUhyc#scrollTo=hpEoqR4ne9TO
You should be training with comment_train, not with x which is taking whatever is in the unknown df.
The embedding_dim=100 is free to choose. It's like the number of units in a hidden layer. You can tune this parameter to find which is best for your model as well as you can tune the number of units in hidden layers.
In your case, you will need a model with two or more inputs:
One input for the comments, passing through the embedding and processing text
Another input for the rest of the data, passing probably through a standard netork.
At some point you will concatenate these two branches and keep on going.
This link has a good tutorial about the functional API models and shows a model that has two text inputs and an extra input: https://www.tensorflow.org/guide/keras/functional
I have a list of dense layers with same output shapes [batch, 1]. If I combine the outputs of these layers with keras.layers.concatenate(), what would the shape be?
dense_layers = [Dense(1), Dense(1), Dense(1)] #some dense layers
merged_output = keras.layers.concatenate([dense_layers])
Would the shape of merged_output be (batch, 3) or(3, 1)?
The answer is (batch,3). To see this, you can build a model and print model.summary():
from keras.layers import Input, Dense
from keras.models import Model
from keras.layers import concatenate
batch = 30
# define three sets of inputs
input1 = Input(shape=(batch,1))
input2 = Input(shape=(batch,1))
input3 = Input(shape=(batch,1))
# define three dense layers
layer1 = Dense(1)(input1)
layer2 = Dense(1)(input2)
layer3 = Dense(1)(input3)
# concatenate layers
dense_layers = [layer1, layer2, layer3]
merged_output = concatenate(dense_layers)
# create a model and check for output shape
model = Model(inputs=[input1, input2, input3], outputs=merged_output)
model.summary()
Layer (type) Output Shape Param # Connected to
=============================================================================
input_1 (InputLayer) (None, 30, 1) 0
_______________________________________________________________________________
input_2 (InputLayer) (None, 30, 1) 0
_______________________________________________________________________________
input_3 (InputLayer) (None, 30, 1) 0
_______________________________________________________________________________
dense_1 (Dense) (None, 30, 1) 2 input_1[0][0]
_______________________________________________________________________________
dense_2 (Dense) (None, 30, 1) 2 input_2[0][0]
_______________________________________________________________________________
dense_3 (Dense) (None, 30, 1) 2 input_3[0][0]
_______________________________________________________________________________
concatenate_1 (Concatenate) (None, 30, 3) 0 dense_1[0][0]
dense_2[0][0]
dense_3[0][0]
==============================================================================
Total params: 6
Trainable params: 6
Non-trainable params: 0
______________________________________________________________________________