I need outputs at every recurrent layer and my setup is as follows:
100 training examples, 3 time steps per example, and 20-d feature vector for each individual element.
x_train: (100,3,20)
y_train: (100,20)
LSTM architecture:
model.add(LSTM(20, input_shape=(3,20), return_sequences=True))
model.compile(loss='mean_absolute_error', optimizer='adam', metrics=['accuracy'])
model.summary()
Training:
history = model.fit(x_train, y_train, epochs=50, validation_data=(x_test, y_test))
Error:
ValueError: Dimensions must be equal, but are 20 and 3 for '{{node Equal}} = Equal[T=DT_FLOAT, incompatible_shape_error=true](IteratorGetNext:1, Cast_1)' with input shapes: [?,20], [?,3].
Please help me with the correct input/output LSTM dimensions.
Thanks
LSTM(20, input_shape=(3,20), return_sequences=True) takes as input shape (100,3,20) and returns (100,3,20). Your target output is however encoded as (100,20).
From the dimensions, I assume you want to map each sequence to a non-sequence, i.e. you can do:
LSTM(20, input_shape=(3,20), return_sequences=False)
This will return the final hidden state, i.e. a shape of (100,20) which matches your target output.
Related
I am trying to create a simple 3 class deep learning classifier using keras as follows:
clf = Sequential()
clf.add(Dense(20, activation='relu', input_dim=NUM_OF_FEATURES))
clf.add(Dense(10, activation='relu'))
clf.add(Dense(3, activation='relu'))
clf.add(Dense(1, activation='softmax'))
# Model Compilation
clf.compile(optimizer = 'adam',
loss = 'categorical_crossentropy',
metrics = ['accuracy'])
# Training the model
clf.fit(X_train,
y_train,
epochs=10,
batch_size=16,
validation_data=(X_val, y_val))
How after training while predicting, it is only predicting the same class (class 1) out of the 3 classes ALWAYS.
Is my network architecture not correct?
I am new to deep learning and AI.
If you want a network to classify three classes, your last dense layer should have three output nodes. In the example, the last dense layer has one output node.
clf = Sequential()
clf.add(Dense(20, activation='relu', input_dim=NUM_OF_FEATURES))
clf.add(Dense(10, activation='relu'))
clf.add(Dense(3, activation='relu'))
clf.add(Dense(3, activation='softmax'))
For each input sample, the output will be three values, all of which sum to one. These represent the probabilities that the input belongs to each class.
Regarding the loss function, if you want to use cross entropy, you have a choice between sparse categorical cross entropy and categorical cross entropy. The latter expects ground truth labels to be one-hot encoded (you can use tf.one_hot for this). In other words, the shape of the labels is the same as the shape as the network's output. Sparse categorical cross entropy, on the other hand, expects labels with a rank N-1, where N is the rank of the neural network's output. In order words, these are the labels before one-hot encoding.
When the model is used for inference, the predicted class values can be retrieved with argmax of the last dimension of the predictions.
predictions = clf.predict(x)
classes = predictions.argmax(-1)
"ValueError: Error when checking target: expected activation_81 to have shape (1,) but got array with shape (7,)"
I am performing a multiclass classification of 7 classes for speech emotion classification using a neural network, but it fails at this point
cnnhistory=model.fit(x_traincnn,
y_train,
batch_size=16,
epochs=700,
validation_data=(x_testcnn, y_test),
callbacks=[mcp_save, lr_reduce])
at the line callbacks=[mcp_save, lr_reduce]
mcp_save being
mcp_save = ModelCheckpoint('model/aug_noiseNshift_2class2_np.h5',
save_best_only=True, monitor='val_loss', mode='min')
and lr_reduce being
lr_reduce = ReduceLROnPlateau(monitor='val_loss', factor=0.9, patience=20, min_lr=0.000001)
Final layer of NN
Dense(7) for 7 classes
model.add(Dense(7))
model.add(Activation('softmax'))
opt = keras.optimizers.SGD(lr=0.0001, momentum=0.0, decay=0.0, nesterov=False)
compiled model using
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy', fscore])
I have already transformed the dataset, with normalised values, changed the loss function to 'sparse_categorical_crossentropy' from 'categorical_crossentropy'. Nothing has worked just pushed the error from activation_9 to activation_18 to activation_45 to activation_54 to now activation_81. But the error is still there.
Any help would be highly appreciated!
I am new to neural networks.
TIA
If you have labels as numbers, that means y_train has shape (samples, 1) and you should use 'sparse_categorical_crossentropy'.
If you have labels as one-hot encodings, that means y_train has shape (samples, 7) and you should use 'categorical_crossentropy'.
I have a text dataset, that contains 6 classes. for each sample, I have the percent value and sum of the 6 percent values is 100% (features are related to each other). For example :
{A:16, B:35, C:7, D:0, E:3, F:40}
how can I feed a deep learning algorithm with this dataset?
I actually want the prediction to be exactly in the shape of training data.
Here is what you can do:
First of all, normalize all of your labels and scale them between 0-1.
Use a softmax layer for prediction.
Here is some code in Keras for intuition:
model = Sequential()
model.add(Dense(100, input_dim = x.shape[1], activation='relu'))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
I am trying to find out label associated with word from annotated text. I am using a bidirectional LSTM. I have X_train which is having shape (1676, 39) and Y_train with the same shape (1676, 39).
input = Input(shape=(sequence_length,))
model = Embedding(input_dim=n_words, output_dim=20,
input_length=sequence_length, mask_zero=True)(input)
model = Bidirectional(LSTM(units=50, return_sequences=True,
recurrent_dropout=0.1))(model)
out_model = TimeDistributed(Dense(50, activation="softmax"))(model)
model = Model(input, out_model)
model.compile(optimizer="rmsprop", loss= "categorical_crossentropy", metrics=["accuracy"])
model.fit(X_train, Y_train, batch_size=32, epochs= 10,
validation_split=0.1)
While executing this, I am getting error:
ValueError: Error when checking target: expected time_distributed_5 to have 3 dimensions, but got array with shape (1676, 39).
I am not able to find out how to feed proper dimension which is needed by the Keras LSTM model.
In the LSTM you set return_sequences=True, as a result, the outputs of the layer is a Tensor with shape of [batch_size * 39 * 50]. Then you pass this Tensor to TimeDistributed layer. TimeDistributed apply Dense layer on the each time stamp. The outputs of the layer, again is [batch_size * 39 * 50]. As you see, you pass 3 dimension Tensor for prediction, while your ground truth is 2 dimension (1676, 39).
How to fix the issue?
1) Remove return_sequences=True from LSTM args.
2) Remove TimeDistributed layer and apply Dense layer directly.
inps = keras.layers.Input(shape=(39,))
embedding = keras.layers.Embedding(vocab_size, 16)(inps)
rnn = keras.layers.LSTM(50)(embedding)
dense = keras.layers.Dense(50, activation="softmax")(rnn)
prediction = keras.layers.Dense(39, activation='softmax')(dense)
I have multiple sequences of varying length. Each has about 9 features. I want to predict the values of all the continuous features at time t+1. The data is in a list of length 2000 (so, 2000 total sequences). How could one do this in Keras?
model = Sequential()
model.add(LSTM(100, input_shape=(None,9)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X, y, epochs=1, batch_size=1, verbose=1)
This is all I really have, but I'm getting some size mismatches. Any suggestions?