I am developing LSTM Program for NLP Problem.
Shape of my data and Label is = (10,20,1)
My Model code looks like this :
model.add(Embedding(18,17,input_length=20,weights=[embedding_weights])) ( Shape of Embedding (18,17))
# encoder layer
model.add(LSTM(100, activation='relu', input_shape=(20, 1)))
# repeat vector
model.add(RepeatVector(20))
# decoder layer
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
I am getting following error
"input_length" is 20, but received input has shape (None, 20, 1)
Related
I'm trying to construct an encoder to get the latent space in order to plot it. I don't really know if I can get it from the RepeatVector or if I have to add a Dense layer.
Here is my code:
model = Sequential()
model.add(LSTM(16, activation='relu', return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(4, activation='relu', return_sequences=False)) #Encoder
model.add(RepeatVector(X_train.shape[1])) #Latent
model.add(LSTM(4, activation='relu', return_sequences=True)) #Decoder
model.add(LSTM(16, activation='relu', return_sequences=False)) #Decoder
model.add(TimeDistributed(Dense(X_train.shape[2]))) #Decoder
You need to separate the model into two parts (encoder and decoder).
Then, build the encoder using the input and output of the encoder part.
btw the output would be the last layer before using RepeatVector.
encoder = Model(inputs, output_from_encoder)
I have a pre-trained sequential CNN model which I trained on images of 224x224x3. The following is the architecture:
model = Sequential()
model.add(Conv2D(filters = 64, kernel_size = (5, 5), strides = 1, activation = 'relu', input_shape = (224, 224, 3)))
model.add(MaxPool2D(pool_size = (3, 3)))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 128, kernel_size = (3, 3), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 256, kernel_size = (2, 2), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation = 'relu', use_bias=False))
model.add(Dense(num_classes, activation = 'softmax'))
model.summary()
For reference, here is the model summary: model summary
I want to retrain this model on images of size 40x40x3. However, I am facing the following error: "ValueError: Input 0 of layer dense_12 is incompatible with the layer: expected axis -1 of input shape to have value 200704 but received input with shape (None, 256)".
What should I do to resolve this error?
Note: I am using Tensorflow version 2.4.1
The problem is, in your pre-trained model you have a flattened shape of 200704 as input shape (line no 4 from last), and then the output size is 128 for the dense layer (line 3 from the last). And now you wanna use the same pre-trained model for the image of 40X40, it will not work. The reasons are :
1- Your model is input image shape-dependent. it's not an end-to-end conv model, as you use dense layers in between, which makes the model input image size-dependent.
2- The flatten size of the 40x40 image after all the conv layers are 256, not 200704.
Solution
1- Either you change the flatten part with adaptive average pooling layer and then your last dense layer with softmax is fine. And again retrain your old model on 224x224 images. Following that you can train on your 40x40 images.
2- Or the easiest way is to just use a subset of your pre-trained model till the flatten part (exclude the flatten part) and then add a flatten part with dense layer and classification layer (layer with softmax). For this method you have to write a custom model, like here, just the first part will be the subset of the pre-trained model, and flatten and classification part will be additional. And then you can train the whole model over the new dataset. You can also take the benefit of transfer-learning using this method, by allowing the backward gradient to flow only through the newly created linear layer and not through the pre-trained layers.
So when I was trying to train a model with LSTM, I have reshaped my input data to (1000, 96, 1), and output data to (1000, 24, 1), which means I want to predict futural 24 data with previous 96 data.
When I add a timedistributed dense layer as the last layer, I get an error:
ValueError: Error when checking target: expected time_distributed_1 to have shape (96, 1) but got array with shape (24, 1)
So what's wrong?
Here are my codes:
modelA.add(LSTM(units=64, return_sequences=True,
input_shape=[xProTrain_3D.shape[1], xProTrain_3D.shape[2]]))
modelA.add(LSTM(units=128, return_sequences=True))
modelA.add(Dropout(0.25))
modelA.add(Dropout(0.25))
modelA.add(LSTM(units=256, return_sequences=True))
modelA.add(Dropout(0.25))
modelA.add(LSTM(units=128, return_sequences=True))
modelA.add(Dropout(0.25))
modelA.add(LSTM(units=64, return_sequences=True))
modelA.add(Dropout(0.25))
modelA.add(TimeDistributed(Dense(units=1, activation='relu', input_shape=(24, 1))))
modelA.compile(optimizer='Adam',
loss='mse',
metrics=['mse'])
modelA.summary()
modelA.fit(x=xProTrain_3D, y=yProTrain_3D, epochs=epoch, batch_size=batch_size)
By the way, the input shape is (1000, 96, 1) and output shape is (1000, 24, 1)
I have a code below which implements an architecture (in grid search), to yield appropriate parameters for input, nodes, epochs, batch size and differenced time series input.
The challenge I have is to convert the neural network from just having one LSTM hidden layer, to multiple LSTM hidden layers.
At the moment, I could only run the code with Dense-type hidden layers, without having any errors thrown, otherwise I get dimension errors, tuple errors and so on.
The problem is only persistent in the neural network architecture section.
Original code that works:
def model_fit(train, config):
# unpack config
n_input, n_nodes, n_epochs, n_batch, n_diff = config
# Data
if n_diff > 0:
train = difference(train, n_diff)
# Time series to supervised format
data = series_to_supervised(train, n_in=n_input)
train_x, train_y = data[:, :-1], data[:, -1]
# Reshaping input data into [samples, timesteps, features]
n_features = 1
train_x = train_x.reshape((train_x.shape[0], train_x.shape[1], n_features))
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features)))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(1))
# Compile model (Grid search architecture)
model.compile(loss='mse', optimizer='adam')
# fit model
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
return model
Modified LSTM-hidden layer code, that fails to run:
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
Another variant that also threw an error - ValueError: Error when checking target: expected time_distributed_4 to have 3 dimensions, but got array with shape (34844, 1)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=False))
model.add(RepeatVector(n_input))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
Could anyone with any suggestion please help me ?
Try to set return_sequences=False at the last layer.
I'm trying to use maxpooling as a first layer using keras and I have a problem with the input and output dimensions.
print(x_train.shape)
print(y_train.shape)
(15662, 6)
(15662,)
x_train = np.reshape(x_train, (-1,15662, 6))
y_train = label_array.reshape(1, -1)
model = Sequential()
model.add(MaxPooling1D(pool_size = 2 , strides=1, input_shape = (15662,6)))
model.add(Dense(5, activation='relu'))
model.add(Flatten())
model.add(Dense(1, activation='softmax'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=
['accuracy'])
model.fit(x_train, y_train, batch_size= 32, epochs=1)
After running the model, I get the following error:
ValueError: Error when checking target: expected dense_622 (last layer)
to have shape (1,) but got array with shape (15662,)
I'm doing classification and my target is binary (0,1)
Thank you
Your target should have shape (batch_size, 1) but you are passing an array of shape (1, 15662). It seems like 15662 should be the batch size, in which case x_train should have shape (15662, 6) and y_train should have shape (15662, 1). In this case however, it doesn't make any sense to have a MaxPooling1D layer as the first layer of your model since max pooling requires a 3D input (i.e. shape (batch_size, time_steps, features)). You probably want to leave out the max pooling layer (and the Flatten layer). The following code should work:
# x_train: (15662, 6)
# y_train: (15662,)
model = Sequential()
model.add(Dense(5, activation='relu', input_shape=(6,))) # Note: don't specify the batch size in input_shape
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=
['accuracy'])
model.fit(x_train, y_train, batch_size= 32, epochs=1)
But it of course depends on what kind of data you have.