I am trying to find out label associated with word from annotated text. I am using a bidirectional LSTM. I have X_train which is having shape (1676, 39) and Y_train with the same shape (1676, 39).
input = Input(shape=(sequence_length,))
model = Embedding(input_dim=n_words, output_dim=20,
input_length=sequence_length, mask_zero=True)(input)
model = Bidirectional(LSTM(units=50, return_sequences=True,
recurrent_dropout=0.1))(model)
out_model = TimeDistributed(Dense(50, activation="softmax"))(model)
model = Model(input, out_model)
model.compile(optimizer="rmsprop", loss= "categorical_crossentropy", metrics=["accuracy"])
model.fit(X_train, Y_train, batch_size=32, epochs= 10,
validation_split=0.1)
While executing this, I am getting error:
ValueError: Error when checking target: expected time_distributed_5 to have 3 dimensions, but got array with shape (1676, 39).
I am not able to find out how to feed proper dimension which is needed by the Keras LSTM model.
In the LSTM you set return_sequences=True, as a result, the outputs of the layer is a Tensor with shape of [batch_size * 39 * 50]. Then you pass this Tensor to TimeDistributed layer. TimeDistributed apply Dense layer on the each time stamp. The outputs of the layer, again is [batch_size * 39 * 50]. As you see, you pass 3 dimension Tensor for prediction, while your ground truth is 2 dimension (1676, 39).
How to fix the issue?
1) Remove return_sequences=True from LSTM args.
2) Remove TimeDistributed layer and apply Dense layer directly.
inps = keras.layers.Input(shape=(39,))
embedding = keras.layers.Embedding(vocab_size, 16)(inps)
rnn = keras.layers.LSTM(50)(embedding)
dense = keras.layers.Dense(50, activation="softmax")(rnn)
prediction = keras.layers.Dense(39, activation='softmax')(dense)
Related
I'm trying to develop as classifier for two classes. I've implemented the model as follows:
model = keras.models.Sequential() #using tensorflow's version of keras
model.add(keras.layers.InputLayer(input_shape = X_train_scaled[:,1].shape))
model.add(keras.layers.Dense(250,activation="relu"))
model.add(keras.layers.Dense(50,activation="relu"))
model.add(keras.layers.Dense(2,activation="softmax"))
model.summary()
# Compile the model
model.compile(loss = 'sparse_categorical_crossentropy',
optimizer = "sgd",
metrics = ["accuracy"])
The size of the inputs are
X_train_scaled[:,1].shape, y_train.shape
((552,), (552,))
The entire error message is:
ValueError: Input 0 of layer sequential_9 is incompatible with the layer:
expected axis -1 of input shape to have value 552 but received input with shape (None, 1)
What am I doing wrong here?
The error message says that you defined a model which expects as input a shape of (batch_size, 552) and you are trying to feed it an array with a shape of (batch_size, 1).
The issue is most likely with
input_shape = X_train_scaled[:,1].shape)
This should most likely be:
input_shape = X_train_scaled.shape[1:]
i.e. you want to define the shape of your model to the the shape of the features (without the number of examples). The model is then fed in mini-batches... e.g. if you do call model.fit(X_train_scaled, ...) keras will create mini-batches (of 32 examples by default) and update the model weights for each mini batch.
Also, please be aware that you defined the model to return a shape of (batch_size, 2). So y_train must have a shape of (X_train.shape[0], 2).
The question was answered by Pedro
I need outputs at every recurrent layer and my setup is as follows:
100 training examples, 3 time steps per example, and 20-d feature vector for each individual element.
x_train: (100,3,20)
y_train: (100,20)
LSTM architecture:
model.add(LSTM(20, input_shape=(3,20), return_sequences=True))
model.compile(loss='mean_absolute_error', optimizer='adam', metrics=['accuracy'])
model.summary()
Training:
history = model.fit(x_train, y_train, epochs=50, validation_data=(x_test, y_test))
Error:
ValueError: Dimensions must be equal, but are 20 and 3 for '{{node Equal}} = Equal[T=DT_FLOAT, incompatible_shape_error=true](IteratorGetNext:1, Cast_1)' with input shapes: [?,20], [?,3].
Please help me with the correct input/output LSTM dimensions.
Thanks
LSTM(20, input_shape=(3,20), return_sequences=True) takes as input shape (100,3,20) and returns (100,3,20). Your target output is however encoded as (100,20).
From the dimensions, I assume you want to map each sequence to a non-sequence, i.e. you can do:
LSTM(20, input_shape=(3,20), return_sequences=False)
This will return the final hidden state, i.e. a shape of (100,20) which matches your target output.
I am loading VGG19 model and tring to apply 1d conv to decrease depth but I am getting the following error:
"ValueError: Input 0 is incompatible with layer conv1d_1: expected ndim=3, found ndim=4"
This is the function I am using:
def getModel():
base_model = VGG19(weights='imagenet')
model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_conv4').output)
#model.output=(None, 28, 28, 512)
layer=keras.layers.Conv1D(96, (512), padding='same')
model.summary()
out=(layer)(model.output)
model = Model(inputs=base_model.input, outputs=out)
model.summary()
return model
The problem is that the last output is an output produced by a Conv2D layer which has the shape [batch size, height, width, channels]. This cannot be consumed by a Conv1D. Conv1D consumes [batch size, width, channels] inputs. In your case, since you're interested in cutting down the number of filters, all you need to do is convert your Conv1D to Conv2D.
Simply change your function to,
def getModel():
base_model = VGG19(weights='imagenet')
model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_conv4').output)
# Here we are using Conv2D not Conv1D which gives 96 filters at the end.
layer=keras.layers.Conv2D(96, (3,3), padding='same')
model.summary()
out=(layer)(model.output)
model = Model(inputs=base_model.input, outputs=out)
model.summary()
return model
Here,
96 - Is the number of filters
(3,3) - Is the kernel height and width of the convolution layer.
I am trying to apply a softmax activation layer to the output of the Add() layer. I am trying to make this layer the output of my model and I am running into a few problems.
It seems Add() layer doesn't allow the usage of activations and if I do something like this:
predictions = Add()([x,y])
predictions = softmax(predictions)
model = Model(inputs = model.input, outputs = predictions)
I get:
ValueError: Output tensors to a Model must be the output of a Keras `Layer` (thus holding past layer metadata). Found: Tensor("Softmax:0", shape=(?, 6), dtype=float32)
It has nothing to do with the Add layer, you are using K.softmax directly on Keras tensors and this won't work, you need an actual layer. You can use the Activation layer for this:
from keras.layers import Activation
predictions = Add()([x,y])
predictions = Activation("softmax")(predictions)
model = Model(inputs = model.input, outputs = predictions)
I have a single training batch of 600 sequential points (x(t), y(t)) with x(t) being a 25 dimensional vector and y(t) being my target (1 dim). I would like to train an LSTM to predict how the series would continue given a few additional x(t) [t> 600]. I tried the following model:
model = Sequential()
model.add(LSTM(128, input_shape = (600,25), batch_size = 1, activation= 'tanh', return_sequences = True))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=20 ,verbose=2) prediction
prediction = model.predict(testX, batch_size = 1)
Fitting works fine, but I keep getting the following error at the prediction step:
Error when checking : expected lstm_46_input to have shape (1, 600, 25) but got array with shape (1, 10, 25)
What am I missing?
Here are my shapes:
trainX.shape = (1,600,25)
trainY.shape = (1,600,1)
testX.shape = (1,10,25)
According to Keras documentation input of LSTM (or any RNN) layers should be of shape (batch_size, timesteps, input_dim) where your input shape is
trainX.shape = (1,600,25)
So it means for training you are passing only one data with 600 timesteps and 25 features per timestep. But I got a feeling that you actually have 600 training data each having 25 timesteps and 1 feature per timestep. I guess your input shape (trainX) should be 600 x 25 x 1. Train target (trainY) should be 600 x 1 If my assumption is right then your test data should be of shape 10 x 25 x 1. First LSTM layer should be written as
model.add(LSTM(128, input_shape = (25,1), batch_size = 1, activation= 'tanh', return_sequences = False))
If your training data is in fact (1,600,25) what this means is you are unrolling the LSTM feedback 600 times. The first input has an impact on the 600th input. If this is what you want, you can use the Keras function "pad_sequences" to add append zeros to the test matrix so it has the shape (1,600,25). The network should predict zeros and you will need to add 590 zeros to your testY.
If you only want say 10 previous timesteps to affect your current Y prediction, then you will want to turn your trainX into shape (590,10,25). The input line will be something like:
model.add(LSTM(n_hid, stateful=True, return_sequences=False, batch_input_shape=(1,nTS,x_train.shape[2])))
The processing to get it in the form you want could be something like this:
def formatTS(XX, yy, window_length):
x_train = np.zeros((XX.shape[0]-window_length,window_length,XX.shape[1]))
for i in range(x_train.shape[0]):
x_train[i] = XX[i:i+window_length,:]
y_train = yy[window_length:]
return x_train, y_train
Then your testing will work just fine since it is already in the shape (1,10,25).