Dimension error for convolution2d in keras for text classification

My input shape is a 10000x500 text document. 10000 represents number of documents and 500 represents number of words.
What I am trying to do is to feed the text for kera's embedding, followed by BLSTM, and then followed by Conv2D and then 2Dpooling, flatten and finally a fully connected dense layer.
Architecture is shown as below:
inp = Input(shape=(500,))
x = Embedding(max_features=10000, embed_size=100)(inp)
x = Bidirectional(CuDNNLSTM(50, return_sequences=True))(x)
x = Conv2D(filters=128, kernel_size=(3, 3), input_shape=(100,500,1))(x)
x = MaxPooling2D()(x)
x = Flatten()(x)
x = Dense(1, activation="sigmoid")(x)
The output shape from the embedding would be (None, 500, 100)
The output shape from BLSTM's hidden state would be (None, 500, 100).
I would like a Conv2D to extract local features over hidden layers from BLSTM. However, I'm having dimension discrepancy error.
ValueError: Input 0 is incompatible with layer conv2d_8: expected ndim=4, found ndim=3
I have tried a solution here When bulding a CNN, I am getting complaints from Keras that do not make sense to me. but still getting the error.

You have two options:
a) Use Conv2D with rows=100, cols=500 and channels=1 by adding a dimension to x:
x = Lambda(lambda t: t[..., None])(x)
x = Conv2D(filters=128, kernel_size=(3, 3), input_shape=(100,500,1))(x)
b) Use Conv1D with steps=100 and input_dim=500, and use MaxPooling1D:
x = Conv1D(filters=128, kernel_size=3, input_shape=(100, 500))(x)
x = MaxPooling1D()(x)
x = Flatten()(x)


ValueError in multiple input model

I'm creating a multi input model where i concatenate a CNN model and a LSTM model. The lstm model contains the last 5 events and the CNN contains a picture of the last event. Both are organized so that each element k in the numpy matches the 5 events and the corresponding picture, as do the output labels which is the 'next' event that should be predicted by the model.
chanDim = -1
inputs = Input(shape=inputShape)
x = inputs
x = Dense(128)(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=chanDim)(x)
x = Dropout(0.3)(x)
x = Flatten()(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=chanDim)(x)
x = Dropout(0.1)(x)
x = Activation("relu")(x)
model_cnn = Model(inputs, x)
This creates the CNN model, and the following code represents the LSTM model
hidden1 = LSTM(128)(visible)
hidden2 = Dense(64, activation='relu')(hidden1)
output = Dense(10, activation='relu')(hidden2)
model_lstm = Model(inputs=visible, outputs=output)
Now, when I combine these models and extend them using a simple dense layer to make the multiclass prediction of 14 classes, all the inputs match and I can concat the (none, 10) and (none, 10) into a (none, 20) for the MLP:
x = Dense(14, activation="softmax")(x)
model_mlp = Model(inputs=[model_lstm.input, model_cnn.input], outputs=x)
This all works fine until I try to compile the model it gives me an error concerning the input of the last dense layer of the mlp model:
ValueError: Error when checking target: expected dense_121 to have shape (14,) but got array with shape (1,)
Do you know how this is possible? If you need more information I'm happy to provide that
your target must be (None, 14) dimensional. with softmax you have to one-hot encode the output
try this:
y = pd.get_dummies(np.concatenate([y_train, y_test])).values
y_train = y[:len(y_train)]
y_test = y[len(y_train):]

How to correctly concatenate a Flatten layer and a feature vector in Keras

I just need to concatenate a flatten layer and a feature vector in Keras. This is the code:
#custom parameters
n_features = 38
vgg_model = VGGFace(include_top=False, input_shape=(224, 224, 3))
last_layer = vgg_model.get_layer('pool5').output
x = Flatten(name='flatten')(last_layer)
# feature vector
feature_vector = Input(shape = (n_features,))
conc = concatenate(([x, feature_vector]), axis=1)
layer_intermediate = Dense(128, activation='relu', name='fc6')(conc)
layer_intermediate1 = Dense(32, activation='relu', name='fc7')(layer_intermediate)
out = Dense(5, activation='softmax', name='fc8')(layer_intermediate1)
custom_vgg_model = Model(vgg_model.input, out)
But I'm getting this error:
---> 20 custom_vgg_model = Model(vgg_model.input, out)
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_88:0", shape=(?, 38), dtype=float32) at layer "input_88". The following previous layers were accessed without issue: ['input_87', 'conv1_1', 'conv1_2', 'pool1', 'conv2_1', 'conv2_2', 'pool2', 'conv3_1', 'conv3_2', 'conv3_3', 'pool3', 'conv4_1', 'conv4_2', 'conv4_3', 'pool4', 'conv5_1', 'conv5_2', 'conv5_3', 'pool5', 'flatten']
Btw the shape of the flatten layer is (None, 25088)
Since your feature_vector is also Input. Try to add feature_vector into inputs when you define the Model.
custom_vgg_model = Model([vgg_model.input,feature_vector], out)

Autoencoder with 3D convolutions and convolutional LSTMs

I have implemented a variational autoencoder with CNN layers in the encoder and decoder. The code is shown below. My training data (train_X) consists of 40'000 images with size 64 x 80 x 1 and my validation data (valid_X) consists of 4500 images of size 64 x 80 x 1.
I would like to adapt my network in the following two ways:
Instead of using 2D convolutions (Conv2D and Conv2DTranspose) I would like to use 3D convolutions to take time into account (as the third dimension). For that I would like to use slices of 10 images, i.e. I will have images of size 64 x 80 x 1 x 10. Can I just use Conv3D and Conv3DTranspose or are other changes necessary?
I would like to try out convolutional LSTMs (ConvLSTM2D) in the encoder and decoder instead of plain 2D convolutions. Again, the input size of the images would be 64 x 80 x 1 x 10 (i.e. time series of 10 images). How can I adapt my network to work with ConvLSTM2D?
import keras
from keras import backend as K
from keras.layers import (Dense, Input, Flatten)
from keras.layers import Lambda, Conv2D
from keras.models import Model
from keras.layers import Reshape, Conv2DTranspose
from keras.losses import mse
def sampling(args):
z_mean, z_log_var = args
batch = K.shape(z_mean)[0]
dim = K.int_shape(z_mean)[1]
epsilon = K.random_normal(shape=(batch, dim))
return z_mean + K.exp(0.5 * z_log_var) * epsilon
inner_dim = 16
latent_dim = 6
image_size = (64,78,1)
inputs = Input(shape=image_size, name='encoder_input')
x = inputs
x = Conv2D(32, 3, strides=2, activation='relu', padding='same')(x)
x = Conv2D(64, 3, strides=2, activation='relu', padding='same')(x)
# shape info needed to build decoder model
shape = K.int_shape(x)
# generate latent vector Q(z|X)
x = Flatten()(x)
x = Dense(inner_dim, activation='relu')(x)
z_mean = Dense(latent_dim, name='z_mean')(x)
z_log_var = Dense(latent_dim, name='z_log_var')(x)
z = Lambda(sampling, output_shape=(latent_dim,), name='z')([z_mean, z_log_var])
# instantiate encoder model
encoder = Model(inputs, [z_mean, z_log_var, z], name='encoder')
# build decoder model
latent_inputs = Input(shape=(latent_dim,), name='z_sampling')
x = Dense(inner_dim, activation='relu')(latent_inputs)
x = Dense(shape[1] * shape[2] * shape[3], activation='relu')(x)
x = Reshape((shape[1], shape[2], shape[3]))(x)
x = Conv2DTranspose(64, 3, strides=2, activation='relu', padding='same')(x)
x = Conv2DTranspose(32, 3, strides=2, activation='relu', padding='same')(x)
outputs = Conv2DTranspose(filters=1, kernel_size=3, activation='sigmoid', padding='same', name='decoder_output')(x)
# instantiate decoder model
decoder = Model(latent_inputs, outputs, name='decoder')
# instantiate VAE model
outputs = decoder(encoder(inputs)[2])
vae = Model(inputs, outputs, name='vae')
def vae_loss(x, x_decoded_mean):
reconstruction_loss = mse(K.flatten(x), K.flatten(x_decoded_mean))
reconstruction_loss *= image_size[0] * image_size[1]
kl_loss = 1 + z_log_var - K.square(z_mean) - K.exp(z_log_var)
kl_loss = K.sum(kl_loss, axis=-1)
kl_loss *= -0.5
vae_loss = K.mean(reconstruction_loss + kl_loss)
return vae_loss
optimizer = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.000)
vae.compile(loss=vae_loss, optimizer=optimizer)
vae.fit(train_X, train_X,
validation_data=(valid_X, valid_X))
Thank you very much for the help. I really appreciate it.
Have your input shape as (10, 64 , 80, 1) and just replace the layers.
The boring part is to organize the input data, if you're going to use sliding windows or just reshape from (images, 64,80,1) to (images//10, 10, 64,80,1).
Sliding windows (Overlapping) or not?
1 - Ok.... if you want your model to understand individual segments of 10 images you may overlap or not. Your choice. Performance may be better with overlapping, but not necessarily.
There isn't really an order in the images, as long as the 10 frames are in order.
This is supported by Conv3D and by LSTM with stateful=False.
2 - But if you want your model to understand the entire sequence, dividing the sequences only because of memory, only LSTM with stateful=True can support this.
(A Conv3D with kernel size = (frames, w, h) will work, but limited to frames, never understanding sequences longer than frames. It may still be capable of detecting the existence of punctual events, though, but not long sequence relationships)
In this case, for the LSTM you will need to:
set shuffle = False in training
use a fixed batch size of sequences
not overlap images
create a manual training loop where you do model.reset_states() every time you are giving "new sequences" for training AND predicting
The loop structure would be:
for epoch in range(epochs):
for group_of_sequences in range(groups):
sequences = getAGroupOfCompleteSequences() #shape (sequences, total_length, ....)
for batch in range(slide_divisions):
batch = sequences[:,10*batch : 10*(batch+1)]
model.train_on_batch(batch, ....)

Dimension errors in neural network in Keras

I am trying to implement a neural network where I merge/concatenate a fully connected neural network with a convolution neural network. But when I fit the model, I get the following error:
ValueError: All input arrays (x) should have the same number of
samples. Got array shapes: [(1, 100, 60, 4500), (100, 4500)]
I have two different inputs:
image(dimensions: 1,100,60,4500) where 1 is the channel, 100: # of sample, 60*4500 (dimension of my image). This goes to my convolution neural network
positions(dimensions: 100,4500): where 100 refers to samples.
Dimension for my output is 100,2.
The code for my neural network is:
###Convolution neural network
b1 = Sequential()
b1.add(Conv2D(128*2, kernel_size=3,activation='relu',data_format='channels_first',
b1.add(Conv2D(128*2, kernel_size=3, activation='relu'))
b1.add(Conv2D(128*2, kernel_size=4, activation='relu'))
###Fully connected feed forward neural network
b2 = Sequential()
b2.add(Dense(64, input_shape = (4500,), activation='relu'))
model = Sequential()
###Concatenating the two networks
concat = concatenate([b1.output, b2.output], axis=-1)
x = Dense(256, activation='relu', kernel_initializer='normal')(concat)
x = Dropout(0.25)(x)
output = Dense(2, activation='softmax')(x)
model = Model([b1.input, b2.input], [output])
history = model.fit([image, positions], Ytest, batch_size=10,
Also, the reason why my 'image' array is 4 dimensional is because in the beginning it was just (100,60,4500) but then I ran into the following error:
ValueError: Error when checking input: expected conv2d_10_input to
have 4 dimensions, but got array with shape (100, 60, 4500)
And upon googling I found out that it expects # of channels as an input too. And after I added the # of channel, this error went away but then I ran into the other error that I mentioned in the beginning.
So can someone tell me how to solve for the error (the one I specified in the beginning)? Help would be appreciated.
It is not a good practice to mix Sequential and Functional API.
You can implement the model like this
i1 = Input(shape=(1, 60, 4500))
c1 = Conv2D(128*2, kernel_size=3,activation='relu',data_format='channels_first')(i1)
c1 = Conv2D(128*2, kernel_size=3, activation='relu')(c1)
c1 = Dropout(0.2)(c1)
c1 = Conv2D(128*2, kernel_size=4, activation='relu')(c1)
c1 = Dropout(0.2)(c1)
c1 = Flatten()(c1)
i2 = Input(shape=(4500, ))
c2 = Dense(64, input_shape = (4500,), activation='relu')(i2)
c2 = Dropout(0.2)(c2)
c = concatenate([c1, c2])
x = Dense(256, activation='relu', kernel_initializer='normal')(c)
x = Dropout(0.25)(x)
output = Dense(2, activation='softmax')(x)
model = Model([i1, i2], [output])
Note the shape of i1 is shape=(1, 60, 4500). You have set data_format='channels_first' in Conv2D layer hence you need 1 in the beginning.
Compiled the model like this
Placeholder data
import numpy as np
X_img = np.zeros((100, 1, 60, 4500))
X_pos = np.ones((100, 4500))
Y = np.zeros((100, 2))
history = model.fit([X_img, X_pos], Y, batch_size=1,
You number of samples (batch size) should always be the first dimension. So, your data should have shape (100, 1, 60, 4500) for image and (100, 4500) for positions. The argument channels_first for the Conv2D layer means that the channels is the first non-batch dimension.
You also need to change the input shape to (1, 60, 4500) in the first Conv2D layer.

1D Convolutional network with keras, error on input size

I'm trying to build a convolutional neural network for my dataset. My training dataset has 1209 examples of 800 features each.
Here's what part of the code looks like :
model = Sequential()
model.add(Conv1D(64, 3, activation='linear', input_shape=(1209, 800)))
model.add(Dense(1, activation='linear'))
model.compile(loss=loss_type, optimizer=optimizer_type, metrics=[metrics_type])
model.fit(X, Y, validation_data=(X2,Y2),epochs = nb_epochs,
batch_size = batch_size,shuffle=True)
When I compile this code, I get the following error :
Error when checking input: expected conv1d_25_input to have 3 dimensions,
but got array with shape (1209, 800)
So I add a dimension, here's what I do :
X = np.expand_dims(X, axis=0)
X2 = np.expand_dims(X2, axis=0)
And then I get this error :
ValueError: Input arrays should have the same number of samples as target arrays.
Found 1 input samples and 1209 target samples.
My training data has now a shape like this (1, 1209, 800), should it be something else ?
Thanks a lot for reading this.
Instead of expanding the dimensions on X at axis 0, you should expand on axis 2. Thus, rather than X = np.expand_dims(X, axis=0), you need X = np.expand_dims(X, axis=2).
Afterwards, the shape of X should be (1209, 800, 1), and you should then specify input_shape=(800, 1) in your first layer.
