Keras TimeDistributed Dense drops vector dimension by 1 - keras

I am not able to figure out how the tensor dimension got reduced by 1 in the TimeDistributed Dense step in the following
model = Sequential()
model.add(Embedding(vocab_size +1, 128, input_length=unravel_len)) # embedding shape: (99, 15, 128)
model.add(Bidirectional(LSTM(64, return_sequences=True))) # (99, 15, 128)
model.add(Dropout(0.5))
model.add(TimeDistributed(Dense(categories, activation='softmax'))) # (99, 15, 127)
I labeled the tensor shape along each step. You can see the last dimension dropped from 128 to 127. Can someone explain why that is. Thanks

Related

Keras LSTM Layer ValueError: Dimensions must be equal, but are 17 and 2

I'm working on a basic RNN model for a multiclass task and I'm facing some issues with output dimensions.
This is my input/output shapes:
input.shape = (50000, 2, 5) # (samples, features, feature_len)
output.shape = (50000, 17, 185) # (samples, features, feature_len) <-- one hot encoded
input[0].shape = (2, 5)
output[0].shape = (17, 185)
This is my model, using Keras functional API:
inp = tf.keras.Input(shape=(2, 5,))
x = tf.keras.layers.LSTM(128, input_shape=(2, 5,), return_sequences=True, activation='relu')(inp)
out = tf.keras.layers.Dense(185, activation='softmax')(x)
model = tf.keras.models.Model(inputs=inp, outputs=out)
This is my model.summary():
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 2, 5)] 0
_________________________________________________________________
lstm (LSTM) (None, 2, 128) 68608
_________________________________________________________________
dense (Dense) (None, 2, 185) 23865
=================================================================
Total params: 92,473
Trainable params: 92,473
Non-trainable params: 0
_________________________________________________________________
Then I compile the model and run fit():
model.compile(optimizer='adam',
loss=tf.nn.softmax_cross_entropy_with_logits,
metrics='accuracy')
model.fit(x=input, y=output, epochs=5)
And I'm getting a dimension error:
ValueError: Dimensions must be equal, but are 17 and 2 for '{{node Equal}} = Equal[T=DT_INT64, incompatible_shape_error=true](ArgMax, ArgMax_1)' with input shapes: [?,17], [?,2].
The error is clear, the model output a dimension 2 and my output has dimension 17, although I understand the issue, I can't find a way of fixing it, any ideas?
I think your output shape is not "output[0].shape = (17, 185)" but "dense (Dense) (None, 2, 185) ".
You need to change your output shape or change your layer structure.
LSTM output is a list of encoder_outputs, when you specify return_sequences=True. hence; I suggest just using the last item of encoder_outputs as the input of your Dense layer. you can see the example section of this link to the documentation. It may help you.

keras pre-trained ResNet50 target shape

I am trying to use ResNet50 Pretrained network for segmentation problem.
I remove the last layer and add my desired layer. But when I try to fit, I get the following error:
ValueError: Error when checking target: expected conv2d_1 to have shape (16, 16, 1) but got array with shape (512, 512, 1)
I have two folders: images and masks. images are RGB and masks are in grayscale.
The shape is 512x512 for all images.
I can not figure in which part am I doing wrong.
Any help will be appreciated.
from keras.applications.resnet50 import ResNet50
image_input=Input(shape=(512, 512, 3))
model = ResNet50(input_tensor=image_input,weights='imagenet',include_top=False)
x = model.output
x = Conv2D(1, (1,1), padding="same", activation="sigmoid")(x)
model = Model(inputs=model.input, outputs=x)
model.summary()
conv2d_1 (Conv2D) (None, 16, 16, 1) 2049 activation_49[0][0]
for layer in model.layers[:-1]:
layer.trainable = False
for layer in model.layers[-1:]:
layer.trainable = True
model.compile(optimizer=Adam(), loss='binary_crossentropy', metrics=['accuracy'])
Your network gives an output of shape (16, 16, 1) but your y (target) has shape (512, 512, 1)
Run the following to see this.
from keras.applications.resnet50 import ResNet50
from keras.layers import Input
image_input=Input(shape=(512, 512, 3))
model = ResNet50(input_tensor=image_input,weights='imagenet',include_top=False)
model.summary()
# Output shows that the ResNet50 network has output of shape (16,16,2048)
from keras.layers import Conv2D
conv2d = Conv2D(1, (1,1), padding="same", activation="sigmoid")
conv2d.compute_output_shape((None, 16, 16, 2048))
# Output shows the shape your network's output will have.
Either your y or the way you use ResNet50 has to change. Read about ResNet50 to see what you are missing.

How to model Convolutional recurrent network ( CRNN ) in Keras

I was trying to port CRNN model to Keras.
But, I got stuck while connecting output of Conv2D layer to LSTM layer.
Output from CNN layer will have a shape of ( batch_size, 512, 1, width_dash) where first one depends on batch_size, and last one depends on input width of input ( this model can accept variable width input )
For eg: an input with shape [2, 1, 32, 829] was resulting output with shape of (2, 512, 1, 208)
Now, as per Pytorch model, we have to do squeeze(2) followed by permute(2, 0, 1)
it will result a tensor with shape [208, 2, 512 ]
I was trying to implement this is Keras, but I was not able to do that because, in Keras we can not alter batch_size dimension in a keras.models.Sequential model
Can someone please guide me how to port above part of this model to Keras?
Current state of ported CNN layer
You don't need to permute the batch axis in Keras. In a pytorch model you need to do it because a pytorch LSTM expects an input shape (seq_len, batch, input_size). However in Keras, the LSTM layer expects (batch, seq_len, input_size).
So after defining the CNN and squeezing out axis 2, you just need to permute the last two axes. As a simple example (in 'channels_first' Keras image format),
model = Sequential()
model.add(Conv2D(512, 3, strides=(32, 4), padding='same', input_shape=(1, 32, None)))
model.add(Reshape((512, -1)))
model.add(Permute((2, 1)))
model.add(LSTM(32))
You can verify the shapes with model.summary():
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_4 (Conv2D) (None, 512, 1, None) 5120
_________________________________________________________________
reshape_3 (Reshape) (None, 512, None) 0
_________________________________________________________________
permute_4 (Permute) (None, None, 512) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 32) 69760
=================================================================
Total params: 74,880
Trainable params: 74,880
Non-trainable params: 0
_________________________________________________________________

Concatenation of Keras parallel layers changes wanted target shape

I'm a bit new to Keras and deep learning. I'm currently trying to replicate this paper but when I'm compiling the first model (without the LSTMs) I get the following error:
"ValueError: Error when checking target: expected dense_3 to have shape (None, 120, 40) but got array with shape (8, 40, 1)"
The description of the model is this:
Input (length T is appliance specific window size)
Parallel 1D convolution with filter size 3, 5, and 7
respectively, stride=1, number of filters=32,
activation type=linear, border mode=same
Merge layer which concatenates the output of
parallel 1D convolutions
Dense layer, output_dim=128, activation type=ReLU
Dense layer, output_dim=128, activation type=ReLU
Dense layer, output_dim=T , activation type=linear
My code is this:
from keras import layers, Input
from keras.models import Model
# the window sizes (seq_length?) are 40, 1075, 465, 72 and 1246 for the kettle, dish washer,
# fridge, microwave, oven and washing machine, respectively.
def ae_net(T):
input_layer = Input(shape= (T,))
branch_a = layers.Conv1D(32, 3, activation= 'linear', padding='same', strides=1)(input_layer)
branch_b = layers.Conv1D(32, 5, activation= 'linear', padding='same', strides=1)(input_layer)
branch_c = layers.Conv1D(32, 7, activation= 'linear', padding='same', strides=1)(input_layer)
merge_layer = layers.concatenate([branch_a, branch_b, branch_c], axis=1)
dense_1 = layers.Dense(128, activation='relu')(merge_layer)
dense_2 =layers.Dense(128, activation='relu')(dense_1)
output_dense = layers.Dense(T, activation='linear')(dense_2)
model = Model(input_layer, output_dense)
return model
model = ae_net(40)
model.compile(loss= 'mean_absolute_error', optimizer='rmsprop')
model.fit(X, y, batch_size= 8)
where X and y are numpy arrays of 8 sequences of a length of 40 values. So X.shape and y.shape are (8, 40, 1). It's actually one batch of data. The thing is I cannot understand how the output would be of shape (None, 120, 40) and what these sizes would mean.
As you noted, your shapes contain batch_size, length and channels: (8,40,1)
Your three convolutions are, each one, creating a tensor like (8,40,32).
Your concatenation in the axis=1 creates a tensor like (8,120,32), where 120 = 3*40.
Now, the dense layers only work on the last dimension (the channels in this case), leaving the length (now 120) untouched.
Solution
Now, it seems you do want to keep the length at the end. So you won't need any flatten or reshape layers. But you will need to keep the length 40, though.
You're probably doing the concatenation in the wrong axis. Instead of the length axis (1), you should concatenate in the channels axis (2 or -1).
So, this should be your concatenate layer:
merge_layer = layers.Concatenate()([branch_a, branch_b, branch_c])
#or layers.Concatenate(axis=-1)([branch_a, branch_b, branch_c])
This will output (8, 40, 96), and the dense layers will transform the 96 in something else.

Deep Net with keras for image segmentation

I am pretty new to deep learning; I want to train a network on image patches of size (256, 256, 3) to predict three labels of pixel-wise segmentation. As a start I want to provide one convolutional layer:
model = Sequential()
model.add(Convolution2d(32, 16, 16, input_shape=(3, 256, 256))
The model output so far is an image with 32 channels. Now, I want to add a dense layer which merges all of these 32 channels into three channels, each one predicting the probability of a class for one pixel.
How can I do that?
The simplest method to merge your 32 channels back to 3 would be to add another convolution, this time with three filters (I arbitrarily set the filter sizes to be 1x1):
model = Sequential()
model.add(Convolution2d(32, 16, 16, input_shape=(3, 256, 256))
model.add(Convolution2d(3, 1, 1))
And then finally add an activation function for segmentation
model.add(Activation("tanh"))
Or you could add it all at once if you want to with activation parameter (arbitrarily chosen to be tanh)
model = Sequential()
model.add(Convolution2d(32, 16, 16, input_shape=(3, 256, 256))
model.add(Convolution2d(3, 1, 1,activation="tanh"))
https://keras.io/layers/convolutional/
You have to use flatten between the convolution layers and the dense layer:
model = Sequential()
model.add(Convolution2d(32, 16, 16, input_shape=(3, 256, 256))
# Do not forget to add an activation layer after your convolution layer, so here.
model.add(Flatten())
model.add(Dense(3))
model.add(Activation("sigmoid")) # whatever activation you want.

Resources