I am pretty new to deep learning; I want to train a network on image patches of size (256, 256, 3) to predict three labels of pixel-wise segmentation. As a start I want to provide one convolutional layer:
model = Sequential()
model.add(Convolution2d(32, 16, 16, input_shape=(3, 256, 256))
The model output so far is an image with 32 channels. Now, I want to add a dense layer which merges all of these 32 channels into three channels, each one predicting the probability of a class for one pixel.
How can I do that?
The simplest method to merge your 32 channels back to 3 would be to add another convolution, this time with three filters (I arbitrarily set the filter sizes to be 1x1):
model = Sequential()
model.add(Convolution2d(32, 16, 16, input_shape=(3, 256, 256))
model.add(Convolution2d(3, 1, 1))
And then finally add an activation function for segmentation
model.add(Activation("tanh"))
Or you could add it all at once if you want to with activation parameter (arbitrarily chosen to be tanh)
model = Sequential()
model.add(Convolution2d(32, 16, 16, input_shape=(3, 256, 256))
model.add(Convolution2d(3, 1, 1,activation="tanh"))
https://keras.io/layers/convolutional/
You have to use flatten between the convolution layers and the dense layer:
model = Sequential()
model.add(Convolution2d(32, 16, 16, input_shape=(3, 256, 256))
# Do not forget to add an activation layer after your convolution layer, so here.
model.add(Flatten())
model.add(Dense(3))
model.add(Activation("sigmoid")) # whatever activation you want.
Related
I have a pre-trained sequential CNN model which I trained on images of 224x224x3. The following is the architecture:
model = Sequential()
model.add(Conv2D(filters = 64, kernel_size = (5, 5), strides = 1, activation = 'relu', input_shape = (224, 224, 3)))
model.add(MaxPool2D(pool_size = (3, 3)))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 128, kernel_size = (3, 3), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 256, kernel_size = (2, 2), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation = 'relu', use_bias=False))
model.add(Dense(num_classes, activation = 'softmax'))
model.summary()
For reference, here is the model summary: model summary
I want to retrain this model on images of size 40x40x3. However, I am facing the following error: "ValueError: Input 0 of layer dense_12 is incompatible with the layer: expected axis -1 of input shape to have value 200704 but received input with shape (None, 256)".
What should I do to resolve this error?
Note: I am using Tensorflow version 2.4.1
The problem is, in your pre-trained model you have a flattened shape of 200704 as input shape (line no 4 from last), and then the output size is 128 for the dense layer (line 3 from the last). And now you wanna use the same pre-trained model for the image of 40X40, it will not work. The reasons are :
1- Your model is input image shape-dependent. it's not an end-to-end conv model, as you use dense layers in between, which makes the model input image size-dependent.
2- The flatten size of the 40x40 image after all the conv layers are 256, not 200704.
Solution
1- Either you change the flatten part with adaptive average pooling layer and then your last dense layer with softmax is fine. And again retrain your old model on 224x224 images. Following that you can train on your 40x40 images.
2- Or the easiest way is to just use a subset of your pre-trained model till the flatten part (exclude the flatten part) and then add a flatten part with dense layer and classification layer (layer with softmax). For this method you have to write a custom model, like here, just the first part will be the subset of the pre-trained model, and flatten and classification part will be additional. And then you can train the whole model over the new dataset. You can also take the benefit of transfer-learning using this method, by allowing the backward gradient to flow only through the newly created linear layer and not through the pre-trained layers.
I am trying to use ResNet50 Pretrained network for segmentation problem.
I remove the last layer and add my desired layer. But when I try to fit, I get the following error:
ValueError: Error when checking target: expected conv2d_1 to have shape (16, 16, 1) but got array with shape (512, 512, 1)
I have two folders: images and masks. images are RGB and masks are in grayscale.
The shape is 512x512 for all images.
I can not figure in which part am I doing wrong.
Any help will be appreciated.
from keras.applications.resnet50 import ResNet50
image_input=Input(shape=(512, 512, 3))
model = ResNet50(input_tensor=image_input,weights='imagenet',include_top=False)
x = model.output
x = Conv2D(1, (1,1), padding="same", activation="sigmoid")(x)
model = Model(inputs=model.input, outputs=x)
model.summary()
conv2d_1 (Conv2D) (None, 16, 16, 1) 2049 activation_49[0][0]
for layer in model.layers[:-1]:
layer.trainable = False
for layer in model.layers[-1:]:
layer.trainable = True
model.compile(optimizer=Adam(), loss='binary_crossentropy', metrics=['accuracy'])
Your network gives an output of shape (16, 16, 1) but your y (target) has shape (512, 512, 1)
Run the following to see this.
from keras.applications.resnet50 import ResNet50
from keras.layers import Input
image_input=Input(shape=(512, 512, 3))
model = ResNet50(input_tensor=image_input,weights='imagenet',include_top=False)
model.summary()
# Output shows that the ResNet50 network has output of shape (16,16,2048)
from keras.layers import Conv2D
conv2d = Conv2D(1, (1,1), padding="same", activation="sigmoid")
conv2d.compute_output_shape((None, 16, 16, 2048))
# Output shows the shape your network's output will have.
Either your y or the way you use ResNet50 has to change. Read about ResNet50 to see what you are missing.
I am playing with a model which should take a 8x8 chess board as input, encoded as a 224x224 grey image, and then output a 64x13 one-hot-encoded logistic regression = probabilities of pieces on the squares.
Now, after the Convolutional layers I don't quite know, how to proceed to get a 2D-Dense layer as a result/target.
I tried adding a Dense(64,13) as a layer to my Sequential model, but I get the error "Dense` can accept only 1 positional arguments ('units',)"
Is it even possible to train for 2D-targets?
EDIT1:
Here is the relevant part of my code, simplified:
# X.shape = (10000, 224, 224, 1)
# Y.shape = (10000, 64, 13)
model = Sequential([
Conv2D(8, (3,3), activation='relu', input_shape=(224, 224, 1)),
Conv2D(8, (3,3), activation='relu'),
# some more repetitive Conv + Pooling Layers here
Flatten(),
Dense(64,13)
])
TypeError: Dense can accept only 1 positional arguments ('units',), but you passed the following positional arguments: [64, 13]
EDIT2: As Anand V. Singh suggested, I changed Dense(64, 13) to Dense(832), which works fine. Loss = mse.
Wouldn't it be better to use "sparse_categorical_crossentropy" as loss and 64x1 encoding (instead of 64x13) ?
In Dense you only pass the number of layers you expect as output, if you want (64x13) as output, put the layer dimension as Dense(832) (64x13 = 832) and then reshape later. You will also need to reshape Y so as to accurately calculate loss, which will be used for back propagation.
# X.shape = (10000, 224, 224, 1)
# Y.shape = (10000, 64, 13)
Y = Y.reshape(10000, 64*13)
model = Sequential([
Conv2D(8, (3,3), activation='relu', input_shape=(224, 224, 1)),
Conv2D(8, (3,3), activation='relu'),
# some more repetitive Conv + Pooling Layers here
Flatten(),
Dense(64*13)
])
That should get the job done, if it doesn't post where it fails and we can proceed further.
A Reshape layer allows you to control the output shape.
Flatten(),
Dense(64*13),
Reshape((64,13))#2D
I'm a bit new to Keras and deep learning. I'm currently trying to replicate this paper but when I'm compiling the first model (without the LSTMs) I get the following error:
"ValueError: Error when checking target: expected dense_3 to have shape (None, 120, 40) but got array with shape (8, 40, 1)"
The description of the model is this:
Input (length T is appliance specific window size)
Parallel 1D convolution with filter size 3, 5, and 7
respectively, stride=1, number of filters=32,
activation type=linear, border mode=same
Merge layer which concatenates the output of
parallel 1D convolutions
Dense layer, output_dim=128, activation type=ReLU
Dense layer, output_dim=128, activation type=ReLU
Dense layer, output_dim=T , activation type=linear
My code is this:
from keras import layers, Input
from keras.models import Model
# the window sizes (seq_length?) are 40, 1075, 465, 72 and 1246 for the kettle, dish washer,
# fridge, microwave, oven and washing machine, respectively.
def ae_net(T):
input_layer = Input(shape= (T,))
branch_a = layers.Conv1D(32, 3, activation= 'linear', padding='same', strides=1)(input_layer)
branch_b = layers.Conv1D(32, 5, activation= 'linear', padding='same', strides=1)(input_layer)
branch_c = layers.Conv1D(32, 7, activation= 'linear', padding='same', strides=1)(input_layer)
merge_layer = layers.concatenate([branch_a, branch_b, branch_c], axis=1)
dense_1 = layers.Dense(128, activation='relu')(merge_layer)
dense_2 =layers.Dense(128, activation='relu')(dense_1)
output_dense = layers.Dense(T, activation='linear')(dense_2)
model = Model(input_layer, output_dense)
return model
model = ae_net(40)
model.compile(loss= 'mean_absolute_error', optimizer='rmsprop')
model.fit(X, y, batch_size= 8)
where X and y are numpy arrays of 8 sequences of a length of 40 values. So X.shape and y.shape are (8, 40, 1). It's actually one batch of data. The thing is I cannot understand how the output would be of shape (None, 120, 40) and what these sizes would mean.
As you noted, your shapes contain batch_size, length and channels: (8,40,1)
Your three convolutions are, each one, creating a tensor like (8,40,32).
Your concatenation in the axis=1 creates a tensor like (8,120,32), where 120 = 3*40.
Now, the dense layers only work on the last dimension (the channels in this case), leaving the length (now 120) untouched.
Solution
Now, it seems you do want to keep the length at the end. So you won't need any flatten or reshape layers. But you will need to keep the length 40, though.
You're probably doing the concatenation in the wrong axis. Instead of the length axis (1), you should concatenate in the channels axis (2 or -1).
So, this should be your concatenate layer:
merge_layer = layers.Concatenate()([branch_a, branch_b, branch_c])
#or layers.Concatenate(axis=-1)([branch_a, branch_b, branch_c])
This will output (8, 40, 96), and the dense layers will transform the 96 in something else.
I am not able to figure out how the tensor dimension got reduced by 1 in the TimeDistributed Dense step in the following
model = Sequential()
model.add(Embedding(vocab_size +1, 128, input_length=unravel_len)) # embedding shape: (99, 15, 128)
model.add(Bidirectional(LSTM(64, return_sequences=True))) # (99, 15, 128)
model.add(Dropout(0.5))
model.add(TimeDistributed(Dense(categories, activation='softmax'))) # (99, 15, 127)
I labeled the tensor shape along each step. You can see the last dimension dropped from 128 to 127. Can someone explain why that is. Thanks