A VGG-19 network has 25 layers as shown here. But if I check the number of layers in Keras implementation, it shows 26 layers. How?
model = VGG19()
len(model.layers)
gives output
26
If you are confused, you can print out the structure of VGG19 directly with model.summary(). It show a layer input_1 (InputLayer) as the input layer.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv4 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv4 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv4 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
fc1 (Dense) (None, 4096) 102764544
_________________________________________________________________
fc2 (Dense) (None, 4096) 16781312
_________________________________________________________________
predictions (Dense) (None, 1000) 4097000
=================================================================
Total params: 143,667,240
Trainable params: 143,667,240
Non-trainable params: 0
_________________________________________________________________
If you want to get output from 1st FC layer, you should use model.layers[23] instead of 22. In fact, you can print out the shape directly and compare it with the output of model.summary().
print(model.layers[22].output.shape)
print(model.layers[23].output.shape)
print(model.layers[24].output.shape)
print(model.layers[25].output.shape)
(?, ?) # flatten (Flatten)
(?, 4096) # fc1 (Dense)
(?, 4096) # fc2 (Dense)
(?, 1000) # predictions (Dense)
In addition, you can get 1st FC layer directly by using the layer name 'fc1'.
print(model.get_layer('fc1').output.shape)
(?, 4096)
The 19 in VGG-19 refers to layers with learn-able weights. If you print the model summary you get the following
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv4 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv4 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv4 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
fc1 (Dense) (None, 4096) 102764544
_________________________________________________________________
fc2 (Dense) (None, 4096) 16781312
_________________________________________________________________
predictions (Dense) (None, 1000) 4097000
=================================================================
Total params: 143,667,240
Trainable params: 143,667,240
Non-trainable params: 0
Here you have 7 layers that don't have any learn-able weights. These are one InputLayer, five MaxPooling2D layer and one Flatten layer. This is how you get 26 layers (19+1+5+1).
Related
I am try to train a model which detect 128d vector to recognize face. Input of model is an image and output is 128d vector (regression) which get from "face_recognition" library.
When I put 128 output to train I got this error:
ValueError: Error when checking target: expected dense_24 to have shape (1,) but got array with shape (128,)
But when I try only one output, fit function works.
The strange part of that prediction shape is (1, 128) but I can't give 128 output to train.
Here is my model:
from keras.applications.vgg16 import VGG16
from keras.layers import Flatten, Dense
import keras
def build_facereg_disc():
# load model
model = VGG16(include_top=False, input_shape=(64, 64, 3))
# add new classifier layers
flat1 = Flatten()(model.outputs)
class1 = Dense(2048, activation='relu')(flat1)
output = Dense(128, activation='relu')(class1)
# define new model
model = models.Model(inputs=model.inputs, outputs=output)
# summarize
return model
facereg_disc = build_facereg_disc()
facereg_disc.compile(optimizer=keras.optimizers.Adam(), # Optimizer
# Loss function to minimize
loss=keras.losses.SparseCategoricalCrossentropy(),
# List of metrics to monitor
metrics=['binary_crossentropy'])
And summary:
Model: "model_27"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_20 (InputLayer) (None, 64, 64, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 64, 64, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 64, 64, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 32, 32, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 32, 32, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 32, 32, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 16, 16, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 16, 16, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 16, 16, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 16, 16, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 8, 8, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 8, 8, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 4, 4, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 2, 2, 512) 0
_________________________________________________________________
flatten_10 (Flatten) (None, 2048) 0
_________________________________________________________________
dense_23 (Dense) (None, 2048) 4196352
_________________________________________________________________
dense_24 (Dense) (None, 128) 262272
=================================================================
Total params: 19,173,312
Trainable params: 19,173,312
Non-trainable params: 0
Here is preprocessing:
dir_data = "data_faces/img_align_celeba/"
Ntrain = 2000
Ntest = 100
nm_imgs = np.sort(os.listdir(dir_data))
## name of the jpg files for training set
nm_imgs_train = nm_imgs[:Ntrain]
## name of the jpg files for the testing data
nm_imgs_test = nm_imgs[Ntrain:Ntrain + Ntest]
img_shape = (64, 64, 3)
def get_npdata(nm_imgs_train):
X_train = []
for i, myid in enumerate(nm_imgs_train):
image = load_img(dir_data + "/" + myid,
target_size=img_shape[:2])
image = img_to_array(image)/255.0
X_train.append(image)
X_train = np.array(X_train)
return(X_train)
X_train = get_npdata(nm_imgs_train)
X_train.shape = (2000, 64, 64, 3)
y_train.shape = (2000, 128)
I use batch size like:
idx = np.random.randint(0, X_train.shape[0], half_batch)
imgs = X_train[idx]
labels = y_train[idx]
reg_d_loss_real = facereg_disc.train_on_batch(imgs, labels)
Your issue comes from your loss function. As explained in the doc, SparseCategoricalCrossentropy expects each sample in y_true to be an integer encoding the class, whereas CategoricalCrossentropy expects a one-hot encoded representation (which is your case).
So, switch to CategoricalCrossentropy and you should be fine.
However, to reproduce, I had to change:
flat1 = Flatten()(model.outputs)
To:
flat1 = Flatten()(model.outputs[0])
I have a pretained model with summary:
Layer (type) Output Shape Param #
=================================================================
vgg19 (Model) (None, 4, 4, 512) 20024384
_________________________________________________________________
flatten_1 (Flatten) (None, 8192) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 8389632
_________________________________________________________________
dropout_1 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 1024) 1049600
_________________________________________________________________
dense_3 (Dense) (None, 5) 5125
=================================================================
I need the version with vgg19 expanded not in a single layer. Something like
this :
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 128, 128, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 128, 128, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 128, 128, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 64, 64, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 64, 64, 128) 73856
.
.
.
** end of vgg16 **
_________________________________________________________________
flatten_1 (Flatten) (None, 8192) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 8389632
_________________________________________________________________
dropout_1 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 1024) 1049600
_________________________________________________________________
dense_3 (Dense) (None, 5) 5125
=================================================================
I have trying to copy layer by layer but I have encountered lots of problems. There exist a way to accomplish this, that also copy the weights?
I don't know how you implemented, you can see the code how I implemented. I hope it will help.
from keras.applications.vgg19 import VGG19
from keras.models import Model
from keras.layers import *
model = VGG19(weights='imagenet', include_top=False, input_shape=(128,128,3))
flatten_1 = Flatten()(model.output)
dense_1 = Dense(1024)(flatten_1)
dropout_1 = Dropout(0.2)(dense_1)
dense_2 = Dense(1024)(dropout_1)
dense_3 = Dense(5)(dense_2)
model = Model(inputs=model.input, outputs=dense_3)
print(model.summary())
Result.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 128, 128, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 128, 128, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 128, 128, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 64, 64, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 64, 64, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 64, 64, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 32, 32, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 32, 32, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 32, 32, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 32, 32, 256) 590080
_________________________________________________________________
block3_conv4 (Conv2D) (None, 32, 32, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 16, 16, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 16, 16, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 16, 16, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 16, 16, 512) 2359808
_________________________________________________________________
block4_conv4 (Conv2D) (None, 16, 16, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 8, 8, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block5_conv4 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 4, 4, 512) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 8192) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 8389632
_________________________________________________________________
dropout_1 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 1024) 1049600
_________________________________________________________________
dense_3 (Dense) (None, 5) 5125
=================================================================
Total params: 29,468,741
Trainable params: 29,468,741
Non-trainable params: 0
_________________________________________________________________
Could anyone help me with this problem of multi-class semantic segmentation. I have modified a code to accept RGB images and RGB labels as masks. I am using the following model
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
img (InputLayer) (None, 128, 128, 3) 0
__________________________________________________________________________________________________
conv2d_170 (Conv2D) (None, 128, 128, 16) 448 img[0][0]
__________________________________________________________________________________________________
batch_normalization_163 (BatchN (None, 128, 128, 16) 64 conv2d_170[0][0]
__________________________________________________________________________________________________
activation_163 (Activation) (None, 128, 128, 16) 0 batch_normalization_163[0][0]
__________________________________________________________________________________________________
conv2d_171 (Conv2D) (None, 128, 128, 16) 2320 activation_163[0][0]
__________________________________________________________________________________________________
batch_normalization_164 (BatchN (None, 128, 128, 16) 64 conv2d_171[0][0]
__________________________________________________________________________________________________
activation_164 (Activation) (None, 128, 128, 16) 0 batch_normalization_164[0][0]
__________________________________________________________________________________________________
max_pooling2d_37 (MaxPooling2D) (None, 64, 64, 16) 0 activation_164[0][0]
__________________________________________________________________________________________________
dropout_73 (Dropout) (None, 64, 64, 16) 0 max_pooling2d_37[0][0]
__________________________________________________________________________________________________
conv2d_172 (Conv2D) (None, 64, 64, 32) 4640 dropout_73[0][0]
__________________________________________________________________________________________________
batch_normalization_165 (BatchN (None, 64, 64, 32) 128 conv2d_172[0][0]
__________________________________________________________________________________________________
activation_165 (Activation) (None, 64, 64, 32) 0 batch_normalization_165[0][0]
__________________________________________________________________________________________________
conv2d_173 (Conv2D) (None, 64, 64, 32) 9248 activation_165[0][0]
__________________________________________________________________________________________________
batch_normalization_166 (BatchN (None, 64, 64, 32) 128 conv2d_173[0][0]
__________________________________________________________________________________________________
activation_166 (Activation) (None, 64, 64, 32) 0 batch_normalization_166[0][0]
__________________________________________________________________________________________________
max_pooling2d_38 (MaxPooling2D) (None, 32, 32, 32) 0 activation_166[0][0]
__________________________________________________________________________________________________
dropout_74 (Dropout) (None, 32, 32, 32) 0 max_pooling2d_38[0][0]
__________________________________________________________________________________________________
conv2d_174 (Conv2D) (None, 32, 32, 64) 18496 dropout_74[0][0]
__________________________________________________________________________________________________
batch_normalization_167 (BatchN (None, 32, 32, 64) 256 conv2d_174[0][0]
__________________________________________________________________________________________________
activation_167 (Activation) (None, 32, 32, 64) 0 batch_normalization_167[0][0]
__________________________________________________________________________________________________
conv2d_175 (Conv2D) (None, 32, 32, 64) 36928 activation_167[0][0]
__________________________________________________________________________________________________
batch_normalization_168 (BatchN (None, 32, 32, 64) 256 conv2d_175[0][0]
__________________________________________________________________________________________________
activation_168 (Activation) (None, 32, 32, 64) 0 batch_normalization_168[0][0]
__________________________________________________________________________________________________
max_pooling2d_39 (MaxPooling2D) (None, 16, 16, 64) 0 activation_168[0][0]
__________________________________________________________________________________________________
dropout_75 (Dropout) (None, 16, 16, 64) 0 max_pooling2d_39[0][0]
__________________________________________________________________________________________________
conv2d_176 (Conv2D) (None, 16, 16, 128) 73856 dropout_75[0][0]
__________________________________________________________________________________________________
batch_normalization_169 (BatchN (None, 16, 16, 128) 512 conv2d_176[0][0]
__________________________________________________________________________________________________
activation_169 (Activation) (None, 16, 16, 128) 0 batch_normalization_169[0][0]
__________________________________________________________________________________________________
conv2d_177 (Conv2D) (None, 16, 16, 128) 147584 activation_169[0][0]
__________________________________________________________________________________________________
batch_normalization_170 (BatchN (None, 16, 16, 128) 512 conv2d_177[0][0]
__________________________________________________________________________________________________
activation_170 (Activation) (None, 16, 16, 128) 0 batch_normalization_170[0][0]
__________________________________________________________________________________________________
max_pooling2d_40 (MaxPooling2D) (None, 8, 8, 128) 0 activation_170[0][0]
__________________________________________________________________________________________________
dropout_76 (Dropout) (None, 8, 8, 128) 0 max_pooling2d_40[0][0]
__________________________________________________________________________________________________
conv2d_178 (Conv2D) (None, 8, 8, 256) 295168 dropout_76[0][0]
__________________________________________________________________________________________________
batch_normalization_171 (BatchN (None, 8, 8, 256) 1024 conv2d_178[0][0]
__________________________________________________________________________________________________
activation_171 (Activation) (None, 8, 8, 256) 0 batch_normalization_171[0][0]
__________________________________________________________________________________________________
conv2d_179 (Conv2D) (None, 8, 8, 256) 590080 activation_171[0][0]
__________________________________________________________________________________________________
batch_normalization_172 (BatchN (None, 8, 8, 256) 1024 conv2d_179[0][0]
__________________________________________________________________________________________________
activation_172 (Activation) (None, 8, 8, 256) 0 batch_normalization_172[0][0]
__________________________________________________________________________________________________
conv2d_transpose_37 (Conv2DTran (None, 16, 16, 128) 295040 activation_172[0][0]
__________________________________________________________________________________________________
concatenate_37 (Concatenate) (None, 16, 16, 256) 0 conv2d_transpose_37[0][0]
activation_170[0][0]
__________________________________________________________________________________________________
dropout_77 (Dropout) (None, 16, 16, 256) 0 concatenate_37[0][0]
__________________________________________________________________________________________________
conv2d_180 (Conv2D) (None, 16, 16, 128) 295040 dropout_77[0][0]
__________________________________________________________________________________________________
batch_normalization_173 (BatchN (None, 16, 16, 128) 512 conv2d_180[0][0]
__________________________________________________________________________________________________
activation_173 (Activation) (None, 16, 16, 128) 0 batch_normalization_173[0][0]
__________________________________________________________________________________________________
conv2d_181 (Conv2D) (None, 16, 16, 128) 147584 activation_173[0][0]
__________________________________________________________________________________________________
batch_normalization_174 (BatchN (None, 16, 16, 128) 512 conv2d_181[0][0]
__________________________________________________________________________________________________
activation_174 (Activation) (None, 16, 16, 128) 0 batch_normalization_174[0][0]
__________________________________________________________________________________________________
conv2d_transpose_38 (Conv2DTran (None, 32, 32, 64) 73792 activation_174[0][0]
__________________________________________________________________________________________________
concatenate_38 (Concatenate) (None, 32, 32, 128) 0 conv2d_transpose_38[0][0]
activation_168[0][0]
__________________________________________________________________________________________________
dropout_78 (Dropout) (None, 32, 32, 128) 0 concatenate_38[0][0]
__________________________________________________________________________________________________
conv2d_182 (Conv2D) (None, 32, 32, 64) 73792 dropout_78[0][0]
__________________________________________________________________________________________________
batch_normalization_175 (BatchN (None, 32, 32, 64) 256 conv2d_182[0][0]
__________________________________________________________________________________________________
activation_175 (Activation) (None, 32, 32, 64) 0 batch_normalization_175[0][0]
__________________________________________________________________________________________________
conv2d_183 (Conv2D) (None, 32, 32, 64) 36928 activation_175[0][0]
__________________________________________________________________________________________________
batch_normalization_176 (BatchN (None, 32, 32, 64) 256 conv2d_183[0][0]
__________________________________________________________________________________________________
activation_176 (Activation) (None, 32, 32, 64) 0 batch_normalization_176[0][0]
__________________________________________________________________________________________________
conv2d_transpose_39 (Conv2DTran (None, 64, 64, 32) 18464 activation_176[0][0]
__________________________________________________________________________________________________
concatenate_39 (Concatenate) (None, 64, 64, 64) 0 conv2d_transpose_39[0][0]
activation_166[0][0]
__________________________________________________________________________________________________
dropout_79 (Dropout) (None, 64, 64, 64) 0 concatenate_39[0][0]
__________________________________________________________________________________________________
conv2d_184 (Conv2D) (None, 64, 64, 32) 18464 dropout_79[0][0]
__________________________________________________________________________________________________
batch_normalization_177 (BatchN (None, 64, 64, 32) 128 conv2d_184[0][0]
__________________________________________________________________________________________________
activation_177 (Activation) (None, 64, 64, 32) 0 batch_normalization_177[0][0]
__________________________________________________________________________________________________
conv2d_185 (Conv2D) (None, 64, 64, 32) 9248 activation_177[0][0]
__________________________________________________________________________________________________
batch_normalization_178 (BatchN (None, 64, 64, 32) 128 conv2d_185[0][0]
__________________________________________________________________________________________________
activation_178 (Activation) (None, 64, 64, 32) 0 batch_normalization_178[0][0]
__________________________________________________________________________________________________
conv2d_transpose_40 (Conv2DTran (None, 128, 128, 16) 4624 activation_178[0][0]
__________________________________________________________________________________________________
concatenate_40 (Concatenate) (None, 128, 128, 32) 0 conv2d_transpose_40[0][0]
activation_164[0][0]
__________________________________________________________________________________________________
dropout_80 (Dropout) (None, 128, 128, 32) 0 concatenate_40[0][0]
__________________________________________________________________________________________________
conv2d_186 (Conv2D) (None, 128, 128, 16) 4624 dropout_80[0][0]
__________________________________________________________________________________________________
batch_normalization_179 (BatchN (None, 128, 128, 16) 64 conv2d_186[0][0]
__________________________________________________________________________________________________
activation_179 (Activation) (None, 128, 128, 16) 0 batch_normalization_179[0][0]
__________________________________________________________________________________________________
conv2d_187 (Conv2D) (None, 128, 128, 16) 2320 activation_179[0][0]
__________________________________________________________________________________________________
batch_normalization_180 (BatchN (None, 128, 128, 16) 64 conv2d_187[0][0]
__________________________________________________________________________________________________
activation_180 (Activation) (None, 128, 128, 16) 0 batch_normalization_180[0][0]
__________________________________________________________________________________________________
conv2d_188 (Conv2D) (None, 128, 128, 1) 17 activation_180[0][0]
==================================================================================================
Total params: 2,164,593
Trainable params: 2,161,649
Non-trainable params: 2,944
__________________________________________________________________________________________________
As you can see the input has 3 channels. Should the last layer have 1 channel of 11 channels? The dataset I am using has 11 classes which are denoted by different RGB value combinations in the image..
Thanks.
The last layer should be 11 channels corresponding to the 11 classes for each pixel location. It is just like doing a multi-class classification for each pixel location.
The following is the architecture of fine-tuned network with VGG16 as Base Model.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
model_1 (Model) (None, 25088) 14714688
_________________________________________________________________
dense_1 (Dense) (None, 512) 12845568
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_2 (Dropout) (None, 512) 0
_________________________________________________________________
dense_3 (Dense) (None, 1) 513
=================================================================
Total params: 27,823,425
Trainable params: 26,087,937
Non-trainable params: 1,735,488
_________________________________________________________________
I am trying to visualize gradients of input with respect to loss and 'block5_conv3' wrt to output. Using the
def build_backprop(model, loss):
# Gradient of the input image with respect to the loss function
gradients = K.gradients(loss, model.input)[0]
# Normalize the gradients
gradients /= (K.sqrt(K.mean(K.square(gradients))) + 1e-5)
# Keras function to calculate the gradients and loss
return K.function([model.input], [loss, gradients])
# Input wrt to loss
# Loss function that optimizes one class
loss_function = K.mean(model.get_layer('dense_3').output)
# Backprop function
backprop = build_backprop(model.get_layer('model_1').get_layer('input_1'), loss_function)
# block5_conv3 wrt to output
K.gradients(model.get_layer("dense_3").output, model.get_layer("model_1").get_layer("block5_conv3").output)[0])
Both above return AttributeError: 'NoneType' object has no attribute 'dtype' implying that in both cases K.gradients output is None.
What could be cause for gradients to be result in None?
Any ways to resolve such error?
Update
The issue of None gets resolved only if we convert Sequential API to Functional API.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 25088) 0
_________________________________________________________________
dense_10 (Dense) (None, 512) 12845568
_________________________________________________________________
dropout_7 (Dropout) (None, 512) 0
_________________________________________________________________
dense_11 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_8 (Dropout) (None, 512) 0
_________________________________________________________________
dense_12 (Dense) (None, 2) 1026
=================================================================
Total params: 27,823,938
Trainable params: 20,188,674
Non-trainable params: 7,635,264
_________________________________________________________________
New architecture after change. Now the error is all the gradients come 0s.
For e.g.
preds = model.predict(x)
class_idx = np.argmax(preds[0])
class_output = model.output[:, class_idx]
last_conv_layer = model.get_layer("block5_conv3")
grads = K.gradients(class_output, last_conv_layer.output)[0]
pooled_grads = K.mean(grads, axis=(0, 1, 2))
iterate = K.function([model.input], [pooled_grads, last_conv_layer.output[0]])
pooled_grads_value, conv_layer_output_value = iterate([x])
for i in range(512):
conv_layer_output_value[:, :, i] *= pooled_grads_value[i]
The output of pooled_grads_value and conv_layer_output_value are all zeros.
I was able to solve both the questions.
Question 1: Both above return AttributeError: 'NoneType' object has no attribute 'dtype' implying that in both cases K.gradients output is None.
The problem here was the model was sequential but after converting from Sequential to Functional this problem vanished and new one appeared.
Question 2: The output of pooled_grads_value and conv_layer_output_value are all zeros.
I resolved this problem by converting last softmax layer to linear layer.
Here is the code
from vis.utils import utils
from keras import activations
# Utility to search for layer index by name.
# Alternatively we can specify this as -1 since it corresponds to the last layer.
layer_idx = utils.find_layer_idx(model, 'dense_12')
# Swap softmax with linear
model.layers[layer_idx].activation = activations.linear
model = utils.apply_modifications(model)
This swap worked perfectly fine and I obtain desired results.
Although, now the only part is I don't understand as to why it doesn't work for softmax? Will it work if we replace last layer from softmax to sigmoid of 1 output?
I'm running some tutorial code for a binary image classification problem. Its a very simple architecture (3 convolution/relu/pooling + fully connected), however the final training step of each epoch takes ~130 seconds, whereas the first 127 take 20s in total. Could anyone explain this and can I somehow speed this up? I'm running on my GPU with 2GB VRAM.
rmsprop = optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
model.compile(loss='binary_crossentropy',
optimizer=rmsprop,
metrics=['accuracy'])
nb_epoch = 30
nb_train_samples = 2048
nb_validation_samples = 832
model.summary()
model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)
127/128 [============================>.] - ETA: 0s - loss: 0.7302 - acc: 0.5266
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 148, 148, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 148, 148, 32) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 74, 74, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 72, 72, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 72, 72, 32) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 36, 36, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 34, 34, 64) 18496
_________________________________________________________________
activation_3 (Activation) (None, 34, 34, 64) 0
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 17, 17, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 18496) 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 1183808
_________________________________________________________________
activation_4 (Activation) (None, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 64) 0
_________________________________________________________________
dense_2 (Dense) (None, 1) 65
_________________________________________________________________
activation_5 (Activation) (None, 1) 0
=================================================================
Total params: 1,212,513.0
Trainable params: 1,212,513.0
Non-trainable params: 0.0
_________________________________________________________________
Probably your script saves or validates without declaring this in your ETA-calculation