I am try to train a model which detect 128d vector to recognize face. Input of model is an image and output is 128d vector (regression) which get from "face_recognition" library.
When I put 128 output to train I got this error:
ValueError: Error when checking target: expected dense_24 to have shape (1,) but got array with shape (128,)
But when I try only one output, fit function works.
The strange part of that prediction shape is (1, 128) but I can't give 128 output to train.
Here is my model:
from keras.applications.vgg16 import VGG16
from keras.layers import Flatten, Dense
import keras
def build_facereg_disc():
# load model
model = VGG16(include_top=False, input_shape=(64, 64, 3))
# add new classifier layers
flat1 = Flatten()(model.outputs)
class1 = Dense(2048, activation='relu')(flat1)
output = Dense(128, activation='relu')(class1)
# define new model
model = models.Model(inputs=model.inputs, outputs=output)
# summarize
return model
facereg_disc = build_facereg_disc()
facereg_disc.compile(optimizer=keras.optimizers.Adam(), # Optimizer
# Loss function to minimize
loss=keras.losses.SparseCategoricalCrossentropy(),
# List of metrics to monitor
metrics=['binary_crossentropy'])
And summary:
Model: "model_27"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_20 (InputLayer) (None, 64, 64, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 64, 64, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 64, 64, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 32, 32, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 32, 32, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 32, 32, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 16, 16, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 16, 16, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 16, 16, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 16, 16, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 8, 8, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 8, 8, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 4, 4, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 2, 2, 512) 0
_________________________________________________________________
flatten_10 (Flatten) (None, 2048) 0
_________________________________________________________________
dense_23 (Dense) (None, 2048) 4196352
_________________________________________________________________
dense_24 (Dense) (None, 128) 262272
=================================================================
Total params: 19,173,312
Trainable params: 19,173,312
Non-trainable params: 0
Here is preprocessing:
dir_data = "data_faces/img_align_celeba/"
Ntrain = 2000
Ntest = 100
nm_imgs = np.sort(os.listdir(dir_data))
## name of the jpg files for training set
nm_imgs_train = nm_imgs[:Ntrain]
## name of the jpg files for the testing data
nm_imgs_test = nm_imgs[Ntrain:Ntrain + Ntest]
img_shape = (64, 64, 3)
def get_npdata(nm_imgs_train):
X_train = []
for i, myid in enumerate(nm_imgs_train):
image = load_img(dir_data + "/" + myid,
target_size=img_shape[:2])
image = img_to_array(image)/255.0
X_train.append(image)
X_train = np.array(X_train)
return(X_train)
X_train = get_npdata(nm_imgs_train)
X_train.shape = (2000, 64, 64, 3)
y_train.shape = (2000, 128)
I use batch size like:
idx = np.random.randint(0, X_train.shape[0], half_batch)
imgs = X_train[idx]
labels = y_train[idx]
reg_d_loss_real = facereg_disc.train_on_batch(imgs, labels)
Your issue comes from your loss function. As explained in the doc, SparseCategoricalCrossentropy expects each sample in y_true to be an integer encoding the class, whereas CategoricalCrossentropy expects a one-hot encoded representation (which is your case).
So, switch to CategoricalCrossentropy and you should be fine.
However, to reproduce, I had to change:
flat1 = Flatten()(model.outputs)
To:
flat1 = Flatten()(model.outputs[0])
Related
I am experimenting/fiddling/learning with some small ML problems.
I have a loaded model based on a pre-trained convolution base with some self-trained dense layers (for model details see below).
I wanted to try to apply some visualizations like activations and the Grad CAM Visualization (https://www.statworx.com/de/blog/erklaerbbarkeit-von-deep-learning-modellen-mit-grad-cam/) on the model. But I was not able to do so.
I tried to create a new model based on mine (like in the article) with
grad_model = tf.keras.models.Model(model.inputs,
[model.get_layer('vgg16').output,
model.output])
but this already fails with the error:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_5_12:0", shape=(None, None, None, 3), dtype=float32) at layer "block1_conv1". The following previous layers were accessed without issue: []
I do not understand what this means. the model surely works (i can evaluate it and make predictions with it).
The call does not fail if I omit the model.get_layer('vgg16').output from the outputs list but of course, this is required for the visualization.
What I am doing wrong?
In a model that I constructed and trained from scratch, I was able to create a similar model with the activations as outputs but here i get these errors.
My model's details
The model was created with the following code and then trained and saved.
from tensorflow import keras
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras import optimizers
conv_base = keras.applications.vgg16.VGG16(
weights="vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5",
include_top=False)
conv_base.trainable = False
data_augmentation = keras.Sequential(
[
layers.experimental.preprocessing.RandomFlip("horizontal"),
layers.experimental.preprocessing.RandomRotation(0.1),
layers.experimental.preprocessing.RandomZoom(0.2),
]
)
inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs)
x = conv_base(x)
x = layers.Flatten()(x)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.compile(loss="binary_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"])
later it was loaded:
model = keras.models.load_model("myModel.keras")
print(model.summary())
print(model.get_layer('sequential').summary())
print(model.get_layer('vgg16').summary())
output:
Model: "functional_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_6 (InputLayer) [(None, 180, 180, 3)] 0
_________________________________________________________________
sequential (Sequential) (None, 180, 180, 3) 0
_________________________________________________________________
vgg16 (Functional) (None, None, None, 512) 14714688
_________________________________________________________________
flatten_1 (Flatten) (None, 12800) 0
_________________________________________________________________
dense_2 (Dense) (None, 256) 3277056
_________________________________________________________________
dropout_1 (Dropout) (None, 256) 0
_________________________________________________________________
dense_3 (Dense) (None, 1) 257
=================================================================
Total params: 17,992,001
Trainable params: 10,356,737
Non-trainable params: 7,635,264
_________________________________________________________________
None
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
random_flip (RandomFlip) (None, 180, 180, 3) 0
_________________________________________________________________
random_rotation (RandomRotat (None, 180, 180, 3) 0
_________________________________________________________________
random_zoom (RandomZoom) (None, 180, 180, 3) 0
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________
None
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) [(None, None, None, 3)] 0
_________________________________________________________________
block1_conv1 (Conv2D) multiple 1792
_________________________________________________________________
block1_conv2 (Conv2D) multiple 36928
_________________________________________________________________
block1_pool (MaxPooling2D) multiple 0
_________________________________________________________________
block2_conv1 (Conv2D) multiple 73856
_________________________________________________________________
block2_conv2 (Conv2D) multiple 147584
_________________________________________________________________
block2_pool (MaxPooling2D) multiple 0
_________________________________________________________________
block3_conv1 (Conv2D) multiple 295168
_________________________________________________________________
block3_conv2 (Conv2D) multiple 590080
_________________________________________________________________
block3_conv3 (Conv2D) multiple 590080
_________________________________________________________________
block3_pool (MaxPooling2D) multiple 0
_________________________________________________________________
block4_conv1 (Conv2D) multiple 1180160
_________________________________________________________________
block4_conv2 (Conv2D) multiple 2359808
_________________________________________________________________
block4_conv3 (Conv2D) multiple 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) multiple 0
_________________________________________________________________
block5_conv1 (Conv2D) multiple 2359808
_________________________________________________________________
block5_conv2 (Conv2D) multiple 2359808
_________________________________________________________________
block5_conv3 (Conv2D) multiple 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) multiple 0
=================================================================
Total params: 14,714,688
Trainable params: 7,079,424
Non-trainable params: 7,635,264
You can achieve what you want in the following way. First, define your model as follows:
inputs = tf.keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs, training=True)
x = keras.applications.VGG16(input_tensor=x,
include_top=False,
weights=None)
x.trainable = False
x = layers.Flatten()(x.output)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = keras.Model(inputs, x)
for i, layer in enumerate(model.layers):
print(i, layer.name, layer.output_shape, layer.trainable)
...
17 block5_conv2 (None, 11, 11, 512) False
18 block5_conv3 (None, 11, 11, 512) False
19 block5_pool (None, 5, 5, 512) False
20 flatten_2 (None, 12800) True
21 dense_4 (None, 256) True
22 dropout_2 (None, 256) True
23 dense_5 (None, 1) True
Now, build the grad-cam model with desired output layer as follows:
grad_model = keras.models.Model(
[model.inputs],
[model.get_layer('block5_pool').output,
model.output]
)
Test
image = np.random.rand(1, 180, 180, 3).astype(np.float32)
with tf.GradientTape() as tape:
convOutputs, predictions = grad_model(tf.cast(image, tf.float32))
loss = predictions[:, tf.argmax(predictions[0])]
grads = tape.gradient(loss, convOutputs)
print(grads)
tf.Tensor(
[[[[ 9.8454033e-04 3.6991197e-03 ... -1.2012678e-02
-1.7934230e-03 2.2925171e-03]
[ 1.6165405e-03 -1.9513096e-03 ... -2.5789393e-03
1.2443252e-03 -1.3931725e-03]
[-2.0554627e-04 1.2232144e-03 ... 5.2324748e-03
3.1955825e-04 3.4566019e-03]
[ 2.3650150e-03 -2.5699558e-03 ... -2.4103196e-03
5.8940407e-03 5.3285398e-03]
...
As the title clearly describes the question, I want to display the layers of a pretained model instead of a single entry (please see the vgg19 (Functional) entry below) in model.summary() function output?
Here is a sample model that is implemented using the Keras Sequential API:
base_model = VGG16(include_top=False, weights=None, input_shape=(32, 32, 3), pooling='max', classes=10)
model = Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(1_000, activation='relu'))
model.add(Dense(10, activation='softmax'))
And here is the output of the model.summary() function call:
Model: "sequential_15"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg19 (Functional) (None, 512) 20024384
_________________________________________________________________
flatten_15 (Flatten) (None, 512) 0
_________________________________________________________________
dense_21 (Dense) (None, 1000) 513000
_________________________________________________________________
dense_22 (Dense) (None, 10) 10010
=================================================================
Total params: 20,547,394
Trainable params: 523,010
Non-trainable params: 20,024,384
Edit: Here is the Functional API equivalent of the implemented Sequential API model - the result is the same:
base_model = VGG16(include_top=False, weights='imagenet', input_shape=(32, 32, 3), pooling='max', classes=10)
m_inputs = Input(shape=(32, 32, 3))
base_out = base_model(m_inputs)
x = Flatten()(base_out)
x = Dense(1_000, activation='relu')(x)
m_outputs = Dense(10, activation='softmax')(x)
model = Model(inputs=m_inputs, outputs=m_outputs)
Instead of using the Sequential, I tried using the Functional API i.e. the tf.keras.models.Model class, like,
import tensorflow as tf
base_model = tf.keras.applications.VGG16(include_top=False, weights=None, input_shape=(32, 32, 3), pooling='max', classes=10)
x = tf.keras.layers.Flatten()( base_model.output )
x = tf.keras.layers.Dense(1_000, activation='relu')( x )
outputs = tf.keras.layers.Dense(10, activation='softmax')( x )
model = tf.keras.models.Model( base_model.input , outputs )
model.summary()
The output of the above snippet,
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 32, 32, 3)] 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 32, 32, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 32, 32, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 16, 16, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 16, 16, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 16, 16, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 8, 8, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 8, 8, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 8, 8, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 8, 8, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 4, 4, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 4, 4, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 2, 2, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 2, 2, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 2, 2, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 2, 2, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 1, 1, 512) 0
_________________________________________________________________
global_max_pooling2d_2 (Glob (None, 512) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 1000) 513000
_________________________________________________________________
dense_3 (Dense) (None, 10) 10010
=================================================================
Total params: 15,237,698
Trainable params: 15,237,698
Non-trainable params: 0
_________________________________________________________________
My understanding after going through the docs and running a few tests (via TF 2.5.0) is that when such a model is included in another model, Keras conceives of it as a "black box". It is not a simple layer, definitely no tensor, basically of complex type tensorflow.python.keras.engine.functional.Functional.
I reckon this is the underlying reason that you can not print it out in a detailed way as part of the model summary.
Now, if you'd like to just review the pre-trained model, have a sneak peak etc., you can simply run:
base_model.summary()
or after constructing your model (sequential or functional, doesn't matter at this point):
model.layers[i].summary() # i: the index of your pre-trained model
If you need to access the pre-trained model's layers, e.g. to use its weights separately etc., you can access them with this way as well.
If you'd like to print the layers of your model as a whole, then you need to trick Keras into beliving the "black box" is no stranger but just yet another KerasTensor. In order to do that, you can wrap the pre-trained model in another layer -in other words, connect them directly via Functional API-, which was suggested above and has worked fine for me.
x = tf.keras.layers.Flatten()( base_model.output )
I don't know if there is any specific reason that you'd like to pursue the new input route as in...
m_inputs = Input(shape=(32, 32, 3))
base_out = base_model(m_inputs)
Whenever you locate the pre-trained model in the middle of your new model, as coming after the new Input layer or adding it to a Sequential model per se, the layers within would disappear from the summary output.
Generating a new Input layer or just feeding the pre-trained model's output as input to the current model didn't make any difference for me in this case.
Hope this clarifies the topic a wee bit more, and helps.
This should do what you want to do
base_model = VGG16(include_top=False, weights=None, input_shape=(32, 32, 3), pooling='max', classes=10)
model = Sequential()
for layer in base_model.layers:
layer.trainable = False
model.add(layer)
model.add(Flatten())
model.add(Dense(1_000, activation='relu'))
model.add(Dense(10, activation='softmax'))
I have a pretained model with summary:
Layer (type) Output Shape Param #
=================================================================
vgg19 (Model) (None, 4, 4, 512) 20024384
_________________________________________________________________
flatten_1 (Flatten) (None, 8192) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 8389632
_________________________________________________________________
dropout_1 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 1024) 1049600
_________________________________________________________________
dense_3 (Dense) (None, 5) 5125
=================================================================
I need the version with vgg19 expanded not in a single layer. Something like
this :
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 128, 128, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 128, 128, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 128, 128, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 64, 64, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 64, 64, 128) 73856
.
.
.
** end of vgg16 **
_________________________________________________________________
flatten_1 (Flatten) (None, 8192) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 8389632
_________________________________________________________________
dropout_1 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 1024) 1049600
_________________________________________________________________
dense_3 (Dense) (None, 5) 5125
=================================================================
I have trying to copy layer by layer but I have encountered lots of problems. There exist a way to accomplish this, that also copy the weights?
I don't know how you implemented, you can see the code how I implemented. I hope it will help.
from keras.applications.vgg19 import VGG19
from keras.models import Model
from keras.layers import *
model = VGG19(weights='imagenet', include_top=False, input_shape=(128,128,3))
flatten_1 = Flatten()(model.output)
dense_1 = Dense(1024)(flatten_1)
dropout_1 = Dropout(0.2)(dense_1)
dense_2 = Dense(1024)(dropout_1)
dense_3 = Dense(5)(dense_2)
model = Model(inputs=model.input, outputs=dense_3)
print(model.summary())
Result.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 128, 128, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 128, 128, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 128, 128, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 64, 64, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 64, 64, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 64, 64, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 32, 32, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 32, 32, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 32, 32, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 32, 32, 256) 590080
_________________________________________________________________
block3_conv4 (Conv2D) (None, 32, 32, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 16, 16, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 16, 16, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 16, 16, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 16, 16, 512) 2359808
_________________________________________________________________
block4_conv4 (Conv2D) (None, 16, 16, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 8, 8, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block5_conv4 (Conv2D) (None, 8, 8, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 4, 4, 512) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 8192) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 8389632
_________________________________________________________________
dropout_1 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 1024) 1049600
_________________________________________________________________
dense_3 (Dense) (None, 5) 5125
=================================================================
Total params: 29,468,741
Trainable params: 29,468,741
Non-trainable params: 0
_________________________________________________________________
The following is the architecture of fine-tuned network with VGG16 as Base Model.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
model_1 (Model) (None, 25088) 14714688
_________________________________________________________________
dense_1 (Dense) (None, 512) 12845568
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_2 (Dropout) (None, 512) 0
_________________________________________________________________
dense_3 (Dense) (None, 1) 513
=================================================================
Total params: 27,823,425
Trainable params: 26,087,937
Non-trainable params: 1,735,488
_________________________________________________________________
I am trying to visualize gradients of input with respect to loss and 'block5_conv3' wrt to output. Using the
def build_backprop(model, loss):
# Gradient of the input image with respect to the loss function
gradients = K.gradients(loss, model.input)[0]
# Normalize the gradients
gradients /= (K.sqrt(K.mean(K.square(gradients))) + 1e-5)
# Keras function to calculate the gradients and loss
return K.function([model.input], [loss, gradients])
# Input wrt to loss
# Loss function that optimizes one class
loss_function = K.mean(model.get_layer('dense_3').output)
# Backprop function
backprop = build_backprop(model.get_layer('model_1').get_layer('input_1'), loss_function)
# block5_conv3 wrt to output
K.gradients(model.get_layer("dense_3").output, model.get_layer("model_1").get_layer("block5_conv3").output)[0])
Both above return AttributeError: 'NoneType' object has no attribute 'dtype' implying that in both cases K.gradients output is None.
What could be cause for gradients to be result in None?
Any ways to resolve such error?
Update
The issue of None gets resolved only if we convert Sequential API to Functional API.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 25088) 0
_________________________________________________________________
dense_10 (Dense) (None, 512) 12845568
_________________________________________________________________
dropout_7 (Dropout) (None, 512) 0
_________________________________________________________________
dense_11 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_8 (Dropout) (None, 512) 0
_________________________________________________________________
dense_12 (Dense) (None, 2) 1026
=================================================================
Total params: 27,823,938
Trainable params: 20,188,674
Non-trainable params: 7,635,264
_________________________________________________________________
New architecture after change. Now the error is all the gradients come 0s.
For e.g.
preds = model.predict(x)
class_idx = np.argmax(preds[0])
class_output = model.output[:, class_idx]
last_conv_layer = model.get_layer("block5_conv3")
grads = K.gradients(class_output, last_conv_layer.output)[0]
pooled_grads = K.mean(grads, axis=(0, 1, 2))
iterate = K.function([model.input], [pooled_grads, last_conv_layer.output[0]])
pooled_grads_value, conv_layer_output_value = iterate([x])
for i in range(512):
conv_layer_output_value[:, :, i] *= pooled_grads_value[i]
The output of pooled_grads_value and conv_layer_output_value are all zeros.
I was able to solve both the questions.
Question 1: Both above return AttributeError: 'NoneType' object has no attribute 'dtype' implying that in both cases K.gradients output is None.
The problem here was the model was sequential but after converting from Sequential to Functional this problem vanished and new one appeared.
Question 2: The output of pooled_grads_value and conv_layer_output_value are all zeros.
I resolved this problem by converting last softmax layer to linear layer.
Here is the code
from vis.utils import utils
from keras import activations
# Utility to search for layer index by name.
# Alternatively we can specify this as -1 since it corresponds to the last layer.
layer_idx = utils.find_layer_idx(model, 'dense_12')
# Swap softmax with linear
model.layers[layer_idx].activation = activations.linear
model = utils.apply_modifications(model)
This swap worked perfectly fine and I obtain desired results.
Although, now the only part is I don't understand as to why it doesn't work for softmax? Will it work if we replace last layer from softmax to sigmoid of 1 output?
I want to combine a pretrained VGG16 model with a special input block, which is an input layer and a convolutional layer. The goal is to use a pre-trained RGB VGG16 imagenet model on grayscale images:
from keras.applications.vgg16 import VGG16
from keras.layers.convolutional import Conv2D
from keras.layers import Input
from keras.models import Model
img_height = 299
img_width = 299
def input_block(img_height = 299, img_width = 299):
input_shape = (img_height, img_width, 1)
img_input = Input(shape=input_shape, name = 'grayscale_input_layer')
x = Conv2D(3, (3,3), padding= 'same', name = 'grayscale_RGB_layer')(img_input)
return x
pretrained_model = VGG16(weights = 'imagenet', include_top=False, input_tensor = input_block(img_height, img_width))
When I set the weight initalization of VGG16() to 'None', the model builds correctly, with the following desired structure:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
grayscale_input_layer (Input (None, 299, 299, 1) 0
_________________________________________________________________
grayscale_RGB_layer (Conv2D) (None, 299, 299, 3) 30
_________________________________________________________________
block1_conv1 (Conv2D) (None, 299, 299, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 299, 299, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 149, 149, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 149, 149, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 149, 149, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 74, 74, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 74, 74, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 74, 74, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 74, 74, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 37, 37, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 37, 37, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 37, 37, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 37, 37, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 18, 18, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 18, 18, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 18, 18, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 18, 18, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 9, 9, 512) 0
=================================================================
Total params: 14,714,718
Trainable params: 14,714,718
Non-trainable params: 0
_________________________________________________________________
None
However, when I set the weight initialization to 'imagenet',
I get the following error:
ValueError: You are trying to load a weight file containing 13 layers into a model with 14 layers.
This error makes sense, since I have added two layers in front of the VGG16 model instead of a single layer.
As a workaround, I have tried the following:
def input_block_model(img_height = 299, img_width = 299):
input_shape = (img_height, img_width, 1)
img_input = Input(shape=input_shape, name = 'grayscale_input_layer')
x = Conv2D(3, (3,3), padding= 'same', name = 'grayscale_RGB_layer')(img_input)
model = Model(img_input, x, name='input_block_model')
return model
input_model = input_block_model(299,299)
pretrained_model = VGG16(weights = "imagenet", include_top=False)
combined_model = Model(input_model.input,
pretrained_model(input_model.output))
print(combined_model.summary())
Then, the model structure is:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
grayscale_input_layer (Input (None, 299, 299, 1) 0
_________________________________________________________________
grayscale_RGB_layer (Conv2D) (None, 299, 299, 3) 30
_________________________________________________________________
vgg16 (Model) multiple 14714688
=================================================================
Total params: 14,714,718
Trainable params: 14,714,718
Non-trainable params: 0
_________________________________________________________________
None
The disadvantage of this structure, is that I cannot set properties of layers within the VGG16 model. I want to freeze certain layers for example in this model, which I cannot access via combined_model.layers. Does anyone have a working solution, such that I get the model structure as with the 'None' initialization, but with pretrained ImageNet weights?
You can freeze or train layers using combined_model.layers[2].layers as mentioned in the comment above. You can may be simplify the model as follows:
```
img_input = Input(shape=(img_height, img_width, 1), name = 'grayscale_input_layer')
x = Conv2D(3, (3,3), padding= 'same', name = 'grayscale_RGB_layer')(img_input)
x = VGG16(weights = None, include_top=False)(x)
model = Model(img_input, x)
model.summary()
for layer in model.layers[2].layers:
layer.trainable = False
```