Convert from sequential to Functional API model - python-3.x

I have the following code snippet:
model = Sequential()
model.add(Conv2D(16, (5, 5), input_shape=(256, 256, 1)))
x = model.layers[0].output
model.add(Lambda(lambda x: tf.abs(x)))
model.add(Activation(activation='tanh'))
My question, how to convert these steps into the Functional API Keras model. My confused idea is how to insert the ABS layer into the Functional API models.

Let's take a look at your model with the Sequential implementation and Functional API implementation :
Here are some imports:
import tensorflow as tf
from tensorflow.keras.layers import Lambda,Conv2D, Activation, Input
from tensorflow.keras import Model, Sequential
Here is your implementation using the Sequential Model:
model = Sequential()
model.add(Conv2D(16, (5, 5), input_shape=(256, 256, 1)))
x = model.layers[0].output
model.add(Lambda(lambda x: tf.abs(x)))
model.add(Activation(activation='tanh'))
model.summary()
The summary output:
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_6 (Conv2D) (None, 252, 252, 16) 416
_________________________________________________________________
lambda_6 (Lambda) (None, 252, 252, 16) 0
_________________________________________________________________
activation_5 (Activation) (None, 252, 252, 16) 0
=================================================================
Total params: 416
Trainable params: 416
Non-trainable params: 0
_________________________________________________________________
Now the implementation with Functional API:
First, define your function:
def arbitrary_functionality(tensor):
return tf.abs(tensor)
And:
input_layer = Input(shape=(256, 256, 1))
conv1 = Conv2D(16, (5, 5))(input_layer)
lambda_layer = Lambda(arbitrary_functionality)(conv1)
output_layer = Activation(activation='tanh')(lambda_layer)
model_2 = Model(inputs=input_layer, outputs=output_layer)
model_2 .summary()
The summary output:
Model: "model_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_7 (InputLayer) [(None, 256, 256, 1)] 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 252, 252, 16) 416
_________________________________________________________________
lambda_9 (Lambda) (None, 252, 252, 16) 0
_________________________________________________________________
activation_8 (Activation) (None, 252, 252, 16) 0
=================================================================
Total params: 416
Trainable params: 416
Non-trainable params: 0
_________________________________________________________________
Note: according to the TensorFlow documentation, a better way is to subclass the Layer class. See an example here.

Related

tensorflow model gives "graph disconnected" error

I am experimenting/fiddling/learning with some small ML problems.
I have a loaded model based on a pre-trained convolution base with some self-trained dense layers (for model details see below).
I wanted to try to apply some visualizations like activations and the Grad CAM Visualization (https://www.statworx.com/de/blog/erklaerbbarkeit-von-deep-learning-modellen-mit-grad-cam/) on the model. But I was not able to do so.
I tried to create a new model based on mine (like in the article) with
grad_model = tf.keras.models.Model(model.inputs,
[model.get_layer('vgg16').output,
model.output])
but this already fails with the error:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_5_12:0", shape=(None, None, None, 3), dtype=float32) at layer "block1_conv1". The following previous layers were accessed without issue: []
I do not understand what this means. the model surely works (i can evaluate it and make predictions with it).
The call does not fail if I omit the model.get_layer('vgg16').output from the outputs list but of course, this is required for the visualization.
What I am doing wrong?
In a model that I constructed and trained from scratch, I was able to create a similar model with the activations as outputs but here i get these errors.
My model's details
The model was created with the following code and then trained and saved.
from tensorflow import keras
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras import optimizers
conv_base = keras.applications.vgg16.VGG16(
weights="vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5",
include_top=False)
conv_base.trainable = False
data_augmentation = keras.Sequential(
[
layers.experimental.preprocessing.RandomFlip("horizontal"),
layers.experimental.preprocessing.RandomRotation(0.1),
layers.experimental.preprocessing.RandomZoom(0.2),
]
)
inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs)
x = conv_base(x)
x = layers.Flatten()(x)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.compile(loss="binary_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"])
later it was loaded:
model = keras.models.load_model("myModel.keras")
print(model.summary())
print(model.get_layer('sequential').summary())
print(model.get_layer('vgg16').summary())
output:
Model: "functional_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_6 (InputLayer) [(None, 180, 180, 3)] 0
_________________________________________________________________
sequential (Sequential) (None, 180, 180, 3) 0
_________________________________________________________________
vgg16 (Functional) (None, None, None, 512) 14714688
_________________________________________________________________
flatten_1 (Flatten) (None, 12800) 0
_________________________________________________________________
dense_2 (Dense) (None, 256) 3277056
_________________________________________________________________
dropout_1 (Dropout) (None, 256) 0
_________________________________________________________________
dense_3 (Dense) (None, 1) 257
=================================================================
Total params: 17,992,001
Trainable params: 10,356,737
Non-trainable params: 7,635,264
_________________________________________________________________
None
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
random_flip (RandomFlip) (None, 180, 180, 3) 0
_________________________________________________________________
random_rotation (RandomRotat (None, 180, 180, 3) 0
_________________________________________________________________
random_zoom (RandomZoom) (None, 180, 180, 3) 0
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________
None
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) [(None, None, None, 3)] 0
_________________________________________________________________
block1_conv1 (Conv2D) multiple 1792
_________________________________________________________________
block1_conv2 (Conv2D) multiple 36928
_________________________________________________________________
block1_pool (MaxPooling2D) multiple 0
_________________________________________________________________
block2_conv1 (Conv2D) multiple 73856
_________________________________________________________________
block2_conv2 (Conv2D) multiple 147584
_________________________________________________________________
block2_pool (MaxPooling2D) multiple 0
_________________________________________________________________
block3_conv1 (Conv2D) multiple 295168
_________________________________________________________________
block3_conv2 (Conv2D) multiple 590080
_________________________________________________________________
block3_conv3 (Conv2D) multiple 590080
_________________________________________________________________
block3_pool (MaxPooling2D) multiple 0
_________________________________________________________________
block4_conv1 (Conv2D) multiple 1180160
_________________________________________________________________
block4_conv2 (Conv2D) multiple 2359808
_________________________________________________________________
block4_conv3 (Conv2D) multiple 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) multiple 0
_________________________________________________________________
block5_conv1 (Conv2D) multiple 2359808
_________________________________________________________________
block5_conv2 (Conv2D) multiple 2359808
_________________________________________________________________
block5_conv3 (Conv2D) multiple 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) multiple 0
=================================================================
Total params: 14,714,688
Trainable params: 7,079,424
Non-trainable params: 7,635,264
You can achieve what you want in the following way. First, define your model as follows:
inputs = tf.keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs, training=True)
x = keras.applications.VGG16(input_tensor=x,
include_top=False,
weights=None)
x.trainable = False
x = layers.Flatten()(x.output)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(1, activation='sigmoid')(x)
model = keras.Model(inputs, x)
for i, layer in enumerate(model.layers):
print(i, layer.name, layer.output_shape, layer.trainable)
...
17 block5_conv2 (None, 11, 11, 512) False
18 block5_conv3 (None, 11, 11, 512) False
19 block5_pool (None, 5, 5, 512) False
20 flatten_2 (None, 12800) True
21 dense_4 (None, 256) True
22 dropout_2 (None, 256) True
23 dense_5 (None, 1) True
Now, build the grad-cam model with desired output layer as follows:
grad_model = keras.models.Model(
[model.inputs],
[model.get_layer('block5_pool').output,
model.output]
)
Test
image = np.random.rand(1, 180, 180, 3).astype(np.float32)
with tf.GradientTape() as tape:
convOutputs, predictions = grad_model(tf.cast(image, tf.float32))
loss = predictions[:, tf.argmax(predictions[0])]
grads = tape.gradient(loss, convOutputs)
print(grads)
tf.Tensor(
[[[[ 9.8454033e-04 3.6991197e-03 ... -1.2012678e-02
-1.7934230e-03 2.2925171e-03]
[ 1.6165405e-03 -1.9513096e-03 ... -2.5789393e-03
1.2443252e-03 -1.3931725e-03]
[-2.0554627e-04 1.2232144e-03 ... 5.2324748e-03
3.1955825e-04 3.4566019e-03]
[ 2.3650150e-03 -2.5699558e-03 ... -2.4103196e-03
5.8940407e-03 5.3285398e-03]
...

How to access immediate activations of custom model containing a pretrained-model?

I have a custom network of a Keras Xception base with added regression head:
pretrained_model = tf.keras.applications.Xception(input_shape=[244, 244, 3], include_top=False, weights='imagenet')
pretrained_model.trainable = True
model = tf.keras.Sequential([
pretrained_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(1, activation='tanh')
])
The model summary:
Layer (type) Output Shape Param #
=================================================================
xception (Model) (None, 7, 7, 2048) 20861480
_________________________________________________________________
global_average_pooling2d_3 ( (None, 2048) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 2048) 0
_________________________________________________________________
dense_6 (Dense) (None, 32) 65568
_________________________________________________________________
dropout_5 (Dropout) (None, 32) 0
_________________________________________________________________
dense_7 (Dense) (None, 1) 33
=================================================================
Total params: 20,927,081
Trainable params: 20,872,553
Non-trainable params: 54,528
I want to get the last activations from the xception(model) layer.
The details of xception:
Model: "xception"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_4 (InputLayer) [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
block1_conv1 (Conv2D) (None, 111, 111, 32) 864 input_4[0][0]
__________________________________________________________________________________________________
...
__________________________________________________________________________________________________
block14_sepconv2 (SeparableConv (None, 7, 7, 2048) 3159552 block14_sepconv1_act[0][0]
__________________________________________________________________________________________________
block14_sepconv2_bn (BatchNorma (None, 7, 7, 2048) 8192 block14_sepconv2[0][0]
__________________________________________________________________________________________________
block14_sepconv2_act (Activatio (None, 7, 7, 2048) 0 block14_sepconv2_bn[0][0]
==================================================================================================
Total params: 20,861,480
Trainable params: 20,806,952
Non-trainable params: 54,528
To reference the last activation layer I have to use :
model.layers[0].get_layer('block14_sepconv2_act').output
since explicitly my 'model' does not contain the 'block14_sepconv2_act' layer.
To access activation I want to use the code below:
activations = tf.keras.Model(model.inputs,model.layers[0].get_layer('block14_sepconv2_act').output)
activations(sample)
but I get the error:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_4_1:0", shape=(None, 224, 224, 3), dtype=float32) at layer "input_4". The following previous layers were accessed without issue: []
My question is how can I access the intermediate layer outputs of a pretrained model if added in this way to the custom model?

Can not convert Resnet model trained with Keras to CoreML model

I need to convert a resnet50 model to CoreML model.
The Keras model trained is working correctly. I tried to convert it to Coreml but here is the error I get using coremltools :
ValueError: Keras layer '<class 'keras.layers.core.Lambda'>' not supported.
It seems that I have lambda functions in my model and Coreml does not support it... but what I do not understand where those lambda functions are coming from as I just used a standard resnet50 network for transfer learning. I only changed the last 1000-dense layer to a 4-dense layer, here is my code:
from keras.applications.resnet50 import ResNet50, preprocess_input
full_imagenet_model = ResNet50(weights='imagenet')
output = full_imagenet_model.layers[-2].output
base_model = Model(full_imagenet_model.input, output)
top_model = Sequential()
top_model.add(Dense(4, input_dim=2048, activation='softmax'))
top_model.compile(optimizer=Adam(lr=1e-4),
loss='categorical_crossentropy', metrics=['accuracy'])
model = Model(base_model.input, top_model(base_model.output))
Here is the beginning and end of model summary:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 224, 224, 3) 0
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 input_1[0][0]
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 112, 112, 64) 9472 conv1_pad[0][0]
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization) (None, 112, 112, 64) 256 conv1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, 112, 112, 64) 0 bn_conv1[0][0]
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D) (None, 114, 114, 64) 0 activation_1[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 56, 56, 64) 0 pool1_pad[0][0]
__________________________________________________________________________________________________
res2a_branch2a (Conv2D) (None, 56, 56, 64) 4160 max_pooling2d_1[0][0]
________________________________________________________________________________________________
(...)
__________________________________________________________________________________
add_16 (Add) (None, 7, 7, 2048) 0 bn5c_branch2c[0][0]
activation_46[0][0]
__________________________________________________________________________________________________
activation_49 (Activation) (None, 7, 7, 2048) 0 add_16[0][0]
__________________________________________________________________________________________________
avg_pool (GlobalAveragePooling2 (None, 2048) 0 activation_49[0][0]
__________________________________________________________________________________________________
sequential_1 (Sequential) (None, 4) 8196 avg_pool[0][0]
==================================================================================================
Total params: 23,595,908
Trainable params: 23,542,788
Non-trainable params: 53,120
What is weird is when I load the model I trained and call the summary, here is what I get:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) (None, 224, 224, 3) 0
__________________________________________________________________________________________________
lambda_3 (Lambda) (None, 224, 224, 3) 0 input_3[0][0]
__________________________________________________________________________________________________
lambda_4 (Lambda) (None, 224, 224, 3) 0 input_3[0][0]
__________________________________________________________________________________________________
model_5 (Model) (None, 4) 23595908 lambda_3[0][0]
lambda_4[0][0]
__________________________________________________________________________________________________
sequential_2 (Concatenate) (None, 4) 0 model_5[1][0]
model_5[2][0]
==================================================================================================
Total params: 23,595,908
Trainable params: 23,542,788
Non-trainable params: 53,120
I have no idea where the lambda layers come from.. any idea?
For information, here is how the training is done:
opt = Adam(lr=1e-3)
parallel_model = multi_gpu_model(model, gpus=2)
parallel_model.compile(optimizer=opt, loss='categorical_crossentropy',
metrics=['accuracy'])
history = parallel_model.fit_generator(train_flow, train_flow.n // train_flow.batch_size,
epochs=200,
validation_data=val_flow,
validation_steps=val_flow.n,
callbacks=[clr, tensorboard, cb_checkpointer, cb_early_stopper])
Thanks for the help
Edit1
And here is how I save the model:
from tensorflow.python.keras.callbacks import EarlyStopping, ModelCheckpoint
from pyimagesearch.clr_callback import CyclicLR
cb_early_stopper = EarlyStopping(monitor = 'val_acc', patience = 30)
cb_checkpointer = ModelCheckpoint(filepath = 'SAVED_MODELS/EPOCH:50_DataAug:Yes_Monitor:val-acc_DB2.hdf5', monitor = 'val_acc', save_best_only = True, mode = 'auto')
tensorboard = TensorBoard(log_dir="logs/{}".format('model_EPOCH:50_DataAug:Yes_Monitor:val-acc_DB2'))
clr = CyclicLR(
mode=CLR_METHOD,
base_lr=MIN_LR,
max_lr=MAX_LR,
step_size= STEP_SIZE * (train_flow.n // train_flow.batch_size))
Ok, I think I found the issue. The code I posted missed a part about multi_gpu_model used for training.
The probleme is that using multi_gpu is creating lambdas functions in the summary to load data in parralel.
I am looking now to implement the lambda functions in coremltools.

Keras functional API slower than Sequential / Not improving

SOLVED!(Had to set trainable=true in the sequential model)
I am currently changing my Keras model from Sequential to the functional API. While the Sequential model does improve to an accuracy of 1 after like 10 epochs, the functional API model does not even reach 0.7 and does not further improve. Apart from the Input layer, both nets should be the same.
Sequential:
model = Sequential()
model.add(Embedding(20000, 256,input_length = 30))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(256, dropout=0.3, recurrent_dropout=0.3))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer=Adam(lr=0.0001),metrics = ['accuracy'])
print(model.summary())
Output is:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_6 (Embedding) (None, 30, 256) 5120000
_________________________________________________________________
spatial_dropout1d_5 (Spatial (None, 30, 256) 0
_________________________________________________________________
lstm_5 (LSTM) (None, 256) 525312
_________________________________________________________________
dense_6 (Dense) (None, 1) 257
=================================================================
Total params: 5,645,569
Trainable params: 5,645,569
Non-trainable params: 0
_________________________________________________________________
None
For the functional API:
inputs = Input(shape=(31,))
embed = Embedding(20000, 256, trainable=False)(inputs)
drop = (SpatialDropout1D(0.4))(embed)
lstm = LSTM(256, dropout=0.3, recurrent_dropout=0.3)(drop)
acti = Dense(1,activation='sigmoid')(lstm)
model = Model(inputs=inputs, outputs=acti)
model.compile(loss = 'binary_crossentropy', optimizer=Adam(lr=0.0001),metrics = ['accuracy'])
print(model.summary())
Result
Model: "model_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_8 (InputLayer) (None, 31) 0
_________________________________________________________________
embedding_7 (Embedding) (None, 31, 256) 5120000
_________________________________________________________________
spatial_dropout1d_6 (Spatial (None, 31, 256) 0
_________________________________________________________________
lstm_6 (LSTM) (None, 256) 525312
_________________________________________________________________
dense_7 (Dense) (None, 1) 257
=================================================================
Total params: 5,645,569
Trainable params: 525,569
Non-trainable params: 5,120,000
_________________________________________________________________
None
Have I overseen something or can someone explain my results?

Prepending Downsample layer to Resnet50 Pretrained Model

I am using keras 1.1.1 in windows 7 with tensorflow backend.
I am trying to prepend the stock Resnet50 pretained model with an image downsampler. Below is my code.
from keras.applications.resnet50 import ResNet50
import keras.layers
# this could also be the output a different Keras model or layer
input = keras.layers.Input(shape=(400, 400, 1)) # this assumes K.image_dim_ordering() == 'tf'
x1 = keras.layers.AveragePooling2D(pool_size=(2,2))(input)
x2 = keras.layers.Flatten()(x1)
x3 = keras.layers.RepeatVector(3)(x2)
x4 = keras.layers.Reshape((200, 200, 3))(x3)
x5 = keras.layers.ZeroPadding2D(padding=(12,12))(x4)
m = keras.models.Model(input, x5)
model = ResNet50(input_tensor=m.output, weights='imagenet', include_top=False)
but I get an error which I am unsure how to fix.
builtins.Exception: Graph disconnected: cannot obtain value for tensor
Output("input_2:0", shape=(?, 400, 400, 1), dtype=float32) at layer
"input_2". The following previous layers were accessed without issue:
[]
You can use both the Functional API and Sequential approaches to solve this. See working example for both approaches below:
from keras.applications.ResNet50 import ResNet50
from keras.models import Sequential, Model
from keras.layers import AveragePooling2D, Flatten, RepeatVector, Reshape, ZeroPadding2D, Input, Dense
pretrained = ResNet50(input_shape=(224, 224, 3), weights='imagenet', include_top=False)
# Sequential method
model_1 = Sequential()
model_1.add(AveragePooling2D(pool_size=(2,2),input_shape=(400, 400, 1)))
model_1.add(Flatten())
model_1.add(RepeatVector(3))
model_1.add(Reshape((200, 200, 3)))
model_1.add(ZeroPadding2D(padding=(12,12)))
model_1.add(pretrained)
model_1.add(Dense(1))
# functional API method
input = Input(shape=(400, 400, 1))
x = AveragePooling2D(pool_size=(2,2),input_shape=(400, 400, 1))(input)
x = Flatten()(x)
x = RepeatVector(3)(x)
x = Reshape((200, 200, 3))(x)
x = ZeroPadding2D(padding=(12,12))(x)
x = pretrained(x)
preds = Dense(1)(x)
model_2 = Model(input,preds)
model_1.summary()
model_2.summary()
The summaries (replace resnet for xception):
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
average_pooling2d_1 (Average (None, 200, 200, 1) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 40000) 0
_________________________________________________________________
repeat_vector_1 (RepeatVecto (None, 3, 40000) 0
_________________________________________________________________
reshape_1 (Reshape) (None, 200, 200, 3) 0
_________________________________________________________________
zero_padding2d_1 (ZeroPaddin (None, 224, 224, 3) 0
_________________________________________________________________
xception (Model) (None, 7, 7, 2048) 20861480
_________________________________________________________________
dense_1 (Dense) (None, 7, 7, 1) 2049
=================================================================
Total params: 20,863,529
Trainable params: 20,809,001
Non-trainable params: 54,528
_________________________________________________________________
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 400, 400, 1) 0
_________________________________________________________________
average_pooling2d_2 (Average (None, 200, 200, 1) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 40000) 0
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 3, 40000) 0
_________________________________________________________________
reshape_2 (Reshape) (None, 200, 200, 3) 0
_________________________________________________________________
zero_padding2d_2 (ZeroPaddin (None, 224, 224, 3) 0
_________________________________________________________________
xception (Model) (None, 7, 7, 2048) 20861480
_________________________________________________________________
dense_2 (Dense) (None, 7, 7, 1) 2049
=================================================================
Total params: 20,863,529
Trainable params: 20,809,001
Non-trainable params: 54,528
_________________________________________________________________
Both approaches work fine. If you plan on freezing the pretrained model and letting pre/post layers learn -- and afterward finetuning the model, the approach I found to work goes like so:
# given the same resnet model as before...
model = load_model('modelname.h5')
# pull out the nested model
nested_model = model.layers[5] # assuming the model is the 5th layer
# loop over the nested model to allow training
for l in nested_model.layers:
l.trainable=True
# insert the trainable pretrained model back into the original
model.layer[5] = nested_model

Resources