Modify ResNet50 output layer for regression - keras

I am trying to create a ResNet50 model for a regression problem, with an output value ranging from -1 to 1.
I omitted the classes argument, and in my preprocessing step I resize my images to 224,224,3.
I try to create the model with
def create_resnet(load_pretrained=False):
if load_pretrained:
weights = 'imagenet'
else:
weights = None
# Get base model
base_model = ResNet50(weights=weights)
optimizer = Adam(lr=1e-3)
base_model.compile(loss='mse', optimizer=optimizer)
return base_model
and then create the model, print the summary and use the fit_generator to train
history = model.fit_generator(batch_generator(X_train, y_train, 100, 1),
steps_per_epoch=300,
epochs=10,
validation_data=batch_generator(X_valid, y_valid, 100, 0),
validation_steps=200,
verbose=1,
shuffle = 1)
I get an error though that says
ValueError: Error when checking target: expected fc1000 to have shape (1000,) but got array with shape (1,)
Looking at the model summary, this makes sense, since the final Dense layer has an output shape of (None, 1000)
fc1000 (Dense) (None, 1000) 2049000 avg_pool[0][0]
But I can't figure out how to modify the model. I've read through the Keras documentation and looked at several examples, but pretty much everything I see is for a classification model.
How can I modify the model so it is formatted properly for regression?

Your code is throwing the error because you're using the original fully-connected top layer that was trained to classify images into one of 1000 classes. To make the network working, you need to replace this top layer with your own which should have the shape compatible with your dataset and task.
Here is a small snippet I was using to create an ImageNet pre-trained model for the regression task (face landmarks prediction) with Keras:
NUM_OF_LANDMARKS = 136
def create_model(input_shape, top='flatten'):
if top not in ('flatten', 'avg', 'max'):
raise ValueError('unexpected top layer type: %s' % top)
# connects base model with new "head"
BottleneckLayer = {
'flatten': Flatten(),
'avg': GlobalAvgPooling2D(),
'max': GlobalMaxPooling2D()
}[top]
base = InceptionResNetV2(input_shape=input_shape,
include_top=False,
weights='imagenet')
x = BottleneckLayer(base.output)
x = Dense(NUM_OF_LANDMARKS, activation='linear')(x)
model = Model(inputs=base.inputs, outputs=x)
return model
In your case, I guess you only need to replace InceptionResNetV2 with ResNet50. Essentially, you are creating a pre-trained model without top layers:
base = ResNet50(input_shape=input_shape, include_top=False)
And then attaching your custom layer on top of it:
x = Flatten()(base.output)
x = Dense(NUM_OF_LANDMARKS, activation='sigmoid')(x)
model = Model(inputs=base.inputs, outputs=x)
That's it.
You also can check this link from the Keras repository that shows how ResNet50 is constructed internally. I believe it will give you some insights about the functional API and layers replacement.
Also, I would say that both regression and classification tasks are not that different if we're talking about fine-tuning pre-trained ImageNet models. The type of task mostly depends on your loss function and the top layer's activation function. Otherwise, you still have a fully-connected layer with N outputs but they are interpreted in a different way.

Related

Errors while fine tuning InceptionV3 in Keras

I am going to fine-tune InceptionV3 model using my self-defined dataset. Unfortunately, when using model.fit to train, here comes the error below:
ValueError: Error when checking target: expected dense_6 to have shape (4,) but got array with shape (1,)
Firstly, I load my own dataset as training_data which contains a pair of image and corresponding label. Then, I use the code below to convert them into specific array-type(img_new and label_new) so that it's compatible to Keras's inputs of both data and labels.
for img, label in training_data:
img_new[i,:,:,:] = img
label_new[i,:] = label
i=i+1
Second, I fine tune the Inception Model below.
InceptionV3_model=keras.applications.inception_v3.InceptionV3(include_top=False,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000)
#InceptionV3_model.summary()
# add a global spatial average pooling layer
x = InceptionV3_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 4 classes
predictions = Dense(4, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=InceptionV3_model.input, outputs=predictions)
# Transfer Learning
for layer in model.layers[:311]:
layer.trainable = False
for layer in model.layers[311:]:
layer.trainable = True
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.001, momentum=0.9), loss='categorical_crossentropy')
model.fit(x=X_train, y=y_train, batch_size=3, epochs=3, validation_split=0.2)
model.save_weights('first_try.h5')
Does anyone have ideas of what is wrong while training using model.fit?
Sincerely thanks for your kind help.
The error is caused because my labels r integers, I gotta compile it by sparse_categorical_crossentropy which is set for integer labels instead of categorical_crossentropy which is used for one-hot encoding.
Sincerely thank for the help by #Amir very much. :-)

how to obtain the runtime batch size of a Keras model

Based on this post. I need some basic implementation help. Below you see my model using a Dropout layer. When using the noise_shape parameter, it happens that the last batch does not fit into the batch size creating an error (see other post).
Original model:
def LSTM_model(X_train,Y_train,dropout,hidden_units,MaskWert,batchsize):
model = Sequential()
model.add(Masking(mask_value=MaskWert, input_shape=(X_train.shape[1],X_train.shape[2]) ))
model.add(Dropout(dropout, noise_shape=(batchsize, 1, X_train.shape[2]) ))
model.add(Dense(hidden_units, activation='sigmoid', kernel_constraint=max_norm(max_value=4.) ))
model.add(LSTM(hidden_units, return_sequences=True, dropout=dropout, recurrent_dropout=dropout))
Now Alexandre Passos suggested to get the runtime batchsize with tf.shape. I tried to implement the runtime batchsize idea it into Keras in different ways but never working.
import Keras.backend as K
def backend_shape(x):
return K.shape(x)
def LSTM_model(X_train,Y_train,dropout,hidden_units,MaskWert,batchsize):
batchsize=backend_shape(X_train)
model = Sequential()
...
model.add(Dropout(dropout, noise_shape=(batchsize[0], 1, X_train.shape[2]) ))
...
But that did just give me the input tensor shape but not the runtime input tensor shape.
I also tried to use a Lambda Layer
def output_of_lambda(input_shape):
return (input_shape)
def LSTM_model_2(X_train,Y_train,dropout,hidden_units,MaskWert,batchsize):
model = Sequential()
model.add(Lambda(output_of_lambda, outputshape=output_of_lambda))
...
model.add(Dropout(dropout, noise_shape=(outputshape[0], 1, X_train.shape[2]) ))
And different variants. But as you already guessed, that did not work at all.
Is the model definition actually the correct place?
Could you give me a tip or better just tell me how to obtain the running batch size of a Keras model? Thanks so much.
The current implementation does adjust the according to the runtime batch size. From the Dropout layer implementation code:
symbolic_shape = K.shape(inputs)
noise_shape = [symbolic_shape[axis] if shape is None else shape
for axis, shape in enumerate(self.noise_shape)]
So if you give noise_shape=(None, 1, features) the shape will be (runtime_batchsize, 1, features) following the code above.

Modify layers in resnet model

I am trying to train resnet50 model for image classification problem. I have loaded the pretrained 'imagenet' weights before training the model on the dataset I have. I want to insert a layer (mean subtraction layer) in-between the input layer and the first convolutiuon layer.
model = ResNet50(weights='imagenet')
def mean_subtract(img):
img = T.set_subtensor(img[:,0,:,:],img[:,0,:,:] - 123.68)
img = T.set_subtensor(img[:,1,:,:],img[:,1,:,:] - 116.779)
img = T.set_subtensor(img[:,2,:,:],img[:,2,:,:] - 103.939)
return img / 255.0
I want to insert inputs = Lambda(mean_subtract, name='mean_subtraction')(inputs) next to the input layer and connect this to the first convolution layer of resnet model without losing the weights saved.
How do I do that?
Thanks!
Quick answer (Seems better than adding the function to the model)
Use the preprocessing function as described here: preprocessing images generated using keras function ImageDataGenerator() to train resnet50 model
Long answer
Since your function doesn't change shapes, you can put it in an outer model without changing the Resnet model (changing models may not be so simple, I always try to mount new models with parts from other models if needed).
resnet_model = ResNet50(weights='imagenet')
inputs = Input((None,None,3))
#it seems you're using (3,None,None) instead.
#choose based on your "data_format", which by default is channels_last
outputs = Lambda(mean_subtract,output_shape=not_necessary_with_tensorflow)(inputs)
outputs = resnet_model(outputs)
model = Model(inputs, outputs)

Is it possible to train using same model with two inputs?

Hello I have a some question for keras.
currently i want implement some network
using same cnn model, and use two images as input of cnn model
and use two result of cnn model, provide to Dense model
for example
def cnn_model():
input = Input(shape=(None, None, 3))
x = Conv2D(8, (3, 3), strides=(1, 1))(input)
x = GlobalAvgPool2D()(x)
model = Model(input, x)
return model
def fc_model(cnn1, cnn2):
input_1 = cnn1.output
input_2 = cnn2.output
input = concatenate([input_1, input_2])
x = Dense(1, input_shape=(None, 16))(input)
x = Activation('sigmoid')(x)
model = Model([cnn1.input, cnn2.input], x)
return model
def main():
cnn1 = cnn_model()
cnn2 = cnn_model()
model = fc_model(cnn1, cnn2)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x=[image1, image2], y=[1.0, 1.0], batch_size=1, ecpochs=1)
i want to implement model something like this, and train models
but i got error message like below :
'All layer names should be unique'
Actually i want use only one CNN model as feature extractor and finally use two features to predict one float value as 0.0 ~ 1.0
so whole system -->>
using two images and extract features from same CNN model, and features are provided to Dense model to get one floating value
Please, help me implement this system and how to train..
Thank you
See the section of the Keras documentation on shared layers:
https://keras.io/getting-started/functional-api-guide/
A code snippet from the documentation above demonstrating this:
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit([data_a, data_b], labels, epochs=10)

Dimensions not matching in keras LSTM model

I want to use an LSTM neural Network with keras to forecast groups of time series and I am having troubles in making the model match what I want. The dimensions of my data are:
input tensor: (data length, number of series to train, time steps to look back)
output tensor: (data length, number of series to forecast, time steps to look ahead)
Note: I want to keep the dimensions exactly like that, no
transposition.
A dummy data code that reproduces the problem is:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, TimeDistributed, LSTM
epoch_number = 100
batch_size = 20
input_dim = 4
output_dim = 3
look_back = 24
look_ahead = 24
n = 100
trainX = np.random.rand(n, input_dim, look_back)
trainY = np.random.rand(n, output_dim, look_ahead)
print('test X:', trainX.shape)
print('test Y:', trainY.shape)
model = Sequential()
# Add the first LSTM layer (The intermediate layers need to pass the sequences to the next layer)
model.add(LSTM(10, batch_input_shape=(None, input_dim, look_back), return_sequences=True))
# add the first LSTM layer (the dimensions are only needed in the first layer)
model.add(LSTM(10, return_sequences=True))
# the TimeDistributed object allows a 3D output
model.add(TimeDistributed(Dense(look_ahead)))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit(trainX, trainY, nb_epoch=epoch_number, batch_size=batch_size, verbose=1)
This trows:
Exception: Error when checking model target: expected
timedistributed_1 to have shape (None, 4, 24) but got array with shape
(100, 3, 24)
The problem seems to be when defining the TimeDistributed layer.
How do I define the TimeDistributed layer so that it compiles and trains?
The error message is a bit misleading in your case. Your output node of the network is called timedistributed_1 because that's the last node in your sequential model. What the error message is trying to tell you is that the output of this node does not match the target your model is fitting to, i.e. your labels trainY.
Your trainY has a shape of (n, output_dim, look_ahead), so (100, 3, 24) but the network is producing an output shape of (batch_size, input_dim, look_ahead). The problem in this case is that output_dim != input_dim. If your time dimension changes you may need padding or a network node that removes said timestep.
I think the problem is that you expect output_dim (!= input_dim) at the output of TimeDistributed, while it's not possible. This dimension is what it considers as the time dimension: it is preserved.
The input should be at least 3D, and the dimension of index one will
be considered to be the temporal dimension.
The purpose of TimeDistributed is to apply the same layer to each time step. You can only end up with the same number of time steps as you started with.
If you really need to bring down this dimension from 4 to 3, I think you will need to either add another layer at the end, or use something different from TimeDistributed.
PS: one hint towards finding this issue was that output_dim is never used when creating the model, it only appears in the validation data. While it's only a code smell (there might not be anything wrong with this observation), it's something worth checking.

Resources