Training only one output of a network in Keras - keras

I have a network in Keras with many outputs, however, my training data only provides information for a single output at a time.
At the moment my method for training has been to run a prediction on the input in question, change the value of the particular output that I am training and then doing a single batch update. If I'm right this is the same as setting the loss for all outputs to zero except the one that I'm trying to train.
Is there a better way? I've tried class weights where I set a zero weight for all but the output I'm training but it doesn't give me the results I expect?
I'm using the Theano backend.

Outputting multiple results and optimizing only one of them
Let's say you want to return output from multiple layers, maybe from some intermediate layers, but you need to optimize only one target output. Here's how you can do it:
Let's start with this model:
inputs = Input(shape=(784,))
x = Dense(64, activation='relu')(inputs)
# you want to extract these values
useful_info = Dense(32, activation='relu', name='useful_info')(x)
# final output. used for loss calculation and optimization
result = Dense(1, activation='softmax', name='result')(useful_info)
Compile with multiple outputs, set loss as None for extra outputs:
Give None for outputs that you don't want to use for loss calculation and optimization
model = Model(inputs=inputs, outputs=[result, useful_info])
model.compile(optimizer='rmsprop',
loss=['categorical_crossentropy', None],
metrics=['accuracy'])
Provide only target outputs when training. Skipping extra outputs:
model.fit(my_inputs, {'result': train_labels}, epochs=.., batch_size=...)
# this also works:
#model.fit(my_inputs, [train_labels], epochs=.., batch_size=...)
One predict to get them all
Having one model you can run predict only once to get all outputs you need:
predicted_labels, useful_info = model.predict(new_x)

In order to achieve this I ended up using the 'Functional API'. You basically create multiple models, using the same layers input and hidden layers but different output layers.
For example:
https://keras.io/getting-started/functional-api-guide/
from keras.layers import Input, Dense
from keras.models import Model
# This returns a tensor
inputs = Input(shape=(784,))
# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions_A = Dense(1, activation='softmax')(x)
predictions_B = Dense(1, activation='softmax')(x)
# This creates a model that includes
# the Input layer and three Dense layers
modelA = Model(inputs=inputs, outputs=predictions_A)
modelA.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
modelB = Model(inputs=inputs, outputs=predictions_B)
modelB.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])

Related

tensorflow sequential model outputting nan

Why is my code outputting nan? I'm using a sequential model with a 30x1 input vector and a single value output. I'm using tensorflow and python. This is one of my firs
While True:
# Define a simple sequential model
def create_model():
model = tf.keras.Sequential([
keras.layers.Dense(30, activation='relu',input_shape=(30,)),
keras.layers.Dense(12, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(7, activation='relu'),
keras.layers.Dense(1, activation = 'sigmoid')
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
return model
# Create a basic model instance
model = create_model()
# Display the model's architecture
model.summary()
train_labels=[1]
test_labels=[1]
train_images= [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30]]
test_images=[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30]]
model.fit(train_images,
train_labels,
epochs=10,
validation_data=(test_images, test_labels),
verbose=1)
print('predicted:',model.predict(train_images))
You are using SparseCategoricalCrossentropy. It expects labels to be integers starting from 0. So, you have only one label 1, but it means you have at least two categories - 0 and 1. So you need at least two neurons in the last layer:
keras.layers.Dense(2, activation = 'sigmoid')
( If your goal is classification, you should maybe consider to use softmax instead of sigmoid, without from_logits=True )
You're using the wrong loss function for those labels. You need to use BinaryCrossentropy.
Change:
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
To:
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy,
metrics=[tf.keras.metrics.BinaryAccuracy()])

Is it possible to train a CNN starting at an intermediate layer (in general and in Keras)?

I'm using mobilenet v2 to train a model on my images. I've frozen all but a few layers and then added additional layers for training. I'd like to be able to train from an intermediate layer rather than from the beginning. My questions:
Is it possible to provide the output of the last frozen layer as the
input for training (it would be a tensor of (?, 7,7,1280))?
How does one specify training to start from that first trainable
(non-frozen) layer? In this case, mbnetv2_conv.layer[153].
What is y_train in this case? I don't quite understand how y_train
is being used during the training process- in general, when does the
CNN refer back to y_train?
Load mobilenet v2
image_size = 224
mbnetv2_conv = MobileNetV2(weights='imagenet', include_top=False, input_shape=(image_size, image_size, 3))
# Freeze all layers except the last 3 layers
for layer in mbnetv2_conv.layers[:-3]:
layer.trainable = False
# Create the model
model = models.Sequential()
model.add(mbnetv2_conv)
model.add(layers.Flatten())
model.add(layers.Dense(16, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(3, activation='softmax'))
model.summary()
# Build an array (?,224,224,3) from images
x_train = np.array(all_images)
# Get layer output
from keras import backend as K
get_last_frozen_layer_output = K.function([mbnetv2_conv.layers[0].input],
[mbnetv2_conv.layers[152].output])
last_frozen_layer_output = get_last_frozen_layer_output([x_train])[0]
# Compile the model
from keras.optimizers import SGD
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['acc'])
# how to train from a specific layer and what should y_train be?
model.fit(last_frozen_layer_output, y_train, batch_size=2, epochs=10)
Yes, you can. Two different ways.
First, the hard way makes you build two new models, one with all your frozen layers, one with all your trainable layers. Add a Flatten() layer to the frozen-layers-only model. And you will copy the weights from mobilenet v2 layer by layer to populate the weights of the frozen-layers-only model. Then you will run your input images through the frozen-layers-only model, saving the output to disk in CSV or pickle form. This is now the input for your trainable-layers model, which you train with the model.fit() command as you did above. Save the weights when you're done training. Then you will have to build the original model with both sets of layers, and load the weights into each layer, and save the whole thing. You're done!
However, the easier way is to save the weights of your model separately from the architecture with:
model.save_weights(filename)
then modify the layer.trainable property of the layers in MobileNetV2 before you add it into a new empty model:
mbnetv2_conv = MobileNetV2(weights='imagenet', include_top=False, input_shape=(image_size, image_size, 3))
for layer in mbnetv2_conv.layers[:153]:
layer.trainable = False
model = models.Sequential()
model.add(mbnetv2_conv)
then reload the weights with
newmodel.load_weights(filename)
This lets you adjust which layers in your mbnetv2_conv model you will be training on the fly, and then just call model.fit() to continue training.

Errors while fine tuning InceptionV3 in Keras

I am going to fine-tune InceptionV3 model using my self-defined dataset. Unfortunately, when using model.fit to train, here comes the error below:
ValueError: Error when checking target: expected dense_6 to have shape (4,) but got array with shape (1,)
Firstly, I load my own dataset as training_data which contains a pair of image and corresponding label. Then, I use the code below to convert them into specific array-type(img_new and label_new) so that it's compatible to Keras's inputs of both data and labels.
for img, label in training_data:
img_new[i,:,:,:] = img
label_new[i,:] = label
i=i+1
Second, I fine tune the Inception Model below.
InceptionV3_model=keras.applications.inception_v3.InceptionV3(include_top=False,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000)
#InceptionV3_model.summary()
# add a global spatial average pooling layer
x = InceptionV3_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 4 classes
predictions = Dense(4, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=InceptionV3_model.input, outputs=predictions)
# Transfer Learning
for layer in model.layers[:311]:
layer.trainable = False
for layer in model.layers[311:]:
layer.trainable = True
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.001, momentum=0.9), loss='categorical_crossentropy')
model.fit(x=X_train, y=y_train, batch_size=3, epochs=3, validation_split=0.2)
model.save_weights('first_try.h5')
Does anyone have ideas of what is wrong while training using model.fit?
Sincerely thanks for your kind help.
The error is caused because my labels r integers, I gotta compile it by sparse_categorical_crossentropy which is set for integer labels instead of categorical_crossentropy which is used for one-hot encoding.
Sincerely thank for the help by #Amir very much. :-)

Is it possible to train using same model with two inputs?

Hello I have a some question for keras.
currently i want implement some network
using same cnn model, and use two images as input of cnn model
and use two result of cnn model, provide to Dense model
for example
def cnn_model():
input = Input(shape=(None, None, 3))
x = Conv2D(8, (3, 3), strides=(1, 1))(input)
x = GlobalAvgPool2D()(x)
model = Model(input, x)
return model
def fc_model(cnn1, cnn2):
input_1 = cnn1.output
input_2 = cnn2.output
input = concatenate([input_1, input_2])
x = Dense(1, input_shape=(None, 16))(input)
x = Activation('sigmoid')(x)
model = Model([cnn1.input, cnn2.input], x)
return model
def main():
cnn1 = cnn_model()
cnn2 = cnn_model()
model = fc_model(cnn1, cnn2)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x=[image1, image2], y=[1.0, 1.0], batch_size=1, ecpochs=1)
i want to implement model something like this, and train models
but i got error message like below :
'All layer names should be unique'
Actually i want use only one CNN model as feature extractor and finally use two features to predict one float value as 0.0 ~ 1.0
so whole system -->>
using two images and extract features from same CNN model, and features are provided to Dense model to get one floating value
Please, help me implement this system and how to train..
Thank you
See the section of the Keras documentation on shared layers:
https://keras.io/getting-started/functional-api-guide/
A code snippet from the documentation above demonstrating this:
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit([data_a, data_b], labels, epochs=10)

ValueError when Fine-tuning Inception_v3 in Keras

I am trying to fine-tune pre-trained Inceptionv3 in Keras for a multi-label (17) prediction problem.
Here's the code:
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a new top layer
x = base_model.output
predictions = Dense(17, activation='sigmoid')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(loss='binary_crossentropy', # We NEED binary here, since categorical_crossentropy l1 norms the output before calculating loss.
optimizer=SGD(lr=0.0001, momentum=0.9))
# Fit the model (Add history so that the history may be saved)
history = model.fit(x_train, y_train,
batch_size=128,
epochs=1,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_valid, y_valid))
But I got into the following error message and had trouble deciphering what it is saying:
ValueError: Error when checking target: expected dense_1 to have 4
dimensions, but got array with shape (1024, 17)
It seems to have something to do with that it doesn't like my one-hot encoding for the labels as target. But how do I get 4 dimensions target?
It turns out that the code copied from https://keras.io/applications/ would not run out-of-the-box.
The following post has helped me:
Keras VGG16 fine tuning
The changes I need to make are the following:
Add in the input shape to the model definition base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(299,299,3)), and
Add a Flatten() layer to flatten the tensor output:
x = base_model.output
x = Flatten()(x)
predictions = Dense(17, activation='sigmoid')(x)
Then the model works for me!

Resources