I am using a pre-trained Resnet50 model for simple feature extraction for images. but it gives me this error.
Error when checking input: expected input_9 to have the shape (224, 224, 3) but got array with shape (244, 244, 3)
I thought I changed the shape correctly and added a dimension to it like this tutorial said to do. https://www.kaggle.com/kelexu/extract-resnet-feature-using-keras
But it still gives me the above error.
What am I doing wrong here?
# load pre-trained resnet50
base_model = ResNet50(weights='imagenet', include_top=False,pooling=max)
x = base_model.output
input = Input(shape=(224,224,3))
x = Flatten()(input)
model = Model(inputs=input, outputs=x)
# Load in image
img = image.load_img("001.png", target_size=(244, 244))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
print(x.shape) # This produces (1, 244, 244, 3)
features = model.predict(x)
features_reduce = features.squeeze()
Change
img = image.load_img("001.png", target_size=(244, 244))
to
img = image.load_img("001.png", target_size=(224, 224))
Related
I try to create image embeddings for the purpose of deep ranking using a triplet loss function. The idea is that we can take a pretrained CNN (e.g. resnet50 or vgg16), remove the FC layers and add an L2 normalization function to retrieve unit vectors which can then be compared via a distance metric (e.g. cosine similarity). As far as I understand the predicted vectors that come out of a pretrained CNN are not optimal, but are a good start. By adding the triplet loss function we can re-train the network to keep similar pictures 'close' to each other and different pictures 'far' apart in the feature space. Inspired by this notebook , I tried to setup the following code, but I get an error ValueError: The name "conv1_pad" is used 3 times in the model. All layer names should be unique..
# Anchor, Positive and Negative are numpy arrays of size (200, 256, 256, 3), same for the test images
pic_size=256
def shared_dnn(inp):
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(3, pic_size, pic_size),
input_tensor=inp)
x = base_model.output
x = Flatten()(x)
x = Lambda(lambda x: K.l2_normalize(x,axis=1))(x)
for layer in base_model.layers[15:]:
layer.trainable = False
return x
anchor_input = Input((3, pic_size,pic_size ), name='anchor_input')
positive_input = Input((3, pic_size,pic_size ), name='positive_input')
negative_input = Input((3, pic_size,pic_size ), name='negative_input')
encoded_anchor = shared_dnn(anchor_input)
encoded_positive = shared_dnn(positive_input)
encoded_negative = shared_dnn(negative_input)
merged_vector = concatenate([encoded_anchor, encoded_positive, encoded_negative], axis=-1, name='merged_layer')
model = Model(inputs=[anchor_input,positive_input, negative_input], outputs=merged_vector)
#ValueError: The name "conv1_pad" is used 3 times in the model. All layer names should be unique.
model.compile(loss=triplet_loss, optimizer=adam_optim)
model.fit([Anchor,Positive,Negative],
y=Y_dummy,
validation_data=([Anchor_test,Positive_test,Negative_test],Y_dummy2), batch_size=512, epochs=500)
I am new to keras and I am not quite sure how to solve this. The author in the link above creates his own CNN from scratch, but I would like to build it upon resnet (or vgg16). How can I configure ResNet50 to use a triplet loss function (in the link above you find also the source code for the triplet loss function).
In your ResNet50 definition, you've written
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(3, pic_size, pic_size), input_tensor=inp)
Remove the input_tensor argument. Change input_shape=inp.
If you're using TF backend as you mentioned the input should be (256, 256, 3), then your input should be (pic_size, pic_size, 3).
def shared_dnn(inp):
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=inp)
x = base_model.output
x = Flatten()(x)
x = Lambda(lambda x: K.l2_normalize(x,axis=1))(x)
for layer in base_model.layers[15:]:
layer.trainable = False
return x
img_shape=(256, 256, 3)
anchor_input = Input(img_shape, name='anchor_input')
positive_input = Input(img_shape, name='positive_input')
negative_input = Input(img_shape, name='negative_input')
encoded_anchor = shared_dnn(anchor_input)
encoded_positive = shared_dnn(positive_input)
encoded_negative = shared_dnn(negative_input)
merged_vector = concatenate([encoded_anchor, encoded_positive, encoded_negative], axis=-1, name='merged_layer')
model = Model(inputs=[anchor_input,positive_input, negative_input], outputs=merged_vector)
model.compile(loss=triplet_loss, optimizer=adam_optim)
model.fit([Anchor,Positive,Negative],
y=Y_dummy,
validation_data=([Anchor_test,Positive_test,Negative_test],Y_dummy2), batch_size=512, epochs=500)
The model plot is as follows:
model_plot
I want to predict single images with the functional API (keras version 2.2.2, tensorflow backend v1.7) . I load my model:
# Loading Model
base_model = VGG16(include_top = False, weights=None,
input_shape=(224,224,3))
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dropout(0.7)(x)
x = Dense(1020, activation='relu')(x)
predictions = Dense(1, activation='sigmoid')(x)
model = Model(inputs=base_model.input, outputs=predictions)
model.load_weights("model.h5")
Then I load an Image and transform it into the input format and try to predict:
from keras.preprocessing.image import img_to_array, load_img
img = load_img("data/my_image.png") # this is a PIL image
array = img_to_array(img) # this is a Numpy array with shape
arrayresized = cv2.resize(array, (244,244))*1./255
inputarray = np.expand_dims(arrayresized, axis=0)
# Predicting
prediction = model.predict(inputarray, batch_size = 1)
Then I get this error back:
ValueError: Error when checking input: expected input_4 to have shape
(224, 224, 3) but got array with shape (244, 244, 3)
I was trying to retrain ResNet50 model to classify given images of animals into 30 different classes. To do this, I made a list containing arrays of given images of dimension(after expanding dimensions and preprocessing it):- (1, 224, 224, 3), thereby the shape of given list(after converting it to numpy array) was (300, 1, 224, 224, 3), as initially i took only 300 images. For Ytrain, I Label encoded the classes and one hot encoded the afterwards. For 30 classes, I had an numpy array of dimension (300, 30). Then I used DataGenerator for model.fit_generator, passing Xtrain of shape (1, 224, 224, 3) and Ytrain of shape (30, ), But got the error:-
ValueError: Error when checking target: expected fc1000 to have shape (30,) but got array with shape (1,)
Here is my code:-
inputShape = (224, 224)
preprocess = imagenet_utils.preprocess_input
df = pd.read_csv('DLBeginner/meta-data/train.csv')
df = df.head(300)
imagesData, target = [], []
c = 0
for images in df['Image_id']:
filename = args["target"] + '/' + images
image = load_img(filename, target_size = inputShape)
image = img_to_array(image)
image = np.expand_dims(image, axis = 0)
image = preprocess(image)
imagesData.append(image)
c += 1
print('Count = {}, Image > {} '.format(c, images))
imagesData = np.array(imagesData)
labelEncoder = LabelEncoder()
series = df['Animal'][0:300]
integerEncoded = labelEncoder.fit_transform(series)
Hot = OneHotEncoder(sparse = False)
integerEncoded = integerEncoded.reshape(len(integerEncoded), 1)
oneHot = Hot.fit_transform(integerEncoded)
model = ResNet50(include_top = True, classes = 30, weights = None)
model.compile(optimizer = 'Adam', loss='categorical_crossentropy', metrics = ['accuracy'])
l = len(imagesData)
def DataGenerator(Xtrain, Ytrain):
while(True):
for i in range(l):
arr1 = Xtrain[i]
arr2 = Ytrain[i]
print("arr1.shape : {}".format(arr1.shape))
print("arr2.shape : {}".format(arr2.shape))
yield(arr1, arr2)
and here is the "fitting part"
generator = DataGenerator(imagesData, oneHot)
model.fit_generator(generator = generator, epochs = 5, steps_per_epoch=l)
Where am I going wrong?
Thanks in advance.
Switching from 'categorical_crossentropy' to 'sparse_categorical_crossentropy' solved it for me.
Just want to add little more details.
When you have multi-class classification problem and
(1) if your targets are one-hot encoded then use categorical_crossentropy
(2) if your targets are integers as in MNIST example, use sparse_categorical_crossentropy. When you use this, Tensorflow under the hood, it will convert data into one-hot encoded and classify the data.
Hope that helps. Thanks!
I'm trying to do an image segmentation problem where I want to segment 5 objects in an image. I'm using a U-net architecture. My final layer looks like this:
conv_final = Conv2D(OUTPUT_MASK_CHANNELS, (1, 1))(up_conv_224)
conv_final = Activation('sigmoid')(conv_final)
model = Model(inputs, conv_final, name="ZF_UNET_224")
However I get an error saying:
ValueError: Error when checking target: expected conv2d_24 to have shape (224, 224, 5) but got array with shape (224, 224, 3)
This is the generator that I'm using
image_generator = train_datagen.flow_from_directory(
'data/train', # this is the target directory
target_size=(224, 224),
color_mode = 'rgb',# all images will be resized to 150x150
batch_size=batch_size,
class_mode=None,
seed = 1) # since we use binary_crossentropy loss, we need binary labels
# this is a similar generator, for validation data
mask_generator = mask_datagen.flow_from_directory(
'data/train',
target_size=(224, 224),
color_mode = 'rgb',
batch_size=batch_size,
class_mode=None,
seed = 1)
train_generator = zip(image_generator, mask_generator)
What can I do to fix this? Any help appreciated!
You have to convert the data into one hot encoded format.
Use from keras.utils import to_categorical
I am using the Keras functional API to build a model with multiple (five) outputs and the same input, in order to simultaneously predict different properties of the data (images in my case).
The summary of that model is the following (with capitals are the layers that have been added on top of the pre-trained VGG16) :
The shape of the data being fed to the CNN are the following:
# input images
('x_train shape:', (23706, 224, 224, 3))
('Head_1 shape:', (23706, 26))
('Head_2 shape:', (23706,))
('Head_3 shape:', (23706,))
('Head_4 shape:', (23706,))
('Head_5 shape:', (23706,))
When I put only a single output to my network the training is carried out without problems, but when all the outputs (or even 2 of them) are present, I am receiving the following error:
Traceback (most recent call last):
history = model.fit_generator(datagen.flow(x_train, train_targets_list, batch_size=batch_size)
.
.
.
.
ValueError: could not broadcast input array from shape (23706,26) into shape (23706)
Any idea what I am doing wrong?
Also is there any working example in the documentation that describes a similar case for multi-output models?
# dimensions of our images.
img_width, img_height = 224, 224
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
input_tensor = Input(shape=input_shape, name='IMAGES')
base_model = VGG16(weights='imagenet', include_top=False, input_tensor=input_tensor)
x = base_model.output
x = GlobalAveragePooling2D(name='GAP')(x)
x = Dense(256, activation='relu', name='FC1')(x)
x = Dropout(0.5, name='DROPOUT')(x)
head_1 = Dense(26, activation='sigmoid', name='PREDICTION1') (x)
head_2 = Dense (1, name='PREDICTION2')(x)
head_3 = Dense (1, name='PREDICTION3')(x)
head_4 = Dense (1, name='PREDICTION4')(x)
head_5 = Dense (1, name='PREDICTION5')(x)
outputs_list = [head_1, head_2, head_3, head_4, head_5]
model = Model(inputs=input_tensor, outputs=outputs_list)
for layer in base_model.layers:
layer.trainable = False
losses_list = ['binary_crossentropy','mse','mse','mse', 'mse']
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9),
loss=losses_list,
metrics=['accuracy'])
print x_train.shape -> (23706, 224, 224, 3)
for y in train_targets_list:
print len(y)
23706
23706
23706
23706
23706