I have a situation where input is an image and a group of (3) numeric fields and output is an image mask. I am not sure about how to do that in KERAS...
My architecture is somewhat like the attachment. I am aware about the CNN and Dense architectures, just not sure how to pass the inputs in the corresponding networks and do the concat operation. Also, suggestion of berrer architecture for this will be great!!!!!
Please suggest me, preferably with example code.
Thanks in Advance, Utpal.
I can advice to try U-net model for this problem. Usual U-net represents several conv and maxpooling layers, and then several conv and upsampling layers:
In the current problem you can mix up non-spatial data (image annotation) at the middle:
Also maybe it's a good idea to start with pre-trained VGG-16 (see below vgg.load_weights(VGG_Weights_path)).
See code below (based on Divam Gupta's repo):
from keras.models import *
from keras.layers import *
def VGGUnet(n_classes, input_height=416, input_width=608, data_length=128, vgg_level=3):
assert input_height % 32 == 0
assert input_width % 32 == 0
# https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5
img_input = Input(shape=(3, input_height, input_width))
data_input = Input(shape=(data_length,))
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1', data_format=IMAGE_ORDERING)(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', data_format=IMAGE_ORDERING)(x)
f1 = x
# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1', data_format=IMAGE_ORDERING)(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', data_format=IMAGE_ORDERING)(x)
f2 = x
# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1', data_format=IMAGE_ORDERING)(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2', data_format=IMAGE_ORDERING)(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', data_format=IMAGE_ORDERING)(x)
f3 = x
# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1', data_format=IMAGE_ORDERING)(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2', data_format=IMAGE_ORDERING)(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', data_format=IMAGE_ORDERING)(x)
f4 = x
# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1', data_format=IMAGE_ORDERING)(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2', data_format=IMAGE_ORDERING)(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3', data_format=IMAGE_ORDERING)(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)
f5 = x
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(1000, activation='softmax', name='predictions')(x)
vgg = Model(img_input, x)
vgg.load_weights(VGG_Weights_path)
levels = [f1, f2, f3, f4, f5]
# Several dense layers for image annotation processing
data_layer = Dense(1024, activation='relu', name='data1')(data_input)
data_layer = Dense(input_height * input_width / 256, activation='relu', name='data2')(data_layer)
data_layer = Reshape((1, input_height / 16, input_width / 16))(data_layer)
# Mix image annotations here
o = (concatenate([f4, data_layer], axis=1))
o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
o = (Conv2D(512, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, f3], axis=1))
o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
o = (Conv2D(256, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, f2], axis=1))
o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
o = (Conv2D(128, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o)
o = (concatenate([o, f1], axis=1))
o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o)
o = (Conv2D(64, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o)
o = (BatchNormalization())(o)
o = Conv2D(n_classes, (3, 3), padding='same', data_format=IMAGE_ORDERING)(o)
o_shape = Model(img_input, o).output_shape
output_height = o_shape[2]
output_width = o_shape[3]
o = (Reshape((n_classes, output_height * output_width)))(o)
o = (Permute((2, 1)))(o)
o = (Activation('softmax'))(o)
model = Model([img_input, data_input], o)
model.outputWidth = output_width
model.outputHeight = output_height
return model
To train and evaluate a keras model with several inputs prepare separate arrays for each of the input layers - image_train and annotation_train (preserving an order by the first axis, i.e. number of the sample) and call this:
model.fit([image_train, annotation_train], result_segmentation_train, batch_size=..., epochs=...)
test_loss, test_acc = model.evaluate([image_test, annotation_test], result_segmentation_test)
Good luck!
Related
I am trying to write a VGG19 neural network for single-channel images, where everything is essentially the same as in a three-channel network except for the input layer.
def model(self, inputShape=(64, 64, 1)):
inputLayer = Input(shape=inputShape)
After applying the Flatten layer to the convolution tensor I use the same dense layer parameters as in classic VGG19 but I get an error when compiling the model
ValueError: Shapes (None, 64, 64, 1) and (None, 1000) are incompatible
As far as I understand the number of neurons in dense layer should correspond to the dimensionality of the input data. That is 64x64 image, after applying the Flatten layer, the dense layer should receive a vector with 4096 neurons. As described in the classical model
layerSet = Flatten()(layerSet)
layerSet = Dense(4096, activation='relu')(layerSet)
layerSet = Dropout(0.5)(layerSet)
layerSet = Dense(4096, activation='relu')(layerSet)
layerSet = Dropout(0.5)(layerSet)
outputLayer = Dense(1000, activation='relu')(layerSet)
The last dense layer gets 1000 neurons, each corresponding to some recognizable class.
In my case, I need a set of features for SRGAN, so I doubt that for my problem there is a need to use classification vector. Features derived from VGG19 model in association with features derived from discriminative model should be passed as output layer of generative-competitive model.
Next I give you the full code example where I give the model itself and the training method. I expect to eventually get the required features from the model
class VGG19DeepConvolutionNetwork:
__model = None
def __init__(self):
self.model()
def model(self, inputShape=(64, 64, 1)):
inputLayer = Input(shape=inputShape)
layerSet = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(inputLayer)
layerSet = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(layerSet)
layerSet = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(layerSet)
layerSet = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(layerSet)
layerSet = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(layerSet)
layerSet = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv4')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv4')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv4')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Flatten()(layerSet)
layerSet = Dense(4096, activation='relu')(layerSet)
layerSet = Dropout(0.5)(layerSet)
layerSet = Dense(4096, activation='relu')(layerSet)
layerSet = Dropout(0.5)(layerSet)
outputLayer = Dense(1000, activation='relu')(layerSet)
self.__model = Model(inputs=[inputLayer], outputs=[outputLayer])
self.__model.compile(optimizer='adam', loss='categorical_crossentropy')
print(self.__model.summary())
def train(self, imageDataPath:string='srgangImageData.h5', weightsPath:string='vgg19Weights.h5', sliceSize=32, epochsNumber=100):
if self.__model is None:
self.model((sliceSize, sliceSize, 1))
imageData = ImageDataProcessing()
sourceTrain, targetTrain, sourceTest, targetTest = imageData.readImageData(imageDataPath)
del imageData
print( 'train source', sourceTrain.shape )
print( 'train target', targetTrain.shape )
print( 'test source', sourceTest.shape )
print( 'test target', targetTest.shape )
checkpoint = ModelCheckpoint(weightsPath, verbose=1, save_best_only=True, save_weights_only=False, mode='min')
callbacks_list = [checkpoint]
history = self.__model.fit(sourceTrain, targetTrain, batch_size=128, steps_per_epoch=len(sourceTrain)//128, validation_data=(sourceTest, targetTest),
callbacks=callbacks_list, shuffle=True, epochs=epochsNumber, verbose=1)
Some corrections:
The flatten layer should result with 2 x 2 x 512 = 2048 parameters as that is the output of the last convolutional layer. Tensorflow/keras should infer that for you.
The reason the last layer gets 1000 neurons is because the model was originally trained on a dataset with 1000 classes (1 neuron per class).
What version of tensorflow are you using? Are you sure it is failing at the compile step? I tried to compile your model with tensorflow 2.10.0 (Python 3.10.4) and everything worked fine. I tried to do a forward pass with an input of (10,64,64,1) and that worked fine too.
Here is the code I tried both locally and in Google Colab:
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras import Model
import tensorflow as tf
class VGG19DeepConvolutionNetwork:
__model = None
def __init__(self):
self.model()
def model(self, inputShape=(64, 64, 1)):
inputLayer = Input(shape=inputShape)
layerSet = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(inputLayer)
layerSet = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(layerSet)
layerSet = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(layerSet)
layerSet = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(layerSet)
layerSet = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(layerSet)
layerSet = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv4')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv4')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(layerSet)
layerSet = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv4')(layerSet)
layerSet = MaxPooling2D(strides=(2,2), padding='same')(layerSet)
layerSet = Flatten()(layerSet)
layerSet = Dense(4096, activation='relu')(layerSet)
layerSet = Dropout(0.5)(layerSet)
layerSet = Dense(4096, activation='relu')(layerSet)
layerSet = Dropout(0.5)(layerSet)
outputLayer = Dense(1000, activation='relu')(layerSet)
self.__model = Model(inputs=[inputLayer], outputs=[outputLayer])
self.__model.compile(optimizer='adam', loss='categorical_crossentropy')
print(self.__model.summary())
def getModel(self):
return self.__model
def train(self, imageDataPath: str='srgangImageData.h5', weightsPath: str='vgg19Weights.h5', sliceSize=32, epochsNumber=100):
if self.__model is None:
self.model((sliceSize, sliceSize, 1))
imageData = ImageDataProcessing()
sourceTrain, targetTrain, sourceTest, targetTest = imageData.readImageData(imageDataPath)
del imageData
print( 'train source', sourceTrain.shape )
print( 'train target', targetTrain.shape )
print( 'test source', sourceTest.shape )
print( 'test target', targetTest.shape )
checkpoint = ModelCheckpoint(weightsPath, verbose=1, save_best_only=True, save_weights_only=False, mode='min')
callbacks_list = [checkpoint]
history = self.__model.fit(sourceTrain, targetTrain, batch_size=128, steps_per_epoch=len(sourceTrain)//128, validation_data=(sourceTest, targetTest),
callbacks=callbacks_list, shuffle=True, epochs=epochsNumber, verbose=1)
modelWrapper = VGG19DeepConvolutionNetwork()
model = modelWrapper.getModel()
X = tf.random.uniform((10,64,64,1))
output = model(X)
print(output)
# modelWrapper.train()
I would like to train a deep learning model, where input image shape is (224,224,3) . And I would like to feed them into a u-net model.
After training I get the error : Error when checking target: expected conv2d_29 to have 4 dimensions, but got array with shape (1255, 12)
I'm confused since I'm sure the image array and label has no issue. Is the issue within the model? How should I resolve this?
The model is as below:
#def unet(pretrained_weights = None, input_size = (224,224,3)):
concat_axis = 3
input_size= Input((224,224,3))
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(input_size)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
#flat1 = Flatten()(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
drop4 = Dropout(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
drop5 = Dropout(0.5)(conv5)
up_conv5 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv5)
ch, cw = get_crop_shape(conv4, up_conv5)
crop_conv4 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv4)
up6 = concatenate([up_conv5, crop_conv4], axis=concat_axis)
conv6 = Conv2D(256, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up6)
conv6 = Conv2D(256, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv6)
up_conv6 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv6)
ch, cw = get_crop_shape(conv3, up_conv6)
crop_conv3 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv3)
up7 = concatenate([up_conv6, crop_conv3], axis=concat_axis)
conv7 = Conv2D(128, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up7)
conv7 = Conv2D(128, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv7)
up_conv7 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv7)
ch, cw = get_crop_shape(conv2, up_conv7)
crop_conv2 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv2)
up8 = concatenate([up_conv7, crop_conv2], axis=concat_axis)
conv8 = Conv2D(64, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up8)
conv8 = Conv2D(64, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv8)
up_conv8 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv8)
ch, cw = get_crop_shape(conv1, up_conv8)
crop_conv1 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv1)
up9 = concatenate([up_conv8, crop_conv1], axis=concat_axis)
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up9)
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)
model = Model(inputs = input_size, outputs = conv9)
Since the model output's layer is conv layer, the output shape has 4 dimensions(Batch_size, width, height, channels). But you are feeding an array of shape (1255, 12). If the target label has a shape of (Batch_size, num_features) then the last layer's output should have a shape of (None, 12) or (Batch_size, 12).
You have two options to deal with this situation.
Using dense layer after flattening the output of conv layer
Reshaping the output of conv layer to have the desired shape.
The choice depends on the problem you're dealing with. If the problem is classification, option one could be used to add softmax activation. With option 1 the modification to the code would be,
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)
flatten1 = Flatten()(conv9)
dense1 = Dense(12, activation="softmax")(flatten1) # The choice of the activation depends on the problem you are dealing with.
model = Model(inputs = input_size, outputs = dense1)
With option 2, the modification would be
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)
reshape1 = Reshape((12,)(conv9) # The choice of the activation depends on the problem you are dealing with.
model = Model(inputs = input_size, outputs = reshape1)
N.B: When the Reshape layer is used to reshape tensor to (None, 12) shape be sure that the product of the output shape of the previous layer should be divisible by 12.
I am using the u-net code from this Kaggle notebook that I've also pasted below:
inputs = Input((IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS))
s = Lambda(lambda x: x / 255) (inputs)
c1 = Conv2D(8, (3, 3), activation='relu', padding='same') (s)
c1 = Conv2D(8, (3, 3), activation='relu', padding='same') (c1)
p1 = MaxPooling2D((2, 2)) (c1)
c2 = Conv2D(16, (3, 3), activation='relu', padding='same') (p1)
c2 = Conv2D(16, (3, 3), activation='relu', padding='same') (c2)
p2 = MaxPooling2D((2, 2)) (c2)
c3 = Conv2D(32, (3, 3), activation='relu', padding='same') (p2)
c3 = Conv2D(32, (3, 3), activation='relu', padding='same') (c3)
p3 = MaxPooling2D((2, 2)) (c3)
c4 = Conv2D(64, (3, 3), activation='relu', padding='same') (p3)
c4 = Conv2D(64, (3, 3), activation='relu', padding='same') (c4)
p4 = MaxPooling2D(pool_size=(2, 2)) (c4)
c5 = Conv2D(128, (3, 3), activation='relu', padding='same') (p4)
c5 = Conv2D(128, (3, 3), activation='relu', padding='same') (c5)
u6 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same') (c5)
u6 = concatenate([u6, c4])
c6 = Conv2D(64, (3, 3), activation='relu', padding='same') (u6)
c6 = Conv2D(64, (3, 3), activation='relu', padding='same') (c6)
u7 = Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same') (c6)
u7 = concatenate([u7, c3])
c7 = Conv2D(32, (3, 3), activation='relu', padding='same') (u7)
c7 = Conv2D(32, (3, 3), activation='relu', padding='same') (c7)
u8 = Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same') (c7)
u8 = concatenate([u8, c2])
c8 = Conv2D(16, (3, 3), activation='relu', padding='same') (u8)
c8 = Conv2D(16, (3, 3), activation='relu', padding='same') (c8)
u9 = Conv2DTranspose(8, (2, 2), strides=(2, 2), padding='same') (c8)
u9 = concatenate([u9, c1], axis=3)
c9 = Conv2D(8, (3, 3), activation='relu', padding='same') (u9)
c9 = Conv2D(8, (3, 3), activation='relu', padding='same') (c9)
outputs = Conv2D(1, (1, 1), activation='sigmoid') (c9)
model = Model(inputs=[inputs], outputs=[outputs])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[mean_iou])
My question is where to properly add a kernal_regularizer (l2 regularization). I've looked at countless repos and notebooks, but I'm not able to find any source where l2 regularization was used successfully. Although I know how l2 regularization works, I have no intuition about which layers to add it into.
Hence, some intuition on where to add the kernal regularizer and what to set the param to would be helpful.
Going over the Kaggele notebook you have linked. It appears that no weight regularization is being used throughout the entire model (so the code you added is correct).
This is quit peculiar and very uncommon, in almost all cases and models, L2 weight regularization (a.k.a ridge regression) is being used in every single layer, perhaps just with different weight-decay coefficients.
I suggest adding weight regularization to all the layers but starting with a very small weight decay coefficient:
c1 = Conv2D(8, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(w_decay)) (s)
c1 = Conv2D(8, (3, 3), activation='relu', padding='same', kernel_regularizer=regularizers.l2(w_decay)) (c1)
p1 = MaxPooling2D((2, 2)) (c1)
...
hi I am building a image classifier for one-class classification in which i've used autoencoder while running this model I am getting this error (ValueError: Layer conv2d_3 was called with an input that isn't a symbolic tensor. Received type: . Full input: [(128, 128, 3)]. All inputs to the layer should be tensors.)
num_of_samples = img_data.shape[0]
labels = np.ones((num_of_samples,),dtype='int64')
labels[0:376]=0
names = ['cat']
Y = np_utils.to_categorical(labels, num_class)
input_shape=img_data[0].shape
x,y = shuffle(img_data,Y, random_state=2)
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=2)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_shape)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_shape, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(X_train, X_train,
epochs=50,
batch_size=32,
shuffle=True,
validation_data=(X_test, X_test),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
Here:
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_shape)
A shape is not a tensor.
Do this:
from keras.layers import *
inputTensor = Input(input_shape)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(inputTensor)
Hint about autoencoders
You should separate the encoder and decoder as individual models. Later you will probably want to work with only one of them.
Encoder:
inputTensor = Input(input_shape)
x = ....
encodedData = MaxPooling2D((2, 2), padding='same')(x)
encoderModel = Model(inputTensor,encodedData)
Decoder:
encodedInput = Input((4,4,8))
x = ....
decodedData = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
decoderModel = Model(encodedInput,decodedData)
Autoencoder:
autoencoderInput = Input(input_shape)
encoded = encoderModel(autoencoderInput)
decoded = decoderModel(encoded)
autoencoderModel = Model(autoencoderInput,decoded)
I have images of shape 391 x 400. I attempted to use the autoencoder as described here.
Specifically, I have used the following code:
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K
input_img = Input(shape=(391, 400, 1)) # adapt this if using `channels_first` image data format
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
I am getting the following:
ValueError: Error when checking target: expected conv2d_37 to have shape (None, 392, 400, 1) but got array with shape (500, 391, 400, 1)
What I need: a layer that would drop/crop/reshape the last layer from 392 x 400 to 391 x 400.
Thank you for any help.
There's a layer called Cropping2D. To crop the last layer from 392 x 400 to 391 x 400, you can use it by:
cropped = Cropping2D(cropping=((1, 0), (0, 0)))(decoded)
autoencoder = Model(input_img, cropped)
The tuple ((1, 0), (0, 0)) means to crop 1 row from the top. If you want to crop from bottom, use ((0, 1), (0, 0)) instead. You can see the documentation for more detailed description about the cropping argument.