I am getting an error when trying to create a CNN model in Keras to create a denoising autoencoder. My Keras backend is TensorFlow.
My input data is a numpy array. The numpy array were taken from grayscale images. I split this using sklearn train_test_split. I have resized the data and get an error at the output layer.
/opt/anaconda/anaconda3/lib/python3.6/site-
packages/keras/engine/training.py in _standardize_input_data(data,
names, shapes, check_batch_axis, exception_prefix)
151 ' to have shape ' + str(shapes[i]) +
152 ' but got array with shape ' +
--> 153 str(array.shape))
154 return arrays
155
ValueError: Error when checking target: expected conv2d_transpose_36
to have shape (None, 279, 559, 1) but got array with shape (129, 258,
540, 1)
train_images = os.listdir('data/train')
test_images = os.listdir('data/test')
clean_images = os.listdir('data/train_cleaned')
def set_common_size(img_link, std_size = (258, 540)):
'''Function will take in the argument of a link to an image and return the image
with the standard size.'''
img = Image.open(img_link)
img = image.img_to_array(img)
img = np.resize(img, std_size)
return img / 255
train_data = []
test_data = []
cleaned_data = []
for img in train_images:
img_file = set_common_size('data/train/' + img)
train_data.append(img_file)
for img in test_images:
img_file = set_common_size('data/test/' + img)
test_data.append(img_file)
for img in clean_images:
img_file = set_common_size('data/train_cleaned/' + img)
cleaned_data.append(img_file)
train_data = np.asarray(train_data)
test_data = np.asarray(test_data)
cleaned_data = np.asarray(cleaned_data)
x_train, x_test, y_train, y_test = train_test_split(train_data, cleaned_data, test_size=0.1, random_state=42)
input_shape = x_train[0].shape
input_layer = Input(input_shape)
#Layer 1
layer1 = Conv2D(64, 3, activation='relu', padding='same')(input_layer)
layer1 = MaxPooling2D(2)(layer1)
#Layer 2
layer2 = Conv2D(128, 3, activation='relu', padding='same')(layer1)
layer2 = MaxPooling2D(2)(layer2)
#Layer 3
layer3 = Conv2D(256, 3, activation='relu', padding='same')(layer2)
layer3 = MaxPooling2D(2)(layer3)
#Bottleneck
encoded = layer3
#Layer1
uplayer1 = UpSampling2D(2)(layer3)
uplayer1 = Conv2DTranspose(256, 2, activation='relu')(uplayer1)
#Layer2
uplayer2 = UpSampling2D(2)(uplayer1)
uplayer2 = Conv2DTranspose(128, 5, activation='relu')(uplayer2)
#Layer3
uplayer3 = UpSampling2D(2)(uplayer2)
uplayer3 = Conv2DTranspose(64, 10, activation='relu')(uplayer3)
output = Conv2DTranspose(1, 3, activation='sigmoid')(uplayer3)
model = Model(input=input_layer, output=output)
print(model.summary())
model.compile(optimizer='adadelta', loss='binary_crossentropy')
model.fit(
x_train, y_train,
epochs=100,
shuffle=True,
validation_data=(x_test, y_test)
)
Here are the Model Summary results:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 258, 540, 1) 0
_________________________________________________________________
conv2d_60 (Conv2D) (None, 258, 540, 64) 640
_________________________________________________________________
max_pooling2d_37 (MaxPooling (None, 129, 270, 64) 0
_________________________________________________________________
conv2d_61 (Conv2D) (None, 129, 270, 128) 73856
_________________________________________________________________
max_pooling2d_38 (MaxPooling (None, 64, 135, 128) 0
_________________________________________________________________
conv2d_62 (Conv2D) (None, 64, 135, 256) 295168
_________________________________________________________________
max_pooling2d_39 (MaxPooling (None, 32, 67, 256) 0
_________________________________________________________________
up_sampling2d_37 (UpSampling (None, 64, 134, 256) 0
_________________________________________________________________
conv2d_transpose_29 (Conv2DT (None, 65, 135, 256) 262400
_________________________________________________________________
up_sampling2d_38 (UpSampling (None, 130, 270, 256) 0
_________________________________________________________________
conv2d_transpose_30 (Conv2DT (None, 134, 274, 128) 819328
_________________________________________________________________
up_sampling2d_39 (UpSampling (None, 268, 548, 128) 0
___________________________________________________________
_________________________________________________________________
conv2d_transpose_32 (Conv2DT (None, 279, 559, 1) 577
=============================================================______
conv2d_transpose_31 (Conv2DT (None, 277, 557, 64) 819264====
Trainable params: 2,271,233
Non- params: 0
There is a mismatch between:
the shape of the outputs your model creates: (279, 559, 1)
(shown in the summary line conv2d_transpose_32 (Conv2DT (None, 279, 559, 1) which is actually the last layer, but your console output looks a bit messed up)
and
the shape of the outputs you are expecting: (258, 540, 1) (the shape of the entries in y_train that you feed in)
Use the model summary to see where the deviations from the expected shapes start and use the proper padding values to get the expected output shapes.
Related
Can anyone please help me to convert this model to PyTorch? I already tried to convert from Keras to PyTorch like this How can I convert this keras cnn model to pytorch version but training results were different. Thank you.
input_3d = (1, 64, 96, 96)
pool_3d = (2, 2, 2)
model = Sequential()
model.add(Convolution3D(8, 3, 3, 3, name='conv1', input_shape=input_3d,
data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool1'))
model.add(Convolution3D(8, 3, 3, 3, name='conv2',data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool2'))
model.add(Convolution3D(8, 3, 3, 3, name='conv3',data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool3'))
model.add(Flatten())
model.add(Dense(2000, activation='relu', name='dense1'))
model.add(Dropout(0.5, name='dropout1'))
model.add(Dense(500, activation='relu', name='dense2'))
model.add(Dropout(0.5, name='dropout2'))
model.add(Dense(3, activation='softmax', name='softmax'))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1 (Conv3D) (None, 8, 60, 94, 94) 224
_________________________________________________________________
pool1 (MaxPooling3D) (None, 8, 30, 47, 47) 0
_________________________________________________________________
conv2 (Conv3D) (None, 8, 28, 45, 45) 1736
_________________________________________________________________
pool2 (MaxPooling3D) (None, 8, 14, 22, 22) 0
_________________________________________________________________
conv3 (Conv3D) (None, 8, 12, 20, 20) 1736
_________________________________________________________________
pool3 (MaxPooling3D) (None, 8, 6, 10, 10) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 4800) 0
_________________________________________________________________
dense1 (Dense) (None, 2000) 9602000
_________________________________________________________________
dropout1 (Dropout) (None, 2000) 0
_________________________________________________________________
dense2 (Dense) (None, 500) 1000500
_________________________________________________________________
dropout2 (Dropout) (None, 500) 0
_________________________________________________________________
softmax (Dense) (None, 3) 1503
=================================================================
Your PyTorch equivalent of the Keras model would look like this:
class CNN(nn.Module):
def __init__(self, ):
super(CNN, self).__init__()
self.maxpool = nn.MaxPool3d((2, 2, 2))
self.conv1 = nn.Conv3d(in_channels=1, out_channels=8, kernel_size=3)
self.conv2 = nn.Conv3d(in_channels=8, out_channels=8, kernel_size=3)
self.conv3 = nn.Conv3d(in_channels=8, out_channels=8, kernel_size=3)
self.linear1 = nn.Linear(4800, 2000)
self.dropout1 = nn.Dropout3d(0.5)
self.linear2 = nn.Linear(2000, 500)
self.dropout2 = nn.Dropout3d(0.5)
self.linear3 = nn.Linear(500, 3)
def forward(self, x):
out = self.maxpool(self.conv1(x))
out = self.maxpool(self.conv2(out))
out = self.maxpool(self.conv3(out))
# Flattening process
b, c, d, h, w = out.size() # batch_size, channels, depth, height, width
out = out.view(-1, c * d * h * w)
out = self.dropout1(self.linear1(out))
out = self.dropout2(self.linear2(out))
out = self.linear3(out)
out = torch.softmax(out, 1)
return out
A driver program to test the model:
inputs = torch.randn(8, 1, 64, 96, 96)
model = CNN()
outputs = model(inputs)
print(outputs.shape) # torch.Size([8, 3])
You can save keras weight and reload then in pytorch.
the steps are
Step 0: Train a Model in Keras. ...
Step 1: Recreate & Initialize Your Model Architecture in PyTorch. ...
Step 2: Import Your Keras Model and Copy the Weights. ...
Step 3: Load Those Weights onto Your PyTorch Model. ...
Step 4: Test and Save Your Pytorch Model.
You Can follow example here https://gereshes.com/2019/06/24/how-to-transfer-a-simple-keras-model-to-pytorch-the-hard-way/
I want to combine a pretrained VGG16 model with a special input block, which is an input layer and a convolutional layer. The goal is to use a pre-trained RGB VGG16 imagenet model on grayscale images:
from keras.applications.vgg16 import VGG16
from keras.layers.convolutional import Conv2D
from keras.layers import Input
from keras.models import Model
img_height = 299
img_width = 299
def input_block(img_height = 299, img_width = 299):
input_shape = (img_height, img_width, 1)
img_input = Input(shape=input_shape, name = 'grayscale_input_layer')
x = Conv2D(3, (3,3), padding= 'same', name = 'grayscale_RGB_layer')(img_input)
return x
pretrained_model = VGG16(weights = 'imagenet', include_top=False, input_tensor = input_block(img_height, img_width))
When I set the weight initalization of VGG16() to 'None', the model builds correctly, with the following desired structure:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
grayscale_input_layer (Input (None, 299, 299, 1) 0
_________________________________________________________________
grayscale_RGB_layer (Conv2D) (None, 299, 299, 3) 30
_________________________________________________________________
block1_conv1 (Conv2D) (None, 299, 299, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 299, 299, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 149, 149, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 149, 149, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 149, 149, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 74, 74, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 74, 74, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 74, 74, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 74, 74, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 37, 37, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 37, 37, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 37, 37, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 37, 37, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 18, 18, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 18, 18, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 18, 18, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 18, 18, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 9, 9, 512) 0
=================================================================
Total params: 14,714,718
Trainable params: 14,714,718
Non-trainable params: 0
_________________________________________________________________
None
However, when I set the weight initialization to 'imagenet',
I get the following error:
ValueError: You are trying to load a weight file containing 13 layers into a model with 14 layers.
This error makes sense, since I have added two layers in front of the VGG16 model instead of a single layer.
As a workaround, I have tried the following:
def input_block_model(img_height = 299, img_width = 299):
input_shape = (img_height, img_width, 1)
img_input = Input(shape=input_shape, name = 'grayscale_input_layer')
x = Conv2D(3, (3,3), padding= 'same', name = 'grayscale_RGB_layer')(img_input)
model = Model(img_input, x, name='input_block_model')
return model
input_model = input_block_model(299,299)
pretrained_model = VGG16(weights = "imagenet", include_top=False)
combined_model = Model(input_model.input,
pretrained_model(input_model.output))
print(combined_model.summary())
Then, the model structure is:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
grayscale_input_layer (Input (None, 299, 299, 1) 0
_________________________________________________________________
grayscale_RGB_layer (Conv2D) (None, 299, 299, 3) 30
_________________________________________________________________
vgg16 (Model) multiple 14714688
=================================================================
Total params: 14,714,718
Trainable params: 14,714,718
Non-trainable params: 0
_________________________________________________________________
None
The disadvantage of this structure, is that I cannot set properties of layers within the VGG16 model. I want to freeze certain layers for example in this model, which I cannot access via combined_model.layers. Does anyone have a working solution, such that I get the model structure as with the 'None' initialization, but with pretrained ImageNet weights?
You can freeze or train layers using combined_model.layers[2].layers as mentioned in the comment above. You can may be simplify the model as follows:
```
img_input = Input(shape=(img_height, img_width, 1), name = 'grayscale_input_layer')
x = Conv2D(3, (3,3), padding= 'same', name = 'grayscale_RGB_layer')(img_input)
x = VGG16(weights = None, include_top=False)(x)
model = Model(img_input, x)
model.summary()
for layer in model.layers[2].layers:
layer.trainable = False
```
I have a dataset of 1-D vectors each 3001 digits long. I have used a simple convolutional network to perform binary classification on these sequences:
shape=train_X.shape[1:]
model = Sequential()
model.add(Conv1D(75,3,strides=1, input_shape=shape, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
The network achieves ~60% accuracy.
I now would like to create an autoencoder to discover the regular pattern that is distinguishing samples where the label is '1' vs those where it is '0'. i.e. to generate an exemplary sequence- that is representative of the '1' labeled samples.
Based on previous blogs and posts I have tried to put together an autoencoder that can achieve this:
input_sig = Input(batch_shape=(None,3001,1))
x = Conv1D(64,3, activation='relu', padding='same')(input_sig)
x1 = MaxPooling1D(2)(x)
x2 = Conv1D(32,3, activation='relu', padding='same')(x1)
x3 = MaxPooling1D(2)(x2)
flat = Flatten()(x3)
encoded = Dense(1,activation = 'relu')(flat)
x2_ = Conv1D(32, 3, activation='relu', padding='same')(x3)
x1_ = UpSampling1D(2)(x2_)
x_ = Conv1D(64, 3, activation='relu', padding='same')(x1_)
upsamp = UpSampling1D(2)(x_)
decoded = Conv1D(1, 3, activation='sigmoid', padding='same')(upsamp)
autoencoder = Model(input_sig, decoded)
autoencoder.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
This looks as follows:
autoencoder.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_57 (InputLayer) (None, 3001, 1) 0
_________________________________________________________________
conv1d_233 (Conv1D) (None, 3001, 64) 256
_________________________________________________________________
max_pooling1d_115 (MaxPoolin (None, 1500, 64) 0
_________________________________________________________________
conv1d_234 (Conv1D) (None, 1500, 32) 6176
_________________________________________________________________
max_pooling1d_116 (MaxPoolin (None, 750, 32) 0
_________________________________________________________________
conv1d_235 (Conv1D) (None, 750, 32) 3104
_________________________________________________________________
up_sampling1d_106 (UpSamplin (None, 1500, 32) 0
_________________________________________________________________
conv1d_236 (Conv1D) (None, 1500, 64) 6208
_________________________________________________________________
up_sampling1d_107 (UpSamplin (None, 3000, 64) 0
_________________________________________________________________
conv1d_237 (Conv1D) (None, 3000, 64) 12352
=================================================================
Total params: 28,096
Trainable params: 28,096
Non-trainable params: 0
hence everything seems to be going smoothly until I train the netowrk
autoencoder.fit(train_X,train_y,epochs=3,batch_size=100,validation_data=(test_X, test_y))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1630, in fit
batch_size=batch_size)
File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1480, in _standardize_user_data
exception_prefix='target')
File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 113, in _standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected conv1d_237 to have 3 dimensions, but got array with shape (32318, 1)
Hence I have tried adding a 'Reshape' layer before the last one.
upsamp = UpSampling1D(2)(x_)
flat = Flatten()(upsamp)
reshaped = Reshape((3000,64))(flat)
decoded = Conv1D(1, 3, activation='sigmoid', padding='same')(reshaped)
in which case the network looks as follows:
autoencoder.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_59 (InputLayer) (None, 3001, 1) 0
_________________________________________________________________
conv1d_243 (Conv1D) (None, 3001, 64) 256
_________________________________________________________________
max_pooling1d_119 (MaxPoolin (None, 1500, 64) 0
_________________________________________________________________
conv1d_244 (Conv1D) (None, 1500, 32) 6176
_________________________________________________________________
max_pooling1d_120 (MaxPoolin (None, 750, 32) 0
_________________________________________________________________
conv1d_245 (Conv1D) (None, 750, 32) 3104
_________________________________________________________________
up_sampling1d_110 (UpSamplin (None, 1500, 32) 0
_________________________________________________________________
conv1d_246 (Conv1D) (None, 1500, 64) 6208
_________________________________________________________________
up_sampling1d_111 (UpSamplin (None, 3000, 64) 0
_________________________________________________________________
flatten_111 (Flatten) (None, 192000) 0
_________________________________________________________________
reshape_45 (Reshape) (None, 3000, 64) 0
_________________________________________________________________
conv1d_247 (Conv1D) (None, 3000, 1) 193
=================================================================
Total params: 15,937
Trainable params: 15,937
Non-trainable params: 0
But the same error results:
Error when checking target: expected conv1d_247 to have 3 dimensions, but got array with shape (32318, 1)
My questions are:
1) Is this a feasible way of finding the pattern that is distinguishing samples with label '1' vs '0'?
2) how can I make the final layer accept the final output of the last upsampling layer?
original = Sequential()
original.add(Conv1D(75,repeat_length,strides=stride, input_shape=shape, activation='relu’,padding=‘same’))
original.add(MaxPooling1D(repeat_length))
original.add(Flatten())
original.add(Dense(1, activation='sigmoid'))
original.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
calculate_roc(original......)
mod=Sequential()
mod.add(original.layers[0])
mod.add(original.layers[1])
mod.add(Conv1D(75,window, activation='relu', padding='same'))
mod.add(UpSampling1D(window))
mod.add(Conv1D(1, 1, activation='sigmoid', padding='same'))
mod.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
mod.layers[0].trainable=False
mod.layers[1].trainable=False
mod.fit(train_X,train_X,epochs=1,batch_size=100)
decoded_imgs = mod.predict(test_X)
x=decoded_imgs.mean(axis=0)
plt.plot(x)
My CNN model contains convolution layer and dense layers. I am able to visualize images and filter of convolution layers with help of below code, but unable to see output images after dense layers (only images, because there is no filters). When i tried using below code i am getting error:
File "<ipython-input-25-e8e4d4494672>", line 35, in <module>
num_of_featuremaps=feature_maps.shape[2]
IndexError: tuple index out of range
#and after that some blank space
code is following:
def get_featuremaps(model, layer_idx, X_batch):
get_activations = K.function([model.layers[0].input, K.learning_phase()],[model.layers[layer_idx].output,])
activations = get_activations([X_batch,0])
return activations
layer_num=11
filter_num=0
test_image=x[0]
test_image_show=test_image[:,:,0]
plt.axis('off')
test_image= np.expand_dims(test_image, axis=0)
print (test_image.shape)
activations = get_featuremaps(model, int(layer_num),test_image)
print (np.shape(activations))
feature_maps = activations[0][0]
print (np.shape(feature_maps))
if K.image_dim_ordering()=='th':
feature_maps=np.rollaxis((np.rollaxis(feature_maps,2,0)),2,0)
print (feature_maps.shape)
fig=plt.figure(figsize=(16,16))
#plt.imshow(feature_maps[:,:,filter_num],cmap='gray')
#plt.savefig("featuremaps-layer-{}".format(layer_num) + "-filternum-{}".format(filter_num)+'.jpg')
num_of_featuremaps=feature_maps.shape[2]
fig=plt.figure(figsize=(16,16))
plt.title("featuremaps-layer-{}".format(layer_num))
subplot_num=int(np.ceil(np.sqrt(num_of_featuremaps)))
for i in range(int(num_of_featuremaps)):
ax = fig.add_subplot(subplot_num, subplot_num, i+1)
ax.imshow(feature_maps[:,:,i],cmap='gray')
plt.xticks([])
plt.yticks([])
plt.tight_layout()
plt.show()
from mpl_toolkits.axes_grid1 import make_axes_locatable
def nice_imshow(ax, data, vmin=None, vmax=None, cmap=None):
"""Wrapper around pl.imshow"""
if cmap is None:
cmap = cm.jet
if vmin is None:
vmin = data.min()
if vmax is None:
vmax = data.max()
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
im = ax.imshow(data, vmin=vmin, vmax=vmax, interpolation='nearest', cmap=cmap)
pl.colorbar(im, cax=cax)
model looks like following:
Layer (type) Output Shape Param #
=================================================================
conv2d_37 (Conv2D) (None, 49, 49, 32) 160
_________________________________________________________________
conv2d_38 (Conv2D) (None, 48, 48, 32) 4128
_________________________________________________________________
max_pooling2d_19 (MaxPooling (None, 24, 24, 32) 0
_________________________________________________________________
dropout_28 (Dropout) (None, 24, 24, 32) 0
_________________________________________________________________
conv2d_39 (Conv2D) (None, 23, 23, 64) 8256
_________________________________________________________________
conv2d_40 (Conv2D) (None, 22, 22, 64) 16448
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 11, 11, 64) 0
_________________________________________________________________
dropout_29 (Dropout) (None, 11, 11, 64) 0
_________________________________________________________________
flatten_10 (Flatten) (None, 7744) 0
_________________________________________________________________
dense_19 (Dense) (None, 256) 1982720
_________________________________________________________________
dropout_30 (Dropout) (None, 256) 0
_________________________________________________________________
dense_20 (Dense) (None, 2) 514
=================================================================
Total params: 2,012,226
Trainable params: 2,012,226
Non-trainable params: 0
_____________________________________
where pl is pylab and plt is matplotlib.
I am getting error while running the following code in keras
Traceback (most recent call last):
File "my_conv_ae.py", line 74, in <module>
validation_steps = nb_validation_samples // batch_size)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\legacy\interfaces.py", line 88, in wrapper
return func(*args, **kwargs)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\training.py", line 1890, in fit_generator
class_weight=class_weight)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\training.py", line 1627, in train_on_batch
check_batch_axis=True)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\training.py", line 1309, in _standardize_user_data
exception_prefix='target')
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\training.py", line 127, in _standardize_input_data
str(array.shape))
ValueError: Error when checking target: expected conv2d_transpose_8 to have 4 dimensions, but got array with shape (20, 1)
The code is:
import keras
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D, Conv2DTranspose
from keras.models import Model
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
import numpy as np
input_img = Input(shape=(512, 512, 1))
nb_train_samples = 1700
nb_validation_samples = 420
epochs = 10
batch_size = 20
x = Conv2D(64, (11, 11), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(input_img)
x = Conv2D(64, (11, 11), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(128, (7, 7), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2D(128, (5, 5), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(256, (5, 5), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2D(256, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(512, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2D(512, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
encoded = MaxPooling2D((2, 2))(x)
print (K.int_shape(encoded))
at this point the representation is (26, 26, 512)
x = UpSampling2D((2, 2))(encoded)
x = Conv2DTranspose(512, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2DTranspose(512, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2DTranspose(256, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2DTranspose(256, (5, 5), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2DTranspose(128, (5, 5), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2DTranspose(128, (7, 7), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2DTranspose(64, (11, 11), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
decoded = Conv2DTranspose(1, (11, 11), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
print (K.int_shape(decoded))
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer = 'adadelta', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
x_train = train_datagen.flow_from_directory(
'data/train',
target_size = (512, 512), color_mode = 'grayscale',
batch_size = batch_size,
class_mode = 'binary')
x_test = test_datagen.flow_from_directory(
'data/validation',
target_size = (512, 512), color_mode = 'grayscale',
batch_size = batch_size,
class_mode = 'binary')
autoencoder.fit_generator(
x_train,
steps_per_epoch = nb_train_samples // batch_size,
epochs = epochs,
validation_data = x_test,
validation_steps = nb_validation_samples // batch_size)
decoded_imgs = autoencoder.predict(x_test)
Summary of model is as follows:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 512, 512, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 502, 502, 64) 7808
_________________________________________________________________
conv2d_2 (Conv2D) (None, 492, 492, 64) 495680
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 246, 246, 64) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 240, 240, 128) 401536
_________________________________________________________________
conv2d_4 (Conv2D) (None, 236, 236, 128) 409728
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 118, 118, 128) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 114, 114, 256) 819456
_________________________________________________________________
conv2d_6 (Conv2D) (None, 112, 112, 256) 590080
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 56, 56, 256) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 54, 54, 512) 1180160
_________________________________________________________________
conv2d_8 (Conv2D) (None, 52, 52, 512) 2359808
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 26, 26, 512) 0
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 52, 52, 512) 0
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 54, 54, 512) 2359808
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 56, 56, 512) 2359808
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 112, 112, 512) 0
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 114, 114, 256) 1179904
_________________________________________________________________
conv2d_transpose_4 (Conv2DTr (None, 118, 118, 256) 1638656
_________________________________________________________________
up_sampling2d_3 (UpSampling2 (None, 236, 236, 256) 0
_________________________________________________________________
conv2d_transpose_5 (Conv2DTr (None, 240, 240, 128) 819328
_________________________________________________________________
conv2d_transpose_6 (Conv2DTr (None, 246, 246, 128) 802944
_________________________________________________________________
up_sampling2d_4 (UpSampling2 (None, 492, 492, 128) 0
_________________________________________________________________
conv2d_transpose_7 (Conv2DTr (None, 502, 502, 64) 991296
_________________________________________________________________
conv2d_transpose_8 (Conv2DTr (None, 512, 512, 1) 7745
=================================================================
Total params: 16,423,745
Trainable params: 16,423,745
Non-trainable params: 0
_________________________________________________________________
Please help me. Is this because of Conv2DTranspose() which I have used for decoding?
It's definitly not a problem with model architecture itself (because it working on my side). Seems like problem with your ground truth data. It must have same dimensions as your input image, but flow_from_directory don't provide such ground truth data. I guess you need use your own custom data generator.