Keras - How to remove useless dimension without hurting the computation graph? - keras

While generating a deep learning model, I used K.squeeze function to squeeze useless dimension when the first two dimensions were None shape.
import keras.backend as K
>>> K.int_shape(user_input_for_TD)
(None, None, 1, 32)
>>> K.int_shape(K.squeeze(user_input_for_TD, axis=-2))
(None, None, 32)
However, this gives below error, It seems like K.squeeze function hurts the computation graph, is there any solution to escape from this issue? Maybe that function does not support calculating gradients, which isn't differentiable.
File "/home/sundong/anaconda3/envs/py36/lib/python3.6/site-packages/keras/engine/network.py", line 1325, in build_map
node = layer._inbound_nodes[node_index]
AttributeError: 'NoneType' object has no attribute '_inbound_nodes'
Below code block is the whole code block which causes that error.
user_embedding_layer = Embedding(
input_dim=len(self.data.visit_embedding),
output_dim=32,
weights=[np.array(list(self.data.visit_embedding.values()))],
input_length=1,
trainable=False)
...
all_areas_lstm = LSTM(1024, return_sequences=True)(all_areas_rslt) # (None, None, 1024)
user_input_for_TD = Lambda(lambda x: x[:, :, 0:1])(multiple_inputs) # (None, None, 1)
user_input_for_TD = TimeDistributed(user_embedding_layer)(user_input_for_TD) # (None, None, 1, 32)
user_input_for_TD = K.squeeze(user_input_for_TD, axis=-2) # (None, None, 32)
aggre_threeway_inputs = Concatenate()([user_input_for_TD, all_areas_lstm]) # should be (None, None, 1056)
threeway_encoder = TimeDistributed(ThreeWay(output_dim=512))
three_way_rslt = threeway_encoder(aggre_threeway_inputs) # should be (None, None, 512)
logits = Dense(365, activation='softmax')(three_way_rslt) # should be (None, None, 365)
self.model = keras.Model(inputs=multiple_inputs, outputs=logits)
By removing below two lines (by not making it go through the embedding layer) , the code works without any issues. In this case, dimension of aggre_threeway_inputs = Concatenate()([user_input_for_TD, all_areas_lstm]) is (None, None, 1025).
user_input_for_TD = TimeDistributed(user_embedding_layer)(user_input_for_TD)
user_input_for_TD = K.squeeze(user_input_for_TD, axis=-2)

I solved it by using the Lambda layer with indexing, instead of K.squeeze function.
from keras.layers import Lambda
>>> K.int_shape(user_input_for_TD)
(None, None, 1, 32)
>>> K.int_shape(Lambda(lambda x: x[:, :, 0, :])(user_input_for_TD))
(None, None, 32)

Related

Tensorflow tape.gradient is NONE when applying Grad-CAM on transfer learning model in model

I am using transfer learning to change my inner regression model with a outer model for classification.
The inner model is trained on the MNIST dataset where it had to predict ones from zeros. (Note that this is just an example to test the method before changing it to a more complex models and data)
I then freeze the inner model to prevent the regression output of the inner model from being changed while learning outer model.
The outer model classifies which instance the input is. So the first image of my batch should become the first class and with Tensorflow's Keras this works fine.
Then I remove the outer model's softmax activation function and then apply Grad-CAM.
Grad-CAM is applied on the final layer of the inner model (this is a 2D Conv layer).
But the grads = tape.gradient(class_channel, last_conv_layer_output) line in the code does produce only NONE.
Any help would be appreciated (I hope enough code is provided to reproduce the error ;-))
I have tried the steps from Grad-CAM Transfer learning error: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor but to no avail. In addition allowing the inner model to be retrained also produces the same problem. I am aware of the import problems with Keras and TF with regards with tape.gradient.
Libaries
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
Preprocess dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train / 255
x_test = x_test / 255
# Select only ones and zeros.
x_train_zeros = x_train[y_train == 0]
x_train_ones = x_train[y_train == 1]
x_test_zeros = x_test[y_test == 0]
x_test_ones = x_test[y_test == 1]
# X and Y need to have same length
y_train_ones = x_train_ones[:len(x_train_zeros)]
y_test_ones = x_test_ones[:len(x_test_zeros)]
Model: "inner"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 28, 28, 1)] 0 []
conv2d (Conv2D) (None, 28, 28, 32) 320 ['input_1[0][0]']
add (Add) (None, 28, 28, 32) 0 ['input_1[0][0]',
'conv2d[0][0]']
conv2d_1 (Conv2D) (None, 28, 28, 32) 9248 ['add[0][0]']
add_1 (Add) (None, 28, 28, 32) 0 ['add[0][0]',
'conv2d_1[0][0]']
conv2d_2 (Conv2D) (None, 28, 28, 32) 9248 ['add_1[0][0]']
add_2 (Add) (None, 28, 28, 32) 0 ['add_1[0][0]',
'conv2d_2[0][0]']
conv2d_3 (Conv2D) (None, 28, 28, 32) 9248 ['add_2[0][0]']
add_3 (Add) (None, 28, 28, 32) 0 ['add_2[0][0]',
'conv2d_3[0][0]']
inner_final (Conv2D) (None, 28, 28, 1) 289 ['add_3[0][0]']
==================================================================================================
Total params: 28,353
Trainable params: 28,353
Non-trainable params: 0
__________________________________________________________________________________________________
history = inner_model.fit(x_train_zeros, y_train_ones, batch_size = batch_size, epochs = epochs, validation_data = (x_test_zeros, y_test_ones))
One hot encode the output for the classification
batch_size_classify = 40 # I want to "classify" the first 40 images.
X_train_class = x_train_zeros[: batch_size_classify]
# Produce "class labels"
Y_test = np.arange(0, batch_size_classify, 1, dtype=int)
from numpy import array
from numpy import argmax
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
# define example
values = array(Y_test)
# integer encode
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(values)
# binary encode
onehot_encoder = OneHotEncoder(sparse=False)
integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
onehot_encoded = onehot_encoder.fit_transform(integer_encoded)
# invert first example
#inverted = label_encoder.inverse_transform([argmax(onehot_encoded[0, :])])
X_train_class = tf.convert_to_tensor(X_train_class)
onehot_encoded = tf.convert_to_tensor(onehot_encoded)
Build the outer model
# prevent changing the inner model
inner_model.trainable = False
inp = tf.keras.layers.Input(shape=(28, 28, 1))
x = inner_model(inp)
x = tf.keras.layers.Flatten()(x)
outputs = tf.keras.layers.Dense(batch_size_2, activation = 'softmax')(x)
Outer_model = tf.keras.Model(inp, outputs)
Outer_model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
Model: "Outer"
__________________________________________________________________________________________________
Layer (type) Output Shape Param #
==================================================================================================
input_1 (InputLayer) [(None, 28, 28, 1)] 0
inner (Functional) (None, 28, 28, 1) 28353
|¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯|
| input_1 (InputLayer) [(None, 28, 28, 1)] 0 |
| |
| conv2d (Conv2D) (None, 28, 28, 32) 320 |
| |
| add (Add) (None, 28, 28, 32) 0 |
| |
| conv2d_1 (Conv2D) (None, 28, 28, 32) 9248 |
| |
| add_1 (Add) (None, 28, 28, 32) 0 |
| |
| conv2d_2 (Conv2D) (None, 28, 28, 32) 9248 |
| |
| add_2 (Add) (None, 28, 28, 32) 0 |
| |
| conv2d_3 (Conv2D) (None, 28, 28, 32) 9248 |
| |
| add_3 (Add) (None, 28, 28, 32) 0 |
| |
| inner_final (Conv2D) (None, 28, 28, 1) 289 |
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
flatten_2 (Flatten) (None, 784) 0
dense_2 (Dense) (None, 40) 31400
==================================================================================================
Total params: 59,753
Trainable params: 31,400
Non-trainable params: 28,353
__________________________________________________________________________________________________
epochs = 20
# Fit the model to the training data.
Outer_model.fit(X_train_class, onehot_encoded, batch_size = 1, epochs = epochs)
Grad-CAM where the grads = tape.gradient(class_channel, last_conv_layer_output) line produces NONE.
def make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index=None):
# First, we create a model that maps the input image to the activations of the last conv layer as well as the output predictions
grad_model = tf.keras.models.Model([model.inputs], [model.get_layer(last_conv_layer_name).output, model.output])
# Then, we compute the gradient of the top predicted class for our input
# image with respect to the activations of the last conv layer
with tf.GradientTape() as tape:
last_conv_layer_output, preds = grad_model(img_array)
if pred_index is None:
pred_index = tf.argmax(preds[0])
class_channel = preds[:, pred_index]
# This is the gradient of the output neuron (top predicted or chosen)
# with regard to the output feature map of the last conv layer
# The line in Grad-CAM where the code produces NONE...
grads = tape.gradient(class_channel, last_conv_layer_output)
# This is a vector where each entry is the mean intensity
#of the gradient over a specific feature map channel
pooled_grads = tf.reduce_mean(grads, axis = (0, 1, 2))
# We multiply each channel in the feature map array
# by "how important this channel is" with regard to the top predicted class
# then sum all the channels to obtain the heatmap class activation
last_conv_layer_output = last_conv_layer_output[0]
heatmap = last_conv_layer_output # pooled_grads[..., tf.newaxis]
heatmap = tf.squeeze(heatmap)
# For visualization purpose, we will also normalize the heatmap between 0 & 1
heatmap = tf.maximum(heatmap, 0) / tf.math.reduce_max(heatmap)
return heatmap.numpy()
last_conv_layer_name = "inner"
# select first image to test Grad-CAM onto.
img_array = X_train_classs[:1]
# Remove softmax activation function.
Outer_model.layers[-1].activation = None
img_array = tf.expand_dims(img_array, axis = 3)
heatmap = make_gradcam_heatmap(img_array, Outer_model, last_conv_layer_name)
Cell output:
ValueError Traceback (most recent call last)
<ipython-input-44-4dea4cd32c74> in <module>
7 img_array = tf.expand_dims(img_array, axis = 3)
8
----> 9 heatmap = make_gradcam_heatmap(img_array, U_model, last_conv_layer_name)
10
11 # Display heatmap
2 frames
<ipython-input-19-f9b93747c0ae> in make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index)
17
18 # This is a vector where each entry is the mean intensity of the gradient over a specific feature map channel
---> 19 pooled_grads = tf.reduce_mean(grads, axis = (0, 1, 2))
20
21 # We multiply each channel in the feature map array
/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/traceback_utils.py in error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb
/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/constant_op.py in convert_to_eager_tensor(value, ctx, dtype)
100 dtype = dtypes.as_dtype(dtype).as_datatype_enum
101 ctx.ensure_initialized()
--> 102 return ops.EagerTensor(value, ctx.device_name, dtype)
103
104
ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.

Specific options missing in keras layer class

I would like to implement operations on the results of two keras conv2d layers (Ix,Iy) in a deep learning architecture for a computer vision task. The operation looks as follows:
G = np.hypot(Ix, Iy)
G = G / G.max() * 255
theta = np.arctan2(Iy, Ix)
I've spent some time looking for operations provided by keras but did not have success so far. Among a few others, there's a "add" functionality that allows the user to add the results of two conv2d layers (tf.keras.layers.Add(Ix,Iy)). However, I would like to have a Pythagorean addition (first line) followed by a arctan2 operation (third line).
So ideally, if already implemented by keras it would look as follows:
tf.keras.layers.Hypot(Ix,Iy)
tf.keras.layers.Arctan2(Ix,Iy)
Does anyone know if it is possible to implement those functionalities within my deep learning architecture? Is it possible to write custom layers that meet my needs?
You could probably use simple Lambda layers for your use case, although they are not absolutely necessary:
import tensorflow as tf
inputs = tf.keras.layers.Input((16, 16, 1))
x = tf.keras.layers.Conv2D(32, (3, 3), padding='same')(inputs)
y = tf.keras.layers.Conv2D(32, (2, 2), padding='same')(inputs)
hypot = tf.keras.layers.Lambda(lambda z: tf.math.sqrt(tf.math.square(z[0]) + tf.math.square(z[1])))([x, y])
hypot = tf.keras.layers.Lambda(lambda z: z / tf.reduce_max(z) * 255)(hypot)
atan2 = tf.keras.layers.Lambda(lambda z: tf.math.atan2(z[0], z[1]))([x, y])
model = tf.keras.Model(inputs, [hypot, atan2])
print(model.summary())
model.compile(optimizer='adam', loss='mse')
model.fit(tf.random.normal((64, 16, 16, 1)), [tf.random.normal((64, 16, 16, 32)), tf.random.normal((64, 16, 16, 32))])
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) [(None, 16, 16, 1)] 0 []
conv2d_2 (Conv2D) (None, 16, 16, 32) 320 ['input_3[0][0]']
conv2d_3 (Conv2D) (None, 16, 16, 32) 160 ['input_3[0][0]']
lambda_2 (Lambda) (None, 16, 16, 32) 0 ['conv2d_2[0][0]',
'conv2d_3[0][0]']
lambda_3 (Lambda) (None, 16, 16, 32) 0 ['lambda_2[0][0]']
lambda_4 (Lambda) (None, 16, 16, 32) 0 ['conv2d_2[0][0]',
'conv2d_3[0][0]']
==================================================================================================
Total params: 480
Trainable params: 480
Non-trainable params: 0
__________________________________________________________________________________________________
None
2/2 [==============================] - 1s 71ms/step - loss: 3006.0469 - lambda_3_loss: 3001.7981 - lambda_4_loss: 4.2489
<keras.callbacks.History at 0x7ffa93dc2890>

Convert a simple cnn from keras to pytorch

Can anyone please help me to convert this model to PyTorch? I already tried to convert from Keras to PyTorch like this How can I convert this keras cnn model to pytorch version but training results were different. Thank you.
input_3d = (1, 64, 96, 96)
pool_3d = (2, 2, 2)
model = Sequential()
model.add(Convolution3D(8, 3, 3, 3, name='conv1', input_shape=input_3d,
data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool1'))
model.add(Convolution3D(8, 3, 3, 3, name='conv2',data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool2'))
model.add(Convolution3D(8, 3, 3, 3, name='conv3',data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool3'))
model.add(Flatten())
model.add(Dense(2000, activation='relu', name='dense1'))
model.add(Dropout(0.5, name='dropout1'))
model.add(Dense(500, activation='relu', name='dense2'))
model.add(Dropout(0.5, name='dropout2'))
model.add(Dense(3, activation='softmax', name='softmax'))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1 (Conv3D) (None, 8, 60, 94, 94) 224
_________________________________________________________________
pool1 (MaxPooling3D) (None, 8, 30, 47, 47) 0
_________________________________________________________________
conv2 (Conv3D) (None, 8, 28, 45, 45) 1736
_________________________________________________________________
pool2 (MaxPooling3D) (None, 8, 14, 22, 22) 0
_________________________________________________________________
conv3 (Conv3D) (None, 8, 12, 20, 20) 1736
_________________________________________________________________
pool3 (MaxPooling3D) (None, 8, 6, 10, 10) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 4800) 0
_________________________________________________________________
dense1 (Dense) (None, 2000) 9602000
_________________________________________________________________
dropout1 (Dropout) (None, 2000) 0
_________________________________________________________________
dense2 (Dense) (None, 500) 1000500
_________________________________________________________________
dropout2 (Dropout) (None, 500) 0
_________________________________________________________________
softmax (Dense) (None, 3) 1503
=================================================================
Your PyTorch equivalent of the Keras model would look like this:
class CNN(nn.Module):
def __init__(self, ):
super(CNN, self).__init__()
self.maxpool = nn.MaxPool3d((2, 2, 2))
self.conv1 = nn.Conv3d(in_channels=1, out_channels=8, kernel_size=3)
self.conv2 = nn.Conv3d(in_channels=8, out_channels=8, kernel_size=3)
self.conv3 = nn.Conv3d(in_channels=8, out_channels=8, kernel_size=3)
self.linear1 = nn.Linear(4800, 2000)
self.dropout1 = nn.Dropout3d(0.5)
self.linear2 = nn.Linear(2000, 500)
self.dropout2 = nn.Dropout3d(0.5)
self.linear3 = nn.Linear(500, 3)
def forward(self, x):
out = self.maxpool(self.conv1(x))
out = self.maxpool(self.conv2(out))
out = self.maxpool(self.conv3(out))
# Flattening process
b, c, d, h, w = out.size() # batch_size, channels, depth, height, width
out = out.view(-1, c * d * h * w)
out = self.dropout1(self.linear1(out))
out = self.dropout2(self.linear2(out))
out = self.linear3(out)
out = torch.softmax(out, 1)
return out
A driver program to test the model:
inputs = torch.randn(8, 1, 64, 96, 96)
model = CNN()
outputs = model(inputs)
print(outputs.shape) # torch.Size([8, 3])
You can save keras weight and reload then in pytorch.
the steps are
Step 0: Train a Model in Keras. ...
Step 1: Recreate & Initialize Your Model Architecture in PyTorch. ...
Step 2: Import Your Keras Model and Copy the Weights. ...
Step 3: Load Those Weights onto Your PyTorch Model. ...
Step 4: Test and Save Your Pytorch Model.
You Can follow example here https://gereshes.com/2019/06/24/how-to-transfer-a-simple-keras-model-to-pytorch-the-hard-way/

Convolution1D to Convolution2D

Summarize the Problem
I have a Raw Signal from a Sensor which is 76000 Datapoints long. I want to
process those data with a CNN. To do that, I thought I could use a Lambda Layer to form a Short Time Fourier Transformation from the Raw Signal such as
x = Lambda(lambda v: tf.abs(tf.signal.stft(v,frame_length=frame_length,frame_step=frame_step)))(x)
which totally works. But I want to go one step further and process the Raw data in advance. In the hope that a Convolution1D layer works as a filter to let some of the frequency pass and block others.
What I tried
I do have the two separate (Conv1D example for raw Data processing & the Conv2D example where I process the STFT "image") up and running. But I want to combine these.
Conv1D Where the input is : input = Input(shape = (76000,))
x = Lambda(lambda v: tf.expand_dims(v,-1))(input)
x = layers.Conv1D(filters =10,kernel_size=100,activation = 'relu')(x)
x = Flatten()(x)
output = Model(input, x)
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 76000)] 0
_________________________________________________________________
lambda_2 (Lambda) (None, 76000, 1) 0
_________________________________________________________________
conv1d (Conv1D) (None, 75901, 10) 1010
________________________________________________________________
Conv2D same input
x = Lambda(lambda v:tf.expand_dims(tf.abs(tf.signal.stft(v,frame_length=frame_length,frame_step=frame_step)),-1))(input)
x = BatchNormalization()(x)
Model: "model_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_6 (InputLayer) [(None, 76000)] 0
_________________________________________________________________
lambda_8 (Lambda) (None, 751, 513, 1) 0
_________________________________________________________________
batch_normalization_3 (Batch (None, 751, 513, 1) 4
_________________________________________________________________
. . .
. . .
flatten_4 (Flatten) (None, 1360) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 1360) 0
_________________________________________________________________
dense_2 (Dense) (None, 1) 1361
Im Looking for a way to combine the start from the "conv1d" to the "lambda_8" layer. If I put them together I ll get :
x = Lambda(lambda v: tf.expand_dims(v,-1))(input)
x = layers.Conv1D(filters =10,kernel_size=100,activation = 'relu')(x)
#x = Flatten()(x)
x = Lambda(lambda v:tf.expand_dims(tf.abs(tf.signal.stft(v,frame_length=frame_length,frame_step=frame_step)),-1))(x)
Layer (type) Output Shape Param #
=================================================================
input_6 (InputLayer) [(None, 76000)] 0
_________________________________________________________________
lambda_17 (Lambda) (None, 76000, 1) 0
_________________________________________________________________
conv1d_6 (Conv1D) (None, 75901, 10) 1010
_________________________________________________________________
lambda_18 (Lambda) (None, 75901, 0, 513, 1) 0 <-- Wrong
=================================================================
Which is not what I am looking for. It should look more like (None,751,513,10,1).
So far I could not find a suitable solution.
Can someone help me?
Thanks in advance!
From the documentation, it seems the stft only accepts (..., length) inputs, it doesn't accept (..., length, channels).
Thus, the first suggestion is to move the channels to another dimension first, to keep the length at the last index and make the function work.
Now, of course, you will need matching lengths, you can't match 76000 with 75901. Thus the second suggestion is to use a padding='same' in the 1D convolutions to keep the lengths equal.
And lastly, since you will already have 10 channels in the result of the stft, you don't need to expand dims in the last lambda.
Summarizing:
1D part
inputs = Input((76000,)) #(batch, 76000)
c1Out = Lambda(lambda x: K.expand_dims(x, axis=-1))(inputs) #(batch, 76000, 1)
c1Out = Conv1D(10, 100, activation = 'relu', padding='same')(c1Out) #(batch, 76000, 10)
#permute for putting length last, apply stft, put the channels back to their position
c1Stft = Permute((2,1))(c1Out) #(batch, 10, 76000)
c1Stft = x = Lambda(lambda v: tf.abs(tf.signal.stft(v,
frame_length=frame_length,
frame_step=frame_step)
)
)(c1Stft) #(batch, 10, probably 751, probably 513)
c1Stft = Permute((2,3,1))(c1Stft) #(batch, 751, 513, 10)
2D part, your code seems ok:
c2Out = Lambda(lambda v: tf.expand_dims(tf.abs(tf.signal.stft(v,
frame_length=frame_length,
frame_step=frame_step)
),
-1))(inputs) #(batch, 751, 513, 1)
Now that everything has compatible dimensions
#maybe
#c2Out = Conv2D(10, ..., padding='same')(c2Out)
joined = Concatenate()([c1Stft, c2Out]) #(batch, 751, 513, 11) #maybe (batch, 751, 513, 20)
further = BatchNormalization()(joined)
further = Conv2D(...)(further)
Warning: I don't know if they made stft differentiable or not, the Conv1D part will only work if the gradients are defined.

Keras Error for CNN Output

I am getting an error when trying to create a CNN model in Keras to create a denoising autoencoder. My Keras backend is TensorFlow.
My input data is a numpy array. The numpy array were taken from grayscale images. I split this using sklearn train_test_split. I have resized the data and get an error at the output layer.
/opt/anaconda/anaconda3/lib/python3.6/site-
packages/keras/engine/training.py in _standardize_input_data(data,
names, shapes, check_batch_axis, exception_prefix)
151 ' to have shape ' + str(shapes[i]) +
152 ' but got array with shape ' +
--> 153 str(array.shape))
154 return arrays
155
ValueError: Error when checking target: expected conv2d_transpose_36
to have shape (None, 279, 559, 1) but got array with shape (129, 258,
540, 1)
train_images = os.listdir('data/train')
test_images = os.listdir('data/test')
clean_images = os.listdir('data/train_cleaned')
def set_common_size(img_link, std_size = (258, 540)):
'''Function will take in the argument of a link to an image and return the image
with the standard size.'''
img = Image.open(img_link)
img = image.img_to_array(img)
img = np.resize(img, std_size)
return img / 255
train_data = []
test_data = []
cleaned_data = []
for img in train_images:
img_file = set_common_size('data/train/' + img)
train_data.append(img_file)
for img in test_images:
img_file = set_common_size('data/test/' + img)
test_data.append(img_file)
for img in clean_images:
img_file = set_common_size('data/train_cleaned/' + img)
cleaned_data.append(img_file)
train_data = np.asarray(train_data)
test_data = np.asarray(test_data)
cleaned_data = np.asarray(cleaned_data)
x_train, x_test, y_train, y_test = train_test_split(train_data, cleaned_data, test_size=0.1, random_state=42)
input_shape = x_train[0].shape
input_layer = Input(input_shape)
#Layer 1
layer1 = Conv2D(64, 3, activation='relu', padding='same')(input_layer)
layer1 = MaxPooling2D(2)(layer1)
#Layer 2
layer2 = Conv2D(128, 3, activation='relu', padding='same')(layer1)
layer2 = MaxPooling2D(2)(layer2)
#Layer 3
layer3 = Conv2D(256, 3, activation='relu', padding='same')(layer2)
layer3 = MaxPooling2D(2)(layer3)
#Bottleneck
encoded = layer3
#Layer1
uplayer1 = UpSampling2D(2)(layer3)
uplayer1 = Conv2DTranspose(256, 2, activation='relu')(uplayer1)
#Layer2
uplayer2 = UpSampling2D(2)(uplayer1)
uplayer2 = Conv2DTranspose(128, 5, activation='relu')(uplayer2)
#Layer3
uplayer3 = UpSampling2D(2)(uplayer2)
uplayer3 = Conv2DTranspose(64, 10, activation='relu')(uplayer3)
output = Conv2DTranspose(1, 3, activation='sigmoid')(uplayer3)
model = Model(input=input_layer, output=output)
print(model.summary())
model.compile(optimizer='adadelta', loss='binary_crossentropy')
model.fit(
x_train, y_train,
epochs=100,
shuffle=True,
validation_data=(x_test, y_test)
)
Here are the Model Summary results:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 258, 540, 1) 0
_________________________________________________________________
conv2d_60 (Conv2D) (None, 258, 540, 64) 640
_________________________________________________________________
max_pooling2d_37 (MaxPooling (None, 129, 270, 64) 0
_________________________________________________________________
conv2d_61 (Conv2D) (None, 129, 270, 128) 73856
_________________________________________________________________
max_pooling2d_38 (MaxPooling (None, 64, 135, 128) 0
_________________________________________________________________
conv2d_62 (Conv2D) (None, 64, 135, 256) 295168
_________________________________________________________________
max_pooling2d_39 (MaxPooling (None, 32, 67, 256) 0
_________________________________________________________________
up_sampling2d_37 (UpSampling (None, 64, 134, 256) 0
_________________________________________________________________
conv2d_transpose_29 (Conv2DT (None, 65, 135, 256) 262400
_________________________________________________________________
up_sampling2d_38 (UpSampling (None, 130, 270, 256) 0
_________________________________________________________________
conv2d_transpose_30 (Conv2DT (None, 134, 274, 128) 819328
_________________________________________________________________
up_sampling2d_39 (UpSampling (None, 268, 548, 128) 0
___________________________________________________________
_________________________________________________________________
conv2d_transpose_32 (Conv2DT (None, 279, 559, 1) 577
=============================================================______
conv2d_transpose_31 (Conv2DT (None, 277, 557, 64) 819264====
Trainable params: 2,271,233
Non- params: 0
There is a mismatch between:
the shape of the outputs your model creates: (279, 559, 1)
(shown in the summary line conv2d_transpose_32 (Conv2DT (None, 279, 559, 1) which is actually the last layer, but your console output looks a bit messed up)
and
the shape of the outputs you are expecting: (258, 540, 1) (the shape of the entries in y_train that you feed in)
Use the model summary to see where the deviations from the expected shapes start and use the proper padding values to get the expected output shapes.

Resources