I am trying to add attention mechanism to the bellow model.
Is there really a need for CTC loss for attention model.
How could I implement a BLSTM with attention mechanism for an image OCR problem.
def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
y_pred = y_pred[:, 2:, :]
return tf.keras.backend.ctc_batch_cost(labels, y_pred, input_length, label_length)
def get_Model(training):
input_shape = (img_w, img_h, 1) # (128, 64, 1)
labels = Input(name='the_labels', shape=[max_text_len], dtype='float32') # (None ,8)
input_length = Input(name='input_length', shape=[1], dtype='int64') # (None, 1)
label_length = Input(name='label_length', shape=[1], dtype='int64') # (None, 1)
# Make Networkw
inputs = Input(name='the_input', shape=input_shape, dtype='float32') # (None, 128, 64, 1)
# Convolution layer (VGG)
inner = Conv2D(64, (3, 3), padding='same', name='conv1', kernel_initializer='he_normal')(inputs) # (None, 128, 64, 64)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = MaxPooling2D(pool_size=(2, 2), name='max1')(inner) # (None,64, 32, 64)
inner = Conv2D(128, (3, 3), padding='same', name='conv2', kernel_initializer='he_normal')(inner) # (None, 64, 32, 128)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = MaxPooling2D(pool_size=(2, 2), name='max2')(inner) # (None, 32, 16, 128)
inner = Conv2D(256, (3, 3), padding='same', name='conv3', kernel_initializer='he_normal')(inner) # (None, 32, 16, 256)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = Conv2D(256, (3, 3), padding='same', name='conv4', kernel_initializer='he_normal')(inner) # (None, 32, 16, 256)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = MaxPooling2D(pool_size=(1, 2), name='max3')(inner) # (None, 32, 8, 256)
inner = Conv2D(512, (3, 3), padding='same', name='conv5', kernel_initializer='he_normal')(inner) # (None, 32, 8, 512)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = Conv2D(512, (3, 3), padding='same', name='conv6')(inner) # (None, 32, 8, 512)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = MaxPooling2D(pool_size=(1, 2), name='max4')(inner) # (None, 32, 4, 512)
inner = Conv2D(512, (2, 2), padding='same', kernel_initializer='he_normal', name='con7')(inner) # (None, 32, 4, 512)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
# CNN to RNN
inner = Reshape(target_shape=((32, 2048)), name='reshape')(inner) # (None, 32, 2048)
inner = Dense(64, activation='relu', kernel_initializer='he_normal', name='dense1')(inner) # (None, 32, 64)
# RNN layer
lstm1 = Bidirectional(LSTM(512, return_sequences=True, kernel_initializer='he_normal'), name='biLSTM1') (inner)
lstm1_norm = BatchNormalization()(lstm1)
lstm2 = Bidirectional(LSTM(512, return_sequences=True, kernel_initializer='he_normal'), name='biLSTM2') (lstm1_norm)
lstm2_norm = BatchNormalization()(lstm2)
# transforms RNN output to character activations:
inner = Dense(num_classes, kernel_initializer='he_normal',name='dense2')(lstm2_norm) #(None, 32, 63)
y_pred = Activation('softmax', name='softmax')(inner)
# Keras doesn't currently support loss funcs with extra parameters
# so CTC loss is implemented in a lambda layer
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([y_pred, labels, input_length, label_length]) #(None, 1)
if training:
return Model(inputs=[inputs, labels, input_length, label_length], outputs=loss_out)
else:
return Model(inputs=[inputs], outputs=y_pred)```
Is lstm1 is encoder and lstm2 is decoder?
I didn't find any attention implementation using keras functional api and also it seems there is no keras attention layer.
Related
I am using a convolutional autoencoder for the Mnist image data (with dimension 28*28), here is my code
input_img = Input(shape=(28, 28, 1))
x = Convolution2D(16, (5, 5), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, (5, 5), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
I get an error message (with padding ='same' at each layer)
ValueError: Error when checking target: expected conv2d_148 to have shape (32, 32, 1) but got array
with shape (28, 28, 1)
Here is my model summary
Layer (type) Output Shape Param #
input_20 (InputLayer) (None, 28, 28, 1) 0
conv2d_142 (Conv2D) (None, 28, 28, 16) 416
max_pooling2d_64 (MaxPooling (None, 14, 14, 16) 0
conv2d_143 (Conv2D) (None, 14, 14, 8) 1160
max_pooling2d_65 (MaxPooling (None, 7, 7, 8) 0
conv2d_144 (Conv2D) (None, 7, 7, 8) 584
max_pooling2d_66 (MaxPooling (None, 4, 4, 8) 0
conv2d_145 (Conv2D) (None, 4, 4, 8) 584
up_sampling2d_64 (UpSampling (None, 8, 8, 8) 0
conv2d_146 (Conv2D) (None, 8, 8, 8) 584
up_sampling2d_65 (UpSampling (None, 16, 16, 8) 0
conv2d_147 (Conv2D) (None, 16, 16, 16) 3216
up_sampling2d_66 (UpSampling (None, 32, 32, 16) 0
conv2d_148 (Conv2D) (None, 32, 32, 1) 145
Total params: 6,689
Trainable params: 6,689
Non-trainable params: 0
I know if I change the first layer to
x = Convolution2D(16, (3, 3), activation='relu', padding='same')(input_img)
It works but I want to use a 5*5 convolution.
How it happens?
You can increase your last filter size to (5, 5) to make this work:
from tensorflow.keras.layers import *
from tensorflow.keras import Model, Input
import numpy as np
input_img = Input(shape=(28, 28, 1))
x = Conv2D(16, (5, 5), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (5, 5), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (5, 5), activation='sigmoid', padding='valid')(x)
auto = Model(input_img, decoded)
auto.build(input_shape=(1, 28, 28, 1))
auto(np.random.rand(1, 28, 28, 1)).shape
TensorShape([1, 28, 28, 1])
Or, use tf.keras.Conv2DTranspose
I am using PyTorch 1.7 and Python 3.8 with CIFAR-10 dataset. I am trying to create a block with: conv -> conv -> pool -> fc. Fully connected layer (fc) has 256 neurons. The code for this is as follows:
# Testing-
conv1 = nn.Conv2d(
in_channels = 3, out_channels = 64,
kernel_size = 3, stride = 1,
padding = 1, bias = True
)
conv2 = nn.Conv2d(
in_channels = 64, out_channels = 64,
kernel_size = 3, stride = 1,
padding = 1, bias = True
)
pool = nn.MaxPool2d(
kernel_size = 2, stride = 2
)
fc1 = nn.Linear(
in_features = 64 * 16 * 16, out_features = 256
bias = True
)
images.shape
# torch.Size([32, 3, 32, 32])
x = conv1(images)
x.shape
# torch.Size([32, 64, 32, 32])
x = conv2(x)
x.shape
# torch.Size([32, 64, 32, 32])
x = pool(x)
x.shape
# torch.Size([32, 64, 16, 16])
# This line of code gives error-
x = fc1(x)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (32768x16 and
16384x256)
What is going wrong?
You are nearly there! As you will have noticed nn.MaxPool returns a shape (32, 64, 16, 16) which is incompatible with a nn.Linear's input: a 2D dimensional tensor (batch, in_features). You need to broadcast to (batch, 64*16*16).
I would recommend using a nn.Flatten layer rather than broadcasting yourself. It will act as x.view(x.size(0), -1) but is clearer. By default it preserves the first dimension:
conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
conv2 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1)
pool = nn.MaxPool2d(kernel_size=2, stride=2)
flatten = nn.Flatten()
fc1 = nn.Linear(in_features=64*16*16, out_features=256)
x = conv1(images)
x = conv2(x)
x = pool(x)
x = flatten(x)
x = fc1(x)
Alternatively, you could use the functional alternative torch.flatten, where you will have to provide the start_dim as 1: x = torch.flatten(x, start_dim=1).
When you're done debugging, you could assemble your layers with nn.Sequential:
model = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1),
nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Flatten(),
nn.Linear(in_features=64*16*16, out_features=256)
)
x = model(images)
you need to flat the output of nn.MaxPool2d layer for giving input in nn.Linear layer.
try to use x = x.view(x.size(0), -1) before giving input to fc layer for flatten tensor.
Can anyone please help me to convert this model to PyTorch? I already tried to convert from Keras to PyTorch like this How can I convert this keras cnn model to pytorch version but training results were different. Thank you.
input_3d = (1, 64, 96, 96)
pool_3d = (2, 2, 2)
model = Sequential()
model.add(Convolution3D(8, 3, 3, 3, name='conv1', input_shape=input_3d,
data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool1'))
model.add(Convolution3D(8, 3, 3, 3, name='conv2',data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool2'))
model.add(Convolution3D(8, 3, 3, 3, name='conv3',data_format='channels_first'))
model.add(MaxPooling3D(pool_size=pool_3d, name='pool3'))
model.add(Flatten())
model.add(Dense(2000, activation='relu', name='dense1'))
model.add(Dropout(0.5, name='dropout1'))
model.add(Dense(500, activation='relu', name='dense2'))
model.add(Dropout(0.5, name='dropout2'))
model.add(Dense(3, activation='softmax', name='softmax'))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1 (Conv3D) (None, 8, 60, 94, 94) 224
_________________________________________________________________
pool1 (MaxPooling3D) (None, 8, 30, 47, 47) 0
_________________________________________________________________
conv2 (Conv3D) (None, 8, 28, 45, 45) 1736
_________________________________________________________________
pool2 (MaxPooling3D) (None, 8, 14, 22, 22) 0
_________________________________________________________________
conv3 (Conv3D) (None, 8, 12, 20, 20) 1736
_________________________________________________________________
pool3 (MaxPooling3D) (None, 8, 6, 10, 10) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 4800) 0
_________________________________________________________________
dense1 (Dense) (None, 2000) 9602000
_________________________________________________________________
dropout1 (Dropout) (None, 2000) 0
_________________________________________________________________
dense2 (Dense) (None, 500) 1000500
_________________________________________________________________
dropout2 (Dropout) (None, 500) 0
_________________________________________________________________
softmax (Dense) (None, 3) 1503
=================================================================
Your PyTorch equivalent of the Keras model would look like this:
class CNN(nn.Module):
def __init__(self, ):
super(CNN, self).__init__()
self.maxpool = nn.MaxPool3d((2, 2, 2))
self.conv1 = nn.Conv3d(in_channels=1, out_channels=8, kernel_size=3)
self.conv2 = nn.Conv3d(in_channels=8, out_channels=8, kernel_size=3)
self.conv3 = nn.Conv3d(in_channels=8, out_channels=8, kernel_size=3)
self.linear1 = nn.Linear(4800, 2000)
self.dropout1 = nn.Dropout3d(0.5)
self.linear2 = nn.Linear(2000, 500)
self.dropout2 = nn.Dropout3d(0.5)
self.linear3 = nn.Linear(500, 3)
def forward(self, x):
out = self.maxpool(self.conv1(x))
out = self.maxpool(self.conv2(out))
out = self.maxpool(self.conv3(out))
# Flattening process
b, c, d, h, w = out.size() # batch_size, channels, depth, height, width
out = out.view(-1, c * d * h * w)
out = self.dropout1(self.linear1(out))
out = self.dropout2(self.linear2(out))
out = self.linear3(out)
out = torch.softmax(out, 1)
return out
A driver program to test the model:
inputs = torch.randn(8, 1, 64, 96, 96)
model = CNN()
outputs = model(inputs)
print(outputs.shape) # torch.Size([8, 3])
You can save keras weight and reload then in pytorch.
the steps are
Step 0: Train a Model in Keras. ...
Step 1: Recreate & Initialize Your Model Architecture in PyTorch. ...
Step 2: Import Your Keras Model and Copy the Weights. ...
Step 3: Load Those Weights onto Your PyTorch Model. ...
Step 4: Test and Save Your Pytorch Model.
You Can follow example here https://gereshes.com/2019/06/24/how-to-transfer-a-simple-keras-model-to-pytorch-the-hard-way/
I am following a Keras tutorial and want to shadow it in Pytorch, so am translating. I'm not strongly familiar with either and am coming unstuck on the input size parameter especially, but also the final layer - do I need another Linear layer? Can anyone translate the following to a Pytorch sequential definition?
visible = Input(shape=(64,64,1))
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
hidden1 = Dense(10, activation='relu')(pool2)
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
This is the output of the model:
Layer (type) Output Shape Param #
_________________________________________________________________
input_1 (InputLayer) (None, 64, 64, 1) 0
conv2d_1 (Conv2D) (None, 61, 61, 32) 544
max_pooling2d_1 (MaxPooling2 (None, 30, 30, 32) 0
conv2d_2 (Conv2D) (None, 27, 27, 16) 8208
max_pooling2d_2 (MaxPooling2 (None, 13, 13, 16) 0
dense_1 (Dense) (None, 13, 13, 10) 170
dense_2 (Dense) (None, 13, 13, 1) 11
Total params: 8,933
Trainable params: 8,933
Non-trainable params: 0
What I have worked out lacks a specification for the shape of the input, and I am also a bit perplexed at the translation of stride in the specified Keras model as it uses stride 2 in the MaxPooling2D but doesn't specify this elsewhere - it is perhaps a toy example.
model = nn.Sequential(
nn.Conv2d(1, 32, 4),
nn.ReLU(),
nn.MaxPool2d(2, 2),
nn.Conv2d(1, 16, 4),
nn.ReLU(),
nn.MaxPool2d(2, 2),
nn.Linear(10, 1),
nn.Sigmoid(),
)
I am getting error while running the following code in keras
Traceback (most recent call last):
File "my_conv_ae.py", line 74, in <module>
validation_steps = nb_validation_samples // batch_size)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\legacy\interfaces.py", line 88, in wrapper
return func(*args, **kwargs)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\training.py", line 1890, in fit_generator
class_weight=class_weight)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\training.py", line 1627, in train_on_batch
check_batch_axis=True)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\training.py", line 1309, in _standardize_user_data
exception_prefix='target')
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\training.py", line 127, in _standardize_input_data
str(array.shape))
ValueError: Error when checking target: expected conv2d_transpose_8 to have 4 dimensions, but got array with shape (20, 1)
The code is:
import keras
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D, Conv2DTranspose
from keras.models import Model
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
import numpy as np
input_img = Input(shape=(512, 512, 1))
nb_train_samples = 1700
nb_validation_samples = 420
epochs = 10
batch_size = 20
x = Conv2D(64, (11, 11), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(input_img)
x = Conv2D(64, (11, 11), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(128, (7, 7), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2D(128, (5, 5), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(256, (5, 5), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2D(256, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(512, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2D(512, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
encoded = MaxPooling2D((2, 2))(x)
print (K.int_shape(encoded))
at this point the representation is (26, 26, 512)
x = UpSampling2D((2, 2))(encoded)
x = Conv2DTranspose(512, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2DTranspose(512, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2DTranspose(256, (3, 3), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2DTranspose(256, (5, 5), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2DTranspose(128, (5, 5), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = Conv2DTranspose(128, (7, 7), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2DTranspose(64, (11, 11), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
decoded = Conv2DTranspose(1, (11, 11), activation='relu', strides= 1, padding='valid', kernel_initializer='glorot_uniform')(x)
print (K.int_shape(decoded))
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer = 'adadelta', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
x_train = train_datagen.flow_from_directory(
'data/train',
target_size = (512, 512), color_mode = 'grayscale',
batch_size = batch_size,
class_mode = 'binary')
x_test = test_datagen.flow_from_directory(
'data/validation',
target_size = (512, 512), color_mode = 'grayscale',
batch_size = batch_size,
class_mode = 'binary')
autoencoder.fit_generator(
x_train,
steps_per_epoch = nb_train_samples // batch_size,
epochs = epochs,
validation_data = x_test,
validation_steps = nb_validation_samples // batch_size)
decoded_imgs = autoencoder.predict(x_test)
Summary of model is as follows:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 512, 512, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 502, 502, 64) 7808
_________________________________________________________________
conv2d_2 (Conv2D) (None, 492, 492, 64) 495680
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 246, 246, 64) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 240, 240, 128) 401536
_________________________________________________________________
conv2d_4 (Conv2D) (None, 236, 236, 128) 409728
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 118, 118, 128) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 114, 114, 256) 819456
_________________________________________________________________
conv2d_6 (Conv2D) (None, 112, 112, 256) 590080
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 56, 56, 256) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 54, 54, 512) 1180160
_________________________________________________________________
conv2d_8 (Conv2D) (None, 52, 52, 512) 2359808
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 26, 26, 512) 0
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 52, 52, 512) 0
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 54, 54, 512) 2359808
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 56, 56, 512) 2359808
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 112, 112, 512) 0
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 114, 114, 256) 1179904
_________________________________________________________________
conv2d_transpose_4 (Conv2DTr (None, 118, 118, 256) 1638656
_________________________________________________________________
up_sampling2d_3 (UpSampling2 (None, 236, 236, 256) 0
_________________________________________________________________
conv2d_transpose_5 (Conv2DTr (None, 240, 240, 128) 819328
_________________________________________________________________
conv2d_transpose_6 (Conv2DTr (None, 246, 246, 128) 802944
_________________________________________________________________
up_sampling2d_4 (UpSampling2 (None, 492, 492, 128) 0
_________________________________________________________________
conv2d_transpose_7 (Conv2DTr (None, 502, 502, 64) 991296
_________________________________________________________________
conv2d_transpose_8 (Conv2DTr (None, 512, 512, 1) 7745
=================================================================
Total params: 16,423,745
Trainable params: 16,423,745
Non-trainable params: 0
_________________________________________________________________
Please help me. Is this because of Conv2DTranspose() which I have used for decoding?
It's definitly not a problem with model architecture itself (because it working on my side). Seems like problem with your ground truth data. It must have same dimensions as your input image, but flow_from_directory don't provide such ground truth data. I guess you need use your own custom data generator.