convolution 1d calculation how it actually work? - keras

I tried to implement 1d convolution with dilation
#keras.layers.Conv1D(filters, kernel_size, strides=1, padding='valid', data_format='channels_last', dilation_rate=1, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)
# valid , causal , same
conv = layers.Conv1D(1, 3, padding='same',
I want to understand how this dilation 1d convolution actually thraw the output
let say we have a input with
array([0. , 0.32380696, 0.61272254, 0.83561502, 0.96846692])
and we have convolution filter with
array([-0.56509803, 0.89481053, 0.6975754 ])
and when we pass the convolution the output will be
output = conv(sequence)
array([0. , 0.22587977, 0.71716606, 0.94819239, 1.07704752])
try to implement the wavenet 1dconvolution with dilation
I want to know how this output value calculated
and what if filter size and kernel_size change to different number?
conv = layers.Conv1D(2, 3, padding='causal',
conv = layers.Conv1D(3, 3, padding='causal',
conv = layers.Conv1D(1, 3, padding='causal',
conv = layers.Conv1D(2, 3, padding='same',
conv = layers.Conv1D(3, 3, padding='same',
conv = layers.Conv1D(1, 3, padding='same',


Resnet autoencoder in keras I/0 issues

I am trying to code a deep auto encoder in keras. My image shape is (4575,32,32,3) and targets are (4575,1)
Here's the function
def build_deep_autoencoder(img_shape, code_size):
H,W,C = img_shape
# encoder
encoder = Sequential()
encoder.add(Dense(512, activation='relu'))
encoder.add(Dense(256, activation='relu'))
# decoder
decoder = Sequential()
decoder.add(Reshape((2, 2, 256)))
decoder.add(Conv2DTranspose(filters=128, kernel_size=(3, 3), strides=2, activation='elu', padding='same'))
decoder.add(Conv2DTranspose(filters=64, kernel_size=(3, 3), strides=2, activation='elu', padding='same'))
decoder.add(Conv2DTranspose(filters=32, kernel_size=(3, 3), strides=2, activation='elu', padding='same'))
decoder.add(Conv2DTranspose(filters=3, kernel_size=(3, 3), strides=2, activation=None, padding='same'))
return encoder, decoder
encoder,decoder = build_deep_autoencoder(img_shape,code_size=2)
inp = L.Input(img_shape)
code = encoder(inp)
reconstruction = decoder(code)
autoencoder = tensorflow.keras.models.Model(inp,reconstruction)
I am getting an error:
InvalidArgumentError: Incompatible shapes: [31,32,32,3] vs. [31,1]
[[{{node training_18/Nadam/gradients/loss_12/sequential_28_loss/MeanSquaredError/sub_grad/BroadcastGradientArgs}}]]
I am using tensorflow.python.keras
Any help would be appreciated.

Adding CTC Loss and CTC decode to a Keras model

I am trying to solve a use case of handwritten text recognition. I have used CNN and LSTM to create a network. The output of this needs to be fed to a CTC layer. I could find some codes to do this in native tensorflow. Is there an easier option for this in Keras.
model = Sequential()
model.add(Conv2D(64, kernel_size=(5,5),activation = 'relu', input_shape=(128,32,1), padding='same', data_format='channels_last'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(128, kernel_size=(5,5),activation = 'relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(256, kernel_size=(5,5),activation = 'relu', padding='same'))
model.add(Conv2D(256, kernel_size=(5,5),activation = 'relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(1,2),padding='same'))
model.add(Conv2D(512, kernel_size=(5,5),activation = 'relu', padding='same'))
model.add(Conv2D(512, kernel_size=(5,5),activation = 'relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(1,2),padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(1,1)))
model.add(Conv2D(512, kernel_size=(5,5),activation = 'relu', padding='same'))
model.add(Lambda(lambda x: x[:, :, 0, :], output_shape=(None,31,512), mask=None, arguments=None))
#model.add(Bidirectional(LSTM(256, return_sequences=True), input_shape=(31, 256)))
model.add(Bidirectional(LSTM(128, return_sequences=True)))
model.add(Bidirectional(LSTM(128, return_sequences=True)))
model.add(Dense(75, activation = 'softmax'))
Any help on how we can easily add CTC Loss and Decode layers to this would be great
A CTC loss function requires four arguments to compute the loss, predicted outputs, ground truth labels, input sequence length to LSTM and ground truth label length. To get this we need to create a custom loss function and then pass it to the model. To make it compatible with your defined model, we need to create a model which takes these four inputs and outputs the loss. This model will be used for training and for testing, the model that you have created earlier can be used.
Let's create a keras model that you used in a different way so that we can create two different versions of the model to be used at training and testing time.
# input with shape of height=32 and width=128
inputs = Input(shape=(32, 128, 1))
# convolution layer with kernel size (3,3)
conv_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(inputs)
# poolig layer with kernel size (2,2)
pool_1 = MaxPool2D(pool_size=(2, 2), strides=2)(conv_1)
conv_2 = Conv2D(128, (3, 3), activation='relu', padding='same')(pool_1)
pool_2 = MaxPool2D(pool_size=(2, 2), strides=2)(conv_2)
conv_3 = Conv2D(256, (3, 3), activation='relu', padding='same')(pool_2)
conv_4 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv_3)
# poolig layer with kernel size (2,1)
pool_4 = MaxPool2D(pool_size=(2, 1))(conv_4)
conv_5 = Conv2D(512, (3, 3), activation='relu', padding='same')(pool_4)
# Batch normalization layer
batch_norm_5 = BatchNormalization()(conv_5)
conv_6 = Conv2D(512, (3, 3), activation='relu', padding='same')(batch_norm_5)
batch_norm_6 = BatchNormalization()(conv_6)
pool_6 = MaxPool2D(pool_size=(2, 1))(batch_norm_6)
conv_7 = Conv2D(512, (2, 2), activation='relu')(pool_6)
squeezed = Lambda(lambda x: K.squeeze(x, 1))(conv_7)
# bidirectional LSTM layers with units=128
blstm_1 = Bidirectional(LSTM(128, return_sequences=True, dropout=0.2))(squeezed)
blstm_2 = Bidirectional(LSTM(128, return_sequences=True, dropout=0.2))(blstm_1)
outputs = Dense(len(char_list) + 1, activation='softmax')(blstm_2)
# model to be used at test time
test_model = Model(inputs, outputs)
We will use ctc_loss_fuction during training. So, lets implement the ctc_loss_function and create a training model using ctc_loss_function:
labels = Input(name='the_labels', shape=[max_label_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')
def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args
return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([outputs, labels,
input_length, label_length])
#model to be used at training time
training_model = Model(inputs=[inputs, labels, input_length, label_length], outputs=loss_out)
--> Train this model and save the weights in .h5 file
Now use the test model and load saved weights of the training model by using arguments by_name=True so it will load weights for only matching layers.

Imbalanced data for semantic segmentation in Keras?

I am new with keras and have been learning it for about 3 weeks now. I apologies if my question sounds a bit stupid.
I am currently doing semantic medical image segmentation of 512x512. I'm using UNet from this link . Basically, I want to segment a brain from an image (so two-class segmentation, background vs foreground)
I have made a few modification of the network and I'm getting some results which i am happy with. But I think I can improve the segmentation results by imposing more weight on the foreground because the number of pixels of the brain is much smaller than the number of background pixels. In some cases the brain does not appear in the image especially those located in the bottom slices.
I don't know which part of the code I need to modify in
I would really appreciate if anyone can help me with this. Thanks a lot in advance!
import numpy as np
import os
import as io
import skimage.transform as trans
import numpy as np
from keras.models import *
from keras.layers import *
from keras.optimizers import *
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras import backend as keras
def unet(pretrained_weights=None, input_size=(256, 256, 1)):
inputs = Input(input_size)
conv1 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(inputs)
conv1 = BatchNormalization()(conv1)
conv1 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv1)
conv1 = BatchNormalization()(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool1)
conv2 = BatchNormalization()(conv2)
conv2 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv2)
conv2 = BatchNormalization()(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool2)
conv3 = BatchNormalization()(conv3)
conv3 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv3)
conv3 = BatchNormalization()(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool3)
conv4 = BatchNormalization()(conv4)
conv4 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv4)
conv4 = BatchNormalization()(conv4)
drop4 = Dropout(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
conv5 = Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool4)
conv5 = BatchNormalization()(conv5)
conv5 = Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv5)
conv5 = BatchNormalization()(conv5)
drop5 = Dropout(0.5)(conv5)
up6 = Conv2D(512, 2, activation='relu', padding='same', kernel_initializer='he_normal')(
UpSampling2D(size=(2, 2))(drop5))
merge6 = concatenate([drop4, up6], axis=3)
conv6 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge6)
conv6 = BatchNormalization()(conv6)
conv6 = Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv6)
conv6 = BatchNormalization()(conv6)
up7 = Conv2D(256, 2, activation='relu', padding='same', kernel_initializer='he_normal')(UpSampling2D(size=(2, 2))(conv6))
merge7 = concatenate([conv3, up7], axis=3)
conv7 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge7)
conv7 = BatchNormalization()(conv7)
conv7 = Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv7)
conv7 = BatchNormalization()(conv7)
up8 = Conv2D(128, 2, activation='relu', padding='same', kernel_initializer='he_normal')(UpSampling2D(size=(2, 2))(conv7))
merge8 = concatenate([conv2, up8], axis=3)
conv8 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge8)
conv8 = BatchNormalization()(conv8)
conv8 = Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv8)
conv8 = BatchNormalization()(conv8)
up9 = Conv2D(64, 2, activation='relu', padding='same', kernel_initializer='he_normal')(UpSampling2D(size=(2, 2))(conv8))
merge9 = concatenate([conv1, up9], axis=3)
conv9 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(merge9)
conv9 = BatchNormalization()(conv9)
conv9 = Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
conv9 = BatchNormalization()(conv9)
conv9 = Conv2D(2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
conv9 = BatchNormalization()(conv9)
conv10 = Conv2D(1, 1, activation='sigmoid')(conv9)
model = Model(input=inputs, output=conv10)
model.compile(optimizer=Adam(lr=1e-4), loss='binary_crossentropy', metrics=['accuracy'])
# model.summary()
if (pretrained_weights):
return model
Here's the
from model2 import *
from data2 import *
from keras.models import load_model
class_weight= {0:0.10, 1:0.90}
myGene = trainGenerator(2,'data/brainTIF/trainNew','image','label',save_to_dir = None)
model = unet()
model_checkpoint = ModelCheckpoint('unet_brainTest_e10_s5.hdf5',
model.fit_generator(myGene,steps_per_epoch=5,epochs=10,callbacks = [model_checkpoint])
testGene = testGenerator("data/brainTIF/test3")
results = model.predict_generator(testGene,18,verbose=1)
As an option for class_weight for binary classes, you can also handle imbalanced classes using Synthetic Oversampling Technique (SMOTE), increasing the size of the minority group:
from imblearn.over_sampling import SMOTE
sm = SMOTE()
x, y = sm.fit_sample(X_train, Y_train)

Error when checking target: expected conv2d_29 to have 4 dimensions, but got array with shape (1255, 12)

I would like to train a deep learning model, where input image shape is (224,224,3) . And I would like to feed them into a u-net model.
After training I get the error : Error when checking target: expected conv2d_29 to have 4 dimensions, but got array with shape (1255, 12)
I'm confused since I'm sure the image array and label has no issue. Is the issue within the model? How should I resolve this?
The model is as below:
#def unet(pretrained_weights = None, input_size = (224,224,3)):
concat_axis = 3
input_size= Input((224,224,3))
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(input_size)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
#flat1 = Flatten()(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
drop4 = Dropout(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
drop5 = Dropout(0.5)(conv5)
up_conv5 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv5)
ch, cw = get_crop_shape(conv4, up_conv5)
crop_conv4 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv4)
up6 = concatenate([up_conv5, crop_conv4], axis=concat_axis)
conv6 = Conv2D(256, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up6)
conv6 = Conv2D(256, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv6)
up_conv6 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv6)
ch, cw = get_crop_shape(conv3, up_conv6)
crop_conv3 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv3)
up7 = concatenate([up_conv6, crop_conv3], axis=concat_axis)
conv7 = Conv2D(128, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up7)
conv7 = Conv2D(128, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv7)
up_conv7 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv7)
ch, cw = get_crop_shape(conv2, up_conv7)
crop_conv2 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv2)
up8 = concatenate([up_conv7, crop_conv2], axis=concat_axis)
conv8 = Conv2D(64, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up8)
conv8 = Conv2D(64, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv8)
up_conv8 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv8)
ch, cw = get_crop_shape(conv1, up_conv8)
crop_conv1 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv1)
up9 = concatenate([up_conv8, crop_conv1], axis=concat_axis)
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up9)
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)
model = Model(inputs = input_size, outputs = conv9)
Since the model output's layer is conv layer, the output shape has 4 dimensions(Batch_size, width, height, channels). But you are feeding an array of shape (1255, 12). If the target label has a shape of (Batch_size, num_features) then the last layer's output should have a shape of (None, 12) or (Batch_size, 12).
You have two options to deal with this situation.
Using dense layer after flattening the output of conv layer
Reshaping the output of conv layer to have the desired shape.
The choice depends on the problem you're dealing with. If the problem is classification, option one could be used to add softmax activation. With option 1 the modification to the code would be,
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)
flatten1 = Flatten()(conv9)
dense1 = Dense(12, activation="softmax")(flatten1) # The choice of the activation depends on the problem you are dealing with.
model = Model(inputs = input_size, outputs = dense1)
With option 2, the modification would be
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)
reshape1 = Reshape((12,)(conv9) # The choice of the activation depends on the problem you are dealing with.
model = Model(inputs = input_size, outputs = reshape1)
N.B: When the Reshape layer is used to reshape tensor to (None, 12) shape be sure that the product of the output shape of the previous layer should be divisible by 12.

Where is Keras 2 's channels?

I once used keras 1 (maybe 1.0.5) for multi-category classification. And my input in CNN is (n, 1, 24, 113) and 113 is channel numbers, and kernel size is (1, 5).
code like:
X_train = X_train.reshape((-1, 1, SLIDING_WINDOW_LENGTH, NUM_SENSOR_CHANNELS))
X_test = X_test.reshape((-1, 1, SLIDING_WINDOW_LENGTH, NUM_SENSOR_CHANNELS))
# network
conv1 = ELU()(Convolution2D(NUM_FILTERS, FILTER_SIZE, 1, border_mode='valid', init='normal', activation='relu')(inputs))
conv2 = ELU()(Convolution2D(NUM_FILTERS, FILTER_SIZE, 1, border_mode='valid', init='normal', activation='relu')(conv1))
conv3 = ELU()(Convolution2D(NUM_FILTERS, FILTER_SIZE, 1, border_mode='valid', init='normal', activation='relu')(conv2))
conv4 = ELU()(Convolution2D(NUM_FILTERS, FILTER_SIZE, 1, border_mode='valid', init='normal', activation='relu')(conv3))
reshape1 = Reshape((8, NUM_FILTERS * NUM_SENSOR_CHANNELS))(conv4)
gru1 = GRU(NUM_UNITS_LSTM, return_sequences=True, consume_less='mem')(reshape1)
gru2 = GRU(NUM_UNITS_LSTM, return_sequences=False, consume_less='mem')(gru1)
outputs = Dense(NUM_CLASSES, activation='softmax')(gru2)
# Hardcoded number of sensor channels employed in the OPPORTUNITY challenge
# Hardcoded number of classes in the gesture recognition problem
# Hardcoded length of the sliding window mechanism employed to segment the data
# Length of the input sequence after convolutional operations
# Hardcoded step of the sliding window mechanism employed to segment the data
# Batch Size
# Number filters convolutional layers
# Size filters convolutional layers
# Number of unit in the long short-term recurrent layers
And these days I switched keras to keras 2. and the networks did not change. And my code like:
X_train = X_train.reshape((-1, 1, SLIDING_WINDOW_LENGTH, NUM_SENSOR_CHANNELS))
X_test = X_test.reshape((-1, 1, SLIDING_WINDOW_LENGTH, NUM_SENSOR_CHANNELS))
# network
conv1 = ELU()(
Conv2D(filters=NUM_FILTERS, kernel_size=(1, FILTER_SIZE), strides=(1, 1), padding='valid', activation='relu',
kernel_initializer='normal', data_format='channels_last')(inputs))
conv2 = ELU()(
Conv2D(filters=NUM_FILTERS, kernel_size=(1, FILTER_SIZE), strides=(1, 1), padding='valid', activation='relu',
kernel_initializer='normal', data_format='channels_last')(conv1))
conv3 = ELU()(
Conv2D(filters=NUM_FILTERS, kernel_size=(1, FILTER_SIZE), strides=(1, 1), padding='valid', activation='relu',
kernel_initializer='normal', data_format='channels_last')(conv2))
conv4 = ELU()(
Conv2D(filters=NUM_FILTERS, kernel_size=(1, FILTER_SIZE), strides=(1, 1), padding='valid', activation='relu',
kernel_initializer='normal', data_format='channels_last')(conv3))
# permute1 = Permute((2, 1, 3))(conv4)
reshape1 = Reshape((SLIDING_WINDOW_LENGTH - (FILTER_SIZE - 1) * 4, NUM_FILTERS * 1))(conv4) # 4 for 4 convs
gru1 = GRU(NUM_UNITS_LSTM, return_sequences=True, implementation=0)(reshape1)
gru2 = GRU(NUM_UNITS_LSTM, return_sequences=False, implementation=0)(gru1) # implementation=2 for GPU
outputs = Dense(NUM_CLASSES, activation='softmax')(gru2)
and the speed seems faster but the shape is strange since I didn't know where is my channels ?
Is there anything wrong with my code and could someone help ? THX
It seems that Keras handles the channel parameter himself.
