Fractional max-pooling in keras - python-3.x

The existing function in keras lib including max-pooling, average pooling, etc.
However, I would like to implement fractional max-pooling in keras based on the paper
My implementation are as follow:
model = Sequential()
model.add(Conv2D(32, (3, 3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
So, instead of model.add(MaxPooling2D(pool_size=(2, 2))), I would like to implement something like the following:
Is it possible?
I am currently using keras as backend in tensorflow.
Appreciate if someone would provide the algorithm/code.
I am quite new to this as I didn't wrote any custom layer before so could anyone kindly help out? Thanks!

In my opinion, you can do that by implementing your custom layer
class FractionalMaxpool2D(Layer):
def __init__(self, output_dim):
super(FractionalMaxpool2D, self).__init__()
self.output_dim = output_dim
def build(self, input_shape):
# Create a trainable weight variable for this layer.
# This kind of layer doesn't have any variable
def call(self, x):
# Handle you algorithm here
return ....
def compute_output_shape(self, input_shape):
# return the output shape
return (input_shape[0], self.output_dim)
The problem is it's difficult to implement the core function for the Fractional max pooling that uses GPU.
Please check this discussion from Keras's Github.

You Can Use Keras Lambda Layer to Wrap tf.nn.fractional_max_pool, like
FMP = Lambda(lambda img, pool_size: tf.nn.fractional_max_pool(img, pool_size))
Now You can Use FMP in your Keras Code like other layers with Two Arguments
Img: with dimensions like [batch, height, width, channels]
pool_size: [1.0, pool_size_you_want, pool_size_you_want, 1.0]
The first and last are 1.0, which is because tf doesnot perform pooling on batch_size and channels, it performs on height and width


Converting .npz model from ChainerRL to Keras model, or alternative methods?

I have a DQN reinforcement learning model which was trained using ChainerRL's built-in DQN experiment on the Ms Pacman Atari game environment, let's call this file model.npz. I have some analysis software written in Keras, which uses a Keras network and loads into that network a model.
I am having trouble getting the .npz exported from ChainerRL to play nice with the Keras network.
I have figured out how to load the weights from the .npz file. I think I figured out how to make sure the Keras model matches the Chainer RL model in terms of kernel size, stride, and activation.
Here is the code which calls the function that builds the network in ChainerRL:
return links.Sequence(
L.Linear(512, n_actions),
And the code which gets called by this, and builds a Chainer DQN network, is:
class NatureDQNHead(chainer.ChainList):
"""DQN's head (Nature version)"""
def __init__(self, n_input_channels=4, n_output_channels=512,
activation=F.relu, bias=0.1):
self.n_input_channels = n_input_channels
self.activation = activation
self.n_output_channels = n_output_channels
layers = [
#L.Convolution2D(n_input_channels, out_channel=32, ksize=8, stride=4, pad=0, nobias=False, initialW=None, initial_bias=bias, *, dilate=1, groups=1),
L.Convolution2D(n_input_channels, 32, 8, stride=4,
#L.Convolution2D(n_input_channels=32, out_channel=64, ksize=4, stride=2, pad=0, nobias=False, initialW=None, initial_bias=bias, *, dilate=1, groups=1),
L.Convolution2D(32, 64, 4, stride=2, initial_bias=bias),
#L.Convolution2D(n_input_channels=64, out_channel=64, ksize=3, stride=1, pad=0, nobias=False, initialW=None, initial_bias=bias, *, dilate=1, groups=1),
L.Convolution2D(64, 64, 3, stride=1, initial_bias=bias),
#L.Convolution2D(in_size=3136, out_size=n_output_channels, nobias=False, initialW=None, initial_bias=bias),
L.Linear(3136, n_output_channels, initial_bias=bias),
super(NatureDQNHead, self).__init__(*layers)
def __call__(self, state):
h = state
for layer in self:
h = self.activation(layer(h))
return h
So I wrote the following Keras code to build an equivalent network in Keras:
# Keras Model
hidden = 512
#bias initializer to match the chainerRL one
initial_bias = tf.keras.initializers.Constant(0.1)
#matches default "channels_last" data format for Keras layers
inputs = Input(shape=(84, 84, 4))
#First call to Conv2D including all defaults for easy reference
x = Conv2D(filters=32, kernel_size=(8, 8), strides=4, padding='valid', data_format=None, dilation_rate=(1, 1), activation='relu', use_bias=True, kernel_initializer='glorot_uniform', bias_initializer=initial_bias, kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None, name='deepq/q_func/convnet/Conv')(inputs)
x1 = Conv2D(filters=64, kernel_size=(4, 4), strides=2, activation='relu', padding='valid', bias_initializer=initial_bias, name='deepq/q_func/convnet/Conv_1')(x)
x2 = Conv2D(filters=64, kernel_size=(3, 3), strides=1, activation='relu', padding='valid', bias_initializer=initial_bias, name='deepq/q_func/convnet/Conv_2')(x1)
#Flatten for move to linear layers
conv_out = Flatten()(x2)
action_out = Dense(hidden, activation='relu', name='deepq/q_func/action_value/fully_connected')(conv_out)
action_scores = Dense(units = 9, name='deepq/q_func/action_value/fully_connected_1', activation='linear', use_bias=True, kernel_initializer="glorot_uniform", bias_initializer=initial_bias, kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None,)(action_out) # num_actions in {4, .., 18}
#Now create model using the above-defined layers
modelArchitecture = Model(inputs, action_scores)
I have examined the structure of the initial weights for the Keras model and found them to be as follows:
Layer 0: no weights
Layer 1: (8,8,4,32)
Layer 2: (4,4,32,64)
Layer 3: (4,4,64,64)
Layer 4: no weights
Layer 5: (3136,512)
Layer 6: (9,512)
Then, I examined the weights in the .npz model which I am trying to import and found them to be as follows:
Layer 0: (32,4,8,8)
Layer 1: (64,32,4,4)
Layer 2: (64,64,4,4)
Layer 3: (512,3136)
Layer 4: (9,512)
So, I reshaped the weights from Layer 0 of model.npz with numpy.reshape and applied them to Layer 1 of the Keras network. I did the same with the model.npz weights for Layer 1, and applied them to Layer 2 of the Keras network. Then, I reshaped the weights from Layer 2 of model.npz, and applied them to Layer 3 of the Keras network. I transposed the weights of Layer 3 from model.npz, and applied them to Layer 5 of the Keras model. Finally, I transposed the weights of Layer 4 of model.npz and applied them to Layer 6 of the Keras model.
I saved the model in .H5 format, and then tried to run it on the evaluation code in the Ms Pacman Atari environment, and produces a video. When I do this, Pacman follows the exact same, short path, runs face-first into a wall, and then keeps trying to walk through the wall until a ghost kills it.
It seems, therfore, like I am doing something wrong in my translation between the Chainer DQN network and the Keras DQN network. I am not sure if maybe they process color in a different order or something?
I also attempted to export the ChainerRL model.npz file to ONNX, but got several errors to the point where it didn't seem possible without rewriting a lot of the ChainerRL code base.
Any help would be appreciated.
I am the author of ChainerRL. I have no experience with Keras, but apparently the formats of the weight parameters seem different between Chainer and Keras. You should check the meaning of each dimension of the weight parameters for each deep learning framework. In Chainer, as you can find in the document (, the weight parameter of Convolution2D is stored as (c_O, c_I, h_K, w_K).
Once you find the meaning of each dimension, I guess what you need is always numpy.transpose, not numpy.reshape, to re-order dimensions to match the order of Keras.

How to pass weights to mean squared error in keras

I am trying to approach a regression problem, which is multi label with 8 labels for which i am using mean squared error loss, but the data set is imbalanced and i want to pass weights to the loss function.Currently i am compiling the model this way.
model.compile(loss='mse', optimizer=Adam(lr=0.0001), metrics=['mse', 'acc'])
Could someone please suggest if it is possible to add weights to mean squared error,if so, how could i do it?
Thanks in advance
The labels look like so
model = Sequential()
model.add(Dense(8,name = 'nelu', activation=elu))
optimizer=Adam(lr=0.0001), metrics=['mse', 'acc'])
import keras
from keras.models import Sequential
from keras.layers import Conv2D, Flatten, Dense, Conv1D, LSTM, TimeDistributed
import keras.backend as K
# custom loss function
def custom_mse(class_weights):
def loss_fixed(y_true, y_pred):
:param y_true: A tensor of the same shape as `y_pred`
:param y_pred: A tensor resulting from a sigmoid
:return: Output tensor.
# print('y_pred:', K.int_shape(y_pred))
# print('y_true:', K.int_shape(y_true))
y_pred = K.reshape(y_pred, (8, 1))
y_pred =, y_pred)
# calculating mean squared error
mse = K.mean(K.square(y_pred - y_true), axis=-1)
# print('mse:', K.int_shape(mse))
return mse
model = Sequential()
model.add(Conv1D(8, (1), input_shape=(28, 28)))
# custom class weights
class_weights = K.variable([[0.25, 1., 2., 3., 2., 0.6, 0.5, 0.15]])
# print('class_weights:', K.int_shape(class_weights))
model.compile(optimizer='adam', loss=custom_mse(class_weights), metrics=['accuracy'])
Here is a small implementation of a custom loss function based on your problem statement
You find more information about keras loss function from and also check out its official documentation from here
Keras does not handle low-level operations such as tensor products, convolutions and so on itself. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. More information about keras backend can be found here and also check out its official documentation from here
Use K.int_shape(tensor_name) to find the dimensions of a tensor.
First create a dictionary of how much you want to weight each class, for example:
class_weights = {0: 1,
1: 1,
2: 1,
3: 9,
4: 1...} # Do this for all eight classes
Then pass them into, y, class_weight=class_weights)

How to properly implement custom squash function in TF2.0 (custom layer or other)

I'm trying to implement a simple capsnet model in TF2.0.
So far, I have added a few conv2d layers, and a reshape layer, but I need to add a squash function now. The issue is that tf.norm() will send me to NaN land since I'm squashing entire vectors, so I have to use a custom squash function. I've never written a custom layer before, and I basically just used the template from the tutorial and added the math function under the call().
Since I am doing this all inside of a keras.models.Sequential model, I wasn't sure how to get the output after the first couple of layers so I just decided to make the squash function its own layer in the model. I feel like this is probably completely and totally wrong, so I'm looking for some input on the best way to go about this.
Should I be using a keras.Model for this at all, or should I use the new eager execution feature to just pass the tensors through the layers manually? If it's okay to use the SquashLayer() that I have implemented, then what do I pass through as an argument so that I get the proper output to pass into the next layer?
class SquashLayer(tf.keras.layers.Layer):
def __init__(self, output_units):
super(SquashLayer, self).__init__()
self.output_units = output_units
def build(self, input_shape):
self.kernel = self.add_variable(
'kernel', [input_shape[-1], self.output_units])
def call(self, input):
squared_norm = tf.reduce_sum(tf.square(input), axis=-1, keepdims=True)
safe_norm = tf.sqrt(squared_norm + 1e-7)
squash_factor = squared_norm / (1. + squared_norm)
unit_vector = input / safe_norm
return squash_factor * unit_vector
model = keras.models.Sequential([
keras.layers.InputLayer(input_shape=(28, 28, 1)),
keras.layers.Conv2D(filters=256, kernel_size=9, strides=1, padding='valid', activation=tf.nn.relu, name='conv1'),
keras.layers.Conv2D(filters=256, kernel_size=9, strides=2, padding='valid', activation=tf.nn.relu, name='conv2'),
keras.layers.Reshape((-1, caps1_n_caps, caps1_n_dims)),

Changing input dimension for AlexNet

I am beginner and I am trying to implement AlexNet for image classification. The pytorch implementation of AlexNet is as follows:
class AlexNet(nn.Module):
def __init__(self, num_classes=1000):
super(AlexNet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.Conv2d(64, 192, kernel_size=5, padding=2),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.Conv2d(192, 384, kernel_size=3, padding=1),
nn.Conv2d(384, 256, kernel_size=3, padding=1),
nn.Conv2d(256, 256, kernel_size=3, padding=1),
nn.MaxPool2d(kernel_size=3, stride=2),
self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
self.classifier = nn.Sequential(
nn.Linear(256 * 6 * 6, 4096),
nn.Linear(4096, 4096),
nn.Linear(4096, num_classes),
def forward(self, x):
x = self.features(x)
x = self.avgpool(x)
x = x.view(x.size(0), 256 * 6 * 6)
x = self.classifier(x)
return x
However I am trying to implement the network for a input size of (3,448,224) with num of classes = 8.
I have no idea on how to change x.view in the forward method and how many layers I should drop to get optimum performance. Please help.
As stated in
Since, most of the pretrained models provided in torchvision (the newest version) already added self.avgpool = nn.AdaptiveAvgPool2d((size, size)) to resolve the incompatibility with input size. So you don't have to care about it so much.
Below is the code, very short.
import torchvision
import torch.nn as nn
num_classes = 8
model = torchvision.models.alexnet(pretrained=True)
# replace the last classifier
model.classifier[6] = nn.Linear(4096, num_classes)
# now you can trained it with your dataset of size (3, 448, 224)
Transfer learning
There are two popular ways to do transfer learning. Suppose that we trained a model M in very large dataset D_large, now we would like to transfer the "knowledge" learned by the model M to our new model, M', on other datasets such as D_other (which has a smaller size than that of D_large).
Use (most) parts of M as the architecture of our new M' and initialize those parts with the weights trained on D_large. We can start training the model M' on the dataset D_other and let it learn the weights of those above parts from M to find the optimal weights on our new dataset. This is usually referred as fine-tuning the model M'.
Same as the above method except that before training M' we freeze all the parameters of those parts and start training M' on our dataset D_other. In both cases, those parts from M are mostly the first components in the model M' (the base). However, in this case, we refer those parts of M as the model to extract the features from the input dataset (or feature extractor). The accuracy obtained from the two methods may differ a little to some extent. However, this method guarantees the model doesn't overfit on the small dataset. It's a good point in terms of accuracy. On the other hands, when we freeze the weights of M, we don't need to store some intermediate values (the hidden outputs from each hidden layer) in the forward pass and also don't need to compute the gradients during the backward pass. This improves the speed of training and reduces the memory required during training.
The implementation
Along with Alexnet, a lot of pretrained models on ImageNet is already provided by Facebook team such as ResNet, VGG.
To fit your requirements the most in the aspect of model size, it would be nice to use VGG11, and ResNet which have fewest parameters in their model family.
I just pick VGG11 as an example:
Obtain a pretrained model from torchvision.
Freeze the all the parameters of this model.
Replace the last layer in the model by your new Linear layer to perform your classification. This means that you can reuse all most everything of M to M'.
import torchvision
# obtain the pretrained model
model = torchvision.models.vgg11(pretrained=True)
# freeze the params
for param in net.parameters():
param.requires_grad = False
# replace with your classifier
num_classes = 8
net.classifier[6] = nn.Linear(in_features=4096, out_features=num_classes)
# start training with your dataset
In the old torchvision package version, there is no self.avgpool = nn.AdaptiveAvgPool2d((size, size)) which makes harder to train on our input size which is different from [3, 224, 224] used in training ImageNet. You can do a little effort as below:
class OurVGG11(nn.Module):
def __init__(self, num_classes=8):
super(OurVGG11, self).__init__()
self.vgg11 = torchvision.models.vgg11(pretrained=True)
for param in self.vgg11.parameters():
param.requires_grad = False
# Add a avgpool here
self.avgpool = nn.AdaptiveAvgPool2d((7, 7))
# Replace the classifier layer
self.vgg11.classifier[-1] = nn.Linear(4096, num_classes)
def forward(self, x):
x = self.vgg11.features(x)
x = self.avgpool(x)
x = x.view(x.size(0), 512 * 7 * 7)
x = self.vgg11.classifier(x)
return x
model = OurVGG11()
# now start training `model` on our dataset.
Try out with different models in torchvision.models.

how to obtain the runtime batch size of a Keras model

Based on this post. I need some basic implementation help. Below you see my model using a Dropout layer. When using the noise_shape parameter, it happens that the last batch does not fit into the batch size creating an error (see other post).
Original model:
def LSTM_model(X_train,Y_train,dropout,hidden_units,MaskWert,batchsize):
model = Sequential()
model.add(Masking(mask_value=MaskWert, input_shape=(X_train.shape[1],X_train.shape[2]) ))
model.add(Dropout(dropout, noise_shape=(batchsize, 1, X_train.shape[2]) ))
model.add(Dense(hidden_units, activation='sigmoid', kernel_constraint=max_norm(max_value=4.) ))
model.add(LSTM(hidden_units, return_sequences=True, dropout=dropout, recurrent_dropout=dropout))
Now Alexandre Passos suggested to get the runtime batchsize with tf.shape. I tried to implement the runtime batchsize idea it into Keras in different ways but never working.
import Keras.backend as K
def backend_shape(x):
return K.shape(x)
def LSTM_model(X_train,Y_train,dropout,hidden_units,MaskWert,batchsize):
model = Sequential()
model.add(Dropout(dropout, noise_shape=(batchsize[0], 1, X_train.shape[2]) ))
But that did just give me the input tensor shape but not the runtime input tensor shape.
I also tried to use a Lambda Layer
def output_of_lambda(input_shape):
return (input_shape)
def LSTM_model_2(X_train,Y_train,dropout,hidden_units,MaskWert,batchsize):
model = Sequential()
model.add(Lambda(output_of_lambda, outputshape=output_of_lambda))
model.add(Dropout(dropout, noise_shape=(outputshape[0], 1, X_train.shape[2]) ))
And different variants. But as you already guessed, that did not work at all.
Is the model definition actually the correct place?
Could you give me a tip or better just tell me how to obtain the running batch size of a Keras model? Thanks so much.
The current implementation does adjust the according to the runtime batch size. From the Dropout layer implementation code:
symbolic_shape = K.shape(inputs)
noise_shape = [symbolic_shape[axis] if shape is None else shape
for axis, shape in enumerate(self.noise_shape)]
So if you give noise_shape=(None, 1, features) the shape will be (runtime_batchsize, 1, features) following the code above.
