Adaptive Activation Function in Tensorflow 2 trained variable for mulitple calls - keras

So I want to try out an adaptive activation function for my neural network. This means I want to have a custom loss that is similar to a standard one (like tanh or relu), however I want to add some trainable parameters.
Currently, I am trying to add this trainable parameter by creating the activation function as a custom layer:
class AdaptiveActivation(keras.layers.Layer):
> """
Adaptive activation function that is changed in training process.
> """
def __init__(self, act="tanh"):
super(AdaptiveActivation, self).__init__()
self.a = tf.Variable(0.1, dtype=tf.float32, trainable=True)
self.n = tf.constant(10.0, dtype=tf.float32)
self.act = act
def call(self, x):
if self.act == "tanh":
return keras.activations.tanh(self.a*self.n*x)
elif self.act == "relu":
return keras.activations.relu(self.a*self.n*x)
However - if I understood some test outputs correctly - this means every time I call the activation function, there will be a unique parameter a. This means for every hidden layer, I get a different a. What I want, is one single a for all my activation functions. So instead of say 9 different values for a per epoch, just always one a that can change between epochs.
Furthermore, is there an easy way to obtain the a from this layer for output during training?

ok the solution was stupidly easy, I can just pass a trainable tensorflow variable to the layer from outside and assign it to the self.a there.
class AdaptiveActivation(keras.layers.Layer):
"""
Adaptive activation function that is changed in training process.
"""
def __init__(self, a, act="tanh"):
super(AdaptiveActivation, self).__init__()
self.a = a
self.n = tf.constant(5.0, dtype=tf.float32)
self.act = act
def call(self, x):
if self.act == "tanh":
return keras.activations.tanh(self.a*self.n*x)
elif self.act == "relu":
return keras.activations.relu(self.a*self.n*x)
This also solves the "issue" of tracking it.
It does feel very unnecessary though, why couldn't I just have done this without having to implement a new layer first.

Related

how to handle different size of input data using Pytorch built in neural network

I build a simple pytorch model as below. However, I receive error message that mat1 and mat2 size are not aligned. How do I tweek the code to allow the flexibility of different dimension of data?
class simpleNet(nn.Module):
def __init__(self, **input_dim, hidden_size, num_classes**):
"""
:param input_dim: input feature dimension
:param hidden_size: hidden dimension
:param num_classes: total number of classes
"""
super(TwoLayerNet, self).__init__()
# hidden layer
self.hidden = nn.Linear(input_dim, hidden_size)
# Second fully connected layer that outputs our 10 labels
self.output = nn.Linear(hidden_size, num_classes)
def forward(self, x):
out = None
x = self.hidden(x)
x = torch.sigmoid(x)
x = self.output(x)
out = x
trying to build a toy neural network using Pytorch.
For your neural network to work, your output from your previous layer should be equal to your input for next layer, since its a code snippet for just your architecture without the initializations code, I cannot tell what you can simplify, not having equals in transition is not a good practice though. However, you can use reshape function from torch to make your output of previous layer equal to your next layer to make it work as a brute force method. Refer to: https://pytorch.org/docs/stable/generated/torch.reshape.html

How to implement some trainable parameters in the model of Keras like nn.Parameters() in Pytorch?

I just wanna to implement some trainable parameters in my model with Keras. In Pytorch, we can do it by using torch.nn.Parameter() like below:
self.a = nn.Parameter(torch.ones(8))
self.b = nn.Parameter(torch.zeros(16,8))
I think by doing this in pytorch it can add some trainable parameters into the model. And now I wanna to know, how to achieve similar operations in keras?
Any suggestions or advice are welcomed!
THX! :)
p.s. I just write a custom layer in Keras as below:
class Mylayer(Layer):
def __init__(self,input_dim,output_dim,**kwargs):
self.input_dim = input_dim
self.output_dim = output_dim
super(Mylayer,self).__init__(**kwargs)
def build(self):
self.kernel = self.add_weight(name='pi',
shape=(self.input_dim,self.output_dim),
initializer='zeros',
trainable=True)
self.kernel_2 = self.add_weight(name='mean',
shape=(self.input_dim,self.output_dim),
initializer='ones',
trainable=True)
super(Mylayer,self).build()
def call(self,x):
return x,self.kernel,self.kernel_2
and I wanna to know if I haven't change the tensor which pass through the layer, should I write the function def compute_output_shape() for necessary?
You need to create the trainable weights in a custom layer:
class MyLayer(Layer):
def __init__(self, my_args, **kwargs):
#do whatever you need with my_args
super(MyLayer, self).__init__(**kwargs)
#you create the weights in build:
def build(self, input_shape):
#use the input_shape to infer the necessary shapes for weights
#use self.whatever_you_registered_in_init to help you, like units, etc.
self.kernel = self.add_weight(name='kernel',
shape=the_shape_you_calculated,
initializer='uniform',
trainable=True)
#create as many weights as necessary for this layer
#build the layer - equivalent to self.built=True
super(MyLayer, self).build(input_shape)
#create the layer operation here
def call(self, inputs):
#do whatever operations are needed
#example:
return inputs * self.kernel #make sure the shapes are compatible
#tell keras about the output shape of your layer
def compute_output_shape(self, input_shape):
#calculate the output shape based on the input shape and your layer's rules
return calculated_output_shape
Now use your layer in the model.
If you are using eager execution on with tensorflow and creating a custom training loop, you can work pretty much the same way you do with PyTorch, and you can create weights outside layers with tf.Variable, passing them as parameters to the gradient calculation methods.

How to develop a layer that works with arbitrary size input

I'm trying to develop a layer in Keras which works with 3D tensors. To make it flexible, I would like to postpone the code that relies on the input's exact shape as much as possible.
My layer is overriding 5 methods:
from tensorflow.python.keras.layers import Layer
class MyLayer(Layer):
def __init__(self, **kwargs):
pass
def build(self, input_shape):
pass
def call(self, inputs, verbose=False):
second_dim = K.int_shape(inputs)[-2]
# Do something with the second_dim
def compute_output_shape(self, input_shape):
pass
def get_config(self):
pass
And I'm using this layer like this:
input = Input(batch_shape=(None, None, 128), name='input')
x = MyLayer(name='my_layer')(input)
model = Model(input, x)
But I'm facing an error since the second_dim is None. How can I develop a layer that relies on the dimensions of the input but it's ok with it being provided by the actual data and not the input layer?
I ended up asking the same question differently, and I've got a perfect answer:
What is the right way to manipulate the shape of a tensor when there are unknown elements in it?
The gist of it is, don't treat the dimensions directly. Use them by reference and not by value. So, do not use K.int_shape and instead use K.shape. And use Keras operations to compose and come up with a new shape:
shape = K.shape(x)
newShape = K.concatenate([
shape[0:1],
shape[1:2] * shape[2:3],
shape[3:4]
])

cannot assign 'torch.nn.modules.container.Sequential' as parameter

I was following this method
(https://discuss.pytorch.org/t/dynamic-parameter-declaration-in-forward-function/427) to dynamically assign parameters in forward function.
However, my parameter is not just one single weight tensor but it is nn.Sequential.
When I implement below:
class MyModule(nn.Module):
def __init__(self):
# you need to register the parameter names earlier
self.register_parameter('W_di', None)
def forward(self, input):
if self.W_di is None:
self.W_di = nn.Sequential(
nn.Linear(mL_n * 2, 1024),
nn.ReLU(),
nn.Linear(1024, self.hS)).to(device)
I get the following error.
TypeError: cannot assign 'torch.nn.modules.container.Sequential' as parameter 'W_di' (torch.nn.Parameter or None expected)
Is there any way that I can register nn.Sequential as a whole param? Thanks!
If you or other users still have this problem, one solution to consider is using nn.ModuleList instead of nn.Sequential.
While nn.Sequential is useful for defining a fixed sequence of layers in PyTorch, nn.ModuleList is a more flexible container that allows direct access and modification of individual layers within the list. This can be especially helpful when dealing with dynamic models or architectures that require more complex layer arrangements.
My gut feeling is that you cannot do it. Even in the static model declaration, nn.Module also specifies the parameters of every sub-modules (e.g., nn.Conv2d or nn.Linear) in a nested way. That is, every kernel or bias is registered one by one and independently.
One workaround might be to introduce dynamic sub-modules. Here is my brief implementation. One can define desired dynamic behaviors inside the function DynamicLinear.
import torch
import torch.nn as nn
class DynamicLinear(nn.Module):
def __init__(self):
super(DynamicLinear, self).__init__()
# you need to register the parameter names earlier
self.register_parameter('W_di', None)
def forward(self, x):
if self.W_di is None:
# dynamically define a linear function here
self.W_di = nn.Parameter(torch.ones(1, 1)).to(x.device)
return self.W_di # x
class MyModule(nn.Module):
def __init__(self):
super(MyModule, self).__init__()
self.net = nn.Sequential(
DynamicLinear(),
nn.ReLU(),
DynamicLinear())
def forward(self, x):
return self.net(x)
m = MyModule()
x = torch.ones(1, 1)
y = m(x)
# output: 1
print(y)

How do I add LSTM, GRU or other recurrent layers to a Sequential in PyTorch

I like using torch.nn.Sequential as in
self.conv_layer = torch.nn.Sequential(
torch.nn.Conv1d(196, 196, kernel_size=15, stride=4),
torch.nn.Dropout()
)
But when I want to add a recurrent layer such as torch.nn.GRU it won't work because the output of recurrent layers in PyTorch is a tuple and you need to choose which part of the output you want to further process.
So is there any way to get
self.rec_layer = nn.Sequential(
torch.nn.GRU(input_size=2, hidden_size=256),
torch.nn.Linear(in_features=256, out_features=1)
)
to work? For this example, let's say I want to feed torch.nn.GRU(input_size=2, hidden_size=20)(x)[1][-1] (the last hidden state of the last layer) into the following Linear layer.
I made a module called SelectItem to pick out an element from a tuple or list
class SelectItem(nn.Module):
def __init__(self, item_index):
super(SelectItem, self).__init__()
self._name = 'selectitem'
self.item_index = item_index
def forward(self, inputs):
return inputs[self.item_index]
SelectItem can be used in Sequential to pick out the hidden state:
net = nn.Sequential(
nn.GRU(dim_in, dim_out, batch_first=True),
SelectItem(1)
)

Resources