When running nn.Sequential, I include a list of class modules (which would be layers of a neural network). When running nn.Sequential, it calls forward functions of the modules. However each of the class modules also has a function which I would like to access when the nn.Sequential runs. How can I access and run this function when running nn.Sequential?
You can use a hook for that. Let's consider the following example demonstrated on VGG16:
This is the network architecture:
Say we want to monitor the input and output for layer (2) in the features Sequential (that Conv2d layer you see above).
For this matter we register a forward hook, named my_hook which will be called on any forward pass:
import torch
from torchvision.models import vgg16
def my_hook(self, input, output):
print('my_hook\'s output')
print('input: ', input)
print('output: ', output)
# Sample net:
net = vgg16()
#Register forward hook:
net.features[2].register_forward_hook(my_hook)
# Test:
img = torch.randn(1,3,512,512)
out = net(img) # Will trigger my_hook and the data you are looking for will be printed
Related
I've got a nodejs AWS Lambda Layer (lets call it dbUtil) with some low level database access code (stuff like opening connections, executing prepared statements, etc.).
Now I want to create another layer (let's call it modelUtil) with higher level, data model-specific code (stuff like data transfer objects, and model-specific transformations).
I would very much like to be able to leverage the code in the dbUtil layer within the higher-level modelUtil layer, while still being able to import dbUtil into a lambda function independently.
Importing the layer to a lambda function is easy as SAM plops the layer code into /opt/nodejs/. But as far as I know, nothing analogous exists for layers; AWS doesn't give you the ability to import a layer into another layer in the same way. Additionally, each layer is self-contained, so I couldn't have the layer just put const dbUtil = require('./dbUtil') in the modelUtil.js file, unless they were in the same directory when I built the layer, and thus, forcing them to be the same layer.
Is there a way I can have a dependency from one layer (modelUtil) on another layer (dbUtil) while still allowing them to be treated as independent layers?
I just tested this on Lambda and I can testify that a Layer can import functions and dependencies from another Layer. Even the merge order does not matter.
For your case, for modelUtil Layer to import functions from dbUtil Layer:
(Inside modelUtil)
const func1 = require('/opt/<the location of func1 in dbUtil>')
For modelUtil Layer to import npm dependencies from dbUtil Layer:
(Inside modelUtil)
const dependency = require(dependency)
It is as simple as that!
I want to separate model structure authoring and training. The model author designs the model structure, saves the untrained model to a file and then sends it training service which loads the model structure and trains the model.
Keras has the ability to save the model config and then load it.
How can the same be accomplished with PyTorch?
You can write your own function to do that in PyTorch. Saving of weights is straight forward where you simply do a torch.save(model.state_dict(), 'weightsAndBiases.pth').
For saving the model structure, you can do this:
(Assume you have a model class named Network, and you instantiate yourModel = Network())
model_structure = {'input_size': 784,
'output_size': 10,
'hidden_layers': [each.out_features for each in yourModel.hidden_layers],
'state_dict': yourModel.state_dict() #if you want to save the weights
}
torch.save(model_structure, 'model_structure.pth')
Similarly, we can write a function to load the structure.
def load_structure(filepath):
structure = torch.load(filepath)
model = Network(structure['input_size'],
structure['output_size'],
structure['hidden_layers'])
# model.load_state_dict(structure['state_dict']) if you had saved weights as well
return model
model = load_structure('model_structure.pth')
print(model)
Edit:
Okay, the above was the case when you had access to source code for your class, or if the class was relatively simple so you could define a generic class like this:
class Network(nn.Module):
def __init__(self, input_size, output_size, hidden_layers, drop_p=0.5):
''' Builds a feedforward network with arbitrary hidden layers.
Arguments
---------
input_size: integer, size of the input layer
output_size: integer, size of the output layer
hidden_layers: list of integers, the sizes of the hidden layers
'''
super().__init__()
# Input to a hidden layer
self.hidden_layers = nn.ModuleList([nn.Linear(input_size, hidden_layers[0])])
# Add a variable number of more hidden layers
layer_sizes = zip(hidden_layers[:-1], hidden_layers[1:])
self.hidden_layers.extend([nn.Linear(h1, h2) for h1, h2 in layer_sizes])
self.output = nn.Linear(hidden_layers[-1], output_size)
self.dropout = nn.Dropout(p=drop_p)
def forward(self, x):
''' Forward pass through the network, returns the output logits '''
for each in self.hidden_layers:
x = F.relu(each(x))
x = self.dropout(x)
x = self.output(x)
return F.log_softmax(x, dim=1)
However, that will only work for simple cases so I suppose that's not what you intended.
One option is, you can define the architecture of model in a separate .py file and import it along with other necessities(if the model architecture is complex) or you can altogether define the model then and there.
Another option is converting your pytorch model to onxx and saving it.
The other option is that, in Tensorflow you can create a .pb file that defines both the architecture and the weights of the model and in Pytorch you would do something like that this way:
torch.save(model, filepath)
This will save the model object itself, as torch.save() is just a pickle-based save at the end of the day.
model = torch.load(filepath)
This however has limitations, your model class definition might not for example be picklable(possible in some complicated models).
Because this is a such an iffy workaround, the answer that you'll usually get is - No, you have to declare the class definition before loading the trained model, ie you need to have access to the model class source code.
Side notes:
An official answer by one of the core PyTorch devs on limitations of loading a pytorch model without code:
We only save the source code of the class definition. We do not save beyond that (like the package sources that the class is referring to).
import foo
class MyModel(...):
def forward(input):
foo.bar(input)
Here the package foo is not saved in the model checkpoint.
There are limitations on robustly serializing python constructs. For example the default picklers cannot serialize lambdas. There are helper packages that can serialize more python constructs than the standard, but they still have limitations. Dill 25 is one such package.
Given these limitations, there is no robust way to have torch.load work without having the original source files.
This a part of the code for a Deconvolutional-Convoltional Generative Adversarial Network (DC-GAN)
discriminator.trainable = False
ganInput = Input(shape=(100,))
# getting the output of the generator
# and then feeding it to the discriminator
# new model = D(G(input))
x = generator(ganInput)
ganOutput = discriminator(x)
gan = Model(input=ganInput, output=ganOutput)
gan.compile(loss='binary_crossentropy', optimizer=Adam())
I do not understand what the line ganInput = Input(shape=(100,)) does.
Clearly ganInput is a variable but what is Input? Is it a function ?
If Input is a function then what will ganInput contain ?
Then it is ganInput is fed into the generator since it is an empty variable (assuming) it will not matter. Next ganOutput catches the output of the discriminator.
Then comes the problem. I read about the Model API but I do not understand fully what it does.
To summarise these are my problems : What is the role of ganInput and what is Input in the second line. And what is Model doing and what is it?
Using Keras with TensorFlow backend
COMPLETE SOURCE CODE : https://github.com/yashk2810/DCGAN-Keras/blob/master/DCGAN.ipynb
Please ask for any more clarification / details required. If you know the answer to even one of my queries I will request you to please answer it will be a huge help. Thanks
What is input: Notice the wildcard import of keras.layers. In context, Input is keras.layers.Input. Generally, if you see a function or class that wasn't defined or explicitly imported in Python, it got there via a wildcard import, i.e.
from keras.layers import *
That means import everything from keras.layers directly into the workspace.
What is model: The model object is essentially the interface for making neural networks with Keras.
You can read about model and keras.layers.Input at the model docs or at this model guide since I'm not very familiar with Keras.
What's going on in the example is they define generator and discriminator as Sequentials. But the GAN model is a little more complex than a standard old Sequential. The authors deal with that by marking the data that needs fed in at every iteration (in this case, just the random noise for the generator - ganInput) as a keras.layers.Input. Then, like you said, ganOutput catches the output of the discriminator. Since we have two distinct Sequentials that need wrapped together, the authors use the model API.
I'd like to set up a Keras layer in which each node simply computes the logarithm of the corresponding node in the preceding layer. I see from the Keras documentation that there is a "log" function in its backend module. But somehow I'm not understanding how to use this.
Thanks in advance for any hints you can offer!
You can use any backend function inside a Lambda layer:
from keras.layers import Lambda
import keras.backend as K
Define just any function taking the input tensor:
def logFunc(x):
return K.log(x)
And create a lambda layer with it:
#add to the model the way you're used to:
model.add(Lambda(logFunc,output_shape=(necessaryWithTheano)))
And if the function is already defined, taking only one argument and returning a tensor, you don't need to create your own function, just Lambda(K.log), for instance.
How is it possible to use leaky ReLUs in the newest version of keras?
Function relu() accepts an optional parameter 'alpha', that is responsible for the negative slope, but I cannot figure out how to pass ths paramtere when constructing a layer.
This line is how I tried to do it,
model.add(Activation(relu(alpha=0.1))
but then I get the error
TypeError: relu() missing 1 required positional argument: 'x'
How can I use a leaky ReLU, or any other activation function with some parameter?
relu is a function and not a class and it takes the input to the activation function as the parameter x. The activation layer takes a function as the argument, so you could initialize it with a lambda function through input x for example:
model.add(Activation(lambda x: relu(x, alpha=0.1)))
Well, from this source (keras doc), and this github question , you use a linear activation then you put the leaky relu as another layer right after.
from keras.layers.advanced_activations import LeakyReLU
model.add(Dense(512, 512, activation='linear')) # Add any layer, with the default of an identity/linear squashing function (no squashing)
model.add(LeakyReLU(alpha=.001)) # add an advanced activation
does that help?
You can build a wrapper for parameterized activations functions. I've found this useful and more intuitive.
class activation_wrapper(object):
def __init__(self, func):
self.func = func
def __call__(self, *args, **kwargs):
def _func(x):
return self.func(x, *args, **kwargs)
return _func
Of course I could have used a lambda expression in call.
Then
wrapped_relu = activation_wrapper(relu).
Then use it as you have above
model.add(Activation(wrapped_relu(alpha=0.1))
You can also use it as part of a layer
model.add(Dense(64, activation=wrapped_relu(alpha=0.1))
While this solution is a little more complicated than the one offered by #Thomas Jungblut, the wrapper class can be reused for any parameterized activation function. In fact, I used it whenever I have a family of activation functions that are parameterized.
Keras defines separate activation layers for the most common use cases, including LeakyReLU, ThresholdReLU, ReLU (which is a generic version that supports all ReLU parameters), among others. See the full documentation here: https://keras.io/api/layers/activation_layers
Example usage with the Sequential model:
import tensorflow as tf
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer(10))
model.add(tf.keras.layers.Dense(16))
model.add(tf.keras.layers.LeakyReLU(0.2))
model.add(tf.keras.layers.Dense(1))
model.add(tf.keras.layers.Activation(tf.keras.activations.sigmoid))
model.compile('adam', 'binary_crossentropy')
If the activation parameter you want to use is unavailable as a predefined class, you could use a plain lambda expression as suggested by #Thomas Jungblut:
from tensorflow.keras.layers import Activation
model.add(Activation(lambda x: tf.keras.activations.relu(x, alpha=0.2)))
However, as noted by #leenremm in the comments, this fails when trying to save or load the model. As suggested you could use the Lambda layer as follows:
from tensorflow.keras.layers import Activation, Lambda
model.add(Activation(Lambda(lambda x: tf.keras.activations.relu(x, alpha=0.2))))
However, the Lambda documentation includes the following warning:
WARNING: tf.keras.layers.Lambda layers have (de)serialization limitations!
The main reason to subclass tf.keras.layers.Layer instead of using a Lambda layer is saving and inspecting a Model. Lambda layers are saved by serializing the Python bytecode, which is fundamentally non-portable. They should only be loaded in the same environment where they were saved. Subclassed layers can be saved in a more portable way by overriding their get_config method. Models that rely on subclassed Layers are also often easier to visualize and reason about.
As such, the best method for activations not already provided by a layer is to subclass tf.keras.layers.Layer instead. This should not be confused with subclassing object and overriding __call__ as done in #Anonymous Geometer's answer, which is the same as using a lambda without the Lambda layer.
Since my use case is covered by the provided layer classes, I'll leave it up to the reader to implement this method. I am making this answer a community wiki in the event anyone would like to provide an example below.