PyTorch has ModuleList and ModuleDict. How do I create equivalents in PyTorch Lightning to get all the nice properties of LightningModules?
You can instantiate the ModuleDict/ModuleList in the LightningModule and since the both ModuleDict and ModuleList inherits from torch.nn.Module so it will work just like that without you needing to do anything.
Related
I'm trying to get the same functionality of pytorch lstm using pytorch lstm cell.
I've read this: Pytorch LSTM vs LSTMCell
but not sure how it can be done
i.e i need to use the same api of lstm so maybe need to create a method or class for it?
can you assist?
I trained a model using pytorch lightning and especially appreciated the ease of using multiple GPU's. Now after training, how can I still make use of lightnings GPU features to run inference on a test set and store/export the predictions?
The documentation on inference does not target that.
Thanks in advance.
You can implement the validation_epoch_end on your LightningModule which is called "at the end of the validation epoch with the outputs of all validation steps". For this to work you also need to define validation_step on that same module.
Once this is done, you can run validation using your trainer and a given dataloader by calling:
trainer.validate(pl_module, dataloaders=validation_dataloader)
I currently train my model using GPUs using Pytorch Lightning
trainer = pl.Trainer( gpus=[0,1],
distributed_backend='ddp',
resume_from_checkpoint=hparams["resume_from_checkpoint"])
trainer.fit(model, train_dataloader=train_loader, val_dataloaders=val_loader)
Instructions are also clear for how to run test samples with trainer defined to use GPU
trainer.test(test_dataloader=test_dataloader)
and also how to load a model and use it interactively
model = transformer.Model.load_from_checkpoint('/checkpoints/run_300_epoch_217.ckpt')
results = model(in_data,
I use the later to interface with an interactive system via sockets in a docker container.
Is there a proper way to make this Pytorch Lightning model run on GPU?
Lightning instructions say not to use model.to(device), but it appears to work just like Pytorch. Reason for instructions to avoid a side effect?
I started reading about ONNX, but would rather just have an easy way to specify GPU since the interactive setup works perfectly with cpu.
My understanding is that "Remove any .cuda() or to.device() calls" is only for using with the Lightning trainer, because the trainer handles that itself.
If you don't use the trainer, a LightningModule module is basically just a regular PyTorch model with some naming conventions. So using model.to(device) is how to run on GPU.
I have an existing model where I load some pre-trained weights and then do prediction (one image at a time) in pytorch. I am trying to basically convert it to a pytorch lightning module and am confused about a few things.
So currently, my __init__ method for the model looks like this:
self._load_config_file(cfg_file)
# just creates the pytorch network
self.create_network()
self.load_weights(weights_file)
self.cuda(device=0) # assumes GPU and uses one. This is probably suboptimal
self.eval() # prediction mode
What I can gather from the lightning docs, I can pretty much do the same, except not to do the cuda() call. So something like:
self.create_network()
self.load_weights(weights_file)
self.freeze() # prediction mode
So, my first question is whether this is the correct way to use lightning? How would lightning know if it needs to use the GPU? I am guessing this needs to be specified somewhere.
Now, for the prediction, I have the following setup:
def infer(frame):
img = transform(frame) # apply some transformation to the input
img = torch.from_numpy(img).float().unsqueeze(0).cuda(device=0)
with torch.no_grad():
output = self.__call__(Variable(img)).data.cpu().numpy()
return output
This is the bit that has me confused. Which functions do I need to override to make a lightning compatible prediction?
Also, at the moment, the input comes as a numpy array. Is that something that would be possible from the lightning module or do things always have to use some sort of a dataloader?
At some point, I want to extend this model implementation to do training as well, so want to make sure I do it right but while most examples focus on training models, a simple example of just doing prediction at production time on a single image/data point might be useful.
I am using 0.7.5 with pytorch 1.4.0 on GPU with cuda 10.1
LightningModule is a subclass of torch.nn.Module so the same model class will work for both inference and training. For that reason, you should probably call the cuda() and eval() methods outside of __init__.
Since it's just a nn.Module under the hood, once you've loaded your weights you don't need to override any methods to perform inference, simply call the model instance. Here's a toy example you can use:
import torchvision.models as models
from pytorch_lightning.core import LightningModule
class MyModel(LightningModule):
def __init__(self):
super().__init__()
self.resnet = models.resnet18(pretrained=True, progress=False)
def forward(self, x):
return self.resnet(x)
model = MyModel().eval().cuda(device=0)
And then to actually run inference you don't need a method, just do something like:
for frame in video:
img = transform(frame)
img = torch.from_numpy(img).float().unsqueeze(0).cuda(0)
output = model(img).data.cpu().numpy()
# Do something with the output
The main benefit of PyTorchLighting is that you can also use the same class for training by implementing training_step(), configure_optimizers() and train_dataloader() on that class. You can find a simple example of that in the PyTorchLightning docs.
Even though above answer suffices, if one takes note of following line
img = torch.from_numpy(img).float().unsqueeze(0).cuda(0)
One has to put both the model as well as image to the right GPU. On multi-gpu inference machine, this becomes a hassle.
To solve this, .predict was also recently produced, see more at https://pytorch-lightning.readthedocs.io/en/stable/deploy/production_basic.html
yes I've read everywhere that keras and tf.keras aren't compatible. But you can pass tf.keras.layers into a keras model, and it does work. When I try to do that with my own models... it does not work!
If you examine the resnet sourcecode for Resnet50.py, they build models like
input = layers.Input(shape=input_shape)
x = layers.Dense()(x)
model = Model(input,x)
and it works fine whether you pass in layers=tf.keras.layers or layers=keras.layers
demonstration code:
import tensorflow as tf
import keras
# THIS WORKS!
input_shape = (224,224,3)
base_model = keras.applications.ResNet50(layers=tf.keras.layers, weights='imagenet',
weights='imagenet', include_top=False, pooling=None,
input_shape=input_shape,
classes=1000)
# this fails!!
input = tf.keras.layers.Input(shape=input_shape)
x = tf.keras.layers.Dense(1000,activation='relu')(input)
model = keras.Model(input, x)
My code produces this error: type error:
object of type Dense has no len
How to make my this work? Apparently there is a way to make it work, because the keras.applications prebuilt models do seem to support it and it works fine
I want to use tf.keras.layers because their batchnormalization layer works different. This is potentially the easiest way to drop it into our massive existing code base.
I do see this related stackoverflow post with the same error: Object of Type 'Dense' has no len()
They correctly mention it's due to tf.keras and keras not being compatible. but again, I've confirmed that passing tf.keras.layers into keras.applications.resnet50 does return a keras model with the correct layers. Somehow.
You got the wrong conclusion, keras.applications is a module that supports both keras and tf.keras packages, as keras.applications uses models.Model, it detects if you use tf.keras or keras and gets the corresponding modules so the code is agnostic to the actual keras implementation.
keras.applications is not mixing usage of keras and tf.keras, it just supports both.