After using a GPU for some to train a PyTorch model, can I use the saved weights to continue training my model on a TPU?
After using GPU for some time can I use the saved weights to train my model using TPU?
Yes, if you saved your GPU-trained model with, say
torch.save(model.save_dict(), 'model.pt')
you can load it again for use on a TPU (using https://github.com/pytorch/xla) in a separate program run with
import torch_xla.utils.serialization as xser
model.load_state_dict(xser.load('model.pt'))
Related
I am new in Pytorch. My question is: How do I apply transfer learning to a custom dataset? I am doing image segmentation on brain tumors. I can find examples which use U-net structure but I could not find examples using weights of the pre-trained models for a U-net image segmentation?
You could obtain pre-trained models in two ways:
Model weights or complete models shared in formats such .pt or .pth:
In this case, Saving and Loading Models is a good starting point. Copying from the tutorial there, you could load a model as
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
The other way is to load the model from torchvision. A list is available models is available at Torchvision Models. U-Net is not available yet. However, it is possible to load a pre-trained model as the encoder and write a separate decoder to form a U-Net with a pre-trained encoder.
In this case, the model object returned from the function calls shown in the API are already loaded with pretrained weights when pretrained=True.
For writing a custom dataloader, PyTorch data loaders may be a useful guide.
# define the model
model = MaskRCNN(mode='training', model_dir='./', config=config)
# load weights (mscoco) and exclude the output layers
model.load_weights('mask_rcnn_coco.h5', by_name=True, exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_bbox", "mrcnn_mask"])
# train weights (output layers or 'heads')
model.train(train_set, test_set, learning_rate=config.LEARNING_RATE, epochs=2, layers='heads')
I have certain medical images containing fibroids.
I wish to apply instance segmentation or object detection.
I may have to use mask Rcnn for instance segmentation and object detection. Is it possible to design the network from scratch instead of using transfer learning?
What I mean here is random initialization of weights for my data, instead of using weights derived from imagenet data or coco data.
From the command line,instead of training a model starting from pre-trained COCO weights like this
python my_model.py train --dataset=/path/dataset --weights=coco
execute the following line.
python my_model.py train --dataset=/path/dataset
And to start training from the first layer execute the following code.
model.train(dataset_train, dataset_val,learning_rate=config.LEARNING_RATE,epochs=10, layers='all')
Can't you just run the training without doing the model.load_weights() line? It seems to be running fine for me when I do that. I assume that runs it with randomized initial weights. It didn't result in quite as good results as starting with coco does, but I'm sure that's expected behavior for some datasets.
I have a keras LSTM model and want to run it under multiple GPUs for speed improvement. But I have some ambiguities:
1- I found that to really get the great speed on GPU I should define my network using CuDNNLSTM layer and not normal LSTM layer. To use multiple GPUs, I looked at Keras documentation and wanted to use multi_gpu_model() function to make distributed model. However, in the sample scripts they recommend to define the model on CPU for easy weight sharing, but my CuDNNLSTM model is not deployable on CPU and LSTM model will not benefit from the enhancements provided by GPU. What is the correct approach?
2- So I tried many configurations, including:
Group 1(using the normal (non-fast) LSTM layers): placing model on CPU and no copying to GPU; placing model on CPU and then use multi_gpu_model to create GPU copies; place model on default GPU and no copying to other GPU; placing model on default GPU and then use multi_gpu_model to create two GPU copies.
Group2 (using CuDNNLSTM layer and therefore no possibility to place model on CPU): defining a single model (which Tensorflow places it on the default GPU); using multi_gpu_model to create two GPU copies.
In all cases, data parallelism (using multi_gpu_model) resulted in lower speed of execution. I didn't change anything else in my code and input data pipeline or batch sizes. What is wrong with me?
3- In general, should I only use CuDNN-type layers to get high speed computation with GPUs when I am programming at high level of keras API?
I'm using faster_rcnn_resnet50 to train a model which will detect corrosions in images and I want to train a model from scratch instead of using transfer learning.
I don't know if this right but the reason I want to do this is that the already existing weights (which are trained on COCO) will affect my model trained on corrosion images.
One way I would like to do this is randomize or unfreeze the weights of the feature extractor on the resnet50 and then train the model on my images.
but there's no function or an option in the resnet50 config file to randomize or unfreeze weights.
I've made a new labelmap with a single label and tried it with transfer learning. It's working but I would like to have a model is trained just on my images and the previous weights shouldn't affect my predictions.
This is the first time I'm working with object detection and transfer learning. Will the weights of the pre-trained model on COCO affect my model which is trained on custom images of corrosion? How do you use tensorflow object-detection API without transfer learning?
Refer to this to train a GAN model for MNIST dataset, I want to save a model and restore it for further prediction. After having some understanding of Saving and Importing a Tensorflow Model I am able to save and restore some variables of inputs and outputs but for this network I am able to save the model only after some specific iterations and not able to predict some output.
Did you refer to this guide? It explains very clearly how to load and save tensorflow models in all possible formats.
If you are new to ML, I'd recommend you give Keras a try first, which is much easier to use. See https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model, pretty much you can use:
model.save('my_model.h5')
to save your model to disk.
model = load_model('my_model.h5')
to load your model and make prediction