Torchvision normalize - how it operates on tuple of means/sds? - pytorch

I don't understand how this transform works from torchvision. Ultimately I want to build a custom normalize class so I need to figure out how this works first.
Here in the docs it describes the init like this:
def __init__(self, mean, std, inplace=False):
self.mean = mean
self.std = std
self.inplace = inplace
And when I pass these parameters usually (not custom class) I pass them as a list or tuple for each channel:
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
But if I look at the call:
return F.normalize(tensor, self.mean, self.std, self.inplace)
All this passes the tuple to is F.normalize() which only accepts a single value for the p parameter.
The class must iterate through the channels somehow to allow this to be implemented but how does it do this and how can I implement it in custom class?
Based on this tutorial, I would describe it like this:
class Normalize(object):
"""Convert ndarrays in sample to Tensors."""
def __init__(self, mean, std, inplace=False):
self.mean = mean
self.std = std
self.inplace = inplace
def __call__(self, sample):
image, landmarks = sample['image'], sample['landmarks']
return {'image': F.normalize(image, self.mean, self.std, self.inplace),
'landmarks': landmarks}
But this does not work because it does not go through each channel.

The normalize function called in there is this one https://github.com/pytorch/vision/blob/master/torchvision/transforms/functional.py#L191
The input is a tensor of shape (C, H, W) and mean and std can be sequences, that are internally converted to tensors. The normalization is done through broadcasting in this way:
tensor.sub_(mean[:, None, None]).div_(std[:, None, None])

Related

Sklearn Pipeline: One feature automatically missed out

I created a Custom Classifier(Dummy Classifier). Below is definition. I also added some print statements & global variables to capture values
class FeaturePassThroughClassifier(ClassifierMixin):
def __init__(self):
pass
def fit(self, X, y):
global test_arr1
self.classes_ = np.unique(y)
test_arr1 = X
print("1:", X.shape)
return self
def predict(self, X):
global test_arr2
test_arr2 = X
print("2:", X.shape)
return X
def predict_proba(self, X):
global test_arr3
test_arr3 = X
print("3:", X.shape)
return X
Below is Stacking Classifier definition where the above defined CustomClassifier is one of base classifier. There are 3 more base classifiers (these are fitted estimators). Goal is to get input training set variables as is (which will come out from CustomClassifier) + prediction from base_classifier2, base_classifier3, base_classifier4. These features will act as input to meta classifier.
model = StackingClassifier(estimators=[
('select_features', Pipeline(steps = [("model_feature_selector", ColumnTransformer([('feature_list', 'passthrough', X_train.columns)])),
('base(dummy)_classifier1', FeaturePassThroughClassifier())])),
('base_classifier2', base_classifier2),
('base_classifier3', base_classifier3),
('base_classifier4', base_classifier4)
],
final_estimator = Pipeline(memory=None,
steps=[
('save_base_estimator_output_data', FunctionTransformer(save_base_estimator_output_data, validate=False)), ('final_model', RandomForestClassifier())
], verbose=True), passthrough = False, **stack_method = 'predict_proba'**)
Below is o/p on fitting the model. There are 230 variables:
Here is the problem: There are 230 variables but CustomClassifier o/p is showing only 229 which is strange. We can clearly see from print statements above that 230 variables get passed through CustomClassifier.
I need to use stack_method = "predict_proba". I am not sure what's going wrong here. The code works fine when stack_method = "predict".
Since this is a binary classifier, the classifier class expects you to add two probability columns in the output matrix - one for probability for class label "1" and another for "0".
In the output, it has dropped one of these since both are not required, hence, 230 columns get reduced to 229. Add a dummy column to solve your problem.
In the Notes section of the documentation:
When predict_proba is used by each estimator (i.e. most of the time for stack_method='auto' or specifically for stack_method='predict_proba'), The first column predicted by each estimator will be dropped in the case of a binary classification problem.
Here's the code that eliminates the first column.
You could add a sacrificial first column in your custom estimator's predict_proba, or switch to decision_function (which will cause differences depending on your real base estimators), or use the passthrough option instead of the custom estimator (doing feature selection in the final_estimator object instead).
Both the above solutions are on point. This is how I implemented the workaround with dummy column:
Declare a custom transformer whose output is the column that gets dropped due reasons explained above:
class add_dummy_column(BaseEstimator, TransformerMixin):
def __init__(self, key):
self.key = key
def fit(self, X, y=None):
return self
def transform(self, X):
print(type(X))
return X[[self.key]]
Do a feature union where above customer transformer + column transformer are called to create final dataframe. This will duplicate the column that gets dropped. Below is altered definition for defining Stacking classifier with FeatureUnion:
model = StackingClassifier(estimators=[
('select_features', Pipeline(steps = [('featureunion', FeatureUnion([('add_dummy_column_to_input_dataframe', add_dummy_column(key='FEATURE_THAT_GETS_DROPPED')),
("model_feature_selector", ColumnTransformer([('feature_list', 'passthrough', X_train.columns)]))])),
('base(dummy)_classifier1', FeaturePassThroughClassifier())])),
('base_classifier2', base_classifier2),
('base_classifier3', base_classifier3),
('base_classifier4', base_classifier4)
],
final_estimator = Pipeline(memory=None,
steps=[
('save_base_estimator_output_data', FunctionTransformer(save_base_estimator_output_data, validate=False)), ('final_model', RandomForestClassifier())
], verbose=True), passthrough = False, **stack_method = 'predict_proba'**)

Subclass of PyTorch DataLoader for changing batch output

I'm interested in a way of applying a transform to a batch generated by a PyTorch DataLoader class. My minimal example is something like this:
class CustomLoader(torch.utils.data.DataLoader):
def __iter__(self):
result = super().__iter__()
return some_function(result)
But this errors since the DataLoader.__iter()__ returns _MultiProcessingDataLoaderIter or _SingleProcessingDataLoaderIter. Weirdly though, directly returning the output does return a Tensor, so any explanation there would be greatly appreciated!
I understand that in general, transform to data should be done in the subclassed Dataset class. However, in my case the data is tabular and the transform is via numpy, and doing it on a sample-wise basis is much slower (5x) than doing it on an entire batch, since surely these operations are vectorized under the hood.
I know I can do something simple like
for X, y in loader:
X = some_function(X)
But I'd also like to use the DataLoader with pytorch-lightning, so this isn't an option.
What is the proper way to subclass PyTorch Dataloaders?
__iter__() is a generator. You will need to yield the result instead of returning it. You can read more about generators here
Regarding your problem to apply a transform to a batch, you can create a custom Dataset instead of DataLoader and then apply the transforms.
class MyDataset(Dataset):
def __init__(self, transforms=None):
super().__init__()
self.data = ... # define your data here
self.transforms = transforms
def __getitem__(self, idx):
x = self.data[idx]
if self.transforms: x = self.transforms(x)
return x
# use your `MyDataset` class for creating your dataloader
dataloader = DataLoader(MyDataset(transforms = CustomTransforms(), batch_size=4)
You can use this dataloader with PyTorch Lightning Trainer as well.
If you are using PyTorch Lightning, I would suggest you to join our Slack channel and ask questions on Github Discussions as well.
Thanks :)
EDIT: (Add transforms to Batch)
If you are using PyTorch Lightning then I would recommend to use LightningDataModule which provides on_before_batch_transfer hook that can be used to apply transforms on a batch ;)
Here is an example:
def on_before_batch_transfer(self, batch, dataloader_idx):
batch['x'] = transforms(batch['x'])
return batch
Checkout the documentation for more

How augmentation increase number of images [duplicate]

I am a little bit confused about the data augmentation performed in PyTorch. Now, as far as I know, when we are performing data augmentation, we are KEEPING our original dataset, and then adding other versions of it (Flipping, Cropping...etc). But that doesn't seem like happening in PyTorch. As far as I understood from the references, when we use data.transforms in PyTorch, then it applies them one by one. So for example:
data_transforms = {
'train': transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
'val': transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
}
Here , for the training, we are first randomly cropping the image and resizing it to shape (224,224). Then we are taking these (224,224) images and horizontally flipping them. Therefore, our dataset is now containing ONLY the horizontally flipped images, so our original images are lost in this case.
Am I right? Is this understanding correct? If not, then where do we tell PyTorch in this code above (taken from Official Documentation) to keep the original images and resize them to the expected shape (224,224)?
Thanks
I assume you are asking whether these data augmentation transforms (e.g. RandomHorizontalFlip) actually increase the size of the dataset as well, or are they applied on each item in the dataset one by one and not adding to the size of the dataset.
Running the following simple code snippet we could observe that the latter is true, i.e. if you have a dataset of 8 images, and create a PyTorch dataset object for this dataset when you iterate through the dataset, the transformations are called on each data point, and the transformed data point is returned. So for example if you have random flipping, some of the data points are returned as original, some are returned as flipped (e.g. 4 flipped and 4 original). In other words, by one iteration through the dataset items, you get 8 data points(some flipped and some not). [Which is at odds with the conventional understanding of augmenting the dataset(e.g. in this case having 16 data points in the augmented dataset)]
from torch.utils.data import Dataset
from torchvision import transforms
class experimental_dataset(Dataset):
def __init__(self, data, transform):
self.data = data
self.transform = transform
def __len__(self):
return len(self.data.shape[0])
def __getitem__(self, idx):
item = self.data[idx]
item = self.transform(item)
return item
transform = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()
])
x = torch.rand(8, 1, 2, 2)
print(x)
dataset = experimental_dataset(x,transform)
for item in dataset:
print(item)
Results: (The little differences in floating points are caused by transforming to pil image and back)
Original dummy dataset:
tensor([[[[0.1872, 0.5518],
[0.5733, 0.6593]]],
[[[0.6570, 0.6487],
[0.4415, 0.5883]]],
[[[0.5682, 0.3294],
[0.9346, 0.1243]]],
[[[0.1829, 0.5607],
[0.3661, 0.6277]]],
[[[0.1201, 0.1574],
[0.4224, 0.6146]]],
[[[0.9301, 0.3369],
[0.9210, 0.9616]]],
[[[0.8567, 0.2297],
[0.1789, 0.8954]]],
[[[0.0068, 0.8932],
[0.9971, 0.3548]]]])
transformed dataset:
tensor([[[0.1843, 0.5490],
[0.5725, 0.6588]]])
tensor([[[0.6549, 0.6471],
[0.4392, 0.5882]]])
tensor([[[0.5647, 0.3255],
[0.9333, 0.1216]]])
tensor([[[0.5569, 0.1804],
[0.6275, 0.3647]]])
tensor([[[0.1569, 0.1176],
[0.6118, 0.4196]]])
tensor([[[0.9294, 0.3333],
[0.9176, 0.9608]]])
tensor([[[0.8549, 0.2275],
[0.1765, 0.8941]]])
tensor([[[0.8902, 0.0039],
[0.3529, 0.9961]]])
The transforms operations are applied to your original images at every batch generation. So your dataset is left unchanged, only the batch images are copied and transformed every iteration.
The confusion may come from the fact that often, like in your example, transforms are used both for data preparation (resizing/cropping to expected dimensions, normalizing values, etc.) and for data augmentation (randomizing the resizing/cropping, randomly flipping the images, etc.).
What your data_transforms['train'] does is:
Randomly resize the provided image and randomly crop it to obtain a (224, 224) patch
Apply or not a random horizontal flip to this patch, with a 50/50 chance
Convert it to a Tensor
Normalize the resulting Tensor, given the mean and deviation values you provided
What your data_transforms['val'] does is:
Resize your image to (256, 256)
Center crop the resized image to obtain a (224, 224) patch
Convert it to a Tensor
Normalize the resulting Tensor, given the mean and deviation values you provided
(i.e. the random resizing/cropping for the training data is replaced by a fixed operation for the validation one, to have reliable validation results)
If you don't want your training images to be horizontally flipped with a 50/50 chance, just remove the transforms.RandomHorizontalFlip() line.
Similarly, if you want your images to always be center-cropped, replace transforms.RandomResizedCrop by transforms.Resize and transforms.CenterCrop, as done for data_transforms['val'].
Yes the dataset size does not change after the transformations. Every Image is passed to the transformation and returned, thus the size remaining the same.
If you wish to use the original dataset with transformed one concat them.
e.g increased_dataset = torch.utils.data.ConcatDataset([transformed_dataset,original])
The purpose of data augumentation is to increase the diversity of training dataset.
Even though the data.transforms doesn't change the size of dataset, however, every epoch we recall the dataset, the transforms operation will be executed and then get different data.
I changed #Ashkan372 code slightly to output data for multiple epochs:
import torch
from torchvision import transforms
from torch.utils.data import TensorDataset as Dataset
from torch.utils.data import DataLoader
class experimental_dataset(Dataset):
def __init__(self, data, transform):
self.data = data
self.transform = transform
def __len__(self):
return self.data.shape[0]
def __getitem__(self, idx):
item = self.data[idx]
item = self.transform(item)
return item
transform = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()
])
x = torch.rand(8, 1, 2, 2)
print('the original data: \n', x)
epoch_size = 3
batch_size = 4
dataset = experimental_dataset(x,transform)
for i in range(epoch_size):
print('----------------------------------------------')
print('the epoch', i, 'data: \n')
for item in DataLoader(dataset, batch_size, shuffle=False):
print(item)
The output is:
the original data:
tensor([[[[0.5993, 0.5898],
[0.7365, 0.5472]]],
[[[0.1878, 0.3546],
[0.2124, 0.8324]]],
[[[0.9321, 0.0795],
[0.4090, 0.9513]]],
[[[0.2825, 0.6954],
[0.3737, 0.0869]]],
[[[0.2123, 0.7024],
[0.6270, 0.5923]]],
[[[0.9997, 0.9825],
[0.0267, 0.2910]]],
[[[0.2323, 0.1768],
[0.4646, 0.4487]]],
[[[0.2368, 0.0262],
[0.2423, 0.9593]]]])
----------------------------------------------
the epoch 0 data:
tensor([[[[0.5882, 0.5961],
[0.5451, 0.7333]]],
[[[0.3529, 0.1843],
[0.8314, 0.2118]]],
[[[0.9294, 0.0784],
[0.4078, 0.9490]]],
[[[0.6941, 0.2824],
[0.0863, 0.3725]]]])
tensor([[[[0.7020, 0.2118],
[0.5922, 0.6235]]],
[[[0.9804, 0.9961],
[0.2902, 0.0235]]],
[[[0.2314, 0.1765],
[0.4627, 0.4471]]],
[[[0.0235, 0.2353],
[0.9569, 0.2392]]]])
----------------------------------------------
the epoch 1 data:
tensor([[[[0.5882, 0.5961],
[0.5451, 0.7333]]],
[[[0.1843, 0.3529],
[0.2118, 0.8314]]],
[[[0.0784, 0.9294],
[0.9490, 0.4078]]],
[[[0.2824, 0.6941],
[0.3725, 0.0863]]]])
tensor([[[[0.2118, 0.7020],
[0.6235, 0.5922]]],
[[[0.9804, 0.9961],
[0.2902, 0.0235]]],
[[[0.2314, 0.1765],
[0.4627, 0.4471]]],
[[[0.0235, 0.2353],
[0.9569, 0.2392]]]])
----------------------------------------------
the epoch 2 data:
tensor([[[[0.5882, 0.5961],
[0.5451, 0.7333]]],
[[[0.3529, 0.1843],
[0.8314, 0.2118]]],
[[[0.0784, 0.9294],
[0.9490, 0.4078]]],
[[[0.6941, 0.2824],
[0.0863, 0.3725]]]])
tensor([[[[0.2118, 0.7020],
[0.6235, 0.5922]]],
[[[0.9961, 0.9804],
[0.0235, 0.2902]]],
[[[0.2314, 0.1765],
[0.4627, 0.4471]]],
[[[0.0235, 0.2353],
[0.9569, 0.2392]]]])
Different epoch we get different outputs!
TLDR :
The transform operation applies a bunch of transforms with a certain probability to the input batch that comes in the loop. So the model now is exposed to more examples during the course of multiple epochs.
Personally, when I was Training an audio classification model on my own dataset, before augmentation, my model always seem to converge at 72 % accuracy. I used augmentation along with an increased number of training epochs, Which boosted the validation accuracy in the test set to 89 percent.
In PyTorch, there are types of cropping that DO change the size of the dataset. These are FiveCrop and TenCrop:
CLASS torchvision.transforms.FiveCrop(size)
Crop the given image into four corners and the central crop.
This transform returns a tuple of images and there may be a mismatch
in the number of inputs and targets your Dataset returns. See below
for an example of how to deal with this.
Example:
>>> transform = Compose([
>>> TenCrop(size), # this is a list of PIL Images
>>> Lambda(lambda crops: torch.stack([ToTensor()(crop) for crop in crops])) # returns a 4D tensor
>>> ])
>>> #In your test loop you can do the following:
>>> input, target = batch # input is a 5d tensor, target is 2d
>>> bs, ncrops, c, h, w = input.size()
>>> result = model(input.view(-1, c, h, w)) # fuse batch size and ncrops
>>> result_avg = result.view(bs, ncrops, -1).mean(1) # avg over crops
TenCrop is the same plus the flipped version of the five patches (horizontal flipping is used by default).

Display image in a PIL format from torch.Tensor

I’m quite new to Pytorch. I was wondering how I could convert my tensor of size torch.Size([1, 3, 224, 224]) to display in an image format on a Jupyter notebook. A PIL format or a CV2 format should be fine.
I tried using transforms.ToPILImage(x) but it resulted in a different format like this: ToPILImage(mode=ToPILImage(mode=tensor([[[[1.3034e-16, 1.3034e-16, 1.3034e-16, ..., 1.4475e-16,.
Maybe I’m doing something wrong :no_mouth:
Since your image is normalized, you need to unnormalize it. You have to do the reverse operations that you did during normalization. One way is
class UnNormalize(object):
def __init__(self, mean, std):
self.mean = mean
self.std = std
def __call__(self, tensor):
"""
Args:
tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
Returns:
Tensor: Normalized image.
"""
for t, m, s in zip(tensor, self.mean, self.std):
t.mul_(s).add_(m)
# The normalize code -> t.sub_(m).div_(s)
return tensor
To use this, you'll need the mean and standard deviation (which you used to normalize the image). Then,
unorm = UnNormalize(mean = [0.35675976, 0.37380189, 0.3764753], std = [0.32064945, 0.32098866, 0.32325324])
image = unorm(normalized_image)

How to use multiprocessing in PyTorch?

I'm trying to use PyTorch with complex loss function. In order to accelerate the code, I hope that I can use the PyTorch multiprocessing package.
The first trial, I put 10x1 features into the NN and get 10x4 output.
After that, I want to pass 10x4 parameters into a function to do some calculation. (The calculation will be complex in the future.)
After calculating, the function will return a 10x1 array in total. This array will be set as NN_energy and calculate loss function.
Besides, I also want to know if there is another method to create a backward-able array to store the NN_energy array, instead of using
NN_energy = net(Data_in)[0:10,0]
Thanks a lot.
Full Code:
import torch
import numpy as np
from torch.autograd import Variable
from torch import multiprocessing
def func(msg,BOP):
ans = (BOP[msg][0]+BOP[msg][1]/BOP[msg][2])*BOP[msg][3]
return ans
class Net(torch.nn.Module):
def __init__(self, n_feature, n_hidden_1, n_hidden_2, n_output):
super(Net, self).__init__()
self.hidden_1 = torch.nn.Linear(n_feature , n_hidden_1) # hidden layer
self.hidden_2 = torch.nn.Linear(n_hidden_1, n_hidden_2) # hidden layer
self.predict = torch.nn.Linear(n_hidden_2, n_output ) # output layer
def forward(self, x):
x = torch.tanh(self.hidden_1(x)) # activation function for hidden layer
x = torch.tanh(self.hidden_2(x)) # activation function for hidden layer
x = self.predict(x) # linear output
return x
if __name__ == '__main__': # apply_async
Data_in = Variable( torch.from_numpy( np.asarray(list(range( 0,10))).reshape(10,1) ).float() )
Ground_truth = Variable( torch.from_numpy( np.asarray(list(range(20,30))).reshape(10,1) ).float() )
net = Net( n_feature=1 , n_hidden_1=15 , n_hidden_2=15 , n_output=4 ) # define the network
optimizer = torch.optim.Rprop( net.parameters() )
loss_func = torch.nn.MSELoss() # this is for regression mean squared loss
NN_output = net(Data_in)
args = range(0,10)
pool = multiprocessing.Pool()
return_data = pool.map( func, zip(args, NN_output) )
pool.close()
pool.join()
NN_energy = net(Data_in)[0:10,0]
for i in range(0,10):
NN_energy[i] = return_data[i]
loss = torch.sqrt( loss_func( NN_energy , Ground_truth ) ) # must be (1. nn output, 2. target)
print(loss)
Error messages:
File
"C:\ProgramData\Anaconda3\lib\site-packages\torch\multiprocessing\reductions.py",
line 126, in reduce_tensor
raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, "
RuntimeError: Cowardly refusing to serialize non-leaf tensor which
requires_grad, since autograd does not support crossing process
boundaries. If you just want to transfer the data, call detach() on
the tensor before serializing (e.g., putting it on the queue).
First of all, Torch Variable API is deprecated since a very long time, just don't use it.
Next, torch.from_numpy( np.asarray(list(range( 0,10))).reshape(10,1) ).float() is wrong at many levels: np.asarray of list is useless since a copy will be performed anyway, and np.array takes list as input by design. Then, np.arange is available to return a range as numpy array, and it is also available on Torch. Next, specifying both dimension for reshape is useless and error prone, you could simply do reshape((-1, 1)), or even better unsqueeze(-1).
Here is the simplified expression torch.arange(10, dtype=torch.float32, requires_grad=True).unsqueeze(-1).
Using multiprocessing pool is a bad practice if using batch processing is possible. It will be both way more efficient and readable. Indeed, performing N small algebraic operations in parallel is always slower and a larger single algebraic operation, and even more on GPU. More importantly, computing the gradient is not supported by multiprocessing, hence the error that you get. Yet, this is partially true, because it is supports for tensors on cpu since 1.6.0. Have a lok, to the official release changelog.
Could you post a more representative example of what func method could be to make sure you really need it ?
NB: Distributed autograd as you are looking is now available in Pytorch as an experimental feature available in beta since 1.6.0. Have a look to the official documentation.

Resources