I have a situation where I need to use ImageFolder with the albumentations lib to make the augmentations in pytorch - custom dataloader is not an option.
To this end, I am stumped and I am not able to get ImageFolder to work with albumenations. I have tried something along these lines:
class Transforms:
def __init__(self, transforms: A.Compose):
self.transforms = transforms
def __call__(self, img, *args, **kwargs):
return self.transforms(image=np.array(img))['image']
and then:
trainset = datasets.ImageFolder(traindir,transform=Transforms(transforms=A.Resize(32 , 32)))
where traindir is some dir with images. I however get thrown a weird error:
RuntimeError: Given groups=1, weight of size [16, 3, 3, 3], expected input[1024, 32, 32, 3] to have 3 channels, but got 32 channels instead
and I cant seem to find a reproducible example to make a simple aug pipleline work with imagefolder.
UPDATE
On the recommendation of #Shai, I have done this now:
class Transforms:
def __init__(self):
self.transforms = A.Compose([A.Resize(224,224),ToTensorV2()])
def __call__(self, img, *args, **kwargs):
return self.transforms(image=np.array(img))['image']
trainset = datasets.ImageFolder(traindir,transform=Transforms())
but I get thrown:
self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.cuda.ByteTensor) and weight type (torch.cuda.FloatTensor) should be the same
You need to use ToTensorV2 transformation as the final one:
trainset = datasets.ImageFolder(traindir,transform=Transforms(transforms=A.Compose([A.Resize(32 , 32), ToTensorV2()]))
By looking into ImageFolder implementation on PyTorch[link] and some proposed work in Kaggle [link]. I propose the following solution (which is successfully tested from my side):
import numpy as np
from typing import Any, Callable, Optional, Tuple
from torchvision.datasets.folder import DatasetFolder, default_loader, IMG_EXTENSIONS
class CustomImageFolder(DatasetFolder):
def __init__(
self,
root: str,
transform: Optional[Callable] = None,
target_transform: Optional[Callable] = None,
loader: Callable[[str], Any] = default_loader,
is_valid_file: Optional[Callable[[str], bool]] = None,
):
super().__init__(
root,
loader,
IMG_EXTENSIONS if is_valid_file is None else None,
transform=transform,
target_transform=target_transform,
is_valid_file=is_valid_file,
)
self.imgs = self.samples
def __getitem__(self, index: int) -> Tuple[Any, Any]:
"""
Args:
index (int): Index
Returns:
tuple: (sample, target) where target is class_index of the target class.
"""
path, target = self.samples[index]
sample = self.loader(path)
if self.transform is not None:
try:
sample = self.transform(sample)
except Exception:
sample = self.transform(image=np.array(sample))["image"]
if self.target_transform is not None:
target = self.target_transform(target)
return sample, target
def __len__(self) -> int:
return len(self.samples)
Now you can run the code as follows:
trainset = CustomImageFolder(traindir,transform=Transforms(transforms=A.Resize(32 , 32)))
Related
For example, I'm trying to view the implementation of RoI Pooling in pytorch.
Here is a code fragment showing how to use RoIPool in pytorch
import torch
from torchvision.ops.roi_pool import RoIPool
device = torch.device('cuda')
# create feature layer, proposals and targets
num_proposals = 10
feature_map = torch.randn(1, 64, 32, 32)
proposals = torch.zeros((num_proposals, 4))
proposals[:, 0] = torch.randint(0, 16, (num_proposals,))
proposals[:, 1] = torch.randint(0, 16, (num_proposals,))
proposals[:, 2] = torch.randint(16, 32, (num_proposals,))
proposals[:, 3] = torch.randint(16, 32, (num_proposals,))
roi_pool_obj = RoIPool(3, 2**-1)
roi_pool = roi_pool_obj(feature_map, [proposals])
I'm using pychram, so when I follow RoIPool from the second line, it opens a file located at ~/anaconda3/envs/CV/lib/python3.8/site-package/torchvision/ops/roi_pool.py, which is exactly the same as codes in the documentation.
I pasted the code below without documentations.
from typing import List, Union
import torch
from torch import nn, Tensor
from torch.jit.annotations import BroadcastingList2
from torch.nn.modules.utils import _pair
from torchvision.extension import _assert_has_ops
from ..utils import _log_api_usage_once
from ._utils import convert_boxes_to_roi_format, check_roi_boxes_shape
def roi_pool(
input: Tensor,
boxes: Union[Tensor, List[Tensor]],
output_size: BroadcastingList2[int],
spatial_scale: float = 1.0,
) -> Tensor:
if not torch.jit.is_scripting() and not torch.jit.is_tracing():
_log_api_usage_once(roi_pool)
_assert_has_ops()
check_roi_boxes_shape(boxes)
rois = boxes
output_size = _pair(output_size)
if not isinstance(rois, torch.Tensor):
rois = convert_boxes_to_roi_format(rois)
output, _ = torch.ops.torchvision.roi_pool(input, rois, spatial_scale, output_size[0], output_size[1])
return output
class RoIPool(nn.Module):
def __init__(self, output_size: BroadcastingList2[int], spatial_scale: float):
super().__init__()
_log_api_usage_once(self)
self.output_size = output_size
self.spatial_scale = spatial_scale
def forward(self, input: Tensor, rois: Tensor) -> Tensor:
return roi_pool(input, rois, self.output_size, self.spatial_scale)
def __repr__(self) -> str:
s = f"{self.__class__.__name__}(output_size={self.output_size}, spatial_scale={self.spatial_scale})"
return s
So, in the code example:
When running roi_pool_obj = RoIPool(3, 2**-1) it will create an instance of RoIPool by calling its __init__ method, which only initialized two instance variables;
When running roi_pool = roi_pool_obj(feature_map, [proposals]), it must have called the forward() method (but I don't know how) which then called the roi_pool() function above;
When running the roi_pool() function, it did some checking first and then computed output with the line output, _ = torch.ops.torchvision.roi_pool(input, rois, spatial_scale, output_size[0], output_size[1]).
But this doesn't show details of how roi_pool is implemented and pycharm showed Cannot find declaration to go to when I tried to follow torch.ops.torchvision.roi_pool.
To summarize, I have two questions:
How does the forward() called by running roi_pool = roi_pool_obj(feature_map, [proposals])?
How can I view the source code of torch.ops.torchvision.roi_pool or where is the file containing it's implementaion located?
Last but not least, I've just started reading source code which is pretty difficult for me. I'd appreciate it if you can also provide some advice or tutorials.
RoIPool is a subclass of torch.nn.Module. Source code:
https://github.com/pytorch/vision/blob/07ae61bf9c21ddd1d5f65d326aa9636849b383ca/torchvision/ops/roi_pool.py#L56
nn.Module defines __call__ method which in turn calls forward method. Source code:
https://github.com/pytorch/pytorch/blob/b2311192e6c4745aac3fdd774ac9d56a36b396d4/torch/nn/modules/module.py#L1234
When you executing roi_pool = roi_pool_obj(feature_map, [proposals]) statement the __call__ method uses the forward() of RoiPool. Source code:
https://github.com/pytorch/vision/blob/07ae61bf9c21ddd1d5f65d326aa9636849b383ca/torchvision/ops/roi_pool.py#L67
RoiPool.forward calls torch.ops.torchvision.roi_pool.
https://github.com/pytorch/vision/blob/07ae61bf9c21ddd1d5f65d326aa9636849b383ca/torchvision/ops/roi_pool.py#L52
ops is a object which loads native libraries implemented in c++:
https://github.com/pytorch/pytorch/blob/b2311192e6c4745aac3fdd774ac9d56a36b396d4/torch/_ops.py#L537
so when you call torch.ops.torchvision it will use torchvision library.
Here the roi_pool function is registered:
https://github.com/pytorch/vision/blob/7947fc8fb38b1d3a2aca03f22a2e6a3caa63f2a0/torchvision/csrc/ops/roi_pool.cpp#L53
Here you can find the actual implementation of rol_pool
CPU:
https://github.com/pytorch/vision/blob/7947fc8fb38b1d3a2aca03f22a2e6a3caa63f2a0/torchvision/csrc/ops/cpu/roi_pool_kernel.cpp
GPU:
https://github.com/pytorch/vision/blob/7947fc8fb38b1d3a2aca03f22a2e6a3caa63f2a0/torchvision/csrc/ops/cuda/roi_pool_kernel.cu
I'm wondering how to create a DataLoader that supports multiple types of labels in Pytorch. How do I do this?
You can return a dict of labels for each item in the dataset, and DataLoader is smart enough to collate them for you. i.e. if you provide a dict for each item, the DataLoader will return a dict, where the keys are the label types. Accessing a key of that label type returns a collated tensor of that label type.
See below:
import torch
from torch.utils.data import Dataset, DataLoader
import numpy as np
class M(Dataset):
def __init__(self):
super().__init__()
self.data = np.random.randn(20, 2)
print(self.data)
def __getitem__(self, i):
return self.data[i], {'label_1':self.data[i], 'label_2':self.data[i]}
def __len__(self):
return len(self.data)
ds = M()
dl = DataLoader(ds, batch_size=6)
for x, y in dl:
print(x, '\n', y)
print(type(x), type(y))
[[-0.33029911 0.36632142]
[-0.25303721 -0.11872778]
[-0.35955625 -1.41633132]
[ 1.28814629 0.38238357]
[ 0.72908184 -0.09222787]
[-0.01777293 -1.81824167]
[-0.85346074 -1.0319562 ]
[-0.4144832 0.12125039]
[-1.29546792 -1.56314292]
[ 1.22566887 -0.71523568]]
tensor([[-0.3303, 0.3663],
[-0.2530, -0.1187],
[-0.3596, -1.4163]], dtype=torch.float64)
{'item_1': tensor([[-0.3303, 0.3663],
[-0.2530, -0.1187],
[-0.3596, -1.4163]], dtype=torch.float64), 'item_2': tensor([[-0.3303, 0.3663],
[-0.2530, -0.1187],
[-0.3596, -1.4163]], dtype=torch.float64)}
<class 'torch.Tensor'> <class 'dict'>
...
I'd like to binarize image before passing it to the dataloader, I have created a dataset class which works well. but in the __getitem__() method I'd like to threshold the image:
def __getitem__(self, idx):
# Open image, apply transforms and return with label
img_path = os.path.join(self.dir, self.filelist[filename"])
image = Image.open(img_path)
label = self.x_data.iloc[idx]["label"]
# Applying transformation to the image
if self.transforms is not None:
image = self.transforms(image)
# applying threshold here:
my_threshold = 240
image = image.point(lambda p: p < my_threshold and 255)
image = torch.tensor(image)
return image, label
And then I tried to invoke the dataset:
data_transformer = transforms.Compose([
transforms.Resize((10, 10)),
transforms.Grayscale()
//transforms.ToTensor()
])
train_set = MyNewDataset(data_path, data_transformer, rows_train)
Since I have applied the threshold on a PIL object I need to apply afterwards a conversion to a tensor object , but for some reason it crashes. can somebody please assist me?
Why not apply the binarization after the conversion from PIL.Image to torch.Tensor?
class ThresholdTransform(object):
def __init__(self, thr_255):
self.thr = thr_255 / 255. # input threshold for [0..255] gray level, convert to [0..1]
def __call__(self, x):
return (x > self.thr).to(x.dtype) # do not change the data type
Once you have this transformation, you simply add it:
data_transformer = transforms.Compose([
transforms.Resize((10, 10)),
transforms.Grayscale(),
transforms.ToTensor(),
ThresholdTransform(thr_255=240)
])
I have a need to use a BatchSampler within a pytorch DataLoader instead of calling __getitem__ of the dataset multiple times (remote dataset, each query is pricy). I cannot understand how to use the batchsampler with any given dataset.
e.g
class MyDataset(Dataset):
def __init__(self, remote_ddf, ):
self.ddf = remote_ddf
def __len__(self):
return len(self.ddf)
def __getitem__(self, idx):
return self.ddf[idx] --------> This is as expensive as a batch call
def get_batch(self, batch_idx):
return self.ddf[batch_idx]
my_loader = DataLoader(MyDataset(remote_ddf),
batch_sampler=BatchSampler(Sampler(), batch_size=3))
The thing I do not understand, neither found any example online or in torch docs, is how do I use my get_batch function instead of the __getitem__ function.
Edit:
Following the answer of Szymon Maszke, this is what I tried and yet, \_\_get_item__ gets one index each call, instead of a list of size batch_size
class Dataset(Dataset):
def __init__(self):
...
def __len__(self):
...
def __getitem__(self, batch_idx): ------> here I get only one index
return self.wiki_df.loc[batch_idx]
loader = DataLoader(
dataset=dataset,
batch_sampler=BatchSampler(
SequentialSampler(dataset), batch_size=self.hparams.batch_size, drop_last=False),
num_workers=self.hparams.num_data_workers,
)
You can't use get_batch instead of __getitem__ and I don't see a point to do it like that.
torch.utils.data.BatchSampler takes indices from your Sampler() instance (in this case 3 of them) and returns it as list so those can be used in your MyDataset __getitem__ method (check source code, most of samplers and data-related utilities are easy to follow in case you need it).
I assume your self.ddf supports list slicing (e.g. self.ddf[[25, 44, 115]] returns values correctly and uses only one expensive call). In this case simply switch get_batch into __getitem__ and you are good to go.
class MyDataset(Dataset):
def __init__(self, remote_ddf, ):
self.ddf = remote_ddf
def __len__(self):
return len(self.ddf)
def __getitem__(self, batch_idx):
return self.ddf[batch_idx] -> batch_idx is a list
EDIT: You have to specify batch_sampler as sampler, otherwise the batch will be divided into single indices. This should be fine:
loader = DataLoader(
dataset=dataset,
# This line below!
sampler=BatchSampler(
SequentialSampler(dataset), batch_size=self.hparams.batch_size, drop_last=False
),
num_workers=self.hparams.num_data_workers,
)
Hi I have a function called
tfnet.return_predict()
which when run on an image outputs certain set o values such as the class of object confidence and coordinates of bounding box. What i want to do is make a wrapper which returns only the confidence value.
So my code is as follows. I am using Darkflow to perform Prediction of classes on images.
#Initialise Libraries
# Load the YOLO Neural Network
tfnet = TFNet(options) #call the YOLO network
image = cv2.imread('C:/darkflow/Car.jpg', cv2.IMREAD_COLOR) #Load image
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
print(tfnet.return_predict(image)) #function to run predictions
The output of print is
[{'label': 'Car', 'confidence': 0.32647023, 'topleft': {'x': 98, 'y': 249}, 'bottomright': {'x': 311, 'y': 455}}]
So from this i want to create a wrapper which just returns the 'confidence' value.
I know how to create wrappers and define functions for it but how to do it for already defined functions.
Any suggestion is of great help to mee
EDIT: I tried:
def log_calls(tfnet.return_predict):
def wrapper(*args, **kwargs):
#name = func.__name__
print('before {name} was called')
r = func(*args, **kwargs)
print('after {name} was called')
return r
return wrapper
But the 'tfnet.return_predict' is returning error
SyntaxError: invalid syntax
Do you need to redefine the tfnet.return_predict function to only return confidence? Or is having a separate function okay? If it's the latter, then it seems like you can just do this:
def conf_only(*args, **kwargs):
out = tfnet.return_predict(*args, **kwargs)
return out[0]["confidence"]
and calling conf_only returns just that part of the dict.
If you need to have tfnet.return_predict redefined and want that to only return confidence, then you can make a decorator:
def conf_deco(func):
def wrapper(*args, **kwargs):
return func(*args, **kwargs)[0]["confidence"]
return wrapper
For example, pretending dummy_function is already predefined
def dummy_function(*args, **kwargs):
print(args, kwargs)
return [{"confidence": .32, "other": "asdf"}]
In [4]: dummy_function("something", kw='else')
('something',) {'kw': 'else'}
Out[4]: [{'confidence': 0.32, 'other': 'asdf'}]
Now redefine it with:
In [6]: dummy_function = conf_deco(dummy_function)
and it'll only return the confidence value
In [7]: dummy_function("something", kw='else')
('something',) {'kw': 'else'}
Out[7]: 0.32