Order of rotated images by using a custom generator - python-3.x

I use a custom image data generator for my project. It receives batches of images and returns [0, 90, 180 and 270] degrees rotated versions of the images with the corresponding class indices {0:0, 1:90, 2:180, 3:270}. Lets assume we have images A, B and C in a batch and images A to Z in the whole data set. All the images are naturally in 0 degree orientation. Initially I returned all the rotated images at the same time. Here is a sample of returned batch: [A0,B0,C0,A1,B1,C1,...,A3,B3,C3]. But this gave me useless results. To compare my approach I trained the same model by using my generator and built in Keras ImageDataGenerator with flow_from_directory. For the built in function I manually rotated original images and stored them in separate folders. Here are the accuracy plots for comparison:
I used only a few images just to see if there is any difference. From the plots it is obvious that the custom generator is not correct. Hence I think it must return the images as [[A0,B0,C0],[D0,E0,F0]...[...,Z0]], then [[A1,B1,C1],[D1,E1,F1]...[...,Z1]] and so on. To do this I must use the folowing function for multiple times (in my case 4).
def next(self):
with self.lock:
# get input data index and size of the current batch
index_array = next(self.index_generator)
# create array to hold the images
return self._get_batches_of_transformed_samples(index_array)
This function iterates through the directory and returns batches of images. When it reaches to the last image it finishes and the next epoch starts. In my case, in one epoch I want to run this for 4 times by sending the rotation angle as an argument like this: self._get_batches_of_transformed_samples(index_array) , rotation_angle). I was wondering if this is possible or not? If not what could be the solution? Here is the current data generator code:
def _get_batches_of_transformed_samples(self, index_array):
# create list to hold the images and labels
batch_x = []
batch_y = []
# create angle categories corresponding to number of rotation angles
angle_categories = list(range(0, len(self.target_angles)))
# generate rotated images and corresponding labels
for rotation_angle, angle_indice in zip(self.target_angles, angle_categories):
for i, j in enumerate(index_array):
if self.filenames is None:
image = self.images[j]
if len(image.shape) == 2: image = cv2.cvtColor(image,cv2.COLOR_GRAY2RGB)
else:
is_color = int(self.color_mode == 'rgb')
image = cv2.imread(self.filenames[j], is_color)
if is_color:
if not image is None:
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# do nothing if the image is none
if not image is None:
rotated_im = rotate(image, rotation_angle, self.target_size[:2])
if self.preprocess_func: rotated_im = self.preprocess_func(rotated_im)
# add dimension to account for the channels if the image is greyscale
if rotated_im.ndim == 2: rotated_im = np.expand_dims(rotated_im, axis=2)
batch_x.append(rotated_im)
batch_y.append(angle_indice)
# convert lists to numpy arrays
batch_x = np.asarray(batch_x)
batch_y = np.asarray(batch_y)
batch_y = to_categorical(batch_y, len(self.target_angles))
return batch_x, batch_y
def next(self):
with self.lock:
# get input data index and size of the current batch
index_array = next(self.index_generator)
# create array to hold the images
return self._get_batches_of_transformed_samples(index_array)

Hmm I would probably do this through keras.utils.Sequence
from keras.utils import Sequence
import numpy as np
class RotationSequence(Sequence):
def __init__(self, x_set, y_set, batch_size, rotations=(0,90,180,270)):
self.rotations = rotations
self.x, self.y = x_set, y_set
self.batch_size = batch_size
def __len__(self):
return int(np.ceil(len(self.x) / float(self.batch_size)))
def __getitem__(self, idx):
batch_x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size]
batch_y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size]
x, y = [], []
for rot in self.rotations:
x += [rotate(cv2.imread(file_name), rotation_angle) for file_name in batch_x]
y += batch_y
return np.array(x), np.array(y)
def on_epoch_end(self):
shuffle_idx = np.random.permutation(len(self.x))
self.x, self.y = self.x[shuffle_idx], self.y[shuffle_idx]
And then just pass the batcher to model.fit()
rotation_batcher = RotationSequence(...)
model.fit_generator(rotation_batcher,
steps_per_epoch=len(rotation_batcher),
validation_data=validation_batcher,
epochs=epochs)
This allows you to have more control over the batches being fed into your model. This implementation will almost run. You just need to implement the rotate() function in __getitem__. Also, the batch_size will be 4 times the set size because I just duplicated and rotated each batch. Hope this is helpful to you

Related

Pytorch freezes when checking dataloader

I am running this block of codes for Pytorch and it seems to run forever/freeze in my notebook. I suspect it has something to do with my dataloader but I can't seem to figure out what is wrong here. I am running this on a GPU environment and I have previously ran tensorflow v2 keras for the CNN model and it was able to work.
In addition I have also tried to do model.train() and it was also stuck at the first epoch.
Code I am running
import time
start_time = time.time()
for data, label in train_dataloader:
print(data.size())
print(label.size())
break
print("Time taken: ", time.time() - start_time)
The dataloader is implemented with these line of codes
train_dataset = ChestXrayDataset("dataset/CheXpert-v1.0-small/train/train", train_data, IMAGE_SIZE, True)
train_dataloader = DataLoader(dataset=train_dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=2, pin_memory=True)
These are the parameters
IMAGE_SIZE = 224 # Image size (224x224)
IMAGENET_MEAN = [0.485, 0.456, 0.406] # Mean of ImageNet dataset (used for normalization)
IMAGENET_STD = [0.229, 0.224, 0.225] # Std of ImageNet dataset (used for normalization)
BATCH_SIZE = 96
LEARNING_RATE = 0.001
LEARNING_RATE_SCHEDULE_FACTOR = 0.1 # Parameter used for reducing learning rate
LEARNING_RATE_SCHEDULE_PATIENCE = 5 # Parameter used for reducing learning rate
MAX_EPOCHS = 100 # Maximum number of training epochs
I have checked the dataloader and this is what I got
<torch.utils.data.dataloader.DataLoader at 0x1f96cd5f6a0>
The class for ChestXrayDataset is shown here
class ChestXrayDataset(Dataset):
def __init__(self, folder_dir, dataframe, image_size, normalization):
"""
Init Dataset
Parameters
----------
folder_dir: str
folder contains all images
dataframe: pandas.DataFrame
dataframe contains all information of images
image_size: int
image size to rescale
normalization: bool
whether applying normalization with mean and std from ImageNet or not
"""
self.image_paths = [] # List of image paths
self.image_labels = [] # List of image labels
# Define list of image transformations
image_transformation = [
transforms.Resize((image_size, image_size)),
transforms.ToTensor()
]
if normalization:
# Normalization with mean and std from ImageNet
image_transformation.append(transforms.Normalize(IMAGENET_MEAN, IMAGENET_STD))
self.image_transformation = transforms.Compose(image_transformation)
# Get all image paths and image labels from dataframe
for index, row in dataframe.iterrows():
image_path = os.path.join(folder_dir, row.Path)
self.image_paths.append(image_path)
if len(row) < 14:
labels = [0] * 14
else:
labels = []
for col in row[5:]:
if col == 1:
labels.append(1)
else:
labels.append(0)
self.image_labels.append(labels)
def __len__(self):
return len(self.image_paths)
def __getitem__(self, index):
"""
Read image at index and convert to torch Tensor
"""
# Read image
image_path = self.image_paths[index]
image_data = Image.open(image_path).convert("RGB") # Convert image to RGB channels
# TODO: Image augmentation code would be placed here
# Resize and convert image to torch tensor
image_data = self.image_transformation(image_data)
return image_data, torch.FloatTensor(self.image_labels[index])
Checking the length of dataframe.iterrows() and row[5:] would help.

Map function within each batch when parsing tensorflow records

Basically this code allowed me to achieve is randomly applying image augmentation to my training samples in tfrecords. The following code will treat each batch (32 pics) the same: either flip/rotation, cutout or do nothing. But I would like to apply the image_aug within each batch such that each batch contains 25%, 25% and 50% of above mentioned transformed image.
Here is my image augmentation and parse function for tfrecords
def decode_image(image_data, shape):
image = tf.io.decode_png(image_data, channels=shape[-1])
image = tf.cast(image, tf.float32) / 255.0 # convert image to floats in [0, 1] range
image = tf.reshape(image, shape) # explicit size needed for TPU
return image
def image_aug(image):
random_num = np.random.rand()
if random_num < 0.25:
data_augmentation = keras.Sequential([
keras.layers.RandomFlip("horizontal_and_vertical"),
keras.layers.RandomRotation(0.2),
])
image = data_augmentation(tf.expand_dims(image, axis=0))
elif random_num < 0.5:
image = tfa.image.random_cutout(
# image,
tf.expand_dims(image, axis=0),
mask_size = (100, 100),
constant_values = 1
)
return tf.squeeze(image)
def parse_example(serialized, shape, data_aug=False):
features = {'image': tf.io.FixedLenFeature([], tf.string),
'label': tf.io.FixedLenFeature([], tf.int64)
}
# Parse the serialized data so we get a dict with our data.
parsed_example = tf.io.parse_single_example(serialized=serialized, features=features)
image_raw = parsed_example['image'] # Get the image as raw bytes.
image = decode_image(image_raw, shape) # Decode the raw bytes so it becomes a tensor with type.
# label = tf.io.decode_raw(parsed_example['label'], tf.uint8)
# label = tf.cast(parsed_example['label'], tf.int64)
if data_aug:
# image = image.numpy()
# for i in range(len(image)):
image = image_aug(image)
return image, tf.cast(parsed_example['label'], tf.float32)
The call of tf.dataset looks like this:
train_dataset = tf.data.TFRecordDataset(np.asarray(train_val)[tr_idx])
train_dataset = train_dataset.map(partial(parse_example, data_aug = True, shape=IMAGE_SIZE)).cache().shuffle(2048).prefetch(AUTOTUNE).batch(BATCH_SIZE).repeat(NUM_EPOCHS)
Any ideas will be appreciated!

Create custom datagenerator in Keras using my own dataset

I want to create my own custom DataGenerator on my own dataset. I have read all the images and stored the locations and their labels in two variables named images and labels. I have written this custom generator:
def data_gen(img_folder, y, batch_size):
c = 0
n_image = list(np.arange(0,len(img_folder),1)) #List of training images
random.shuffle(n_image)
while (True):
img = np.zeros((batch_size, 224, 224, 3)).astype('float') #Create zero arrays to store the batches of training images
label = np.zeros((batch_size)).astype('float') #Create zero arrays to store the batches of label images
for i in range(c, c+batch_size): #initially from 0 to 16, c = 0.
train_img = imread(img_folder[n_image[i]])
# row,col= train_img.shape
train_img = cv2.resize(train_img, (224,224), interpolation = cv2.INTER_LANCZOS4)
train_img = train_img.reshape(224, 224, 3)
# binary_img = binary_img[:,:128//2]
img[i-c] = train_img #add to array - img[0], img[1], and so on.
label[i-c] = y[n_image[i]]
c+=batch_size
if(c+batch_size>=len((img_folder))):
c=0
random.shuffle(n_image)
# print "randomizing again"
yield img, label
What I want to know is how can I add other augmentations like flip, crop, rotate to this generator? Moreover, how should I yield these augmentations so that they are linked with the correct label.
Please let me know.
You can add flip, crop, rotate on train_img before putting it into the img. That is,
# ....
While(True):
# ....
# add your data augmentation function here
train_img = data_augmentor(train_img)
img[i-c] = train_img
# ....

Keras : using generators to output trainingset batches and targets but also auxiliary data not used for training

I need to use generators (because of too large datasets) to yield training data and targets to a CNN for training. However, each data sample is normalized (/maxVal) and I need to un-normalize/de-normalize it just before the loss function. I don't know how to output this auxiliary data at the same time as a batch of (X,Y) from the generator?
It is something very similar to https://towardsdatascience.com/keras-data-generators-and-how-to-use-them-b69129ed779c :
import numpy as np
import cv2
from tensorflow.keras.utils import Sequence
class DataGenerator(Sequence):
"""Generates data for Keras
Sequence based data generator. Suitable for building data generator for training and prediction.
"""
def __init__(self, list_IDs, labels, image_path, mask_path,
to_fit=True, batch_size=32, dim=(256, 256),
n_channels=1, n_classes=10, shuffle=True):
"""Initialization
:param list_IDs: list of all 'label' ids to use in the generator
:param labels: list of image labels (file names)
:param image_path: path to images location
:param mask_path: path to masks location
:param to_fit: True to return X and y, False to return X only
:param batch_size: batch size at each iteration
:param dim: tuple indicating image dimension
:param n_channels: number of image channels
:param n_classes: number of output masks
:param shuffle: True to shuffle label indexes after every epoch
"""
self.list_IDs = list_IDs
self.labels = labels
self.image_path = image_path
self.mask_path = mask_path
self.to_fit = to_fit
self.batch_size = batch_size
self.dim = dim
self.n_channels = n_channels
self.n_classes = n_classes
self.shuffle = shuffle
self.on_epoch_end()
def __len__(self):
"""Denotes the number of batches per epoch
:return: number of batches per epoch
"""
return int(np.floor(len(self.list_IDs) / self.batch_size))
def __getitem__(self, index):
"""Generate one batch of data
:param index: index of the batch
:return: X and y when fitting. X only when predicting
"""
# Generate indexes of the batch
indexes = self.indexes[index * self.batch_size:(index + 1) * self.batch_size]
# Find list of IDs
list_IDs_temp = [self.list_IDs[k] for k in indexes]
# Generate data
X = self._generate_X(list_IDs_temp)
if self.to_fit:
y = self._generate_y(list_IDs_temp)
return X/np.max(X), y/np.max(y)
else:
return X
def on_epoch_end(self):
"""Updates indexes after each epoch
"""
self.indexes = np.arange(len(self.list_IDs))
if self.shuffle == True:
np.random.shuffle(self.indexes)
def _generate_X(self, list_IDs_temp):
"""Generates data containing batch_size images
:param list_IDs_temp: list of label ids to load
:return: batch of images
"""
# Initialization
X = np.empty((self.batch_size, *self.dim, self.n_channels))
# Generate data
for i, ID in enumerate(list_IDs_temp):
# Store sample
X[i,] = self._load_grayscale_image(self.image_path + self.labels[ID])
return X
def _generate_y(self, list_IDs_temp):
"""Generates data containing batch_size masks
:param list_IDs_temp: list of label ids to load
:return: batch if masks
"""
y = np.empty((self.batch_size, *self.dim), dtype=int)
# Generate data
for i, ID in enumerate(list_IDs_temp):
# Store sample
y[i,] = self._load_grayscale_image(self.mask_path + self.labels[ID])
return y
def _load_grayscale_image(self, image_path):
"""Load grayscale image
:param image_path: path to image to load
:return: loaded image
"""
img = cv2.imread(image_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = img / 255
return img
So, if I have understood your need correctly, what you need to do:
Fit a MinMaxScaler on your whole target (y) dataset (if possible)
For each batch
Scale your batch's targets
Yield your batch's targets
Create a custom loss function that takes your scaler as an argument
Call your scaler's inverse_transform on your y_true and y_pred in your custom loss
Call your favorite loss function on your de-normalized y_true and y_pred and return its value

Correct way of doing data augmentation in TensorFlow with the dataset api?

So, I've been playing around with the TensorFlow dataset API for loading images, and segmentation masks (for a semantic segmentation project), I would like to be able to generate batches of images and masks, with each image having randomly gone through any combination of pre-processing functions like brightness changes, contrast changes, cropping, saturation changes etc. So, the first image in my batch may have no pre-processing, second may have saturation changes, third may have brightness and saturation and so on.
I tried the following:
import tensorflow as tf
from tensorflow.contrib.data import Dataset, Iterator
import random
def _resize_image(image, mask):
image = tf.image.resize_bicubic(image, [480, 640], True)
mask = tf.image.resize_bicubic(mask, [480, 640], True)
return image, mask
def _corrupt_contrast(image, mask):
image = tf.image.random_contrast(image, 0, 5)
return image, mask
def _corrupt_saturation(image, mask):
image = tf.image.random_saturation(image, 0, 5)
return image, mask
def _corrupt_brightness(image, mask):
image = tf.image.random_brightness(image, 5)
return image, mask
def _random_crop(image, mask):
seed = random.random()
image = tf.random_crop(image, [240, 320, 3], seed=seed)
mask = tf.random_crop(mask, [240, 320, 1], seed=seed)
return image, mask
def _flip_image_horizontally(image, mask):
seed = random.random()
image = tf.image.random_flip_left_right(image, seed=seed)
mask = tf.image.random_flip_left_right(mask, seed=seed)
return image, mask
def _flip_image_vertically(image, mask):
seed = random.random()
image = tf.image.random_flip_up_down(image, seed=seed)
mask = tf.image.random_flip_up_down(mask, seed=seed)
return image, mask
def _normalize_data(image, mask):
image = tf.cast(image, tf.float32)
image = image / 255.0
mask = tf.cast(mask, tf.float32)
mask = mask / 255.0
return image, mask
def _parse_data(image_paths, mask_paths):
image_content = tf.read_file(image_paths)
mask_content = tf.read_file(mask_paths)
images = tf.image.decode_png(image_content, channels=3)
masks = tf.image.decode_png(mask_content, channels=1)
return images, masks
def data_batch(image_paths, mask_paths, params, batch_size=4, num_threads=2):
# Convert lists of paths to tensors for tensorflow
images_name_tensor = tf.constant(image_paths)
mask_name_tensor = tf.constant(mask_paths)
# Create dataset out of the 2 files:
data = Dataset.from_tensor_slices(
(images_name_tensor, mask_name_tensor))
# Parse images and labels
data = data.map(
_parse_data, num_threads=num_threads, output_buffer_size=6 * batch_size)
# Normalize images and masks for vals. between 0 and 1
data = data.map(_normalize_data, num_threads=num_threads, output_buffer_size=6 * batch_size)
if params['crop'] and not random.randint(0, 1):
data = data.map(_random_crop, num_threads=num_threads,
output_buffer_size=6 * batch_size)
if params['brightness'] and not random.randint(0, 1):
data = data.map(_corrupt_brightness, num_threads=num_threads,
output_buffer_size=6 * batch_size)
if params['contrast'] and not random.randint(0, 1):
data = data.map(_corrupt_contrast, num_threads=num_threads,
output_buffer_size=6 * batch_size)
if params['saturation'] and not random.randint(0, 1):
data = data.map(_corrupt_saturation, num_threads=num_threads,
output_buffer_size=6 * batch_size)
if params['flip_horizontally'] and not random.randint(0, 1):
data = data.map(_flip_image_horizontally,
num_threads=num_threads, output_buffer_size=6 * batch_size)
if params['flip_vertically'] and not random.randint(0, 1):
data = data.map(_flip_image_vertically, num_threads=num_threads,
output_buffer_size=6 * batch_size)
# Shuffle the data queue
data = data.shuffle(len(image_paths))
# Create a batch of data
data = data.batch(batch_size)
data = data.map(_resize_image, num_threads=num_threads,
output_buffer_size=6 * batch_size)
# Create iterator
iterator = Iterator.from_structure(data.output_types, data.output_shapes)
# Next element Op
next_element = iterator.get_next()
# Data set init. op
init_op = iterator.make_initializer(data)
return next_element, init_op
But all batches returned by this have the same transformations applied to them, not different combinations, my guess is that the random.randint persists, and is not actually run for each batch, if so, how do I fix this to get the desired result?
For an example of how I plan to use it (I feel that's irrelevant to the problem but people might still want to know) can be found here
So the problem was indeed that the control flow with the if statements are with Python variables, and are only executed once when the graph is created, to do what I want to do, I had to define a placeholder that contains the boolean values of whether to apply a function or not (and feed in a new boolean tensor per iteration to change the augmentation), and control flow is handled by tf.cond. I pushed the new code to the GitHub link I posted in the question above if anyone is interested.

Resources