Are the input images resized to fixed width and height during the training in detectron2? - conv-neural-network

Are the input images resized to fixed width and height during the training in detectron2? and if yes please explain why? thanks!

Yes, it is. The reason is, large images cannot fit into the memory. Even, with 8GB GPU it is not possible to train with high resolution images.
Besides, there is trade off between image size and batch size; the larger image, the smaller batch size.
In Detectron2, you can change the minimum image size in the configuration like this:
from detectron2.config import get_cfg
cfg = get_cfg()
# minimum image size for the train set
cfg.INPUT.MIN_SIZE_TRAIN = (800,)
# maximum image size for the train set
cfg.INPUT.MAX_SIZE_TRAIN = 1333
# minimum image size for the test set
cfg.INPUT.MIN_SIZE_TEST = 800
# maximum image size for the test set
cfg.INPUT.MAX_SIZE_TEST = 1333
The image size is set for width of the image, and height is fitted correspondingly to avoid disordered images.
Besides, you can set multiple image sizes, so that during training one is selected randomly per image. like this:
cfg .INPUT.MIN_SIZE_TRAIN = (600, 900, 900)
cfg .INPUT.MIN_SIZE_TRAIN_SAMPLING = "choice"
For more information, see here.

Related

Splitforlder without shuffling data - How to split dataset among train, test and validation on disk without shuffling the information (python)?

I am working on a deep learning model that uses a large amount of time series related data. As the data is too big to be loaded in RAM at once, I will use keras train_on_batch to train the model reading data from disk.
I am looking for a simple and fast process to split the data among train, validation and test folders.
I´ve tried "splitfolder" function, but could not deactivate the data shuffling (what is inapropriate for time series related data). Arguments on this function documentation does not inclued an option to turn shuffle on/off.
Code I´ve tried:
import splitfolders
input_folder = r"E:\Doutorado\apagar"
splitfolders.ratio(input_folder, output = r'E:\Doutorado\apagardivididos', ratio=(0.7, 0.2, 0.1),
group_prefix=None)
Resulting split data is shuffled, but this shuffle is a problem for my time series analysis...
source: https://pypi.org/project/split-folders/
splitfolders.ratio("input_folder", output="output",
seed=1337, ratio=(.8, .1, .1), group_prefix=None, move=False) # default values
Usage:
splitfolders [--output] [--ratio] [--fixed] [--seed] [--oversample] [--group_prefix] [--move] folder_with_images
Options:
--output path to the output folder. defaults to output. Get created if non-existent.
--ratio the ratio to split. e.g. for train/val/test .8 .1 .1 -- or for train/val .8 .2 --.
--fixed set the absolute number of items per validation/test set. The remaining items constitute
the training set. e.g. for train/val/test 100 100 or for train/val 100.
Set 3 values, e.g. 300 100 100, to limit the number of training values.
--seed set seed value for shuffling the items. defaults to 1337.
--oversample enable oversampling of imbalanced datasets, works only with --fixed.
--group_prefix split files into equally-sized groups based on their prefix
--move move the files instead of copying

Max pooling with certain size output

I'm resizing (Downsample) an Image with an unknown size to a certain size.
for example:
The first image size is 45X45
The second image size is 57x57
The output size must be resizable and be the maximum value (max pooling)
I`ve tried to use skimage.measure.block_reduce, working good but I can't get the certain output size.
I've tried to use cv2.resize but it doesn't have the "Map pooling option"
maybe there is some function/library to do it?

Flow huge amount of images from memory to Keras generator

I am trying to train keras model with very large number of images and labels. I want to use the model.fit_generator and somehow flow the input images and labels from memory because we prepare all the data in memory, after the image is loaded. The problem is that we have plenty of large images that we then clip into smaller size and provide it like that to the model. We need a for loop inside a While loop.
Something like this:
While True:
for file in files: #lets say that there are 500 files (images)
image = ReadImage (file)
X = prepareImage(image) # here it is cut and prepared in specific shape
Y = labels
yield X[batch_start:batch_end],Y[batch_start:batch_end]
After it yields the last batch for the first image we need to load the next image in the for loop, prepare the data and yield again in the same epoch. For the second epoch we need again all the images. The problem here is that we prepare everything in memory, from 1 image we create millions of training data and then move to next image. We cannot write all the data to the disk and flow_from_directory, since it would require plenty of disk space.
Any hint?

When we implement YOLOv2 in darknet, after every 10 epochs, the image size is changed. How is this happening?

In YOLO v2, after every 10 epochs, the network randomly chooses a size How is this happening in darknet? I'm using ubuntu 18.04
I think the network sized is changed every 10 iterations (not epochs).
In your cfg file, check the random flag.
random = 1 means Yolo changes the network size for every 10 iterations, it is useful to increase precision by training the network on different resolution.
According to Yolo paper :
However, since our model only uses convolutional and pooling layers it
can be resized on the fly. We want YOLOv2 to be robust to running on
images of different sizes so we train this into the model. Instead of
fixing the input image size we change the network every few
iterations. Every 10 batches our network randomly chooses a new image
dimension size. Since our model downsamples by a factor of 32, we pull
from the following multiples of 32: {320, 352, ..., 608}. Thus the
smallest option is 320 × 320 and the largest is 608 × 608. We resize
the network to that dimension and continue training.

How to increase resolution of output Images using tensor flow object detection API?

I have trained my own model using tensorflow (https://github.com/tensorflow/models/tree/master/research/object_detection) to identify objects in images. I am testing this model using Google object detection API
My question is the way Google coded the ipython notebook is to output image which has size 200 kb to 300 kb output size, the link to this ipythonnotebook(https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb.)
How do I output images with orignal size (which is 15MB) (I am running this code on my local machine). Ive tried changing Helper Code session of the notebook it didnt work. Anything that I am missing here?
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
In the detection part of ipython notebook
I changed image size to
IMAGE_SIZE = (120, 80)
It did the trick

Resources