fast-cnn model - images with no labels in the training data - pytorch

How can I modify the example in the "Object Detection FasterRCNN Tutorial" (https://www.kaggle.com/code/pdochannel/object-detection-fasterrcnn-tutorial) to include empty labeled images in the training data? I want to be able to handle cases where there are no classes of interest in certain images.
Additionally, I have noticed that even when I include images of underwater scenes (where there are no objects of interest) the model still tends to detect something.

Related

How should I do the classification of images with different orientation?

I have been working on a dataset in which the goal is to determine that which type of orientation is it. It is a classification problem in which for each record(for most of them) I am having 4 images - front facing, left facing, right facing and back facing product images.
I want to classify these images in the above 4 categories.
The dataset looks like this :
I have downloaded the images and put them in different folders according to their classes.
Methods I have applied:
Till now I have applied two methods to classify these images.
1) I have tried vgg16 directly to classify the images but it did not give me even 50% accuracy.
2) I converted those images into edge images with black background as:
This is done using canny edge detection. It was done because in the result I was getting images with similar color dresses, similar design dresses, etc.
On top of these I again applied vgg16, resnet50, inception models but nothing seemed to work.
Can you suggest some ideas that can work in my case and classify the images in a better way.
first of all yor data set has to be equally splited. For instance 80% train and 20% test. After that you have to balance these sets (train set 60% of class A images, 40% of class B images) the exact same for test set.

Object detection - Ignore specific image zones during training

Im trying to use the Faster R-CNN Algorithm for vehicle detection using keras.
I have a dataset containing different folders with every folder containing multiple images. I've managed to transform the annotation files for the images into a CSV file for training process. the annotation files contain extra information about region to be ignored in the images during the training (black zones in the attached image). The image show the bounding boxes for the vehicles along with the ignored zones in the image in a test example based on the information obtained from the annotation file for this image.
Is there a way to specify the specific zones to focus on or to ignore while during training of the algorithm?
The loss function is modified to not penalize predictions inside these ignore annotation areas as false positives during training.
This phrase is of bellow article (Sec.3.2), a good way but I don't know how to do it.
Vandersteegen, Maarten, Kristof Van Beeck, and Toon Goedemé. "Super accurate low latency object detection on a surveillance UAV." 2019 16th International Conference on Machine Vision Applications (MVA). IEEE, 2019.

Model unable to identify distant objects

I have made a object recognition and detection model using tensorflow. It identifies the images which are clearly visible but its unable to identify if the same object is at a large distance. I am using Faster RCNN model. the model is able to identify the same object when it is closer but not when it is at a far distance. It has been trained already for the same object. How can i make the model identify objects at a distance?
You can resize and add padding using data augmentation to images with objects that are clearly visible so that they look like they are in a big distance and train your model further with those images

Can I train YOLO on small already segmented out images and test it on a large image for detection?

I have been thinking about building a YOLO model for detecting parking lot occupancy, I have all the small segmented out images for every parking space. Can I train YOLO on these small images already divided into separate empty and occupied classes and test it on a test image like the ariel view of a parking lot with say 28 parking spots and the model should detect the occupied and empty spaces.
If yes then can someone guide me how to approach the problem? I will be using YOLO implemented on Keras.
YOLO is a n object detection model. During training, it takes coordinates of bounding boxes in an image as input and learns to identify the images inside such bounding boxes. As per your problem statement, if you have a aerial view of parking lot then draw the bounding boxes, generate xml files (as per your training requirement) and start training. This ideally should give you the desired model to predict.
Free tool to label images - https://github.com/tzutalin/labelImg
Github project to get an idea of how to train Yolo in Keras on custom dataset - https://github.com/experiencor/keras-yolo2
By any means, this is not a perfect tailor made solution for your problem given you haven't provided any code or images. But this is a good place to start.

How many images(minimum) should be there in each classes for training YOLO?

I am trying to implement YOLOv2 on my custom dataset. Is there any minimum number of images required for each class?
There is no minimum images per class for training. Of course the lower number you have, the model will converge slowly and the accuracy will be low.
What important, according to Alexey's (popular forked darknet and the creator of YOLO v4) how to improve object detection is :
For each object which you want to detect - there must be at least 1
similar object in the Training dataset with about the same: shape,
side of object, relative size, angle of rotation, tilt, illumination.
So desirable that your training dataset include images with objects at
diffrent: scales, rotations, lightings, from different sides, on
different backgrounds - you should preferably have 2000 different
images for each class or more, and you should train 2000*classes
iterations or more
https://github.com/AlexeyAB/darknet
So I think you should have minimum 2000 images per class if you want to get the optimum accuracy. But 1000 per class is not bad also. Even with hundreds of images per class you can still get decent (not optimum) result. Just collect as many images as you can.
It depends.
There is an objective minimum of one image per class. That may work with some accuracy, in principle, if using data-augmentation strategies and fine-tuning a pretrained YOLO network.
The objective reality, however, is that you may need as many as 1000 images per class, depending on your problem.

Resources