Im trying to use the Faster R-CNN Algorithm for vehicle detection using keras.
I have a dataset containing different folders with every folder containing multiple images. I've managed to transform the annotation files for the images into a CSV file for training process. the annotation files contain extra information about region to be ignored in the images during the training (black zones in the attached image). The image show the bounding boxes for the vehicles along with the ignored zones in the image in a test example based on the information obtained from the annotation file for this image.
Is there a way to specify the specific zones to focus on or to ignore while during training of the algorithm?
The loss function is modified to not penalize predictions inside these ignore annotation areas as false positives during training.
This phrase is of bellow article (Sec.3.2), a good way but I don't know how to do it.
Vandersteegen, Maarten, Kristof Van Beeck, and Toon Goedemé. "Super accurate low latency object detection on a surveillance UAV." 2019 16th International Conference on Machine Vision Applications (MVA). IEEE, 2019.
Related
How can I modify the example in the "Object Detection FasterRCNN Tutorial" (https://www.kaggle.com/code/pdochannel/object-detection-fasterrcnn-tutorial) to include empty labeled images in the training data? I want to be able to handle cases where there are no classes of interest in certain images.
Additionally, I have noticed that even when I include images of underwater scenes (where there are no objects of interest) the model still tends to detect something.
Actually in darknet yolov3 model has coco.names file for labels which include 80 classes.
Now if I want to train a custom model with two labels only, where one label is already there in coco.names and another is not there.
For example I want to train a model to detect for cell phone and dslr camera, so cell phone class already exist in coco.names whereas dslr camera is not there in its labels file.
So can I train custom model using two classes cell phone and dslr camera and give data of only dslr camera for training and it will predict for both dslr camera and cell phone or shall I train with both data of cell phone and dslr images or is there any other way out.
I am a bit new to ML, so any help would be great
Thanks
So you want to fine tune a pre-trained model.
You need to think of classes by just being a set of end nodes of a network, the labels (phone, camera) are just a naming convention for them, and to give us visual guidance.
These nodes are fully connected (with associated weights) to the previous layer of the network, the total number of these intermediate connections varies depending on the number of end nodes (classes) you have.
With the fully trained model, you can't just select the nodes you want, and take out the rest, and add a few more. Because the previous layer (and full network) was trained to give estimates/predictions taking into account a certain number of final nodes.
So basically you need to give a full reset on the last layer (the head), and restart it with the desired number of classes. The idea here, is that you take advantage of the previous training effort on a broader dataset, and fine tune it to your desired data.
Short answer, you need data for both, and need to change the model to accept 2 classes only.
To configure that specific model for the new number of classes and data, I believe you can find some guidance and instructions here
I am implementing YOLOv3 and have trained the model on my custom class ( which is tomato). I have used the darknet model 53 weights ( https://pjreddie.com/media/files/darknet53.conv.74) to start my training as per the instructions provided by many sites on training and object detection using YOLOv3 . I thought it was not necessary to list down the steps.
One of my object images used for training is shown below ( with bounding boxes using LabelImg):
The txt file for the above image for the bounding boxes contains the following coordinates , as created using labellmg:
0 0.152807 0.696655 0.300640 0.557093
0 0.468728 0.705306 0.341862 0.539792
0 0.819652 0.695213 0.337242 0.543829
0 0.317164 0.271626 0.324449 0.501730
Now when I use the same image for testing to determine the accuracy of detection, it is unable to detect all the tomatoes and moreover the bounding boxes are way off from the objects as shown below:
I am not sure what is going on.
I have cloned the git
https://github.com/AlexeyAB/darknet and did a local make and trained the model on the custom object. Nothing fancy.
The pictures above were taken from my phone. I have trained the darknet using a combination of downloaded images and custom tomato pictures I had taken from my phone. I have 290 images for training.
Maybe your model can't generalize well. Maybe your are training too much, which can cause over-fitting or even your dataset is small.
You can try testing on a never seen data (a new tomato picture) and sees if it does well.
Double-check your config files, if something is incorrect there, like you are using a yolov4 cfg in a yolov3 model.
And I recommend that you read this article in which can help you understand better how neural networks works:
https://towardsdatascience.com/understand-neural-networks-model-generalization-7baddf1c48ca
I have been working on a dataset in which the goal is to determine that which type of orientation is it. It is a classification problem in which for each record(for most of them) I am having 4 images - front facing, left facing, right facing and back facing product images.
I want to classify these images in the above 4 categories.
The dataset looks like this :
I have downloaded the images and put them in different folders according to their classes.
Methods I have applied:
Till now I have applied two methods to classify these images.
1) I have tried vgg16 directly to classify the images but it did not give me even 50% accuracy.
2) I converted those images into edge images with black background as:
This is done using canny edge detection. It was done because in the result I was getting images with similar color dresses, similar design dresses, etc.
On top of these I again applied vgg16, resnet50, inception models but nothing seemed to work.
Can you suggest some ideas that can work in my case and classify the images in a better way.
first of all yor data set has to be equally splited. For instance 80% train and 20% test. After that you have to balance these sets (train set 60% of class A images, 40% of class B images) the exact same for test set.
I have been thinking about building a YOLO model for detecting parking lot occupancy, I have all the small segmented out images for every parking space. Can I train YOLO on these small images already divided into separate empty and occupied classes and test it on a test image like the ariel view of a parking lot with say 28 parking spots and the model should detect the occupied and empty spaces.
If yes then can someone guide me how to approach the problem? I will be using YOLO implemented on Keras.
YOLO is a n object detection model. During training, it takes coordinates of bounding boxes in an image as input and learns to identify the images inside such bounding boxes. As per your problem statement, if you have a aerial view of parking lot then draw the bounding boxes, generate xml files (as per your training requirement) and start training. This ideally should give you the desired model to predict.
Free tool to label images - https://github.com/tzutalin/labelImg
Github project to get an idea of how to train Yolo in Keras on custom dataset - https://github.com/experiencor/keras-yolo2
By any means, this is not a perfect tailor made solution for your problem given you haven't provided any code or images. But this is a good place to start.