Face Detection Using Own Custom Model From The Scratch - Need Guidance - python-3.x

Goal : I want to build a model which can detect "number of faces" present in a picture.
What I have :
24533 Images to train, One CSV file which includes,
7 column - [Image_ID, width, height, xmin, ymin, xmax, ymax]
I want to build my own model by fitting this data to Keras Dense layer so that while passing any image or bunch of image it could give me a result with Image_ID & number of faces present in photo.
I have gone through a lot of documents including how to use YOLO, ImageAI, HAAR cascade XML Library but in every case, it is addressed how to use these libraries without building own model.
Please guide me on how to use the information I already have to build my own model without using the existing one.

Related

fast-cnn model - images with no labels in the training data

How can I modify the example in the "Object Detection FasterRCNN Tutorial" (https://www.kaggle.com/code/pdochannel/object-detection-fasterrcnn-tutorial) to include empty labeled images in the training data? I want to be able to handle cases where there are no classes of interest in certain images.
Additionally, I have noticed that even when I include images of underwater scenes (where there are no objects of interest) the model still tends to detect something.

How to place the dataset for training Yolov5?

I’m currently working on object detection using yolov5. I trained a model with a custom dataset which has 3 classes = [‘Car’,‘Motorcycle’,‘Person’]
I have many questions related to yolov5.
All the custom images are labelled using Roboflow.
question1 : As you can see from the table that my dataset has mix of images with different sizes. Will this be a problem in training? And also assume that i’ve trained the model and got ‘best.pt’. Will that model work efficiently in any dimensions of images/videos.
question 2:
Is this directory model correct for training. Even i have ‘test’ directory but it seems that the directory is not at all used. The images in the ‘test’ folder is useless. ( I know that i’m asking dumb questions, please bare with me.)
Is it ok if place all my images like this
And should i need a ‘test’ folder?
question3: What is the ‘imgsz’ in detect.py? Is it downsampling the input source?
I’ve spent more than 3 weeks in yolo. I love it but i find some parts difficult to grasp. kindly provide suggestion for this questions. Thanks in advance.
"question1 : As you can see from the table that my dataset has mix of images with different sizes. Will this be a problem in training? And also assume that i’ve trained the model and got ‘best.pt’. Will that model work efficiently in any dimensions of images/videos."
As long as you've resized/normalized all of your images to be the same square size, then you should be fine. YOLO trains on square images. You can use a platform like Roboflow to process your images so they not only come out in the right structure (for your images and annotation files) but also resize them while generating your dataset so they are all the same size. http://roboflow.com/ - you just need to make a public workspace to upload your images to and you can use the platform free. Here's a video that covers custom training with YOLOv5: https://www.youtube.com/watch?v=x0ThXHbtqCQ
Roboflow's python package can also be used to extract your images programmatically: https://docs.roboflow.com/python
"Is this directory model correct for training. Even i have ‘test’ directory but it seems that the directory is not at all used. The images in the ‘test’ folder is useless. ( I know that i’m asking dumb questions, please bare with me.)"
Yes that directory model is correct from training. Its what I have whenever I run YOLOv5 training too.
You do need a test folder if you want to run inference against the test folder images to learn more about your model's performance.
The 'imgsz' parameter in detect.py is for setting the height/width of the images for inference. You set it at the value you used for --img when you ran train.py.
For example: Resized images to 640 by 640 when generating your images for training? Use (640, 640) for the 'imgsz' parameter (that is the default value). And that would also mean you set --img to 640 when you ran train.py
detect.py parameters (YOLOv5 Github repo)
train.py parameters (YOLOv5 Github repo)
YOLOv5's Github: Tips for Best Training Results https://github.com/ultralytics/yolov5/wiki/Tips-for-Best-Training-Results
Roboflow's Model Production Tips: https://docs.roboflow.com/model-tips

Keras: Load dataset and autocrop relevant area of image

I'm working on signature verification and there were a bunch of things I wanted to do using Keras/ OpenCV/ PIL but couldn't find relevant information. I have loaded the dataset folder using Keras.preprocessing.image_dataset_from_directory and now need to:
Crop the signature from the image stored in the dataset. There may be rectangular borders (or a side of the border) and the border pixels aren't the same in all images.
Resize the image and also take care of augmentation in the signature.
Example Images:
Since I'm working in Keras, I thought of working with its functions but couldn't find any. How can I auto crop/ extract a signature in the dataset I've loaded? About image augmentation, should I do this in this image preprocessing stage, or implement this in CNN model I am using? I am new to image processing and Keras.
Also, because of loading entire training folder as a dataset, the labels are "Genuine" and "Forged". However, there are multiple genuine and forged signatures of a person, and there are multiple people. How do I divide the data?
Organize your directories as follows
main_dir
-train_dir
``person1_fake_dir
```person1 fake image
```person1 fake image
---etc
``person1_real_dir
---person1 real image
---person1 real image
--- etc
--person2_fake_dir
--- person2 fake image
--- person2 fake image
--- etc
--person2_real_dir
---person2 real image
---person2 real image
---etc
.
.
.
--personN_fake_dir
---personN fake image
---personN fake image
---etc
--personN_real_dir
---personN real image
---personN real image
--- etc
-test_dir
same structure as train_dir but put test images here
-valid_dir
same structure as train_dir but put validation images here
If you have N persons then you will have 2 X N classes
You can then use tf.keras.preprocessing.image.ImageDataGenerator().flow_from_directory()
to input your data. Documentation is here. You don't have to worry about cropping the images just set the image size in flow to something like (256,256).
Code below show the rest of the code you need
data_gen=tf.keras.preprocessing.image.ImageDataGenerator(resize=1/255)
train_gen=data_gen.flow_from_directory(train_dir, target_size=(224,224), color-mode='grayscale')
valid_gen=data_gen.flow_from_directory(valid_dir, target_size=(224,224), color-mode='grayscale', shuffle=False)
test_gen=data_gen.flow_from_directory(test_dir, target_size=(224,224), color-mode='grayscale', shuffle=False)
model.compile(optimizer=tf.keras.optimizers.Adam(), loss=tf.keras.losses.CategoricalCrossentropy(), metrics='accuracy')
history=model.fit(train_gen, epochs=20, verbose=1)
accuracy=model.evaluate (test_gen)[1]*100
print ('Model accuracy is ', accuracy)
Note your model will not be able to tell fake from real in the general case. It should work for persons 1 through N. You could try putting all the fake images in one class directory and all the real images in another class directory and train it but I suspect it will not work well in telling real from fake for the general case.

YOLOV3 object detection not detecting the object and bounding boxes are not bounding the objects

I am implementing YOLOv3 and have trained the model on my custom class ( which is tomato). I have used the darknet model 53 weights ( https://pjreddie.com/media/files/darknet53.conv.74) to start my training as per the instructions provided by many sites on training and object detection using YOLOv3 . I thought it was not necessary to list down the steps.
One of my object images used for training is shown below ( with bounding boxes using LabelImg):
The txt file for the above image for the bounding boxes contains the following coordinates , as created using labellmg:
0 0.152807 0.696655 0.300640 0.557093
0 0.468728 0.705306 0.341862 0.539792
0 0.819652 0.695213 0.337242 0.543829
0 0.317164 0.271626 0.324449 0.501730
Now when I use the same image for testing to determine the accuracy of detection, it is unable to detect all the tomatoes and moreover the bounding boxes are way off from the objects as shown below:
I am not sure what is going on.
I have cloned the git
https://github.com/AlexeyAB/darknet and did a local make and trained the model on the custom object. Nothing fancy.
The pictures above were taken from my phone. I have trained the darknet using a combination of downloaded images and custom tomato pictures I had taken from my phone. I have 290 images for training.
Maybe your model can't generalize well. Maybe your are training too much, which can cause over-fitting or even your dataset is small.
You can try testing on a never seen data (a new tomato picture) and sees if it does well.
Double-check your config files, if something is incorrect there, like you are using a yolov4 cfg in a yolov3 model.
And I recommend that you read this article in which can help you understand better how neural networks works:
https://towardsdatascience.com/understand-neural-networks-model-generalization-7baddf1c48ca

Can I train YOLO on small already segmented out images and test it on a large image for detection?

I have been thinking about building a YOLO model for detecting parking lot occupancy, I have all the small segmented out images for every parking space. Can I train YOLO on these small images already divided into separate empty and occupied classes and test it on a test image like the ariel view of a parking lot with say 28 parking spots and the model should detect the occupied and empty spaces.
If yes then can someone guide me how to approach the problem? I will be using YOLO implemented on Keras.
YOLO is a n object detection model. During training, it takes coordinates of bounding boxes in an image as input and learns to identify the images inside such bounding boxes. As per your problem statement, if you have a aerial view of parking lot then draw the bounding boxes, generate xml files (as per your training requirement) and start training. This ideally should give you the desired model to predict.
Free tool to label images - https://github.com/tzutalin/labelImg
Github project to get an idea of how to train Yolo in Keras on custom dataset - https://github.com/experiencor/keras-yolo2
By any means, this is not a perfect tailor made solution for your problem given you haven't provided any code or images. But this is a good place to start.

Resources