I’ve used InceptionV3 network as base for my model to classify car side(back, front, side, interior, other).
Now, one of the classes(interior) has only 350 images and largest(other) has more than 4000. Other three are about the same, with appx 1000 images per class.
Issue is that, while training, accuracy on training and test data reaches ~75% but when I run the model on all images on the same data it gets classified as ‘other’.
I’ve thought that the weighting the classes woth class_weights would help, but same issue happens.
Can someone suggest what might be the reason for this behaviour?
Related
I am trying to implement an image classification task for the grayscale images, which were converted from some sensor readings. It means that I had initially time series data e.g. acceleration or displacement, then I transformed them into images. Before I do the transformation, I did apply normalization across the data. I have a 1000x9 image dimension where 1000 represents the total time step and 9 is the number of data points. The split ratio is 70%, 15%, and 15% for training, validation, and test data sets. There are 10 different labels, each label has 100 images, it's a multi-class classification task.
An example of my array before image conversion is:
As you see above, the precisions are so sensitive. When I convert them into images, I am able to see the darkness and white part of the image;
Imagine that I have a directory from D1 to D9 (damaged cases) and UN (health case) and there are so many images like this.
Then, I have a CNN-network where my goal is to make a classification. But, there is a significant overfitting issue and whatever I do it's not working out. One of the architecture I've been working on;
Model summary;
I also augment the data. After 250 epochs, this is what I get;
So, what I wonder is that I tried to apply some regularization or augmentation but they do not give me kind of solid results. I experimented it by changing the number of hidden units, layers, etc. Would you think that I need to fully change my architecture? I basically consider two blocks of CNN and FC layers at the end. This is not the first time I've been working on images like this, but I cannot mitigate this overfitting issue. I appreciate it if any of you give me some solid suggestions so I can get smooth results. i was thinking to use some pre-trained models for transfer learning but the image dimension causes some problems, do you know if I can use any of those pre-trained models with 1000x9 image dimension? I know there are some overfiting topics in the forum, but since those images are coming from numerical arrays and I could not make it work, I wanted to create a new title. Thank you!
I have build a model using Keras. My main aim is to classify animals belonging to 10 different categories. The model is performing well. However, when I am testing the images with non animal images, the model is trying to fit the image between the 10 different categories. That is if I test the model with a non animal image I am getting an animal as output with a high confidence score. I know that to solve this issue I must use a threshold. But I have not found any resource on how to do this task. Can someone help please?
I have been working on a dataset in which the goal is to determine that which type of orientation is it. It is a classification problem in which for each record(for most of them) I am having 4 images - front facing, left facing, right facing and back facing product images.
I want to classify these images in the above 4 categories.
The dataset looks like this :
I have downloaded the images and put them in different folders according to their classes.
Methods I have applied:
Till now I have applied two methods to classify these images.
1) I have tried vgg16 directly to classify the images but it did not give me even 50% accuracy.
2) I converted those images into edge images with black background as:
This is done using canny edge detection. It was done because in the result I was getting images with similar color dresses, similar design dresses, etc.
On top of these I again applied vgg16, resnet50, inception models but nothing seemed to work.
Can you suggest some ideas that can work in my case and classify the images in a better way.
first of all yor data set has to be equally splited. For instance 80% train and 20% test. After that you have to balance these sets (train set 60% of class A images, 40% of class B images) the exact same for test set.
I have been thinking about building a YOLO model for detecting parking lot occupancy, I have all the small segmented out images for every parking space. Can I train YOLO on these small images already divided into separate empty and occupied classes and test it on a test image like the ariel view of a parking lot with say 28 parking spots and the model should detect the occupied and empty spaces.
If yes then can someone guide me how to approach the problem? I will be using YOLO implemented on Keras.
YOLO is a n object detection model. During training, it takes coordinates of bounding boxes in an image as input and learns to identify the images inside such bounding boxes. As per your problem statement, if you have a aerial view of parking lot then draw the bounding boxes, generate xml files (as per your training requirement) and start training. This ideally should give you the desired model to predict.
Free tool to label images - https://github.com/tzutalin/labelImg
Github project to get an idea of how to train Yolo in Keras on custom dataset - https://github.com/experiencor/keras-yolo2
By any means, this is not a perfect tailor made solution for your problem given you haven't provided any code or images. But this is a good place to start.
I am trying to implement YOLOv2 on my custom dataset. Is there any minimum number of images required for each class?
There is no minimum images per class for training. Of course the lower number you have, the model will converge slowly and the accuracy will be low.
What important, according to Alexey's (popular forked darknet and the creator of YOLO v4) how to improve object detection is :
For each object which you want to detect - there must be at least 1
similar object in the Training dataset with about the same: shape,
side of object, relative size, angle of rotation, tilt, illumination.
So desirable that your training dataset include images with objects at
diffrent: scales, rotations, lightings, from different sides, on
different backgrounds - you should preferably have 2000 different
images for each class or more, and you should train 2000*classes
iterations or more
https://github.com/AlexeyAB/darknet
So I think you should have minimum 2000 images per class if you want to get the optimum accuracy. But 1000 per class is not bad also. Even with hundreds of images per class you can still get decent (not optimum) result. Just collect as many images as you can.
It depends.
There is an objective minimum of one image per class. That may work with some accuracy, in principle, if using data-augmentation strategies and fine-tuning a pretrained YOLO network.
The objective reality, however, is that you may need as many as 1000 images per class, depending on your problem.