azure custom vision model precision - azure

Im having problems in identifying the best way to optimize the custom vision model.
In using Image classification, will labeling the same image with different label at different time effect the precision of machine learning?
//for example labeling and training image A today with the label 't-shirt', labeling and training image A tomorrow with the label 'blue'.
//We are basically trying to input one classification at a time with the same image (total of five classification, such as style and color) and wanted to know whether this way will effect the prediction precision.
Will labeling larger amount of data at once when inputing the image for machine learning increase the precision of the model? (for example, will there be any difference between 50 label and 100 label for an image to learn at a time?)
Is there any way to teach machine learning to identify object recognition using the result gained from the image classification, else can i teach image classification and object recognition separately with the same type of image?
Will running the learning process of the machine learning longer (for example, the difference between 1hour and 10hours) always give better results?

Related

Can't overcome Overfitting - GrayScale Images from Numerical Arrays and CNN with PyTorch

I am trying to implement an image classification task for the grayscale images, which were converted from some sensor readings. It means that I had initially time series data e.g. acceleration or displacement, then I transformed them into images. Before I do the transformation, I did apply normalization across the data. I have a 1000x9 image dimension where 1000 represents the total time step and 9 is the number of data points. The split ratio is 70%, 15%, and 15% for training, validation, and test data sets. There are 10 different labels, each label has 100 images, it's a multi-class classification task.
An example of my array before image conversion is:
As you see above, the precisions are so sensitive. When I convert them into images, I am able to see the darkness and white part of the image;
Imagine that I have a directory from D1 to D9 (damaged cases) and UN (health case) and there are so many images like this.
Then, I have a CNN-network where my goal is to make a classification. But, there is a significant overfitting issue and whatever I do it's not working out. One of the architecture I've been working on;
Model summary;
I also augment the data. After 250 epochs, this is what I get;
So, what I wonder is that I tried to apply some regularization or augmentation but they do not give me kind of solid results. I experimented it by changing the number of hidden units, layers, etc. Would you think that I need to fully change my architecture? I basically consider two blocks of CNN and FC layers at the end. This is not the first time I've been working on images like this, but I cannot mitigate this overfitting issue. I appreciate it if any of you give me some solid suggestions so I can get smooth results. i was thinking to use some pre-trained models for transfer learning but the image dimension causes some problems, do you know if I can use any of those pre-trained models with 1000x9 image dimension? I know there are some overfiting topics in the forum, but since those images are coming from numerical arrays and I could not make it work, I wanted to create a new title. Thank you!

Can HSV images be used for CNN training

I am currently working on fingers-count deep learning problem. When you look at the dataset, images in the training and validation set are very basic and are almost the same. The network can achieve high training and validation accuracies. But when it comes to prediction in real-life images, it performs very badly(this is because the model has been trained on very basic images).
To overcome this, I converted the training and validation images to HSV(Hue-Saturation-Value) and trained the model on new HSV images. Example of 1 such image from new training set is:
I then convert my image from real life to HSV and pass it to model for prediction. But still, the model is not able to predict correctly. I assumed that since the training images and predicting image are almost same after applying HSV, the model should be predicting good. Is there something which I am thinking incorrectly here? Can HSV images be actually used for training CNN?
It seems you have the overfitting issue, and your model only memorize the simple samples of the training set and in contrast it can not generalize to more complex and diverse data.
In the context of Deep Learning there are various methods to avoid overfitting and I think you don't need to transform your input to HSV necessarily. First of all you can apply various data augmentation methods like random crop or rotation to create various versions of your data. If this method does not work, you can use a smaller model or applying techniques such as Drop Out or Regularization.
Here is a good tutorial from TensorFlow.

how to make ground-truth or training classes for hyperspectral image classification

I have a hyperspectral image having 186 bands. What is appropriate way to generate ground truths so that I can use it to make training class to train a machine learning model. The image is as below:
We need to manually create masks or assign classes to regions of interest on any 2D image serving as the ground truth data (may need to convert it into the same type as the hyperspectral image data, containing only a single band information)

Training a CNN on temporal image data

I am working on a project where I have 1024x1024 brain images over time depicting blood flow. A blood flow parameter image is computed using the brain images over time, and is off the same dimension (1024 x 1024). My goal is to train a CNN to learn the mapping between the brain images over time and the blood flow parameter image.
I've looked into current CNN architectures, but it seems like most research on CNNs is either done for classification on single images (not images over time) or action recognition on video data, which I'm not sure my problem falls under. If anyone can provide me with any insight or papers I can read on how to train a model on temporal data, with the output being an image (rather than a classification score), that would be immensely helpful.

Can I train YOLO on small already segmented out images and test it on a large image for detection?

I have been thinking about building a YOLO model for detecting parking lot occupancy, I have all the small segmented out images for every parking space. Can I train YOLO on these small images already divided into separate empty and occupied classes and test it on a test image like the ariel view of a parking lot with say 28 parking spots and the model should detect the occupied and empty spaces.
If yes then can someone guide me how to approach the problem? I will be using YOLO implemented on Keras.
YOLO is a n object detection model. During training, it takes coordinates of bounding boxes in an image as input and learns to identify the images inside such bounding boxes. As per your problem statement, if you have a aerial view of parking lot then draw the bounding boxes, generate xml files (as per your training requirement) and start training. This ideally should give you the desired model to predict.
Free tool to label images - https://github.com/tzutalin/labelImg
Github project to get an idea of how to train Yolo in Keras on custom dataset - https://github.com/experiencor/keras-yolo2
By any means, this is not a perfect tailor made solution for your problem given you haven't provided any code or images. But this is a good place to start.

Resources