I am new to pytorch and Deep learning. I am trying to do image segmentation.
But , I am stuck at how to label training set images.
Can anyone please help me ?
This is one of my training image
I have two kinds of plants here - one is weed and another one is a good crop. I need to label them.
Can anyone tell me how can I do this ?
I am going to use deep neural network models ( like ResNet ) on the labelled data.
There is discussions here about segmentation tools for image labeling. You may find it useful.
Try with https://oclavi.com which is a web-based object annotation tool
Related
I would like to know a couple of things to clear my confusion. I want to work on a medical neuroimage MRI image scans dataset from the ADNI database.
Each Alzheimer's Disease (AD) MRI image scan has multiple slices.
Do I have to separate each image scan slice and label each of them as AD or combine all image scan slices as a one-image scan and label it for classification?
Most of the medical neuroimage DICOM, NfINT, NII, etc., format. Is it mandatory to convert them to png or jpg for the CNN network model or keep it in NfNIT or nii format?
I have read several existing papers on neuroimaging regarding Alzheimer's disease but did not find the above question answer. Even I have sent an email to the research paper writer in reply; I got they can not help on this as they are very busy and mention their sincere apology for that.
It will be very helpful if anyone has the answer to clear my confusion and thought.
Thank you.
You can train with NIfTI, using, for example, TorchIO. There's no need to separate each slice, you can use the 3D image as is.
You can find some examples in the documentation.
Disclaimer: I'm the main developer of TorchIO.
I want to train a facial recognition CNN from scratch. I can write a Keras Sequential() model following popular architectures and copying their networks.
I wish to use the LFW dataset, however I am confused regarding the technical methodology. Do I have to crop each face to a tight-fitting box? That seems impractical, as the dataset has 13000+ faces.
Lastly, I know it's stupid, but all I have to do is preprocess the images (of course), then fit the model to these images? What's the exact procedure?
Your question is very open ended. Before preprocessing and fitting the model, you need to understand Object Detection. Once you understand what object detection you will get answer to your 1st question whether you are required to manually crop every 13000 image. The answer is no. However, you will have to draw bounding boxes around faces and assign label to images if they are not available in the training data.
Your second question is very vague . What do you mean by exact procedure? Is it the steps you need to do or how to do preprocessing and fitting of the model in python/or any other language? There are lots of references available on the internet about how to do preprocessing and model training for every specific problem. There are no universal steps which can be applied to any problem
I´m new here, so please be kind and teach me if I did not provide all the information you need :)
I would like to compare Edge TPU with other edge device such as Myriad. I would like to select one object detection model and one image segmentation model. Considering the following link which shows supported operations, I have noticed that yolov3 cannot be compiled for EdgeTPU because it includes LeakyRelu.
https://coral.withgoogle.com/docs/edgetpu/models-intro/
For image segmentation, I'd like to use Deeplab. But I'm still don't know if operations included in deeplab v3+, such as atrous convolution or feature pyramid network, are supported.
I'd appreciate if someone teach me what models are usable on edgeTPU. Are there any models of image segmentation?
Did you already found below?
https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/quantize.md
"mobilenetv2_coco_voc_trainaug_8bit":
deeplabv3_mnv2_pascal_train_aug_8bit/frozen_inference_graph.pb
This model is possible to converting to TFLite FlatBuffer.
And also possible to compile for edgetpu with edgetpu_compiler.
Note. edgetpu_api environment had updated.
You can find it below.
https://coral.withgoogle.com/news/updates-07-2019/
Yes. There are prepackaged segmentation models and code examples how to use them.
Here they are https://coral.ai/models/
Please share if you know where to find something similar for Movidius based VPU devices.
Here you can find all supported layers for edgetpu: https://coral.ai/docs/edgetpu/models-intro/#supported-operations.
And for Conv2D it says "Must use the same dilation in x and y dimensions.". So implementing a version of deeplab v3+ is possible for the edgetpu.
The primary objective (my assigned work) is to do an image segmentation for the underwater images using a convolutional neural network. The camera shots taken from the underwater structure will have poor image quality due to severe noise and bad light exposure. In order to achieve higher classification accuracy, I want to do an automatic image enhancement for the images (see the attached file). So, I want to know, which CNN architecture will be best to do both tasks. Please kindly suggest any possible solutions to achieve the objective.
What do you need to segment? I'd be nice so see some labels of the segmentation.
You may not need to enhance the image, if all your dataset has that same amount of noise, the network will generalize properly.
Regarding CNNs architectures, it depends on the constraints you have with processing power and accuracy. If that is not a constrain go with something like MaskRCNN, check that repo as a good starting point, some results are like this:
Be mindful it's a bit of a complex architecture so inference times might be a bit too high (but it's doable on realtime depending your gpu).
Other simple architectures are FCN (Fully Convolutional Networks) with are basically your CNN but instead of fully connected layers:
You replace with with Fully Convolutional Layers:
Images taken from HERE.
The advantage of this FCNs are that they are really easy to implement and modify since you can go with simple architectures (FCN-Alexnet), to more complex and more accurate ones (FCN-VGG, FCN-Resnet).
Also, I think you don't mention framework, there are many to choose from and it depends on your familiarly with languages, most of them you can do them with python:
TensorFlow
Pytorch
MXNet
But if you are a beginner, try starting with a GUI based one, Nvidia Digits is a great starting point and really easy to configure, it's based on Caffe so it's fairly fast when deploying and can easily be integrated with accelerators like TensorRT.
I want to make mushroom classifier with tensorflow using CNN.
But, I wonder about image data pre-processing.
Should I remove background of picture as black color or just use raw picture?
Also, if any pre-processing step before cnn what I do, please let me know.
The question is a little bit too broad, but I'll give you a hint.
Should I remove background of picture as black color or just use raw picture?
If you can do this, you can achieve higher accuracy with data augmentation, because you can generate training images with various backgrounds, thus help generalization.
Note however that by just removing the background the neural network will likely "get used" to the black background, so you would need to translate your test images as well, which in turn needs image segmentation.
Since image segmentation is even harder than classification, the background is usually left unchanged.
Also, if any pre-processing step before CNN what I do, please let me know.
The one pre-processing step that works consistently for all image related tasks is zero-centering: compute the mean value across the training set and use that value to zero-center the images. Be careful not to use test images in computing the mean.