Loss function for segmentation of a single connected component - pytorch

I am training a model for the segmentation of a medical image.
There is a condition: it needs to cover a single whole region of the image.
However, my model (U-Net) sometimes shows more than one region even though all labels in the dataset contain one region each and one only. We want to train the model and eliminate those small regions without using any post-processing.
Is the idea of giving a penalty for outputting more than one region wrong? I know that the model might be overfitting, but I would like to try to solve this problem with the above ideas.
Label:
Incorrect output:
Settings:
model: U-Net
loss: 0.5 * cross-entropy + 0.5 * dice-score
additional considerations: data augmentation and early stopping.

Related

Can't overcome Overfitting - GrayScale Images from Numerical Arrays and CNN with PyTorch

I am trying to implement an image classification task for the grayscale images, which were converted from some sensor readings. It means that I had initially time series data e.g. acceleration or displacement, then I transformed them into images. Before I do the transformation, I did apply normalization across the data. I have a 1000x9 image dimension where 1000 represents the total time step and 9 is the number of data points. The split ratio is 70%, 15%, and 15% for training, validation, and test data sets. There are 10 different labels, each label has 100 images, it's a multi-class classification task.
An example of my array before image conversion is:
As you see above, the precisions are so sensitive. When I convert them into images, I am able to see the darkness and white part of the image;
Imagine that I have a directory from D1 to D9 (damaged cases) and UN (health case) and there are so many images like this.
Then, I have a CNN-network where my goal is to make a classification. But, there is a significant overfitting issue and whatever I do it's not working out. One of the architecture I've been working on;
Model summary;
I also augment the data. After 250 epochs, this is what I get;
So, what I wonder is that I tried to apply some regularization or augmentation but they do not give me kind of solid results. I experimented it by changing the number of hidden units, layers, etc. Would you think that I need to fully change my architecture? I basically consider two blocks of CNN and FC layers at the end. This is not the first time I've been working on images like this, but I cannot mitigate this overfitting issue. I appreciate it if any of you give me some solid suggestions so I can get smooth results. i was thinking to use some pre-trained models for transfer learning but the image dimension causes some problems, do you know if I can use any of those pre-trained models with 1000x9 image dimension? I know there are some overfiting topics in the forum, but since those images are coming from numerical arrays and I could not make it work, I wanted to create a new title. Thank you!

Training loss for Faster-RCNN either becoming Nan or infinity

I want to implement Pytorch Faster-RCNN module on a custom dataset that I curated and labelled. The implementation detail looks straightforward, there was a demo that showed training and inference on a custom dataset (a person detection problem). Just like the person detection dataset, where there is only one class (person) along with background class, my personal dataset also has only one class. I therefore saw no need to make any changes to the hyper-parameters. It is unclear what is causing the training loss to either become Nan or infinity. This happens on the first epoch. Grateful for any suggestions on this issue.
Edit:
I found that my masks were sometimes out of bounds, which was blowing up the loss, so the cumulative loss was either going positive infinity or negative infinity. It probably would not have mattered but for the fact that train_one_epoch function provided by the PyTorch Detection Colab Notebook example with PennFudan dataset added a math condition that sys.exit() -ed when the loss was going math.isfinite().

Data augmentation affects convergence speed

Data augmentation is surely a great regularization method, and it improves my accuracy on the unseen test set. However, I do not understand why it reduces the convergence speed of the network? I know each epoch takes a longer time to train since image transformations are applied on the fly. But why does it affect the convergence? For my current setup, the network hits a 100% training accuracy after 5 epochs without data augmentation (and clearly overfits) - with data augmentation, it takes 23 epochs to hit 95% training accuracy and never seems to hit 100%.
Any links to research papers or comments on the reasonings behind this?
I guess you are evaluating accuracy on the train set, right? And it is a mistake...
Without augmentation your network simply overfits. You have a predefined number of images, for instance, 1000, and your network during training can easily memorize dataset labels. And you are evaluating the model on the fixed (not augmented) dataset.
When you are training your network with data augmentation, basically, you are training a model on a dataset of infinite size. You are doing augmentation on the fly, which means that the model "sees" new images every time, and it cannot memorize them perfectly with 100% accuracy. And you are evaluating the model on the augmented (infinite) dataset.
When you train your model with and without augmentation, you evaluate it on the different datasets, so it is not correct to compare their accuracy.
Piece of advice:
Do not look at train set accuracy, it is simply misleading when you use augmentations. Instead - evaluate your model on the test set (or validation set), which is not augmented. By doing this - you'll see the real accuracy increase for your model.
P.S. If you want to find out more about image augmentaitons, I really recommend you to check this guide - https://notrocketscience.blog/complete-guide-to-data-augmentation-for-computer-vision/

How to improve validation accuracy in training convolutional neural network?

I am training a CNN model(made using Keras). Input image data has around 10200 images. There are 120 classes to be classified. Plotting the data frequency, I can see that sample data for every class is more or less uniform in terms of distribution.
Problem I am facing is loss plot for training data goes down with epochs but for validation data it first falls and then goes on increasing. Accuracy plot reflects this. Accuracy for training data finally settles down at .94 but for validation data its around 0.08.
Basically its case of over fitting.
I am using learning rate of 0.005 and dropout of .25.
What measures can I take to get better accuracy for validation? Is it possible that sample size for each class is too small and I may need data augmentation to have more data points?
Hard to say what could be the reason. First you can try classical regularization techniques like reducing the size of your model, adding dropout or l2/l1-regularizers to the layers. But this is more like randomly guessing the models hyperparameters and hoping for the best.
The scientific approach would be to look at the outputs for your model and try to understand why it produces these outputs and obviously checking your pipeline. Did you had a look at the outputs (are they all the same)? Did you preprocess the validation data the same way as the training data? Did you made a stratified train/test-split, i.e. keeping the class distribution the same in both sets? Is the data shuffles when you feed it to your model?
In the end you have about ~85 images per class which is really not a lot, compare CIFAR-10 resp. CIFAR-100 with 6000/600 images per class or ImageNet with 20k classes and 14M images (~500 images per class). So data augmentation could be beneficial as well.

ResUNet Segmentation output is bad although precision and recall values are higher on training and validation

I recently have implemented the RESUNET for a parasite segmentation on blood sample images. The model is described in this papaer, https://arxiv.org/pdf/1711.10684.pdf and here is the code https://github.com/DuFanXin/deep_residual_unet/blob/master/res_unet.py. The segmentation output is a binary image. I trained the model with the weighted Binary cross-entropy Loss, given more weight to the parasite class since there is an imbalance of classes in my images. The last ouput layer has a sigmoid activation.
I calculate precision, recall, and Dice Coefficient value to verify how good is the segmentation on trainning. On training and validation I got good numerical results:
Training
dice_coeff: .6895, f2: 0.8611, precision: 0.6320, recall: 0.9563
Validation
val_dice_coeff: .6433, val_f2: 0.7752, val_precision: 0.6052, val_recall: 0.8499
However, when I try to visually see the segmentations of the validation set my algorithm outputs all black. After analyzing the predictions returned by the model, almost all values are close to zero, so it cannot correctly differenciate between background and foreground. The problems is: Why my metrics shows good numerical values but the segmentation ouput not?
I mean, the metrics are not giving me good information? Why the recall value is higher even if the output is all black?
I trained for about 50 epochs, and my training curves shows constantly learning. Is this because the vanishing gradient problem?
No, you do not have a vanishing gradient issue.
I am almost 100% sure that the problem is related to the way in which you test.
The numbers in your training/validation do not lie.
Ensure that you use the exact same preprocessing on your test dataset, exactly the same preprocessing that is applied during the training.
E.g. : If you use "rescale = 1/255.0" parameter in Keras ImageDataGenerator(), ensure that when you load the test image, divide it by 255.0 before predicting on it.
Note that the aforementioned is a pure example; your inconsistency in train/test preprocessing may stem from other reasons.

Resources