I am using Yolov3 in Pytorch and have got 5 classes. Each class consists of approx. 500 images and after 2.500 iterations I get an mAP of 0.80%. However, the training takes approx. 2 days and I am strongly wondering how to decrease the training time.
One idea is to take a pretrained yolov3 network weights file, concatenate the three output layers and use these layers for training.
What do you think of this idea? Or will this not work?
Thanks!
Related
I'm trying to train a multilabel text classification model using BERT. Each piece of text can belong to 0 or more of a total of 485 classes. My model consists of a dropout layer and a linear layer added on top of the pooled output from the bert-base-uncased model from Hugging Face. The loss function I'm using is the BCEWithLogitsLoss in PyTorch.
I have millions of labeled observations to train on. But the training data are highly unbalanced, with some labels appearing in less than 10 observations and others appearing in more than 100K observations! I'd like to get a "good" recall.
My first attempt at training without adjusting for data imbalance produced a micro recall rate of 70% (good enough) but a macro recall rate of 45% (not good enough). These numbers indicate that the model isn't performing well on underrepresented classes.
How can I effectively adjust for the data imbalance during training to improve the macro recall rate? I see we can provide label weights to BCEWithLogitsLoss loss function. But given the very high imbalance in my data leading to weights in the range of 1 to 1M, can I actually get the model to converge? My initial experiments show that a weighted loss function is going up and down during training.
Alternatively, is there a better approach than using BERT + dropout + linear layer for this type of task?
In your case it might be helpful to balance the labels in the training data. You have a lot of data, so you could afford to loose a part of it by balancing. But before you do this, I recommend to read this answer about balancing classes in traing data.
If you really only care about recall, you could try to tune your model maximizing recall.
I am doing a brain MRI segmentation task using Unet model. My problem is it does no matter what transformation i use during training my val dice score after few epochs stucks around 77% score.
I already have a successfuly trained model with 93% dice score on my test set, but i wanted to improve it by adding some augmentation to my training. I used transformations from the albumentation library. I only experimented with transformations that are reveleant to my train case. (Horizontal, VerticalFlip, etc ..)
Things that i already tried:
waiting for more epochs
using different transformations, using only one
using different encoders to make the model more deep, (resnet34, resnet152 etc)
I am trying to classify around 400K data with 13 attributes. I have used python sklearn's SVM package, but it didn't work, and then I learned that SVM's are not suitable for large dataset classification. Then I used the (sklearn) ANN using the following MLPClassifier:
MLPClassifier(solver='adam', alpha=1e-5, random_state=1,activation='relu', max_iter=500)
and trained the system using 200K samples, and tested the model on the remaining ones. The classification worked well. However, my concern is that the system is over trained or overfit. Can you please guide me on the number of hidden layers and node sizes to make sure that there is no overfit? (I have learned that the default implementation has 100 hidden neurons. Is it ok to use the default implementation as is?)
To know if your are overfitting you have to compute:
Training set accuracy
Test set accuracy
Once you have calculated this scores, compare it. If training set score is much better than your test set score, then you are overfitting. This means that your model is "memorizing" your data, instead of learning from it to make future predictions.
If you are overfitting with Neuronal Networks you probably have to reduce the number of layers and reduce the number of neurons per layer. There isn't any strict rule that says the number of layer or neurons you need depending on you dataset size. Every dataset can behaves completely different with the same dataset size.
So, to conclude, if you are overfitting, you would have to evaluate your model accuracy using different parameters of layers and number of neurons, and, then, observe with which values you obtain the best results. There are some methods you can use to find the best parameters, is like gridsearchCV.
I am training a CNN model(made using Keras). Input image data has around 10200 images. There are 120 classes to be classified. Plotting the data frequency, I can see that sample data for every class is more or less uniform in terms of distribution.
Problem I am facing is loss plot for training data goes down with epochs but for validation data it first falls and then goes on increasing. Accuracy plot reflects this. Accuracy for training data finally settles down at .94 but for validation data its around 0.08.
Basically its case of over fitting.
I am using learning rate of 0.005 and dropout of .25.
What measures can I take to get better accuracy for validation? Is it possible that sample size for each class is too small and I may need data augmentation to have more data points?
Hard to say what could be the reason. First you can try classical regularization techniques like reducing the size of your model, adding dropout or l2/l1-regularizers to the layers. But this is more like randomly guessing the models hyperparameters and hoping for the best.
The scientific approach would be to look at the outputs for your model and try to understand why it produces these outputs and obviously checking your pipeline. Did you had a look at the outputs (are they all the same)? Did you preprocess the validation data the same way as the training data? Did you made a stratified train/test-split, i.e. keeping the class distribution the same in both sets? Is the data shuffles when you feed it to your model?
In the end you have about ~85 images per class which is really not a lot, compare CIFAR-10 resp. CIFAR-100 with 6000/600 images per class or ImageNet with 20k classes and 14M images (~500 images per class). So data augmentation could be beneficial as well.
I have a training dataset of 600 images with (512*512*1) resolution categorized into 2 classes(300 images per class). Using some augmentation techniques I have increased the dataset to 10000 images. After having following preprocessing steps
all_images=np.array(all_images)/255.0
all_images=all_images.astype('float16')
all_images=all_images.reshape(-1,512,512,1)
saved these images to H5 file.
I am using an AlexNet architecture for classification purpose with 3 convolutional, 3 overlap max-pool layers.
I want to know which of the following cases will be best for training using Google Colab where memory size is limited to 12GB.
1. model.fit(x,y,validation_split=0.2)
# For this I have to load all data into memory and then applying an AlexNet to data will simply cause Resource-Exhaust error.
2. model.train_on_batch(x,y)
# For this I have written a script which randomly loads the data batch-wise from H5 file into the memory and train on that data. I am confused by the property of train_on_batch() i.e single gradient update. Do this will affect my training procedure or will it be same as model.fit().
3. model.fit_generator()
# giving the original directory of images to its data_generator function which automatically augments the data and then train using model.fit_generator(). I haven't tried this yet.
Please guide me which will be the best among these methods in my case. I have read many answers Here, Here, and Here about model.fit(), model.train_on_batch() and model.fit_generator() but I am still confused.
model.fit - suitable if you load the data as numpy-array and train without augmentation.
model.fit_generator - if your dataset is too big to fit in the memory or\and you want to apply augmentation on the fly.
model.train_on_batch - less common, usually used when training more than one model at a time (GAN for example)