Why there is a sudden drop in the training model of cnn? - conv-neural-network

Is there a reason why there is a short drop during the training model between epoch 150 to 250?Training model Graph

Related

Why my validation loss is always in a certain range of values?

I have been working with an convolutional autoencoder model. My encoder works fine but I am facing problem with my decoder. I have tried two different model.
For Model 1, training loss starts with 0.0436 and then after some epochs, the value was in range betweek 0.0280 and 0.0320. For validation, loss starts with 0.0306 and then after some epochs it the loss was in between 0.0275 and 0.0285.
For Model 2, the training loss decreased nicely but for the validation loss, it started with 0.2702 but after some epochs, the value was in range between 0.1450 and 0.1550.
I have used 15000 images from MS Coco, 'mse' as a loss function, 'Adam' as an optimizer with 0.00001 learning rate.
I have tried to use dropout layer, regularization but nothing is working. First I have tried it with 3000 images but later I increased the dataset into 15000 but still getting same problem. Total number of parameter of my model is 221,449,088.

Is Imagenet_1k dataset enough to train Resnet50 model?

I used nervana distiller to train resnet50 baseline model with imagenet_1k dataset.I've observed after 100 epochs, Top5 accuracy is about 10%.The validation accuracy remains zero for long step.

Would training a BERT Multi-Label Classifier for 100 labels decrease accuracy a lot?

I am trying to train a text classifier which would be able to classify a sentence as being of a certain query type. I have used the BERT Model and trained a Multi-Label classifier which does the job with 90% accuracy for about 20 labels.
My question is that if I have to train the model for 100/200 labels would the accuracy be impacted severely?
If your class distributions does not have a large overlap and you have the good amount of train data representing each class, your accuracy should not be severely impacted. For data hungry model like BERT its all about data. If you have large amount of data represent your 100/200 class you are good to go.

Keras: A loaded checkpoint model to resume a training could decrease the accuracy?

My keras template is generating a checkpoint for every best time of my training.
However my internet dropped and when loading my last checkpoint and restarting training from last season (using initial_epoch), the accuracy dropped from 89.1 (loaded model value) to 83.6 in the first season of new training. Is this normal behavior when resuming(restarting) a training? Because when my network fell it was already in the 30th season and there was no drop in accuracy, there was also no significant improvement and so did not generate any new checkpoint, forcing me to come back a few epochs.
Thanks in advance for the help.
The problem with saving and retraining is that, when you start retraining from a trained model up to epoch N, at epoch N+1 it does not have the history retained.
Scenario:
You are training a model for 30 epochs. At epoch 15, you have an accuracy of 88% (say you save your model according to the best validation accuracy). Unfortunately, something happens and your training crashes. However, since you trained with checkpoints, you have the resulting model obtained at epoch 15, before your program crashed.
If you start retraining from epoch 15, the previous validation_accuracies(since you now train again "from scratch"), will not be 'remembered anywhere'. If you get at epoch 16 a validation accuracy of 84%, your 'best_model' (with 88% acc) will be overwritten with the epoch 16 model, because there is no saved/internal history data of the prior training/validation accuracies. Under the hood, at a new retraining, 84% will be compared to -inf, therefore it will save the epoch 16 model.
The solution is to either retrain from scratch, or to initialise the second training validation accuracies with a list (manually or obtained from Callback) from the previous training. In this way, the maximum accuracy compared by Keras under the hood at the end of your epoch, would be 88% (in the scenario) not -inf.

Why training VGG16 CNN with images from scratch doesn't show up convergence?

I'm training the VGG16 CNN model with the "Imagenet" weights. There is no convergence when training the model from scratch. I have used a saved model after one epoch from another machine and started training on top of that. What could be the issue?
Using Learning Rate - 0.0001 / 0.001 ( Tried using both )
Training Images - RGB (1,00,000 images)
When training the model with Imagenet Pretrained weights from scratch there is no convergence. The training accuracy stands still (70%) after several epochs. The loss (88%) is not reducing.
When training the model with Imagenet Pretrained weights with a .h5 model ( Got from another machine after first epoch with the same configuration ) there was convergence. The accuracy raised epoch by epoch up to 90% & above and the loss reduced gradually.
What could be the actual reason?

Resources