Sklearn stop on loss plateau during manual training [closed] - python-3.x

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Working with sklearn, the fit function of MLPClassifier is a nice one-size-fits-all solution; you call it once, and it trains until it hits the maximum number of iterations, or the training loss plateaus, all without any interaction. However, I had to change my code to accommodate some other features, and the standard fit function isn't configurable enough for what I want to do. I reconfigured my code to use partial_fit instead, manually running each iteration one at a time; but I can't figure out how to get my code to recognize when the loss plateaus, like in the fit function. I can't seem to find any properties or methods of MLPClassifier that allow me to access the loss value calculated by partial_fit, so that I can judge if the loss has plateau'd. It seems to me the only way to judge loss across each iteration would be to calculate it myself, despite the fact that partial_fit already calculates it, and even prints it to the console in verbose mode.
Edit: Running partial_fit manually still does cause the training algorithm to recognize when training loss stops improving; it prints the message Training loss did not improve more than tol=0.000100 for 10 consecutive epochs. Stopping. after each iteration, once the training loss plateaus. However, because i'm controlling the iterations manually, it doesn't actually stop, and I have no way of figuring out in my code whether or not this message has been printed in order to stop it manually.

I would recommend to manually log the loss in a list:
loss_list = list()
clf = MLPClassifier()
#partial fit and so on
print(clf.loss_)
loss_list.append(clf.loss_)
I can provide you with a stopping criterion if this code is helpful.

Related

How can I implement an early stopping criteria - Tensorflow Object Detection API [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Fairly new to the Object Detection API here, using tf-gpu==1.15 for training and 2.2.0 for evaluation as well as python 3.7.
I am able to utilize data augmentation as well as adjust the decay of the learning rate in the ssd_mobilenet_v1.config file, but I am not sure how to go about implementing a way for the model to stop training if I am confident the loss will not get below a certain value no matter how many more steps it trains.
How or where do I configure / implement early stopping criteria?
You can make use of EarlyStopping callback from tensorflow. Refer tensorflow documentation here.
Create a callback using -
from tensorflow.keras.callbacks import EarlyStopping
es = EarlyStopping(
monitor='val_loss', min_delta=0, patience=3, verbose=0, mode='auto',
baseline=None, restore_best_weights=False
)
monitor - Change the monitor param to loss if u don't have validation data & want to stop training if training loss stops decreasing for 3 consecutive epochs. Also, u can change it to accuracy if u wish to stop training based on accuracy instead of loss.
patience - Number of epochs to wait before stopping the training if loss doesn't decrease or accuracy doesn't improve
Then pass it to fit function while training -
history = model.fit(X, y, epochs=10, callbacks=[es])
There are many examples online, u can check them!

How do I properly tune parameters for a neural network? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
How do I tune the parameters for a neural network such as the amount of layers, the types of layers, the width, etc.? Right now I simply guess for good parameters. This becomes very expensive and time consuming for me as I will tune a network and then find out that it didn't do any better than the previous model. Is there better way to tune the model to get a good test and validation score?
It is totally hit and trail method. You have to play around it. There is no particular method to do that. Try to use GPU instead to CPU to compute fast such as "Google Colab". My suggestion is note down all the parameters that can be tunned.
eg:
Optimizer: Try to use different optimizer such as Adam,SGD,many more
learning rate : This is a very crucial parameter, try to change it from .0001 to 0.001 step by 0.0001.
Number of hidden layers : Try to increase no. of hidden layers.
Try to use Batch Normalization or Drop out or both if required.
Use correct Loss funtion.
Change Batch size and Epoch.
Hidden layers, Epochs, batch-size: Try different numbers.
Optimizers: Adam (gives better results), Rmsprop
Dropout: 0.2 works well in most case
Also as a plus, you should also try different activation functions ( like you can use ReLu in the hidden layers and for output layer use sigmoid for binary class classification and softmax for multiclass classification.

Tensorflow Object Detection: Using new detected images to retrain model [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have searched through this forum for similar questions but was unanswered (Updating Tensorflow Object detection model with new images). I have managed to create my custom train model (lets name it model1). Was wondering if can i use new images that are processed by model1 to further train model1? will it improve the accuracy of the model?
Accuracy will depend on the number of correctly classified images and not only on the total number of training images. https://developers.google.com/machine-learning/crash-course/classification/accuracy. If you consider that the new images are to be used for training (have correct labels), then you should consider re-training the model. Take a look at this post https://datascience.stackexchange.com/questions/12761/should-a-model-be-re-trained-if-new-observations-are-available
You can use your current model (model1) in a number of ways:
on new images to detect bad results (hard examples) for new training
on new images to detect good results for evaluation
on the images in the existing dataset to detect bad images (wrong label etc.)
Some of the bad results from new images will be non-objects (adversarial) and not directly usable for training (but see this: https://github.com/tensorflow/models/issues/3578#issuecomment-375267920).
Removal of bad images from the existing dataset requires retraining from scratch unless there is some funky way of "untraining" images from a model.
Eventually one would end up approaching a perfect dataset that makes best use of the capacity of the chosen model architecture, although the domain may evolve over time.
I think the reason this is not much discussed is because most researchers have to work with common datasets so they can compare their approaches (brilliant read: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5697567/).
It might improve it but it is tricky. It would lead to overfitting. Improving the data set would actually help, but not with images detected by its own model. This kind of images are detected cause the model already performs well on them, so not much help.
What you need actually is quite the opposite. You need to teach the model to recognize the images that it didn't recognize before
The main problem of machine learning (that is the approach you are using for object detection here) is that of generalization. In your case, it is the ability to recognize objects of the same type as image you used for training, in images that were not used during training.
Obviously, if you were able to use all the possible images during training, your system would be perfect (actually, it would be a simple exact image matching problem). In a more realistic setup, the more training image you are using, the higher chance you have to obtain a better object detector.
Usually, it is however more valuable to add hard examples to your training set. Hence, if your application allows it (in terms of computation time in particular) you can indeed add all the images that are wrongly detected in your dataset (with the correct label) and it will probably help to get a better model, able to detect the object in harder condition on new images.
However, it really depends on what you are doing. If you want to compare your system to another one, you need to use the same (training and) test images to be fair. For benchmarking, you are not allowed to include test images in the training dataset! When you compute the accuracy (on a validation/test dataset) to compare several settings, be sure you are fair in this comparison.

'normalize=True' parameter needed in Lasso and Ridge Regressions, if features already scaled? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I have my data already standardized with the help of StandardScaler() in Python. while applying Lasso Regression do I need to set the normalize parameter True or not and why?
from sklearn import StandardScaler()
scaler=StandardScaler()
x_new=scaler.fit_transform(x)
Now, i want to use Lasso Regression.
from sklearn.linear_model import Lasso
lreg=Lasso(alpha=0.1,max_iter=100,normalize=True)
I want to know if 'normalize=True' is still needed or not?
Standarize and Normalize are two different actions. If you do both without knowing what they do and why you do it, you'll end up loosing accuracy.
Standarization is removing the mean and dividing by the deviation. Normalization is putting everything between 0 and 1.
Depending on the penalisation (lasso,ridge, elastic net) you'll prefer one over the other, but it's not recommended to to do both.
So no, it's not needed.

Multi layer perceptron for OCR [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I intend to use a multi layer perceptron network trained with backpropagation (one hidden layer, inputs served as 8x8 bit matrices containing the B/W pixels from the image). The following questions arise:
which type of learning should I use: batch or on-line?
how could I estimate the right number of nodes in the hidden layer? I intend to process the 26 letter of english alphabet.
how could I stop the training process, to avoid overfitting?
(not quite related) is there another better NN prved to perform better than MLP? I know about MLP stucking in local minima, overfitting and so on, so is there a better (soft computing-based) approach?
Thanks
Most of these questions are things that you need to try different options to see what works best. That is the problem with ANNs. There is no "best" way to do almost anything. You need to find out what works for your specific problem. Nevertheless, I will give my advice for your questions.
1) I prefer incremental learning. I think it is important for the network weights to be updated after each pattern.
2) This is a tough question. It really depends on the complexity of your network. How many input nodes, output nodes, and training patterns that there are. For your problem, I might start with 100 and try ranges up and down from 100 to see if there is improvement.
3) I usually calculate the total error of the network when applied to the test set (not the training set) after each epoch. If that error increases for about 5 epochs, I will stop training and then use the network that was created before the increase occurred. It is important not to use the error of the training set when deciding to stop training. This is what will cause overfitting.
4) You could also try a probabilistic neural network if you are representing your output as 26 nodes, each representing a letter of the alphabet. This network architecture is good for classification problems. Again, it may be a good idea just to try a few different architectures to see what works best for your problem.
Regarding number 3, one way to find out when your ANN starts to overfit is by graphing the accuracy of the net on your training data and your test data vs the number of epochs performed. At some point, as your training accuracy continues to increase (tending towards 100%), your test accuracy will probably start to actually decrease because the ANN is overfitting to the training data. See what epoch that starts to happen and make sure not to train past that.
If your data is very regular and consistent, then it might not overfit until very late in the game, or not at all. And if your data is highly irregular, then your ANN will start to overfit much earlier.
Also, a way to test how regular your data is is to do something like k-fold cross validation.

Resources