Hi,
Can someone please help me understand the results from the model (image) below? I am new to ML and checking if my understanding is correct that the model is 66% correct and not 83% in terms of prediction?
The metrics have different meanings, they are both correct, but if you wonder which one is more useful for evaluation, I think you should understand the difference between overall accuracy and average accuracy.
Overall accuracy : number of correctly predicted items/total of item to predict.
average accuracy : it is the average of each accuracy per class (sum of accuracy for each class predicted/number of class).
You could refer to the two articles, 1 and 2, they will be helpful.
Read My Article which explain you parameters of classification Algorithm
https://social.technet.microsoft.com/wiki/contents/articles/33879.classification-algorithms-parameters-in-azure-ml.aspx
Related
I have a multilabel classification problem, which I am trying to solve with CNNs in Pytorch. I have 80,000 training examples and 7900 classes; every example can belong to multiple classes at the same time, mean number of classes per example is 130.
The problem is that my dataset is very imbalance. For some classes, I have only ~900 examples, which is around 1%. For “overrepresented” classes I have ~12000 examples (15%). When I train the model I use BCEWithLogitsLoss from pytorch with a positive weights parameter. I calculate the weights the same way as described in the documentation: the number of negative examples divided by the number of positives.
As a result, my model overestimates almost every class… Mor minor and major classes I get almost twice as many predictions as true labels. And my AUPRC is just 0.18. Even though it’s much better than no weighting at all, since in this case the model predicts everything as zero.
So my question is, how do I improve the performance? Is there anything else I can do? I tried different batch sampling techniques (to oversample minority class), but they don’t seem to work.
I would suggest either one of these strategies
Focal Loss
A very interesting approach for dealing with un-balanced training data through tweaking of the loss function was introduced in
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollar Focal Loss for Dense Object Detection (ICCV 2017).
They propose to modify the binary cross entropy loss in a way that decrease the loss and gradient of easily classified examples while "focusing the effort" on examples where the model makes gross errors.
Hard Negative Mining
Another popular approach is to do "hard negative mining"; that is, propagate gradients only for part of the training examples - the "hard" ones.
see, e.g.:
Abhinav Shrivastava, Abhinav Gupta and Ross Girshick Training Region-based Object Detectors with Online Hard Example Mining (CVPR 2016)
#Shai has provided two strategies developed in the deep learning era. I would like to provide you some additional traditional machine learning options: over-sampling and under-sampling.
The main idea of them is to produce a more balanced dataset by sampling before starting your training. Note that you probably will face some problems such as losing the data diversity (under-sampling) and overfitting the training data (over-sampling), but it might be a good start point.
See the wiki link for more information.
I am working for NLP project where I wanted to do text classification
using neural n/w
I am getting very nice accuracy from the test set as 98%.
But, when I tried to check the confusion matrix accuracy (the accuracy score using confusion matrix) it's just 52%.
How is it possible? What am I missing here?
Question
What is the difference between both the accuracy's which one should be considered as the actual accuracy? and why?
Code on test set
loss, acc = model.evaluate(Xtest, y_test_array)
It looks like your dataset has class imbalance, and the metric calculated from confusion matrix (it is NOT accuracy - probably, it is something like F1 score) is low because the minority class is recognized poorly. At the same time, accuracy is high because the majority class is recognized well.
I have a multilabel classification problem, which I am trying to solve with CNNs in Pytorch. I have 80,000 training examples and 7900 classes; every example can belong to multiple classes at the same time, mean number of classes per example is 130.
The problem is that my dataset is very imbalance. For some classes, I have only ~900 examples, which is around 1%. For “overrepresented” classes I have ~12000 examples (15%). When I train the model I use BCEWithLogitsLoss from pytorch with a positive weights parameter. I calculate the weights the same way as described in the documentation: the number of negative examples divided by the number of positives.
As a result, my model overestimates almost every class… Mor minor and major classes I get almost twice as many predictions as true labels. And my AUPRC is just 0.18. Even though it’s much better than no weighting at all, since in this case the model predicts everything as zero.
So my question is, how do I improve the performance? Is there anything else I can do? I tried different batch sampling techniques (to oversample minority class), but they don’t seem to work.
I would suggest either one of these strategies
Focal Loss
A very interesting approach for dealing with un-balanced training data through tweaking of the loss function was introduced in
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollar Focal Loss for Dense Object Detection (ICCV 2017).
They propose to modify the binary cross entropy loss in a way that decrease the loss and gradient of easily classified examples while "focusing the effort" on examples where the model makes gross errors.
Hard Negative Mining
Another popular approach is to do "hard negative mining"; that is, propagate gradients only for part of the training examples - the "hard" ones.
see, e.g.:
Abhinav Shrivastava, Abhinav Gupta and Ross Girshick Training Region-based Object Detectors with Online Hard Example Mining (CVPR 2016)
#Shai has provided two strategies developed in the deep learning era. I would like to provide you some additional traditional machine learning options: over-sampling and under-sampling.
The main idea of them is to produce a more balanced dataset by sampling before starting your training. Note that you probably will face some problems such as losing the data diversity (under-sampling) and overfitting the training data (over-sampling), but it might be a good start point.
See the wiki link for more information.
I'm working on a multi-class classification problem and want to make predictions with high precision for a single class only (i.e. to predict less but correctly).
I've high lighted the total number of predictions and True positive cases for class-1. Any suggestion, how to tune the model of high precision?
PS: Result of other classes don't matter, we are only focusing on the precision of class-1. Please find the results below
One of the possible issues over here can be the class imbalance problem. Due to the unequal size of samples in your dataset, the model you have developed might be biased to a particular class. Keeping a similar sample size for all the classes may increase your precision/accuracy. Hope this helps
I am using a Naive Bayes algorithm to predict movie ratings as positive or negative. I have been able to rate movies with 81% accuracy. I am, however, trying to assign a 'confidence level' for each of the ratings as well.
I am trying to identify how I can tell the user something like "we think that the review is positive with 80% confidence". Can someone help me understand how I can calculate a confidence level to our classification result?
You can report the probability P(y|x) that Naive Bayes calculates. (But note that Naive Bayes isn't a very good probability model, even if it's not too bad a classifier.)