I'm currently evaluating a bag of words texture classifier which output binary results:
true positives(TP)
true negatives(TN)
false positives(FP)
false negatives(FN)
I'm looking to calculate the accuracy but am not sure i'm assigning true negatives correctly.
I'm currently working with 8 classes and assign 7 true negatives each time there is a true positive, and 6 true negatives and a false negative each time there is a false positive.
I wasn't sure if i should instead add one to the true negatives only when there is a true positive?
This still seems to give overly high results, like for these results:
TP: 20
FP: 10
TN: 20
FN: 10
Accuracy: 0.66
when assigning true negatives like i originally did it's even higher. Shouldn't accuracy be 50% when only half the results are correct or is this normal?
Also do you think this is the best metric to measure classifier accuracy or is there something more advanced?
thanks
From what i've read the method i was using at first was correct, although standard Accuracy (Overall Accuracy) is not necessarily the best way to evaluate a classifier.
Precision and Recall are the widely used as they represent both type 1 and type 2 errors. For a single combined metric however then the F1Measure is typically used F1Score this is the Harmonic mean of precision and recall and can be calculated with this formula:
formula.
Other options such as ROC curves(generated from the True Positive Rate(TPR) and False Positive Rate(FPR)), are also used although not necessarily for multi class systems. To generate a single metric with these the Area Under the Curve(AUC) is taken which largely represents the classifier's predictive ability. This is again however not widely used for multiclass systems.
Related
I use cross validation on a binary dataset.
Right, now when I enter the following lines:
clf = cross_val_score(cl_mnb,transcriptsVectorized, y=Labels,cv=6,scoring = 'precision')
print(clf.round(2))
I do get precision scores, but only for the the negatives; I want to have the precision of the positive cases.
When I look up the available metrics with sklearn.metrics.SCORERS, I don't I cant find that option, does it exist?
I feel the notation you are using in the question may not be correct, so please assist me if the answer does not solve your problem.
First of all, let's take a look at the confusion matrix reference.
Precision or positive predictive value:
Scikit-learn respects this formula and interprets 1 (or true) as positives. The condition positive and predicted condition positive are used to obtain the precision.
By positive cases, do you refer to condition positive cases evaluation? You can use the recall metric in this case, which is also offered by sklearn...
The recall evaluates your model prediction of real positives. Some of the question that it tries to solve are:
Is the model predicting the anomalies? Is the model identifying the cats in the data?
Then, maybe you need the prediction of 0 or False values. Follow the matrix and understand it, is the best way to improve!
I've been using R-squared (coefficient of determination) and mean-absolute-percentage-error to see the difference between true output value (scalar) and the predicted output value (also scalar) that come out of a regression model.
Now, I want to see how the regressed output (vector) is close to my true output (vector) in an intuitive way. MSE is used for the regression model's training, but it is hard to tell whether your model is doing OK or not. For example, if the true output value itself is very small (close to zero) and if your predicted output is twice times bigger than the true output, the MSE will be very small even though the prediction is twice as larger than the true output.
I've been searching a while, and I found terms like "wilk's lambda test", ANOVA, MANOVA, p-value, adjusted-R-squared. But I have not figured out what is the one I can and should use.
I just decided to use MAPE by using the Euclidean distance between vectors instead of the absolute value of the difference between scalars (predicted, true value).
I am trying to build a model on a class imbalanced dataset (binary - 1's:25% and 0's 75%). Tried with Classification algorithms and ensemble techniques. I am bit confused on below two concepts as i am more interested in predicting more 1's.
1. Should i give preference to Sensitivity or Positive Predicted Value.
Some ensemble techniques give maximum 45% of sensitivity and low Positive Predicted Value.
And some give 62% of Positive Predicted Value and low Sensitivity.
2. My dataset has around 450K observations and 250 features.
After power test i took 10K observations by Simple random sampling. While selecting
variable importance using ensemble technique's the features
are different compared to the features when i tried with 150K observations.
Now with my intuition and domain knowledge i felt features that came up as important in
150K observation sample are more relevant. what is the best practice?
3. Last, can i use the variable importance generated by RF in other ensemple
techniques to predict the accuracy?
Can you please help me out as am bit confused on which w
The preference between Sensitivity and Positive Predictive value depends on your ultimate goal of the analysis. The difference between these two values is nicely explained here: https://onlinecourses.science.psu.edu/stat507/node/71/
Altogether, these are two measures that look at the results from two different perspectives. Sensitivity gives you a probability that a test will find a "condition" among those you have it. Positive Predictive value looks at the prevalence of the "condition" among those who is being tested.
Accuracy is depends on the outcome of your classification: it is defined as (true positive + true negative)/(total), not variable importance's generated by RF.
Also, it is possible to compensate for the imbalances in the dataset, see https://stats.stackexchange.com/questions/264798/random-forest-unbalanced-dataset-for-training-test
In the scikit version of Multinomial Naive bayes there is a parameter for fit_prior.
I have found that for unbalanced datasets usually setting this to false is desired.
For my particular use case setting it raised my AUC from 0.52 to 0.61.
However in the pyspark.ml.classification.NaiveBayes there is no such setting which I think means it is fitting priors regardless.
I "think" I can counteract this with the thresholds param to essentially give more weight to the minority class.
In my case the breakdown is 87% negative and 13% positive.
If I can indeed use the thresholds to in effect do fit_prior to false what value should I use.
Would it be 13/18 ~ 0.15 or ....?
i.e I would create it with NaiveBayes(thresholds=[1,.15])
Or am I completely off base with this?
I need to train a Logistic Regression in Sklearn with different profit-loss weights between classes.
The positive class is a loss. Meaning that each time a negative happens, it costs the company, say 1.000$. This obviously happens to both True Positive and False Negative cases.
On the other hand each negative case (both True Negative and False Positive) makes the company gain 50$.
The question is: how do I train, say, a Logistic Regression classifier in SkLearn to maximizes the prifit?
A further complication is that the positive and negative classes are unbalanced meaning that the Positives represent a 5% of overall sample size while the Negatives rerpesent a 95% of overall sample size.
Thanks for helping