calculate precision and recall in a confusion matrix - python-3.x

Suppose I have a confusion matrix as like as below. How can I calculate precision and recall?

first, your matrix is arranged upside down.
You want to arrange your labels so that true positives are set on the diagonal [(0,0),(1,1),(2,2)] this is the arrangement that you're going to find with confusion matrices generated from sklearn and other packages.
Once we have things sorted in the right direction, we can take a page from this answer and say that:
True Positives are on the diagonal position
False positives are column-wise sums. Without the diagonal
False negatives are row-wise sums. Without the diagonal.
\ Then we take some formulas from sklearn docs for precision and recall.
And put it all into code:
import numpy as np
cm = np.array([[2,1,0], [3,4,5], [6,7,8]])
true_pos = np.diag(cm)
false_pos = np.sum(cm, axis=0) - true_pos
false_neg = np.sum(cm, axis=1) - true_pos
precision = np.sum(true_pos / (true_pos + false_pos))
recall = np.sum(true_pos / (true_pos + false_neg))
Since we remove the true positives to define false_positives/negatives only to add them back... we can simplify further by skipping a couple of steps:
true_pos = np.diag(cm)
precision = np.sum(true_pos / np.sum(cm, axis=0))
recall = np.sum(true_pos / np.sum(cm, axis=1))

I don't think you need summation at last. Without summation, your method is correct; it gives precision and recall for each class.
If you intend to calculate average precision and recall, then you have two options: micro and macro-average.
Read more here http://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html

For the sake of completeness for future reference, given a list of grounth (gt) and prediction (pd). The following code snippet computes confusion matrix and then calculates precision and recall.
from sklearn.metrics import confusion_matrix
gt = [1,1,2,2,1,0]
pd = [1,1,1,1,2,0]
cm = confusion_matrix(gt, pd)
#rows = gt, col = pred
#compute tp, tp_and_fn and tp_and_fp w.r.t all classes
tp_and_fn = cm.sum(1)
tp_and_fp = cm.sum(0)
tp = cm.diagonal()
precision = tp / tp_and_fp
recall = tp / tp_and_fn

Given:
hypothetical confusion matrix (cm)
cm =
[[ 970 1 2 1 1 6 10 0 5 0]
[ 0 1105 7 3 1 6 0 3 16 0]
[ 9 14 924 19 18 3 13 12 24 4]
[ 3 10 35 875 2 34 2 14 19 19]
[ 0 3 6 0 903 0 9 5 4 32]
[ 9 6 4 28 10 751 17 5 24 9]
[ 7 2 6 0 9 13 944 1 7 0]
[ 3 11 17 3 16 3 0 975 2 34]
[ 5 38 10 16 7 28 5 4 830 20]
[ 5 3 5 13 39 10 2 34 5 853]]
Goal:
precision and recall for each class using map() to calculate list division.
from operator import truediv
import numpy as np
tp = np.diag(cm)
prec = list(map(truediv, tp, np.sum(cm, axis=0)))
rec = list(map(truediv, tp, np.sum(cm, axis=1)))
print ('Precision: {}\nRecall: {}'.format(prec, rec))
Result:
Precision: [0.959, 0.926, 0.909, 0.913, 0.896, 0.880, 0.941, 0.925, 0.886, 0.877]
Recall: [0.972, 0.968, 0.888, 0.863, 0.937, 0.870, 0.954, 0.916, 0.861, 0.880]
please note: 10 classes, 10 precisions and 10 recalls.

Agreeing with gruangly and EuWern, I modified PabTorre's solution accordingly to generate precision and recall per class.
Also, given my use case (NER) where a model could:
Never predict a class that is present in the input text (i.e. a column of zeros, i.e. TP:0, FP:0, FN: all), causing a nan in the precision array, or
Predict a class that is completely absent in the input text (i.e. a row of zeros, i.e. TP:0, FN:0, FP: all), causing a nan in the recall array...
I wrap the array with a numpy.nan_to_num() to convert any nan to zero. This is not a mathematical decision, but a per use-case, functional decision in how to handle never-predicted, or never-occuring classes.
import numpy
confusion_matrix = numpy.array([
[ 5, 0, 0, 0, 0, 3],
[ 0, 2, 0, 1, 0, 5],
[ 0, 0, 0, 3, 5, 7],
[ 0, 0, 0, 9, 0, 0],
[ 0, 0, 0, 9, 32, 3],
[ 0, 0, 0, 0, 0, 0]
])
true_positives = numpy.diag(confusion_matrix)
false_positives = numpy.sum(confusion_matrix, axis=0) - true_positives
false_negatives = numpy.sum(confusion_matrix, axis=1) - true_positives
precision = numpy.nan_to_num(numpy.divide(true_positives, (true_positives + false_positives)))
recall = numpy.nan_to_num(numpy.divide(true_positives, (true_positives + false_negatives)))
print(true_positives) # [ 5 2 0 9 32 0 ]
print(false_positives) # [ 0 0 0 13 5 18 ]
print(false_negatives) # [ 3 6 15 0 12 0 ]
print(precision) # [1. 1. 0. 0.40909091 0.86486486 0. ]
print(recall) # [0.625 0.25 0. 1. 0.72727273 0. ]

import numpy as np
n_classes=3
cm = np.array([[0,1,2],
[5,4,3],
[8,7,6]])
sp = []
f1 = []
gm = []
sens = []
acc= []
for c in range(n_classes):
tp = cm[c,c]
fp = sum(cm[:,c]) - cm[c,c]
fn = sum(cm[c,:]) - cm[c,c]
tn = sum(np.delete(sum(cm)-cm[c,:],c))
recall = tp/(tp+fn)
precision = tp/(tp+fp)
accuracy = (tp+tn)/(tp+fp+fn+tn)
specificity = tn/(tn+fp)
f1_score = 2*((precision*recall)/(precision+recall))
g_mean = np.sqrt(recall * specificity)
sp.append(specificity)
f1.append(f1_score)
gm.append(g_mean)
sens.append(recall)
acc.append(tp)
print("for class {}: recall {}, specificity {}\
precision {}, f1 {}, gmean {}".format(c,round(recall,4), round(specificity,4), round(precision,4),round(f1_score,4),round(g_mean,4)))
print("sp: ", np.average(sp))
print("f1: ", np.average(f1))
print("gm: ", np.average(gm))
print("sens: ", np.average(sens))
print("accuracy: ", np.sum(acc)/np.sum(cm))

Related

How to calculate Sensitivity, specificity and pos predictivity for each class in multi class classficaition

I have checked all SO question which generate confusion matrix and calculate TP, TN, FP, FN.
Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative
Mainly it usage
from sklearn.metrics import confusion_matrix
For two class it's easy
from sklearn.metrics import confusion_matrix
y_true = [1, 1, 0, 0]
y_pred = [1, 0, 1, 0]
tn, fp, fn, tp = confusion_matrix(y_true, y_pred, labels=[0, 1]).ravel()
For multiclass there is one solution, but it does it only for first class. Not all class
def perf_measure(y_actual, y_pred):
class_id = set(y_actual).union(set(y_pred))
TP = []
FP = []
TN = []
FN = []
for index ,_id in enumerate(class_id):
TP.append(0)
FP.append(0)
TN.append(0)
FN.append(0)
for i in range(len(y_pred)):
if y_actual[i] == y_pred[i] == _id:
TP[index] += 1
if y_pred[i] == _id and y_actual[i] != y_pred[i]:
FP[index] += 1
if y_actual[i] == y_pred[i] != _id:
TN[index] += 1
if y_pred[i] != _id and y_actual[i] != y_pred[i]:
FN[index] += 1
return class_id,TP, FP, TN, FN
But this by default calculate for only one class.
But I want to calculate the values for each class given a 4 class. For https://extendsclass.com/csv-editor.html#0697f61
I have done it using excel like this.
Then calculate the results for each
I have automated it in Excel sheet, but is there any programatical solution in python or sklearn to do this ?
This is way easier with multilabel_confusion_matrix. For your example, you can also pass labels=["A", "N", "O", "~"] as an argument to get the labels in the preferred order.
from sklearn.metrics import multilabel_confusion_matrix
import numpy as np
mcm = multilabel_confusion_matrix(y_true, y_pred)
tps = mcm[:, 1, 1]
tns = mcm[:, 0, 0]
recall = tps / (tps + mcm[:, 1, 0]) # Sensitivity
specificity = tns / (tns + mcm[:, 0, 1]) # Specificity
precision = tps / (tps + mcm[:, 0, 1]) # PPV
Which results in an array for each metric:
[[0.83333333 0.94285714 0.64 0.25 ] # Sensitivity / Recall
[0.99029126 0.74509804 0.91666667 1. ] # Specificity
[0.9375 0.83544304 0.66666667 1. ]] # Precision / PPV
Alternatively, you may view class-dependent precision and recall in classification_report. You could get the same lists with output_dict=True and each class label.
>>> print(classification_report(y_true, y_pred))
precision recall f1-score support
A 0.94 0.83 0.88 18
N 0.84 0.94 0.89 70
O 0.67 0.64 0.65 25
~ 1.00 0.25 0.40 8
accuracy 0.82 121
macro avg 0.86 0.67 0.71 121
weighted avg 0.83 0.82 0.81 121

How can I perform GridSearchCV but cross validate using multiple validation sets?

I have a Train set training_set of m observations and n features, and I have three different validation sets val_a, val_b, and val_c which don't leak information to one another.
I would like to perform hyperparameter tuning via HalvingGridSearchCV, where I fit models on training_set, and validate on all three validation sets separately, and then take the score to be the average score for all three (or the lowest score).
The reason is that the three validation were observations of the samples at three distinct time points (A, B, C), and the training set contains observations from only time point A. Thus, a model trained on training_set and evaluated on val_a would not necessarily be best for val_b and val_c.
Also, concatenating all of the sets via training_set = pd.concat([training_set, val_a, val_b, val_c]), and then performing a variant of GroupShuffleSplit is non-ideal, as this results in leaking information from different time points to the model.
Thus far here's what I've tried:
import pandas as pd
from sklearn.model_selection import PredefinedSplit
# Assume each dataset has 4 observations.
tf = [-1] * len(training_set)
training_set = pd.concat([training_set, val_a, val_b, val_c])
tf += [0] * len(val_a) + [1] * len(val_b) + [2] * len(val_c)
print("Test fold:", tf)
pds = PredefinedSplit(test_fold = tf)
# gs = HalvingGridSearchCV(estimator = LGBMRegressor(), param_grid = param_grid, cv = pds, scoring = 'r2', refit = False, min_resources = 'exhaust')
for train_index, test_index in ps.split():
print("TRAIN:", train_index, "TEST:", test_index)
Output:
Test fold: [-1, -1, -1, -1, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]
TRAIN: [ 0 1 2 3 8 9 10 11 12 13 14 15] TEST: [4 5 6 7]
TRAIN: [ 0 1 2 3 4 5 6 7 12 13 14 15] TEST: [ 8 9 10 11]
TRAIN: [ 0 1 2 3 4 5 6 7 8 9 10 11] TEST: [12 13 14 15]
As you can see, this would generate a 3 fold cross-validation, where each validation set is left out once, and included in the training set all of the other times. I know -1 will leave the observations out of any test set, but there is no value to leave the observations out of any train set. ):
Thank you!

Squaring multi-dimensional array, including cross term, without for loop

I'm trying to square a particular axis of a multi dimensional array without using loop in python.
Here I will present the code with loop.
First, let's define a simple array
x = np.random.randint(1, size=(2, 3))
Since the value of the second axis is 3, we have x1, x2, x3. The square term of this array is x12, x22, x32, 2x1x2, 2x1x3, 2x2x3. In total, we have 9 terms.
Here is the full code:
import numpy as np
import time
x = np.random.randint(low=20, size=(2, 3))
print(x)
a, b = x.shape
for i in range(b):
XiXj = np.einsum('i, ij->ij', x[:, i], x[:, i:b])
x = np.concatenate((x, XiXj) , axis=1)
print(x)
Print:
[[ 3 12 18]
[12 10 10]]
[[ 3 12 18 9 36 54 144 216 324]
[ 12 10 10 144 120 120 100 100 100]]
Of course, this won't take long to compute. However, one may have the size of the array of [2000, 5000]. This will take awhile to compute.
How would you do it without the for loop?

Difference in use of ** and pow function

while attempting to write a cost function for linear regression the error is arising while replacing ** with pow function in cost_function :
Original cost function
def cost_function(x,y,theta):
m = np.size(y)
j = (1/(2*m))*np.sum(np.power(np.matmul(x,theta)-y),2)
return j
Cost function giving the error:
def cost_function(x,y,theta):
m = np.size(y)
j = (1/(2*m))*np.sum((np.matmul(x,theta)-y)**2)
return j
Gradient Descent
def gradient_descent(x,y,theta,learn_rate,iters):
x = np.mat(x);y = np.mat(y); theta= np.mat(theta);
m = np.size(y)
j_hist = np.zeros(iters)
for i in range(0,iters):
temp = theta - (learn_rate/m)*(x.T*(x*theta-y))
theta = temp
j_hist[i] = cost_function(x,y,theta)
return (theta),j_hist
Variable values
theta = np.zeros((2,1))
learn_rate = 0.01
iters = 1000
x is (97,2) matrix
y is (97,1) matrix
cost function is calculated fine with value of 32.0727
The error arises while using the same function in gradient descent.
The error am getting is LinAlgError: Last 2 dimensions of the array must be square
First let's distinguish between pow, ** and np.power. pow is the Python function, that according to docs is equivalent to ** when used with 2 arguments.
Second, you apply np.mat to the arrays, making np.matrix objects. According to its docs:
It has certain special operators, such as *
(matrix multiplication) and ** (matrix power).
matrix power:
In [475]: np.mat([[1,2],[3,4]])**2
Out[475]:
matrix([[ 7, 10],
[15, 22]])
Elementwise square:
In [476]: np.array([[1,2],[3,4]])**2
Out[476]:
array([[ 1, 4],
[ 9, 16]])
In [477]: np.power(np.mat([[1,2],[3,4]]),2)
Out[477]:
matrix([[ 1, 4],
[ 9, 16]])
Matrix power:
In [478]: arr = np.array([[1,2],[3,4]])
In [479]: arr#arr # np.matmul
Out[479]:
array([[ 7, 10],
[15, 22]])
With a non-square matrix:
In [480]: np.power(np.mat([[1,2]]),2)
Out[480]: matrix([[1, 4]]) # elementwise
Attempting to do matrix_power on a non-square matrix:
In [481]: np.mat([[1,2]])**2
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
<ipython-input-481-18e19d5a9d6c> in <module>()
----> 1 np.mat([[1,2]])**2
/usr/local/lib/python3.6/dist-packages/numpy/matrixlib/defmatrix.py in __pow__(self, other)
226
227 def __pow__(self, other):
--> 228 return matrix_power(self, other)
229
230 def __ipow__(self, other):
/usr/local/lib/python3.6/dist-packages/numpy/linalg/linalg.py in matrix_power(a, n)
600 a = asanyarray(a)
601 _assertRankAtLeast2(a)
--> 602 _assertNdSquareness(a)
603
604 try:
/usr/local/lib/python3.6/dist-packages/numpy/linalg/linalg.py in _assertNdSquareness(*arrays)
213 m, n = a.shape[-2:]
214 if m != n:
--> 215 raise LinAlgError('Last 2 dimensions of the array must be square')
216
217 def _assertFinite(*arrays):
LinAlgError: Last 2 dimensions of the array must be square
Note that the whole traceback lists matrix_power. That's why we often ask to see the whole traceback.
Why are you setting x,y and theta to np.mat? The cost_function uses matmul. With that function, and its # operator, there are few(er) good reasons for using np.matrix.
Despite the subject line, you did not try to use pow. That confused me and at least one other commentator. I tried to find a np.pow or a scipy version.

How to compute correlation ratio or Eta in Python?

According the answer to this post,
The most classic "correlation" measure between a nominal and an interval ("numeric") variable is Eta, also called correlation ratio, and equal to the root R-square of the one-way ANOVA (with p-value = that of the ANOVA). Eta can be seen as a symmetric association measure, like correlation, because Eta of ANOVA (with the nominal as independent, numeric as dependent) is equal to Pillai's trace of multivariate regression (with the numeric as independent, set of dummy variables corresponding to the nominal as dependent).
I would appreciate if you could let me know how to compute Eta in python.
In fact, I have a dataframe with some numeric and some nominal variables.
Besides, how to plot a heatmap like plot for it?
The answer above is missing root extraction, so as a result, you will receive an eta-squared. However, in the main article (used by User777) that issue has been fixed.
So, there is an article on Wikipedia about the correlation ratio is and how to calculate it. I've created a simpler version of the calculations and will use the example from wiki:
import pandas as pd
import numpy as np
data = {'subjects': ['algebra'] * 5 + ['geometry'] * 4 + ['statistics'] * 6,
'scores': [45, 70, 29, 15, 21, 40, 20, 30, 42, 65, 95, 80, 70, 85, 73]}
df = pd.DataFrame(data=data)
print(df.head(10))
>>> subjects scores
0 algebra 45
1 algebra 70
2 algebra 29
3 algebra 15
4 algebra 21
5 geometry 40
6 geometry 20
7 geometry 30
8 geometry 42
9 statistics 65
def correlation_ratio(categories, values):
categories = np.array(categories)
values = np.array(values)
ssw = 0
ssb = 0
for category in set(categories):
subgroup = values[np.where(categories == category)[0]]
ssw += sum((subgroup-np.mean(subgroup))**2)
ssb += len(subgroup)*(np.mean(subgroup)-np.mean(values))**2
return (ssb / (ssb + ssw))**.5
coef = correlation_ratio(df['subjects'], df['scores'])
print('Eta_squared: {:.4f}\nEta: {:.4f}'.format(coef**2, coef))
>>> Eta_squared: 0.7033
Eta: 0.8386
The answer is provided here:
def correlation_ratio(categories, measurements):
fcat, _ = pd.factorize(categories)
cat_num = np.max(fcat)+1
y_avg_array = np.zeros(cat_num)
n_array = np.zeros(cat_num)
for i in range(0,cat_num):
cat_measures = measurements[np.argwhere(fcat == i).flatten()]
n_array[i] = len(cat_measures)
y_avg_array[i] = np.average(cat_measures)
y_total_avg = np.sum(np.multiply(y_avg_array,n_array))/np.sum(n_array)
numerator = np.sum(np.multiply(n_array,np.power(np.subtract(y_avg_array,y_total_avg),2)))
denominator = np.sum(np.power(np.subtract(measurements,y_total_avg),2))
if numerator == 0:
eta = 0.0
else:
eta = numerator/denominator
return eta

Resources