在scikit-learn中获得二进制概率分类器的最大准确性 [英] Getting the maximum accuracy for a binary probabilistic classifier in scikit-learn
本文介绍了在scikit-learn中获得二进制概率分类器的最大准确性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在scikit-learn中,是否有任何内置函数可以使二进制概率分类器获得最大的准确性?
Is there any built-in function to get the maximum accuracy for a binary probabilistic classifier in scikit-learn?
例如以获得最大的F1分数:
E.g. to get the maximum F1-score I do:
# AUCPR
precision, recall, thresholds = sklearn.metrics.precision_recall_curve(y_true, y_score)
auprc = sklearn.metrics.auc(recall, precision)
max_f1 = 0
for r, p, t in zip(recall, precision, thresholds):
if p + r == 0: continue
if (2*p*r)/(p + r) > max_f1:
max_f1 = (2*p*r)/(p + r)
max_f1_threshold = t
我可以用类似的方式计算最大精度:
I could compute the maximum accuracy in a similar fashion:
accuracies = []
thresholds = np.arange(0,1,0.1)
for threshold in thresholds:
y_pred = np.greater(y_score, threshold).astype(int)
accuracy = sklearn.metrics.accuracy_score(y_true, y_pred)
accuracies.append(accuracy)
accuracies = np.array(accuracies)
max_accuracy = accuracies.max()
max_accuracy_threshold = thresholds[accuracies.argmax()]
但是我想知道是否有任何内置函数.
but I wonder whether there is any built-in function.
推荐答案
from sklearn.metrics import accuracy_score
from sklearn.metrics import roc_curve
fpr, tpr, thresholds = roc_curve(y_true, probs)
accuracy_scores = []
for thresh in thresholds:
accuracy_scores.append(accuracy_score(y_true, [m > thresh for m in probs]))
accuracies = np.array(accuracy_scores)
max_accuracy = accuracies.max()
max_accuracy_threshold = thresholds[accuracies.argmax()]
这篇关于在scikit-learn中获得二进制概率分类器的最大准确性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文