在scikit-learn中获得二进制概率分类器的最大准确性 [英] Getting the maximum accuracy for a binary probabilistic classifier in scikit-learn

查看:68
本文介绍了在scikit-learn中获得二进制概率分类器的最大准确性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在scikit-learn中,是否有任何内置函数可以使二进制概率分类器获得最大的准确性?

Is there any built-in function to get the maximum accuracy for a binary probabilistic classifier in scikit-learn?

例如以获得最大的F1分数:

E.g. to get the maximum F1-score I do:

# AUCPR
precision, recall, thresholds = sklearn.metrics.precision_recall_curve(y_true, y_score)    
auprc  = sklearn.metrics.auc(recall, precision)
max_f1 = 0
for r, p, t in zip(recall, precision, thresholds):
    if p + r == 0: continue
    if (2*p*r)/(p + r) > max_f1:
        max_f1 = (2*p*r)/(p + r) 
        max_f1_threshold = t

我可以用类似的方式计算最大精度:

I could compute the maximum accuracy in a similar fashion:

accuracies = []
thresholds = np.arange(0,1,0.1)
for threshold in thresholds:
    y_pred = np.greater(y_score, threshold).astype(int)
    accuracy = sklearn.metrics.accuracy_score(y_true, y_pred)
    accuracies.append(accuracy)

accuracies = np.array(accuracies)
max_accuracy = accuracies.max() 
max_accuracy_threshold =  thresholds[accuracies.argmax()]

但是我想知道是否有任何内置函数.

but I wonder whether there is any built-in function.

推荐答案

from sklearn.metrics import accuracy_score
from sklearn.metrics import roc_curve

fpr, tpr, thresholds = roc_curve(y_true, probs)
accuracy_scores = []
for thresh in thresholds:
    accuracy_scores.append(accuracy_score(y_true, [m > thresh for m in probs]))

accuracies = np.array(accuracy_scores)
max_accuracy = accuracies.max() 
max_accuracy_threshold =  thresholds[accuracies.argmax()]

这篇关于在scikit-learn中获得二进制概率分类器的最大准确性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆