如何在Scikit-Learn创建ROC曲线时使用预测分数 [英] How to use prediction score in creating ROC curve with Scikit-Learn
问题描述
我有以下代码:
from sklearn.metrics import roc_curve, auc
actual = [1,1,1,0,0,1]
prediction_scores = [0.9,0.9,0.9,0.1,0.1,0.1]
false_positive_rate, true_positive_rate, thresholds = roc_curve(actual, prediction_scores, pos_label=1)
roc_auc = auc(false_positive_rate, true_positive_rate)
roc_auc
# 0.875
在此示例中,prediction_scores
的解释很简单,即越高,分数越能预测预测.
In this example the interpretation of prediction_scores
is straightforward namely, the higher the score the more confident the prediction is.
现在我有了另一组预测预测分数. 它是非小数的,而解释是相反的.表示降低 分数对预测更有信心.
Now I have another set of prediction prediction scores. It is non-fractional, and the interpretation is the reverse. Meaning the lower the score more confident the prediction is.
prediction_scores_v2 = [10.3,10.3,10.2,10.5,2000.34,2000.34]
# so this is equivalent
我的问题是:如何在prediction_scores_v2
中缩放比例,以便
与第一个类似的AUC分数?
My question is: how can I scale that in prediction_scores_v2
so that it gives
similar AUC score like the first one?
换句话说, Scikit的ROC_CURVE 要求y_score
是肯定类别的概率估计.如果我拥有的y_score
是错误类别的概率估计值,我该如何处理该值?
To put it another way, Scikit's ROC_CURVE requires the y_score
to be probability estimates of the positive class. How can I treat the value if the y_score
I have is probability estimates of the wrong class?
推荐答案
对于AUC,您实际上只关心预测的顺序.因此,只要这是对的,您就可以将预测结果转换为AUC可以接受的格式.
For AUC, you really only care about the order of your predictions. So as long as that is true, you can just get your predictions into a format that AUC will accept.
您需要除以最大值,以使预测值介于0到1之间,然后从1中减去,因为在您的情况下,越低越好:
You'll want to divide by the max to get your predictions to be between 0 and 1, and then subtract from 1 since lower is better in your case:
max_pred = max(prediction_scores_v2)
prediction_scores_v2[:] = (1-x/max_pred for x in prediction_scores_v2)
false_positive_rate, true_positive_rate, thresholds = roc_curve(actual, prediction_scores_v2, pos_label=1)
roc_auc = auc(false_positive_rate, true_positive_rate)
# 0.8125
这篇关于如何在Scikit-Learn创建ROC曲线时使用预测分数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!