如何在Scikit-Learn创建ROC曲线时使用预测分数 [英] How to use prediction score in creating ROC curve with Scikit-Learn

查看:135
本文介绍了如何在Scikit-Learn创建ROC曲线时使用预测分数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码:

from sklearn.metrics import roc_curve, auc

actual      = [1,1,1,0,0,1]
prediction_scores = [0.9,0.9,0.9,0.1,0.1,0.1]
false_positive_rate, true_positive_rate, thresholds = roc_curve(actual, prediction_scores, pos_label=1)
roc_auc = auc(false_positive_rate, true_positive_rate)
roc_auc
# 0.875

在此示例中,prediction_scores的解释很简单,即越高,分数越能预测预测.

In this example the interpretation of prediction_scores is straightforward namely, the higher the score the more confident the prediction is.

现在我有了另一组预测预测分数. 它是非小数的,而解释是相反的.表示降低 分数对预测更有信心.

Now I have another set of prediction prediction scores. It is non-fractional, and the interpretation is the reverse. Meaning the lower the score more confident the prediction is.

prediction_scores_v2 = [10.3,10.3,10.2,10.5,2000.34,2000.34]
# so this is equivalent 

我的问题是:如何在prediction_scores_v2中缩放比例,以便 与第一个类似的AUC分数?

My question is: how can I scale that in prediction_scores_v2 so that it gives similar AUC score like the first one?

换句话说, Scikit的ROC_CURVE 要求y_score是肯定类别的概率估计.如果我拥有的y_score是错误类别的概率估计值,我该如何处理该值?

To put it another way, Scikit's ROC_CURVE requires the y_score to be probability estimates of the positive class. How can I treat the value if the y_score I have is probability estimates of the wrong class?

推荐答案

对于AUC,您实际上只关心预测的顺序.因此,只要这是对的,您就可以将预测结果转换为AUC可以接受的格式.

For AUC, you really only care about the order of your predictions. So as long as that is true, you can just get your predictions into a format that AUC will accept.

您需要除以最大值,以使预测值介于0到1之间,然后从1中减去,因为在您的情况下,越低越好:

You'll want to divide by the max to get your predictions to be between 0 and 1, and then subtract from 1 since lower is better in your case:

max_pred = max(prediction_scores_v2)
prediction_scores_v2[:] = (1-x/max_pred for x in prediction_scores_v2)

false_positive_rate, true_positive_rate, thresholds = roc_curve(actual, prediction_scores_v2, pos_label=1)
roc_auc = auc(false_positive_rate, true_positive_rate)
# 0.8125

这篇关于如何在Scikit-Learn创建ROC曲线时使用预测分数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆