sklearn的roc曲线[python] [英] roc curve with sklearn [python]
问题描述
使用roc库时,我有一个理解上的问题.
I have an understanding problem by using the roc libraries.
我想用python绘制roc曲线 http://scikit-learn.org/stable/modules/genic /sklearn.metrics.roc_auc_score.html
I want to plot a roc curve with a python http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html
我正在编写一个程序来评估检测器(hacascascade,神经元网络)并希望对其进行评估. 因此,我已经将数据以以下格式保存在文件中:
I am writing a program which evalutes detectors (haarcascade, neuronal networks) and want to evaluate them. So I already have the data saved in a file in the following format:
0.5 TP
0.43 FP
0.72 FN
0.82 TN
...
而TP表示真阳性,FP-假阳性,FN-假阴性,TN-真阴性
whereas TP means True Positive, FP - False Positivve, FN - False Negative, TN - True Negative
我解析它,并使用此数据集填充4个数组.
I parse it and fill 4 arrays with this data set.
然后我要放入
fpr, tpr = sklearn.metrics.roc_curve(y_true, y_score, average='macro', sample_weight=None)
但是该怎么做呢?在我的情况下,y_true和y_score是什么? 之后,我把它放在fpr,tpr中
but how to do this? What is y_true in my case and y_score? afterwards, I put it fpr, tpr in
auc = sklearn.metric.auc(fpr, tpr)
推荐答案
引用维基百科:
通过绘制各种阈值设置下的FPR(假阳性率)与TPR(真阳性率)来创建ROC.
The ROC is created by plotting the FPR (false positive rate) vs the TPR (true positive rate) at various thresholds settings.
为了计算FPR和TPR,必须为函数 sklearn.metrics.roc_curve .
In order to compute FPR and TPR, you must provide the true binary value and the target scores to the function sklearn.metrics.roc_curve.
所以在您的情况下,我会做这样的事情:
So in your case, I would do something like this :
from sklearn.metrics import roc_curve
from sklearn.metrics import auc
# Compute fpr, tpr, thresholds and roc auc
fpr, tpr, thresholds = roc_curve(y_true, y_score)
roc_auc = auc(y_true, y_score)
# Plot ROC curve
plt.plot(fpr, tpr, label='ROC curve (area = %0.3f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--') # random predictions curve
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.xlabel('False Positive Rate or (1 - Specifity)')
plt.ylabel('True Positive Rate or (Sensitivity)')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")
如果您想更深入地了解如何为所有可能的阈值计算假阳性率和真阳性率,建议您阅读
If you want to have a deeper understanding of how the False positive rate and the True positive rate are computed for all the possible thresholds values, I suggest you to read this article
这篇关于sklearn的roc曲线[python]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!