如何在python中为一类SVM计算AUC? [英] How to calculate AUC for One Class SVM in python?

查看:857
本文介绍了如何在python中为一类SVM计算AUC?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难在python中绘制OneClassSVM的AUC图(我使用的是sklearn,它会生成像[[tp, fp],[fn,tn]]fn=tn=0这样的混淆矩阵.

from sklearn.metrics import roc_curve, auc
fpr, tpr, thresholds = roc_curve(y_test, y_nb_predicted)
roc_auc = auc(fpr, tpr) # this generates ValueError[1]
print "Area under the ROC curve : %f" % roc_auc
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)

我要处理错误[1],并为OneClassSVM绘制AUC.

[1] ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

解决方案

有关类似问题,请参见我的回答.要点是:

  • OneClassSVM从根本上不支持将决策转换为概率分数,因此您不能将必要的分数传递到需要改变分数阈值的函数中,例如ROC或Precision-Recall曲线和分数. p>

  • 您可以通过在输入数据中计算OneClassSVM决策函数的最大值来近似这种分数,将其称为MAX,然后通过计算y进行预测得分>.

  • 使用这些分数将y_score传递给average_precision_score等函数,这些函数将接受非阈值分数,而不是概率.

  • 最后,请记住,ROC对OneClassSVM的物理意义不大,尤其是因为OneClassSVM用于存在预期的巨大类不平衡(异常值与非异常值)且ROC不能准确显示的情况在少数异常值上增加相对成功率.

I have difficulty in plotting OneClassSVM's AUC plot in python (I am using sklearn which generates confusion matrix like [[tp, fp],[fn,tn]] with fn=tn=0.

from sklearn.metrics import roc_curve, auc
fpr, tpr, thresholds = roc_curve(y_test, y_nb_predicted)
roc_auc = auc(fpr, tpr) # this generates ValueError[1]
print "Area under the ROC curve : %f" % roc_auc
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)

I want to handle error [1] and plot AUC for OneClassSVM.

[1] ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

解决方案

Please see my answer on a similar question. The gist is:

  • OneClassSVM fundamentally doesn't support converting a decision into a probability score, so you cannot pass the necessary scores into functions that require varying a score threshold, such as for ROC or Precision-Recall curves and scores.

  • You can approximate this type of score by computing the max value of your OneClassSVM's decision function across your input data, call it MAX, and then score the prediction for a given observation y by computing y_score = MAX - decision_function(y).

  • Use these scores to pass as y_score to functions such as average_precision_score, etc., which will accept non-thresholded scores instead of probabilities.

  • Finally, keep in mind that ROC will make less physical sense for OneClassSVM specifically because OneClassSVM is intended for situations where there is an expected and huge class imbalance (outliers vs. non-outliers), and ROC will not accurately up-weight the relative success on the small amount of outliers.

这篇关于如何在python中为一类SVM计算AUC?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆