如何从 BinaryClassificationMetrics 绘制 ROC 曲线和精确召回曲线 [英] How to plot ROC curve and precision-recall curve from BinaryClassificationMetrics

查看:50
本文介绍了如何从 BinaryClassificationMetrics 绘制 ROC 曲线和精确召回曲线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在图中绘制 ROC 曲线和 Precision-Recall 曲线.这些点是从 Spark Mllib BinaryClassificationMetrics 生成的.按照以下 Spark https://spark.apache.org/docs/最新/mllib-evaluation-metrics.html

I was trying to plot ROC curve and Precision-Recall curve in graph. The points are generated from the Spark Mllib BinaryClassificationMetrics. By following the following Spark https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html

[(1.0,1.0), (0.0,0.4444444444444444)] Precision
[(1.0,1.0), (0.0,1.0)] Recall
[(1.0,1.0), (0.0,0.6153846153846153)] - F1Measure    
[(0.0,1.0), (1.0,1.0), (1.0,0.4444444444444444)]- Precision-Recall curve
[(0.0,0.0), (0.0,1.0), (1.0,1.0), (1.0,1.0)] - ROC curve

推荐答案

看来您遇到了与我遇到的类似的问题.您需要将参数翻转到 Metrics 构造函数,或者可能传入概率而不是预测.因此,例如,如果您使用 BinaryClassificationMetricsRandomForestClassifier,则根据 这个页面(在输出下)有预测"和概率".

It looks like you have a similar problem to what I experienced. You need to either flip your parameters to the Metrics constructor or perhaps pass in the probability instead of the prediction. So, for example, if you are using the BinaryClassificationMetrics and a RandomForestClassifier, then according to this page (under outputs) there is "prediction" and "probability".

然后这样初始化您的指标:

Then initialize your Metrics thus:

    new BinaryClassificationMetrics(predictionsWithResponse
      .select(col("probability"),col("myLabel"))
      .rdd.map(r=>(r.getAs[DenseVector](0)(1),r.getDouble(1))))

使用 DenseVector 调用来提取 1 类的概率.

With the DenseVector call used to extract the probability of the 1 class.

至于实际绘图,这取决于您(为此有很多很好的工具),但至少您会在曲线上获得 1 个以上的点(除了端点).

As for actual plotting, that's up to you (many fine tools for that), but at least you will get more than 1 point on you curve (besides the endpoints).

如果不清楚:

metrics.roc().collect() 将为您提供 ROC 曲线的数据:元组:(假阳性率,真阳性率).

metrics.roc().collect() will give you the data for the ROC curve: Tuples of: (false positive rate, true positive rate).

这篇关于如何从 BinaryClassificationMetrics 绘制 ROC 曲线和精确召回曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆