如何从BinaryClassificationMetrics绘制ROC曲线和精度调用曲线 [英] How to plot ROC curve and precision-recall curve from BinaryClassificationMetrics

查看:825
本文介绍了如何从BinaryClassificationMetrics绘制ROC曲线和精度调用曲线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在图形中绘制ROC曲线和Precision-Recall曲线.这些点是从Spark Mllib BinaryClassificationMetrics生成的.通过遵循以下Spark https://spark.apache.org/docs/Latest/mllib-evaluation-metrics.html

I was trying to plot ROC curve and Precision-Recall curve in graph. The points are generated from the Spark Mllib BinaryClassificationMetrics. By following the following Spark https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html

[(1.0,1.0), (0.0,0.4444444444444444)] Precision
[(1.0,1.0), (0.0,1.0)] Recall
[(1.0,1.0), (0.0,0.6153846153846153)] - F1Measure    
[(0.0,1.0), (1.0,1.0), (1.0,0.4444444444444444)]- Precision-Recall curve
[(0.0,0.0), (0.0,1.0), (1.0,1.0), (1.0,1.0)] - ROC curve

推荐答案

您似乎遇到了与我所遇到的问题类似的问题.您需要将参数翻转到Metrics构造函数,或者传递概率而不是预测.因此,例如,如果您使用的是BinaryClassificationMetricsRandomForestClassifier,则根据此页面(在输出下)有预测"和概率".

It looks like you have a similar problem to what I experienced. You need to either flip your parameters to the Metrics constructor or perhaps pass in the probability instead of the prediction. So, for example, if you are using the BinaryClassificationMetrics and a RandomForestClassifier, then according to this page (under outputs) there is "prediction" and "probability".

然后通过以下方式初始化指标:

Then initialize your Metrics thus:

    new BinaryClassificationMetrics(predictionsWithResponse
      .select(col("probability"),col("myLabel"))
      .rdd.map(r=>(r.getAs[DenseVector](0)(1),r.getDouble(1))))

使用DenseVector调用来提取1类的概率.

With the DenseVector call used to extract the probability of the 1 class.

对于实际绘图,这取决于您(为此可以使用许多好的工具),但是至少您会在曲线上获得超过1个点(除了端点).

As for actual plotting, that's up to you (many fine tools for that), but at least you will get more than 1 point on you curve (besides the endpoints).

如果不清楚的话:

metrics.roc().collect()将为您提供ROC曲线的数据:元组:(假阳性率,真阳性率).

metrics.roc().collect() will give you the data for the ROC curve: Tuples of: (false positive rate, true positive rate).

这篇关于如何从BinaryClassificationMetrics绘制ROC曲线和精度调用曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆