星火:回归模型阈值和precision [英] Spark : regression model threshold and precision

查看:1546
本文介绍了星火:回归模型阈值和precision的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有回归模式,在这里我明确的门槛设定为0.5。

I have logistic regression mode, where I explicitly set the threshold to 0.5.

model.setThreshold(0.5)

我训练模型,然后我想要得到的基本统计资料 - precision,召回等

I train the model and then I want to get basic stats -- precision, recall etc.

这是当我评估模型我做什么:

This is what I do when I evaluate the model:

val metrics = new BinaryClassificationMetrics(predictionAndLabels)

val precision = metrics.precisionByThreshold


precision.foreach { case (t, p) =>

      println(s"Threshold is: $t, Precision is: $p")

    }

我得到的只有0.0和1.0作为阈值和0.5完全被忽略的结果。

I get results with only 0.0 and 1.0 as values of threshold and 0.5 is completely ignored.

下面是上述回路的输出:

Here is the output of the above loop:

阈值是1.0,precision是:0.8571428571428571

Threshold is: 1.0, Precision is: 0.8571428571428571

阈值是:0.0,precision是:0.3005181347150259

Threshold is: 0.0, Precision is: 0.3005181347150259

当我打电话metrics.thresholds()也返回只有两个值,0.0和1.0。

When I call metrics.thresholds() it also returns only two values, 0.0 and 1.0.

我如何与阈值precision和召回值0.5?

How do I get the precision and recall values with threshold as 0.5?

推荐答案

您需要清除模型门槛你让predictions之前。清除门槛使你的predictions返回一个分值,而不是分类标签。如果没有,你只会有两个阈值,即您的标签0.0和1.0。

You need to clear the model threshold before you make predictions. Clearing threshold makes your predictions return a score and not the classified label. If not you will only have two thresholds, i.e. your labels 0.0 and 1.0.

model.clearThreshold()

从predictionsAndLabels元组应该看起来像(0.6753421,1.0),而不是(1.0,1.0)

看看<一个href=\"https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/mllib/BinaryClassificationMetricsExample.scala\" rel=\"nofollow\">https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/mllib/BinaryClassificationMetricsExample.scala

您可能仍然要设置numBins控制点的数量,如果输入的是很大的。

You probably still want to set numBins to control the number of points if the input is large.

这篇关于星火:回归模型阈值和precision的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆