如何在 ML Pipeline 中访问底层模型的参数? [英] How to access parameters of the underlying model in ML Pipeline?
问题描述
我有一个使用 LinearRegression 处理的 DataFrame.如果我直接这样做,如下所示,我可以显示模型的详细信息:
I have a DataFrame that is processed with LinearRegression. If I do it directly, like below, I can display the details of the model:
val lr = new LinearRegression()
val lrModel = lr.fit(df)
lrModel: org.apache.spark.ml.regression.LinearRegressionModel = linReg_b22a7bb88404
println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")
Coefficients: [0.9705748115939526] Intercept: 0.31041486689532866
但是,如果我在管道中使用它(如下面的简化示例),
However, if I use it inside a pipeline (like in the simplified example below),
val pipeline = new Pipeline().setStages(Array(lr))
val lrModel = pipeline.fit(df)
然后我收到以下错误.
scala> lrModel
res9: org.apache.spark.ml.PipelineModel = pipeline_99ca9cba48f8
scala> println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")
<console>:68: error: value coefficients is not a member of org.apache.spark.ml.PipelineModel
println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")
^
<console>:68: error: value intercept is not a member of org.apache.spark.ml.PipelineModel
println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")
我明白这意味着什么(很明显我得到了一个不同的类,因为管道),但不知道如何获得真正的底层模型.
I understand what it means (it's obvious I got a different class, because of the pipeline), but don't know how to get to the real underlying model.
推荐答案
LinearRegressionModel
应该在 stages
内与其对应的 LinearRegression
的索引完全相同代码>.
LinearRegressionModel
should be inside stages
at the exact same index as its corresponding LinearRegression
.
import org.apache.spark.ml.regression.LinearRegressionModel
lrModel.stages(0).asInstanceOf[LinearRegressionModel]
这篇关于如何在 ML Pipeline 中访问底层模型的参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!