星火ML - 保存OneVsRestModel [英] Spark ML - Save OneVsRestModel

查看:989
本文介绍了星火ML - 保存OneVsRestModel的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的重构code采取优势的中间DataFrames,估计和管道。我本来使用 MLlib多类LogisticRegressionWithLBFGS 的上 RDD [LabeledPoint] 。我很享受学习和使用新的API,但我不知道如何拯救我的新模式,它适用于新的数据。

I am in the middle of refactoring my code to take advantage of DataFrames, Estimators, and Pipelines. I was originally using MLlib Multiclass LogisticRegressionWithLBFGS on RDD[LabeledPoint]. I am enjoying learning and using the new API, but I am not sure how to save my new model and apply it on new data.

目前,在ML实施逻辑回归只支持二元分类。我,而不是使用的 OneVsRest 像这样:

Currently, the ML implementation of LogisticRegression only supports binary classification. I am, instead using OneVsRest like so:

val lr = new LogisticRegression().setFitIntercept(true)
val ovr = new OneVsRest()
ovr.setClassifier(lr)
val ovrModel = ovr.fit(training)

我现在想救我的 OneVsRestModel ,但这似乎并没有受到API的支持。我曾尝试:

I would now like to save my OneVsRestModel, but this does not seem to be supported by the API. I have tried:

ovrModel.save("my-ovr") // Cannot resolve symbol save
ovrModel.models.foreach(_.save("model-" + _.uid)) // Cannot resolve symbol save

有没有办法救这一点,这样我就可以在一个新的应用程序加载它作出新的predictions?

Is there a way to save this, so I can load it in a new application for making new predictions?

推荐答案

这里的问题是,模式返回阵列 ClassificationModel [_,_]] 不是阵列 LogisticRegressionModel (或 MLWritable )。为了使它工作,你就必须要具体了解类型:

The problem here is that models returns an Array of ClassificationModel[_, _]] not an Array of LogisticRegressionModel (or MLWritable). To make it work you'll have to be specific about the types:

import org.apache.spark.ml.classification.LogisticRegressionModel

ovrModel.models.zipWithIndex.foreach { 
  case (model: LogisticRegressionModel, i: Int) => 
    model.save(s"model-${model.uid}-$i")
}

或更通用的

import org.apache.spark.ml.util.MLWritable

ovrModel.models.zipWithIndex.foreach { 
  case (model: MLWritable, i: Int) =>
    model.save(s"model-${model.uid}-$i")
}

不幸的是,作为现在(星火1.6) OneVsRestModel 不执行 MLWritable 所以它不能单独保存。

Unfortunately as for now (Spark 1.6) OneVsRestModel doesn't implement MLWritable so it cannot be saved alone.

注意

所有型号的int OneVsRest 似乎使用相同的 UID 因此,我们需要一个明确的指标。这将是后来也识别模型有用的。

All models int the OneVsRest seem to use the same uid hence we need an explicit index. It will be also useful to identify the model later.

这篇关于星火ML - 保存OneVsRestModel的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆