如何从ML管道模型保存到S3或HDFS? [英] How to save models from ML Pipeline to S3 or HDFS?

查看:340
本文介绍了如何从ML管道模型保存到S3或HDFS?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图挽救数以千计的ML管道生产的车型。如在应答here,该机型可以保存如下:

I am trying to save thousands of models produced by ML Pipeline. As indicated in the answer here, the models can be saved as follows:

import java.io._

def saveModel(name: String, model: PipelineModel) = {
  val oos = new ObjectOutputStream(new FileOutputStream(s"/some/path/$name"))
  oos.writeObject(model)
  oos.close
}

schools.zip(bySchoolArrayModels).foreach{
  case (name, model) => saveModel(name, Model)
}

我一直在使用试图S3://一些/路径/ $名称 /用户/ Hadoop的/部分/路径/ $名称,因为我想模型保存到Amazon S3最终但是均不能与指示的路径信息不能被发现。

I have tried using s3://some/path/$name and /user/hadoop/some/path/$name as I would like the models to be saved to amazon s3 eventually but they both fail with messages indicating the path cannot be found.

如何将模型保存到Amazon S3?

How to save models to Amazon S3?

推荐答案

由于 Apache的星火1.6 ,并在斯卡拉 API,你可以保存你的模型,而无需使用任何技巧。因为,从ML库配备了保存方法的所有车型,您可以在<一检查这href=\"http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.classification.LogisticRegressionModel\"相对=nofollow> LogisticRegressionModel ,它的确具有该方法。通过加载模型的方式,你可以使用一个静态方法。

Since Apache-Spark 1.6 and in the Scala API, you can save your models without using any tricks. Because, all models from the ML library come with a save method, you can check this in the LogisticRegressionModel, indeed it has that method. By the way to load the model you can use a static method.

val logRegModel = LogisticRegressionModel.load("myModel.model")

这篇关于如何从ML管道模型保存到S3或HDFS?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆