如何保存和加载在Apache的星火MLLib模型 [英] How to save and load MLLib model in Apache Spark

查看:2968
本文介绍了如何保存和加载在Apache的星火MLLib模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我训练中的Apache星火分类模型( pyspark 使用)。我存储在模型对象, LogisticRegressionModel 。现在,我想就新的数据predictions。我想存储的模式,并在为了使predictions读回到一个新的程序。任何想法如何存储模式?我在想,也许泡菜,但我是一个新手,Python和星火,所以我想听听市民的想法。

I trained a classification model in Apache Spark (using pyspark). I stored the model in an object, LogisticRegressionModel. Now, I want to make predictions on new data. I would like to store the model, and read it back into a new program in order to make the predictions. Any idea how to store the model? I'm thinking of maybe pickle, but I'm a newbie to both python and Spark, so I'd like to hear what the community thinks.

更新:我还需要一个决策树分类。读它,我需要进口DecisionTreeModel 从pyspark.mllib.tree进口决策树,DecisionTreeModel

UPDATE: I also needed a decision tree classifier. To read it, I needed to import DecisionTreeModel from pyspark.mllib.tree import DecisionTree, DecisionTreeModel

推荐答案

您可以通过使用保存模型的<一个href=\"https://spark.apache.org/docs/latest/api/python/pyspark.mllib.html?highlight=logisticregression#pyspark.mllib.classification.LogisticRegressionModel.save\"相对=nofollow>保存 mllib 模型方法。

You can save your model by using the save method of mllib models.

# let lrm be a LogisticRegression Model
lrm.save(sc, "lrm_model.model")

存放后,您可以在其他应用程序加载它。

After storing it you can load it in another application.

sameModel = LogisticRegressionModel.load(sc, "lrm_model.model")

正如@ zero323如前所述,还有另一种方式来实现这一点,是通过使用 predictive模型标记语言(PMML)

是数据挖掘小组开发了一种基于XML的文件格式,为数据挖掘和机器学习算法产生的应用程序来描述和交换模型提供了一种方法。

is an XML-based file format developed by the Data Mining Group to provide a way for applications to describe and exchange models produced by data mining and machine learning algorithms.

这篇关于如何保存和加载在Apache的星火MLLib模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆