如何服务Spark MLlib模型? [英] How to serve a Spark MLlib model?

查看:157
本文介绍了如何服务Spark MLlib模型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在评估用于基于生产ML的应用程序的工具,我们的选择之一是Spark MLlib,但是我对如何在模型训练后如何提供服务有一些疑问?

I'm evaluating tools for production ML based applications and one of our options is Spark MLlib , but I have some questions about how to serve a model once its trained?

例如,在Azure ML中,一旦受过训练,该模型就会作为Web服务公开,可以从任何应用程序中使用,这与Amazon ML相似.

For example in Azure ML, once trained, the model is exposed as a web service which can be consumed from any application, and it's a similar case with Amazon ML.

您如何在Apache Spark中提供服务/部署ML模型?

How do you serve/deploy ML models in Apache Spark ?

推荐答案

一方面,用Spark构建的机器学习模型无法像传统方式那样为您在Azure ML或Amazon ML中提供服务.

From one hand, a machine learning model built with spark can't be served the way you serve in Azure ML or Amazon ML in a traditional manner.

Databricks声称能够使用它的笔记本来部署模型,但是我实际上还没有尝试过.

Databricks claims to be able to deploy models using it's notebook but I haven't actually tried that yet.

另一方面,您可以通过三种方式使用模型:

On other hand, you can use a model in three ways :

  • 在应用程序内部进行动态培训,然后应用预测.这可以在spark应用程序或笔记本中完成.
  • 训练模型并保存,如果它实现了MLWriter,则将其加载到应用程序或笔记本中并针对您的数据运行它.
  • 使用Spark训练模型,然后使用 jpmml-spark 将其导出为PMML格式. PMML允许不同的统计和数据挖掘工具使用相同的语言.这样,无需工具即可轻松地在工具和应用程序之间移动预测解决方案.例如从Spark ML到R.
  • Training on the fly inside an application then applying prediction. This can be done in a spark application or a notebook.
  • Train a model and save it if it implements an MLWriter then load in an application or a notebook and run it against your data.
  • Train a model with Spark and export it to PMML format using jpmml-spark. PMML allows for different statistical and data mining tools to speak the same language. In this way, a predictive solution can be easily moved among tools and applications without the need for custom coding. e.g from Spark ML to R.

这是三种可能的方式.

Those are the three possible ways.

当然,您可以考虑一个具有RESTful服务的体系结构,您可以在该体系结构中使用每个示例使用spark-jobserver进行构建,以进行培训和部署,但需要进行一些开发.这不是开箱即用的解决方案.

Of course, you can think of an architecture in which you have RESTful service behind which you can build using spark-jobserver per example to train and deploy but needs some development. It's not a out-of-the-box solution.

您还可以使用Oryx 2之类的项目来创建完整的lambda架构,以训练,部署和提供模型.

You might also use projects like Oryx 2 to create your full lambda architecture to train, deploy and serve a model.

不幸的是,对上述每个解决方案的描述都很广泛,并且不适合SO的范围.

Unfortunately, describing each of the mentioned above solution is quite broad and doesn't fit in the scope of SO.

这篇关于如何服务Spark MLlib模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆