如何为 Spark MLlib 模型提供服务? [英] How to serve a Spark MLlib model?

查看:43
本文介绍了如何为 Spark MLlib 模型提供服务?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在评估用于基于 ML 的生产应用程序的工具,我们的选择之一是 Spark MLlib,但我对如何在训练后提供模型有一些疑问?

I'm evaluating tools for production ML based applications and one of our options is Spark MLlib , but I have some questions about how to serve a model once its trained?

例如,在 Azure ML 中,一旦经过训练,模型就会作为 Web 服务公开,可以从任何应用程序中使用,这与 Amazon ML 的情况类似.

For example in Azure ML, once trained, the model is exposed as a web service which can be consumed from any application, and it's a similar case with Amazon ML.

您如何在 Apache Spark 中提供/部署 ML 模型?

How do you serve/deploy ML models in Apache Spark ?

推荐答案

一方面,使用 Spark 构建的机器学习模型无法以传统方式在 Azure ML 或 Amazon ML 中提供服务.

From one hand, a machine learning model built with spark can't be served the way you serve in Azure ML or Amazon ML in a traditional manner.

Databricks 声称能够使用它的笔记本部署模型,但我还没有真正尝试过.

Databricks claims to be able to deploy models using it's notebook but I haven't actually tried that yet.

另一方面,您可以通过三种方式使用模型:

On other hand, you can use a model in three ways :

  • 在应用程序中进行动态训练,然后应用预测.这可以在 spark 应用程序或笔记本中完成.
  • 训练模型并保存它(如果它实现了 MLWriter),然后加载到应用程序或笔记本中并针对您的数据运行它.
  • 使用 Spark 训练模型并使用 jpmml-spark 将其导出为 PMML 格式.PMML 允许不同的统计和数据挖掘工具使用相同的语言.通过这种方式,预测解决方案可以轻松地在工具和应用程序之间移动,而无需自定义编码.例如,从 Spark ML 到 R.
  • Training on the fly inside an application then applying prediction. This can be done in a spark application or a notebook.
  • Train a model and save it if it implements an MLWriter then load in an application or a notebook and run it against your data.
  • Train a model with Spark and export it to PMML format using jpmml-spark. PMML allows for different statistical and data mining tools to speak the same language. In this way, a predictive solution can be easily moved among tools and applications without the need for custom coding. e.g from Spark ML to R.

这是三种可能的方式.

当然,您可以考虑一种架构,在该架构中您可以使用 RESTful 服务在每个示例中使用 spark-jobserver 构建以进行训练和部署,但需要一些开发.这不是开箱即用的解决方案.

Of course, you can think of an architecture in which you have RESTful service behind which you can build using spark-jobserver per example to train and deploy but needs some development. It's not a out-of-the-box solution.

您还可以使用 Oryx 2 等项目来创建完整的 lambda 架构来训练、部署和服务模型.

You might also use projects like Oryx 2 to create your full lambda architecture to train, deploy and serve a model.

不幸的是,对上述每个解决方案的描述都非常广泛,不适合 SO 的范围.

Unfortunately, describing each of the mentioned above solution is quite broad and doesn't fit in the scope of SO.

这篇关于如何为 Spark MLlib 模型提供服务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆