Apache Spark ALS - 如何执行实时推荐/折叠匿名用户 [英] Apache Spark ALS - how to perform Live Recommendations / fold-in anonym user

查看:26
本文介绍了Apache Spark ALS - 如何执行实时推荐/折叠匿名用户的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Apache Spark(Python 的 Pyspark API)ALS MLLIB 开发一项服务,该服务为我网站中的匿名用户(不在训练集中的用户)执行实时推荐.在我的用例中,我以这种方式在用户评分上训练模型:

I am using Apache Spark (Pyspark API for Python) ALS MLLIB to develop a service that performs live recommendations for anonym users (users not in the training set) in my site. In my usecase I train the model on the User ratings in this way:

from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel, Rating
ratings = df.map(lambda l: Rating(int(l[0]), int(l[1]), float(l[2])))
rank = 10 
numIterations = 10
model = ALS.trainImplicit(ratings, rank, numIterations)

现在,每次匿名用户在目录中选择一个项目时,我都想在 ALS 模型中折叠它的向量并获得推荐(就像recommendProducts() 调用),但避免重新训练整个模型.

Now, each time an anonym user selects an item in the catalogue, I want to fold-in its vector in the ALS model and get the recommendations (just like the recommendProducts() call), but avoiding the re-training of the whole model.

在 Apache Spark 中训练 ALS 模型后,有没有什么方法可以轻松地对新匿名用户向量进行折叠?

Is there any way to easily do the fold-in of the new anonym user vector after training the ALS model in Apache Spark?

提前致谢

推荐答案

我见过一些开源模型服务器"解决方案,广告,但有没有实际操作经验.我也听说过商业产品,但现在想不起来名字了.
因此,请发表您自己的意见,并密切关注可能的替代方案.

There are a few Open Source "model server" solutions that I have seen advertised, but have no hands-on experience on. I also heard of a commercial offering, but can't just remember the name right now.
So make your own opinion, and keep a watch on possible alternatives.

PredictionIO (初创公司已被 SalesForce 吞噬,但他们的解决方案仍然可用)使用 Spark+Hadoop+HBase 堆栈,以及某种 Web 服务器组件.

PredictionIO (the start-up has been gobbled by SalesForce but their solution is still available) uses a Spark+Hadoop+HBase stack, plus some kind of web server component.

MLeap是另一个具有有限功能集的 ML-library-library-feature-set,它可以插入 Spark/Scikit-Learn/whatever 中,并且可以生成 Web 服务——或者将您的模型导出到名为 Combust.ml 的托管解决方案

MLeap is yet-another-ML-library-with-limited-feature-set, which can be plugged into Spark/Scikit-Learn/whatever, and can spawn a web service -- or export your model to a hosted solution named Combust.ml

MLDB 是另一个 ML-library-with-limited-feature-set,完全在Python/Spark 生态系统,但声称与 TensorFlow 完全集成——包括能够导入现有的深度学习模型并针对不同用途对其进行调整.

MLDB is yet-another-ML-library-with-limited-feature-set, completely outside of the Python/Spark ecosystem, but claims full integration with TensorFlow -- including the ability to import existing Deep Learning models and tweak them for different uses.

这篇关于Apache Spark ALS - 如何执行实时推荐/折叠匿名用户的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆