Apache Spark ALS-如何执行实时推荐/匿名用户 [英] Apache Spark ALS - how to perform Live Recommendations / fold-in anonym user

查看:183
本文介绍了Apache Spark ALS-如何执行实时推荐/匿名用户的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Apache Spark(用于Python的Pyspark API)ALS MLLIB开发一项服务,该服务为我站点中的匿名用户(不在培训集中的用户)执行实时推荐. 在我的用例中,我以这种方式在用户"评分上训练模型:

I am using Apache Spark (Pyspark API for Python) ALS MLLIB to develop a service that performs live recommendations for anonym users (users not in the training set) in my site. In my usecase I train the model on the User ratings in this way:

from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel, Rating
ratings = df.map(lambda l: Rating(int(l[0]), int(l[1]), float(l[2])))
rank = 10 
numIterations = 10
model = ALS.trainImplicit(ratings, rank, numIterations)

现在,每当匿名用户在目录中选择一个项目时,我都想其向量插入ALS模型并获得推荐(就像 recommendProducts()调用),但避免对整个模型进行重新训练.

Now, each time an anonym user selects an item in the catalogue, I want to fold-in its vector in the ALS model and get the recommendations (just like the recommendProducts() call), but avoiding the re-training of the whole model.

在Apache Spark中训练ALS模型之后,是否可以轻松地对新的匿名用户向量进行插入?

Is there any way to easily do the fold-in of the new anonym user vector after training the ALS model in Apache Spark?

预先感谢

推荐答案

我见过广告的一些开源模型服务器" 解决方案没有实际经验.我也听说过一种商业产品,但现在不只记得这个名字.
因此,请发表您自己的看法,并注意可能的替代方案.

There are a few Open Source "model server" solutions that I have seen advertised, but have no hands-on experience on. I also heard of a commercial offering, but can't just remember the name right now.
So make your own opinion, and keep a watch on possible alternatives.

PredictionIO (该公司曾被SalesForce吞并,解决方案仍然可用).使用Spark + Hadoop + HBase堆栈,以及某种Web服务器组件.

PredictionIO (the start-up has been gobbled by SalesForce but their solution is still available) uses a Spark+Hadoop+HBase stack, plus some kind of web server component.

MLeap 是另一个具有有限功能集的ML库,可以将其插入Spark/Scikit-Learn/任何版本,并且可以生成Web服务-或将模型导出到名为Combust.ml的托管解决方案

MLeap is yet-another-ML-library-with-limited-feature-set, which can be plugged into Spark/Scikit-Learn/whatever, and can spawn a web service -- or export your model to a hosted solution named Combust.ml

MLDB 仍然是另一个-ML-library-with-limited-feature-set,完全不在Python/Spark生态系统,但声称与TensorFlow完全集成-包括能够

MLDB is yet-another-ML-library-with-limited-feature-set, completely outside of the Python/Spark ecosystem, but claims full integration with TensorFlow -- including the ability to import existing Deep Learning models and tweak them for different uses.

这篇关于Apache Spark ALS-如何执行实时推荐/匿名用户的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆