使用ALS获得错误的建议 [英] Get wrong recommendation with ALS.recommendation

本文介绍了使用ALS获得错误的建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个火花程序来提出建议.然后,我使用了ALS.recommendation库.然后,我对以下名为trainData的数据集进行了小型测试:

I write a spark program for making recommendations. Then I used ALS.recommendation library. And I made a small test with the following dataset called trainData:

(u1, m1, 1)
(u1, m4, 1)
(u2, m2, 1)
(u2, m3, 1)
(u3, m1, 1)
(u3, m3, 1)
(u3, m4, 1)
(u4, m3, 1)
(u4, m4, 1)
(u5, m2, 1)
(u5, m4, 1)

第一列包含用户,第二列包含用户评分的项目,第三列包含评分.

The first column contains the user, the second contains the items rated by the users and the third contains the ratings.

在用Scala编写的代码中,我使用以下方法训练了模型:

In my code written in scala I trained the model using:

myModel = ALS.trainImplicit(trainData, 3, 5, 0.01, 1.0)

我尝试使用以下说明来检索有关 u1 的一些建议:

I try to retrieve some recommendations for u1 using this instruction:

recommendations = myModel.recommendProducts(idUser, 2)

其中idUser包含对用户 u1 影响的ID 作为建议,我获得:

where idUser contains the ID affected to the user u1 As recommendations, I obtain:

(u1, m1, 1.0536233346170754)
(u1, m4, 0.8540954252858661)
(u1, m3, 0.09069877419040584)
(u1, m2, -0.1345521479521654)

如您所见,前两行显示推荐的项目是u1已评级的项目(m1和m4). 无论我选择获得推荐的用户是什么,我总是会得到相同的行为(推荐的第一个项目是用户已经评分的项目).

As you can see, the first two lines show that the items recommended are the ones that u1 had already rated (m1 and m4). Whatever the user I select to obtain the recommendations, I always get the same behavior (the first items recommended are the ones the user already rated).

我觉得很奇怪!哪里有问题吗?

I find it weird! Is there any problem anywhere?

推荐答案

我认为这是使用recommendProducts的预期行为,当您训练诸如ALS之类的矩阵分解算法时,您试图找到与之相关的等级每个用户到每个项目.

I think that is the expected behaviour of using recommendProducts, when you are training a matrix factorization algorithm such as ALS you are attempting to find a rating that relates each user to each item.

ALS会根据用户已经评分的项目来执行此操作,因此,当您查找给定用户的建议时,模型将最确定已查看的评分,因此大多数情况下它将推荐产品额定.

ALS does this based on the items the user has already rated, so when you are finding recommendations for a given user the model will be most sure about the ratings it has already seen, so it will most of the times recommend products already rated.

您需要做的是保留每个用户的产品清单,并在提出建议时对其进行过滤.

What you need to do is to keep a list of products each user as rated and filter them when making the recommendations.

我仔细研究了源代码和文档,以确保我在说什么.

I dug a bit into the source code and the documentations to be sure of what I was saying.

ALS.recommendProducts在类

ALS.recommendProducts is implemented in the class MatrixFactorizationModel (source code). You can see there that the model when making recommendations doesn't care if the user has already rated that item.

并且应该注意,如果您使用隐式评级,那么您最明确地希望推荐用户已经隐式评级的产品: 想象一下,您的隐式评级是在线商店中产品的页面浏览量,而您想要的是用户购买该产品.

And you should note that if you are using implicit ratings then you most definetly want to recommend products already implicitly rated by the user: Imagine the case where your implicit ratings are page views of your product in an online store and what you want is that the user buys the product.

我无权访问《 使用Spark进行高级分析》 ,因此我无法评论其中的解释和示例.

I don't have access to that book Advanced analytics with Spark so I can't comment on the explations and examples there.

文档:

  • MatrixFactorizationModel

    这篇关于使用ALS获得错误的建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆