协同过滤：非个性化项目对项相似度 [英] Collaborative Filtering: Non-Personalized item-to-item similarity

查看：220 发布时间：2015/11/30 16:01:20 python algorithm similarity recommendation-engine collaborative-filtering

本文介绍了协同过滤：非个性化项目对项相似度的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图来计算项目对项相似度沿亚马逊的线谁查看/购买X的客户也查看/购买Y和Z。所有的我见过的实施例和参考文献是为任一计算项目相似度为排项，查找用户 - 用户的相似性，或用于查找基于当前用户的历史推荐的项目。我想保理在当前用户的preferences之前，非针对性的方法来开始吧。

I'm trying to compute item-to-item similarity along the lines of Amazon's "Customers who viewed/purchased X have also viewed/purchased Y and Z". All of the examples and references I've seen are for either computing item similarity for ranked items, for finding user-user similarity, or for finding recommended items based on the current users' history. I'd like to start off with a non-targeted approach before factoring in the current users' preferences.

纵观 Amazon.com建议白皮书，他们使用下面的逻辑进行离线项目项的相似性：

Looking at the Amazon.com recommendations white paper, they use the following logic for offline item-item similarity:

For each item in product catalog, I1 
  For each customer C who purchased I1
    For each item I2 purchased by customer C
       Record that a customer purchased I1 and I2
  For each item I2 
    Compute the similarity between I1 and I2

如果我理解正确的话的时候，我们正处于I1和I2的计算similiarty，我会同一个值I1（外环）购买的物品（I2）的列表。

If I understand correctly, by the time we're at "Compute similiarty between I1 and I2", I have a list of items(I2) purchased in conjunction with a single value I1(the outer loop).

这是如何计算进行？

另一个想法是，我这个得太多，使之更加困难比我需要 - 它是否足以做I2的计数前N个查询，购买与I1一起？

Another idea is that I'm overthinking this and making it more difficult than I need to - Would it be enough to do a top-n query on the count of I2 bought in conjunction with I1?

我也AP $这种方法是否是正确的一个p $ pciate建议。我的产品数据库拥有在任何时间约15万项。由于大部分我见过的阅读材料中显示用户的项目相似，甚至用户 - 用户的相似性，我应该找走这条路吧。

I also appreciate suggestions on whether or not this approach is a correct one. My product database has about 150k items at any time. Since the bulk of the reading material I've seen shows user-item similarity or even user-user similarity, should I be looking to go that route instead.

我已经与过去相似的算法工作，但他们总是涉及等级或分数。我认为这会工作将是建立一个以客户为产品矩阵打进0/1不购买的唯一途径/购买。由于购买历史记录，而该项目的大小，这有可能会真的很大。

I've worked with similarity algorithms in the past but they've always involved a rank or a score. I think the only way this would work would be to build a customer-product matrix scoring 0/1 for not purchased/purchased. Given the purchase history and the item size, this could get really large.

编辑：尽管我列出Python作为一个标签，我想preFER保持逻辑分贝，$ P $内pferably使用Oracle PL / SQL

edit: although i listed python as a tag, i'd prefer to keep the logic inside of a db, preferably using Oracle PL/SQL.

协同过滤：非个性化项目对项相似度 [英] Collaborative Filtering: Non-Personalized item-to-item similarity

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

协同过滤：非个性化项目对项相似度 [英] Collaborative Filtering: Non-Personalized item-to-item similarity

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭