"人谁看了这也看了"算法 [英] "People who watched this also watched" algorithm

查看:142
本文介绍了"人谁看了这也看了"算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想code的算法,其作用有点像亚马逊的人谁买了这个也买了。

I am trying to code an algorithm that acts a bit like Amazon's "People who bought this also bought".

这两者之间的区别是,我的是只是计算你在一个会话中观看了产品,而亚马逊指望每次购买/结帐。

The difference between the two is that mine is just counting the "products" you watched in a single session, while Amazon is counting every purchase/checkout.

我在实施和搞清楚什么的算法中应该有点难度。

I have a bit of difficulty in implementing and figuring out what the algo should be.

  1. 到目前为止,我对SessionID的计数被观看产品ID。
  2. 将一天结束时,我有很多ProductIDs许多SessionIDs关注。
  3. 现在,我需要建立某种形式的数据库派系。也就是说,将一个接一个在SessionsIDs和提取的所有他们已经查看的产品。那么,写它在数据库表中的派系。
  4. 有一次我拉帮结派,和产品正在查看,我扫描这个表看它是哪个集团,然后提取productIDs中的其他一切。

你有任何引用/想法,如果我的算法是正确的?有没有更好的?

Do you have any reference/idea if my algorithm is correct? Is there a better one?

推荐答案

我可以用一个简单的数据库结构,以达到您想要的结果,和pretty的简单查询:

I was able to achieve your desired result using a simple DB structure, and a pretty simple query:

TABLE `exa`

| sesh_id | prod_id |
---------------------
| 1       | 1       |
| 1       | 2       |
| 1       | 3       |
| 1       | 4       |
| 2       | 2       |
| 2       | 3       |
| 2       | 4       |
| 3       | 3       |
| 3       | 4       |
| 4       | 1       |
| 4       | 2       |
| 4       | 5       |

查询

SELECT c.prod_id, COUNT(*)
FROM `exa` a
JOIN `exa` b ON a.prod_id=b.prod_id
JOIN `exa` c ON b.sesh_id=c.sesh_id
WHERE a.`prod_id`=3 AND c.prod_id!=3
GROUP BY c.prod_id
ORDER BY 2 DESC;

结果

| prod_id | COUNT |
| 4       | 9     |
| 2       | 6     |
| 1       | 3     |

我们的想法是,每一个会话视图一个产品的时候,它就会被插入到表[在这种情况下 EXA ]

然后,在任何特定的产品视图,您可以检查,看看谁浏览过此产品的其他产品还看了,通过频率加权。因此,在这个特定例子中,大家认为产物#3观看产品#4,所以它出现第一在排序。 产品#5只按会话#4,#会话4观察没有查看产品#3,所以产品#5没有出现在结果中。

Then, on any particular product view, you can check and see what other products people who viewed this product also viewed, weighted by frequency. So in this particular example, EVERYONE that viewed product #3 viewed product #4, so it comes up first in the sort. Product #5 was only viewed by session #4, and session #4 didn't view product #3, so product #5 doesn't come up in the results.

这篇关于"人谁看了这也看了"算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆