User Based CF 或 Item Based CF 需要多少数据才能给出推荐? [英] How much data is needed for User Based CF or Item Based CF to give recommendation?

查看:58
本文介绍了User Based CF 或 Item Based CF 需要多少数据才能给出推荐?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

User CF、Item CF 给出推荐需要多少数据?

How much data is needed for User CF, Item CF to give recommendation?

我手动创建了一个小数据集,所以我可以很好地理解算法的工作原理.
我发现对于我创建的小数据集,Slope-One可以给推荐,User CF或者Item CF不能给推荐.

I've manually created a small dataset, so I can understand well how the algorithm is working.
I found that for the small dataset I created, Slope-One can give a recommendation, User CF or Item CF can not give recommendation.

背后的原因是什么?
数据量的阈值是多少?

What is the reason behind it?
What is the threshold of the data amount ?

推荐答案

在基于用户和项目的 CF 中,数据集的大小可以非常小.重要的部分是项目和数据集中用户之间的映射频率.如果一个用户在数据集中只存在一次,基于用户的 cf 很可能不会给出推荐.因为一个共同的项目不会提供两个用户成为邻居的阈值相似度.上面的解释只是一个例子.对于像 1000 个数据这样的小数据集,两个推荐器都会给出最相似的项目和推荐方法的答案.然而,对于小得多的数据集,无论是否有足够的关于查询的用户/项目 ID 的信息,手动控制数据是很有用的.在这个链接中,您可以找到一个真正的非常小的受控数据集,用于创建基于项目的 CF 及其工作原理.我希望这个回答对您有所帮助.

In user and item based CF, the size of the data set can be really small. The important part is the frequency of the mapping between the items and the users in the dataset. If a user exists in the dataset only once, user based cf most probably will not give recommendations. Because one common item will not provide the threshold similarity for two users to become neighbors. The above explanation is just an example case. For a small dataset like 1000 data, both recommenders will give answers for the most similar item and recommend methods. However, for much smaller datasets, it is useful to control the data manually whether there is enough info about the queried user/item id or not. In this link you can find a really very small controlled dataset to create an item based CF and how it works. I hope this answer is helpful.

这篇关于User Based CF 或 Item Based CF 需要多少数据才能给出推荐?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆