推荐系统为书店应用程序 [英] Recommendation System for a book store application

查看:105
本文介绍了推荐系统为书店应用程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习一些推荐算法,这些算法正在Amazon.com等网站中使用。所以我有这个简单的java(春季休眠postgres)书店应用程序,在书中有属性标题,类别,标签,作者。为了简单起见,书中没有内容。一本书必须以其标题,类别,作者和标签来标识。对于登录到应用程序的每个用户,我应该能够推荐一些书籍。每个用户都可以查看图书,将其添加到购物车并随时购买。因此,在数据库中,我存储了每个用户看过一本书的次数,购物车中的书以及用户购买的书。目前没有评分选项,但可以添加。



那么有人可以告诉我哪些算法可以用来演示每位用户的书籍推荐?我想保持它非常简单。它不是一个销售项目,而只是为了扩展我对推荐算法的知识。所以假设总共只有约30本书(每本书5个类别和6本书)。如果有人可以告诉我应该使用什么属性来计算两个用户之间的相似度,以及如何使用推荐的算法来解决这个问题,那将会非常有帮助。



<预先感谢。
SerotoninChase。

解决方案

作为一个具体的例子,一个选项是最近的K邻居算法。 p>

为简单起见,假设您只有10本书,并且您只跟踪每位用户查看每本书的次数。然后,对于每个用户,你可能有一个数组 int timesViewed [10] ,其中 timesViewed [i] 是用户查看书籍号码的次数 i



然后,您可以将用户与所有用户的其他用户使用相关性功能,例如皮尔逊相关性例。计算当前用户 c 和另一个用户 o 之间的相关性会给出一个介于-1.0和1.0之间的值,其中-1.0意味着此用户 c 与其他用户 o 完全相反,1.0表示此用户 c 与其他用户相同 o



如果计算 c 和其他每个用户之间的核心关系,则会得到用户的查看模式与其他用户的查看模式相似程度的结果列表。然后,您选择 K (例如5,10,20)最相似的结果(因此是算法的名称),也就是 K 相关性得分最接近1.0的用户。



现在,您可以对每个用户的次进行加权平均观察数组。例如,我们会说 averageTimesViewed [0] 是每个人的 timesViewed [0] 的平均值K用户,通过它们的相关分数加权。然后对每一个做相同的 averageTimesViewed [i]



现在你有一个数组 averageTimesViewed ,其中大致包含观看模式最相似的K个用户对 c 查看每本书的平均次数。推荐具有最高 averageTimesViewed 得分的图书,因为这是其他用户最感兴趣的图书。



它通常也值得排除用户已经推荐的书籍,但是在计算相似性/相关性时保留这些书籍仍然很重要。

这可以被平凡地扩展以考虑其他数据(例如购物车清单等)。此外,如果您愿意(例如 K =用户数量),您可以选择所有用户,但这并不总是产生有意义的结果,并且通常选择一个相当小的 K 就足以获得好的结果,并且计算起来更快。


Hey I'm trying to learn some of the recommendation algorithms that's being used in websites like Amazon.com. So I have this simple java (spring hibernate postgres) book store application where in Book has the attributes title, category, tags, author. For simplicity there's no content inside the book. A book has to be identified by its title, category, author and tags. For each user logging into the application I should be able to recommend some books. Each user can view a book, add them to cart and buy it anytime. So in the database I'm storing how many times each user looked at a book, the books in his cart and the books the user has bought. At the moment there's no rating option but that can be added too.

So can someone tell me what are the algorithms I could use to demonstrate some recommendation of books for each user? I want to keep it really simple. Its not a project to sell but only to expand my knowledge on recommendation algorithms. So assume there are only about 30 books in total(5 categories and 6 books in each). It would be really helpful if someone could also tell me what should be the attributes I should be using to calculate similarities between two users and how to go about it with the algorithms recommended.

Thanks in advance. SerotoninChase.

解决方案

As a particular concrete example, one option is a "nearest K neighbours" algorithm.

To simplify things, imagine you only had ten books, and you were only tracking how many times each user viewed each book. Then, for each user, you might have an array int timesViewed[10], where the value of timesViewed[i] is the number of times the user has viewed book number i.

You can then compare the user to all of the other users using a correlation function, such as the Pearson correlation for example. Computing the correlation between the current user c and another user o gives a value between -1.0 and 1.0, where -1.0 means "this user c is the complete opposite of the other user o", and 1.0 means "this user c is the same as the other user o".

If you compute the corellation between c and every other user, you get a list of results of how similar the user's viewing pattern is to that of each other user. You then pick the K (e.g. 5, 10, 20) most similar results (hence the name of the algorithm), that is, the K users with the correlation scores closest to 1.0.

Now, you can do a weighted average of each of those user's timesViewed arrays. For example, we'll say averageTimesViewed[0] is the average of the timesViewed[0] for each of those K users, weighted by their correlation score. Then do the same for each other averageTimesViewed[i].

Now you have an array averageTimesViewed which contains, roughly speaking, the average number of times the K users with the most similar viewing patterns to c has viewed each book. Recommend the book which has the highest averageTimesViewed score, since this is the book the other users have shown most interest in.

It's usually worth also excluding books the user has already viewed from being recommended, but it is still important to keep those accounted for when computing similarity/correlation.

Also note that this can be trivially extended to take other data into account (such as cart lists etc). Also, you can select all users if you want (i.e. K = number of users), but that doesn't always produce meaningful results, and usually picking a reasonably small K is sufficient for good results, and is quicker to compute.

这篇关于推荐系统为书店应用程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆