按 5 星评级排序的更好方法是什么? [英] What is a better way to sort by a 5 star rating?

查看:25
本文介绍了按 5 星评级排序的更好方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 5 星级系统按客户评分对一堆产品进行排序.我为其设置的网站没有很多评分,并会继续添加新产品,因此通常会有一些评分较低的产品.

I'm trying to sort a bunch of products by customer ratings using a 5 star system. The site I'm setting this up for does not have a lot of ratings and continue to add new products so it will usually have a few products with a low number of ratings.

我尝试使用平均星级评分,但是当评分数量很少时,该算法会失败.

I tried using average star rating but that algorithm fails when there is a small number of ratings.

例如,具有 3x 5 星评分的产品比具有 100x 5 星评分和 2x 2 星评分的产品显示效果更好.

Example a product that has 3x 5 star ratings would show up better than a product that has 100x 5 star ratings and 2x 2 star ratings.

第二个产品不应该显示得更高,因为它在统计上更值得信赖,因为评分数量更多?

Shouldn't the second product show up higher because it is statistically more trustworthy because of the larger number of ratings?

推荐答案

2015 年之前,互联网电影数据库 (IMDb) 公开列出了用于对其进行排名的公式 Top 250 电影列表.引用:

Prior to 2015, the Internet Movie Database (IMDb) publicly listed the formula used to rank their Top 250 movies list. To quote:

用于计算评分最高的 250 个标题的公式给出了真实的贝叶斯估计:

The formula for calculating the Top Rated 250 Titles gives a true Bayesian estimate:

weighted rating (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C

哪里:

  • R = 电影的平均值(平均值)
  • v = 电影的投票数
  • m = 进入前 250 名所需的最低票数(目前为 25000)
  • C = 整个报告的平均投票数(目前为 7.0)

对于前 250 名,只考虑来自普通选民的选票.

For the Top 250, only votes from regular voters are considered.

这并不难理解.公式为:

It's not so hard to understand. The formula is:

rating = (v / (v + m)) * R +
         (m / (v + m)) * C;

数学上可以简化为:

rating = (R * v + C * m) / (v + m);

变量是:

  • R – 项目自己的评级.R 是项目投票的平均值.(比如一个item没有投票,它的R为0.如果有人给它5星,R变成5.如果别人给它1星,R变成3,[1, 5]的平均值.等等.)
  • C – 平均项目的评分.找出数据库中每一项的R,包括当前的一项,并取它们的平均值;即 C.(假设数据库中有 4 个项目,它们的评分为 [2, 3, 5, 5].C 是 3.75,这些数字的平均值.)
  • v – 一个项目的投票数.(再举一个例子,如果有 5 人对一个项目投了票,则 v 为 5.)
  • m – 可调参数.应用于评级的平滑"量基于与 m 相关的投票数 (v).调整 m 直到结果令您满意.并且不要将 IMDb 对 m 的描述误解为列出所需的最低票数"——该系统完全能够对票数少于 m 的项目进行排名.
  • R – The item's own rating. R is the average of the item's votes. (For example, if an item has no votes, its R is 0. If someone gives it 5 stars, R becomes 5. If someone else gives it 1 star, R becomes 3, the average of [1, 5]. And so on.)
  • C – The average item's rating. Find the R of every single item in the database, including the current one, and take the average of them; that is C. (Suppose there are 4 items in the database, and their ratings are [2, 3, 5, 5]. C is 3.75, the average of those numbers.)
  • v – The number of votes for an item. (To given another example, if 5 people have cast votes on an item, v is 5.)
  • m – The tuneable parameter. The amount of "smoothing" applied to the rating is based on the number of votes (v) in relation to m. Adjust m until the results satisfy you. And don't misinterpret IMDb's description of m as "minimum votes required to be listed" – this system is perfectly capable of ranking items with less votes than m.

所有公式的作用是:在计算平均值之前添加 m 个假想选票,每个选票的值为 C.一开始,当没有足够的数据时(即投票数远远小于m),这会导致用平均数据填充空白.然而,随着选票的积累,最终假想的选票会被真实的选票淹没.

All the formula does is: add m imaginary votes, each with a value of C, before calculating the average. In the beginning, when there isn't enough data (i.e. the number of votes is dramatically less than m), this causes the blanks to be filled in with average data. However, as votes accumulates, eventually the imaginary votes will be drowned out by real ones.

在这个系统中,投票不会导致评分大幅波动.相反,他们只是在某个方向稍微扰乱了它.

In this system, votes don't cause the rating to fluctuate wildly. Instead, they merely perturb it a bit in some direction.

当票数为零时,只存在假想票,并且都是C.因此,每个项目的评分都是C.

When there are zero votes, only imaginary votes exist, and all of them are C. Thus, each item begins with a rating of C.

另见:

  • 演示.点击解决".
  • IMDb 系统的另一个解释.
  • 对类似贝叶斯星级评定系统的说明.
  • A demo. Click "Solve".
  • Another explanation of IMDb's system.
  • An explanation of a similar Bayesian star-rating system.

这篇关于按 5 星评级排序的更好方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆