评级为5星的更好方法是什么? [英] What is a better way to sort by a 5 star rating?

查看:75
本文介绍了评级为5星的更好方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用5星系统按客户评分对一堆产品进行排序.我为此设置的网站收视率不高,并且会继续添加新产品,因此通常会有一些收视率低的产品.

I'm trying to sort a bunch of products by customer ratings using a 5 star system. The site I'm setting this up for does not have a lot of ratings and continue to add new products so it will usually have a few products with a low number of ratings.

我尝试使用平均星级,但是当评分很少时,该算法就会失败.

I tried using average star rating but that algorithm fails when there is a small number of ratings.

例如,具有3颗5颗星评级的产品比具有100颗5颗星评级和2颗2颗星评级的产品的显示效果更好.

Example a product that has 3x 5 star ratings would show up better than a product that has 100x 5 star ratings and 2x 2 star ratings.

第二个产品是否应该显示出较高的排名,因为从统计上讲,由于评分较高,第二个产品更值得信赖?

Shouldn't the second product show up higher because it is statistically more trustworthy because of the larger number of ratings?

推荐答案

2015年之前,互联网电影数据库(IMDb)公开列出了用于对其

Prior to 2015, the Internet Movie Database (IMDb) publicly listed the formula used to rank their Top 250 movies list. To quote:

用于计算排名最高的250个标题的公式给出了真正的贝叶斯估计:

weighted rating (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C

其中:

  • R =电影的平均值(均值)
  • v =电影的票数
  • m =必须进入前250名(目前为25000名)的最低投票数
  • C =整个报告的平均投票数(当前为7.0)
  • R = average for the movie (mean)
  • v = number of votes for the movie
  • m = minimum votes required to be listed in the Top 250 (currently 25000)
  • C = the mean vote across the whole report (currently 7.0)

对于前250名,仅考虑常规选民的投票.

For the Top 250, only votes from regular voters are considered.

这并不难理解.公式是:

It's not so hard to understand. The formula is:

rating = (v / (v + m)) * R +
         (m / (v + m)) * C;

数学上可以简化为:

rating = (R * v + C * m) / (v + m);

变量为:

  • R –物品本身的等级. R是项目投票的平均值. (例如,如果某个项目没有投票,则其R为0.如果某人给它5星,则R变为5.如果其他人给它1星,则R变为3,即[1, 5]的平均值.依此类推. )
  • C –平均项目的评分.在数据库中找到每个项目的R,包括当前的R,取它们的平均值; (假设数据库中有4个项目,它们的评级为[2, 3, 5, 5].C为3.75,即这些数字的平均值.)
  • v –项目的票数. (举另一个例子,如果5个人对该项目投了票,则v为5.)
  • m –可调整参数.应用于评级的平滑"量基于相对于m的票数(v).调整m直到结果令您满意.并且不要将IMDb对m的描述误解为需要列出的最低票数" –该系统完全能够对票数少于m的项目进行排名.
  • R – The item's own rating. R is the average of the item's votes. (For example, if an item has no votes, its R is 0. If someone gives it 5 stars, R becomes 5. If someone else gives it 1 star, R becomes 3, the average of [1, 5]. And so on.)
  • C – The average item's rating. Find the R of every single item in the database, including the current one, and take the average of them; that is C. (Suppose there are 4 items in the database, and their ratings are [2, 3, 5, 5]. C is 3.75, the average of those numbers.)
  • v – The number of votes for an item. (To given another example, if 5 people have cast votes on an item, v is 5.)
  • m – The tuneable parameter. The amount of "smoothing" applied to the rating is based on the number of votes (v) in relation to m. Adjust m until the results satisfy you. And don't misinterpret IMDb's description of m as "minimum votes required to be listed" – this system is perfectly capable of ranking items with less votes than m.

所有公式要做的是:在计算平均值之前,先添加m个虚拟投票,每个虚拟投票的值为C.开始时,如果没有足够的数据(即投票数大大少于m),则会导致空白区域填充平均数据.但是,随着票数的累积,最终虚构的票数将被真实的票数淹没.

All the formula does is: add m imaginary votes, each with a value of C, before calculating the average. In the beginning, when there isn't enough data (i.e. the number of votes is dramatically less than m), this causes the blanks to be filled in with average data. However, as votes accumulates, eventually the imaginary votes will be drowned out by real ones.

在此系统中,投票不会导致评分波动剧烈.取而代之的是,他们只是在某个方向上扰动了它.

In this system, votes don't cause the rating to fluctuate wildly. Instead, they merely perturb it a bit in some direction.

当票数为零时,仅虚数票存在,并且所有票数均为C.因此,每个项目都以等级C开头.

When there are zero votes, only imaginary votes exist, and all of them are C. Thus, each item begins with a rating of C.

另请参阅:

  • A demo. Click "Solve".
  • Another explanation of IMDb's system.
  • An explanation of a similar Bayesian star-rating system.

这篇关于评级为5星的更好方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆