SQL等级百分位数 [英] SQL rank percentile

查看:156
本文介绍了SQL等级百分位数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我进行了一个SQL查询,对页面进行了多少次查看.例如,

I've made an SQL query which rank pages by how many times they have been viewed. For instance,

╔══════╦═══════╗
║ PAGE ║ VIEWS ║
╠══════╬═══════╣
║ J    ║   100 ║
║ Q    ║    77 ║
║ 3    ║    55 ║
║ A    ║    23 ║
║ 2    ║     6 ║
╚══════╩═══════╝

现在我想做的是使用SQL查询找到每个页面的百分位等级.我想使用的数学运算非常简单,我只想将已经生成的表的行数除以总行数即可.或1减去此值,具体取决于我的兴趣.

Now what I would like to do is find the percentile rank of each page using an SQL query. The math I would like to use for this is simple enough, I just want to take the row number of the already generated table divided by the total number of rows. Or 1 minus this value, depending on my interests.

我可以像这样在已经生成的表上做一个COUNT(页)吗?我意识到这就是我将获得行总数的方法.但是是否有任何命令返回行号?

Can I do a COUNT(pages) on an already generated table like this? I realize that's how I will get the total number of rows. But are there any commands to return a row number?

为了进一步澄清我的问题,我需要以下结果

Just to further clarify my question I need the following results

╔══════╦════════════════╗
║ PAGE ║ Percentile     ║
╠══════╬════════════════╣
║ J    ║   (1-1/5)*100  ║
║ Q    ║   (1-2/5)*100  ║
║ 3    ║   (1-3/5)*100  ║
║ A    ║   (1-4/5)*100  ║
║ 2    ║   (1-5/5)*100  ║
╚══════╩════════════════╝

或者通常是(1-(行号)/(COUNT(页))* 100

Or in general (1-(row number)/(COUNT(page))*100

推荐答案

您无法在单个SQL语句中计算表中的百分位等级. John Woo此处建议的方法在计算出最高排名之后便分崩离析,即使结果对于正在处理的表的第一个(不可预测的)百分比(这意味着前几个百分位数)看起来确实不错.

You cannot calculate percentile ranks across a table in a single SQL statement. The approach suggested by John Woo here falls apart after the top ranks are calculated, even though the results do look good for the first (unpredictable) percent of the table being processed, meaning the top few percentiles.

Oracle Ace Roland Bouman在这篇文章中对此原因进行了解释: http://rpbouman.blogspot.com/2009/09/mysql -another-ranking-trick.html

The reason why is explained in this post by Oracle Ace Roland Bouman: http://rpbouman.blogspot.com/2009/09/mysql-another-ranking-trick.html

简而言之:用户定义的变量并不是为了在单个SQL语句中可靠而设计的,而不能仅在多个SQL语句之间可靠.

In short: user-defined variables are not designed to be reliable within a single SQL statement, only across multiple SQL statements.

阅读MySQL手册中有关用户定义变量的第一句话: http://dev.mysql.com/doc/refman/5.5/en/user-variables.html 您可以在一个语句中将值存储在用户定义的变量中,然后在另一条语句中引用它."

Read the first sentence of the MySQL manual about User-Defined Variables: http://dev.mysql.com/doc/refman/5.5/en/user-variables.html "You can store a value in a user-defined variable in one statement and then refer to it later in another statement."

然后在第十段中看到以下清晰的语句:作为一般规则,除了SET语句外,永远不要为用户变量分配值,并且不要在同一语句中读取该值.涉及用户变量的表达式的求值顺序不确定."

Then in about the 10th paragraph see this clear statement: "As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement. [. . .] the order of evaluation for expressions involving user variables is undefined. "

这篇关于SQL等级百分位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆