mysql中一组行的累积总和 [英] Cumulative sum over a set of rows in mysql

查看:36
本文介绍了mysql中一组行的累积总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个复杂的查询(包含多个连接、联合),它返回一组包含 id、day、hr、amount 的行.查询的输出如下所示:

id day hr 金额1 1 1 101 1 2 251 1 3 301 2 1 101 2 2 401 2 2 302 1 1 102 1 2 152 1 3 302 2 1 102 2 2 202 2 2 30

我需要为每个 id 找到一天中每个小时的累计总数.输出应该是这样的:

id day hr 金额累计总和1 1 1 10 101 1 2 25 351 1 3 30 651 2 1 10 101 2 2 40 501 2 2 30 802 1 1 10 102 1 2 15 252 1 3 30 552 2 1 10 102 2 2 20 302 2 2 30 60

生成第一个输出的初始查询如下所示:

从中选择id、day、hr、amount(//多个表上的多个连接)a左连接(//多个表上的联合)ba.id=b.id;

获取第二个输出中描述的累积总和的 sql 查询是什么?解决方案中不应使用 SET.

谢谢.

解决方案

UPDATE

MySQL 8.0 引入了窗口函数",功能等同于 SQL Server 的窗口函数"(具有 Transact-SQL OVER 语法提供的分区和排序)和 Oracle 的分析函数".

MySQL 参考手册 12.21 窗口函数 https://dev.mysql.com/doc/refman/8.0/en/window-functions.html

此处提供的答案是针对 8.0 之前的 MySQL 版本的一种方法.

<小时>

原答案

MySQL 不提供用于获取运行累积总和"的类型分析函数,就像其他 DBMS(如 Oracle 或 SQL Server)中可用的分析函数一样.

但是,可以使用 MySQL 模拟一些分析函数.

有(至少)两种可行的方法:

一种是使用相关子查询来获取小计.这种方法在大型集合上可能很昂贵,如果外部查询上的谓词很复杂,则这种方法会很复杂.这实际上取决于多个表上的多个连接"的复杂程度.(不幸的是,MySQL 也不支持 CTE.)

另一种方法是利用MySQL用户变量,做一些控制中断处理.这里的技巧"是对您的查询结果进行排序(使用 ORDER BY),然后将您的查询包装在另一个查询中.

我将举例说明后一种方法.

由于 MySQL 执行操作的顺序,cumulative_total 列需要在 idday 的值之前计算从当前行被保存到用户变量中.把这个列放在最前面是最简单的.

别名为 i 的内联视图(在下面的查询中)只是为了初始化用户变量,以防万一这些已经在会话中设置.如果那些已经分配了值,我们想忽略它们的当前值,最简单的方法是初始化它们.

您的原始查询被括在括号中,并在下面的示例中被赋予一个别名 c.对原始查询的唯一更改是添加了 ORDER BY 子句,因此我们可以确保按顺序处理查询中的行.

外部选择检查当前行的 idday 值是否与前一行匹配".如果是,我们将当前行的 amount 添加到累积小计中.如果它们不匹配,那么我们将累积小计重置为零,并添加当前行的金额(或者,更简单地,只分配当前行的金额).

在我们完成累计总数的计算后,我们将当前行的idday值保存到用户变量中,以便在我们处理时可用下一行.

例如:

SELECT IF(@prev_id = c.id AND @prev_day = c.day,@cumtotal := @cumtotal + c.amount,@cumtotal := c.amount) AScumulative_total, @prev_id := c.id AS `id`, @prev_day := c.day AS `day`, c.hr, c.amount AS `amount'从 ( SELECT @prev_id := NULL, @prev_day := NULL,@小计:= 0) 一世加入 (从中选择 id、day、hr、金额(//多个表上的多个连接)a左连接(//多个表上的联合)ba.id=b.id按 1,2,3 排序) C

如果需要以不同的顺序返回列,累积总数作为最后一列,那么一个选项是将整个语句包装在一组括号中,并将该查询用作内联视图:

SELECT d.id, d.day, d.hr, d. 数量, d.cumulative_total从 (//从上面查询) d

I have a complex query(containing multiple joins, unions) that returns a set of rows containing id, day, hr, amount. The output of the query looks like this:

id day    hr  amount 
1   1      1   10       
1   1      2   25       
1   1      3   30        
1   2      1   10       
1   2      2   40       
1   2      2   30        
2   1      1   10       
2   1      2   15        
2   1      3   30       
2   2      1   10       
2   2      2   20      
2   2      2   30  

I need to find cumulative total for each id, for every hour of the day. The output should be like this:

id day    hr  amount cumulative total
1   1      1   10       10
1   1      2   25       35
1   1      3   30       65 
1   2      1   10       10
1   2      2   40       50
1   2      2   30       80 
2   1      1   10       10
2   1      2   15       25 
2   1      3   30       55
2   2      1   10       10
2   2      2   20       30
2   2      2   30       60

My initial query that produces the first output looks like this:

select id, day, hr, amount from
( //multiple joins on multiple tables)a
left join
(//unions on multiple tables)b
on a.id=b.id;

What's sql query to get the cumulative sum as described in the second output? SET should not be used in the solution.

Thanks.

解决方案

UPDATE

MySQL 8.0 introduces "window functions", functionality equivalent to SQL Server "window functions" (with partitioning and ordering provided by Transact-SQL OVER syntax), and Oracle "analytic functions".

MySQL Reference Manual 12.21 Window Functions https://dev.mysql.com/doc/refman/8.0/en/window-functions.html

The answer provided here is an approach for MySQL versions prior to 8.0.


ORIGINAL ANSWER

MySQL doesn't provide the type analytic function you would use to get a running "cumulative sum", like the analytic functions available in other DBMS (like Oracle or SQL Server.)

But, it is possible to emulate some analytic functions, using MySQL.

There are (at least) two workable approaches:

One is to use a correlated subquery to get the subtotal. This approach can be expensive on large sets, and complicated if the predicates on the outer query are complicated. It really depends on how complicated that "multiple joins on multiple tables" is. (Unfortunately, MySQL also does not not support CTEs either.)

The other approach is to make use of MySQL user variables, to do some control break processing. The "trick" here is to the results from your query sorted (using an ORDER BY) and then wrapping your query in another query.

I'll give an example of the latter approach.

Because of the order that MySQL performs operations, the cumulative_total column needs to be computed before the value from id and day from the current row are saved into user variables. It's just easiest to put this column first.

The inline view aliased as i (in the query below) is just there to initialize the user variables, just in case these are already set in the session. If those already have values assigned, we want to ignore their current values, and the easiest way to do that is to initialize them.

Your original query gets wrapped in parenthesis, and is given an alias, c in the example below. The only change to your original query is the addition of an ORDER BY clause, so we can be sure that we process the rows from the query in sequence.

The outer select checks whether the id and day value from the current row "match" the previous row. If they do, we add the amount from the current row to the cumulative subtotal. If they don't match, then we reset the the cumulative subtotal to zero, and add the amount from the current row (or, more simply, just assign the amount from the current row).

After we have done the computation of the cumulative total, we save the id and day values from the current row into user variables, so they are available when we process the next row.

For example:

SELECT IF(@prev_id = c.id AND @prev_day = c.day
         ,@cumtotal := @cumtotal + c.amount
         ,@cumtotal := c.amount) AS cumulative_total
     , @prev_id  := c.id  AS `id`
     , @prev_day := c.day AS `day`
     , c.hr
     , c.amount AS `amount'
  FROM ( SELECT @prev_id  := NULL
              , @prev_day := NULL
              , @subtotal := 0
       ) i
  JOIN (

         select id, day, hr, amount from
         ( //multiple joins on multiple tables)a
         left join
         (//unions on multiple tables)b
         on a.id=b.id

         ORDER BY 1,2,3
       ) c

If it's necessary to return the columns in a different order, with cumulative total as the last column, then one option is to wrap that whole statement in a set of parens, and use that query as an inline view:

SELECT d.id
     , d.day
     , d.hr
     , d.amount
     , d.cumulative_total
FROM (
       // query from above
     ) d

这篇关于mysql中一组行的累积总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆