无法累计加上COUNT(*)` [英] Cannot cumulatively sum `COUNT(*)`

查看:108
本文介绍了无法累计加上COUNT(*)`的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个答案的第二部分使用变量来创建另一列的累计和。除了我正在使用 GROUP BY 语句并总结 COUNT(*)之外,而不是一列。这是我创建一个最小表并插入值的代码:

  CREATE TABLE`test_group_cumulative`(
`id` int(11)unsigned NOT NULL AUTO_INCREMENT,
`group_id` int(11)unsigned NOT NULL,
PRIMARY KEY(`id`)
)ENGINE = InnoDB DEFAULT CHARSET = latin1;

INSERT INTO`test_group_cumulative`(`id`,`group_id`)
VALUES
(1,1),
(2,2),
(3,3);

以下是失败的代码:

  SELECT 
`group_id`,
COUNT(*)AS`count`,
@count_cumulative:= @count_cumulative + COUNT(*)AS `count_cumulative`
FROM`test_group_cumulative` as`tgc`
JOIN(SELECT @count_cumulative:= 0)AS`_count_cumulative`
GROUP BY`group_id`
ORDER BY`id` ;

以下是结果:

  group_id count count_cumulative 
1 1 1
2 1 1
3 1 1

正如你所看到的, count_cumulative 不能正确求和。然而,这是怪异的部分。如果我将 count_cumulative 中的 COUNT(*)替换为值, 1 ,查询正常工作。

  @count_cumulative:= @count_cumulative +1 AS`count_cumulative` 

以下是正确的结果:

  group_id count count_cumulative 
1 1 1
2 1 2
3 1 3

显然,在我的应用程序中,每个组中都会有多个项目,因此 COUNT(*)并不总是 1 。我知道有些方法可以通过连接或子查询来实现,如果必须的话,我会这样做,但在我看来,这应该起作用。那么为什么不是 COUNT(*)在累积和中工作?

解决方案

这是我在进行时间序列分析时经常遇到的一个问题。我最喜欢的解决方法是将其包装到第二个选择中,并在最后一层中引入计数器。如果需要的话,您可以使用临时表将这种技术适应更复杂的数据流。



我使用您提供的模式完成了这个小的sqlfiddle: http://sqlfiddle.com/#!2/cc97e/21



以下是查询获得累积计数:

  SELECT 
tgc.group_id,@count_cumulative:= @count_cumulative + cnt as cum_cnt
FROM(
SELECT
group_id,COUNT(*)AS cnt
FROM`test_group_cumulative`
group by group_id
order by id)AS`tgc`,
(SELECT @count_cumulative := 0)AS`temp_var`;

这是我得到的结果:

  GROUP_ID CUM_CNT 
1 1
2 2
3 3

您尝试失败的原因:

当您使用临时变量进行组合时,mysql会独立执行各个组,当时每个组都分配了临时变量当前值,在本例中为0。



如果您运行了此查询:

  SELECT @count_cumulative; 

立即在

  SELECT 
`group_id`,
COUNT(*)AS`count`,
@count_cumulative:= @count_cumulative + COUNT(*)AS`count_cumulative`
FROM`test_group_cumulative` as as`tgc`
JOIN(SELECT @count_cumulative:= 0)AS`_count_cumulative`
GROUP BY`group_id`
ORDER BY`id`;

您将得到值1.对于您的每个组,@count_cumulative将被重置为0 。

因此,在我提出的解决方案中,我通过首先生成'group-counts'然后进行累加来避开这个问题。


The second section of this answer uses variables to create a cumulative sum of another column. I'm doing the same thing, except that I am using a GROUP BY statement, and summing COUNT(*) instead of a column. Here is my code to create a minimal table and insert values:

CREATE TABLE `test_group_cumulative` (
    `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
    `group_id` int(11) unsigned NOT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `test_group_cumulative` (`id`, `group_id`)
VALUES
    (1, 1),
    (2, 2),
    (3, 3);

And here is the code that is failing:

SELECT
    `group_id`,
    COUNT(*) AS `count`,
    @count_cumulative := @count_cumulative + COUNT(*) AS `count_cumulative`
FROM `test_group_cumulative` AS `tgc`
JOIN (SELECT @count_cumulative := 0) AS `_count_cumulative`
GROUP BY `group_id`
ORDER BY `id`;

Here is the result:

group_id    count   count_cumulative
1   1   1
2   1   1
3   1   1

As you can see, count_cumulative is NOT summing correctly. However, here's the weird part. If I replace the COUNT(*) in count_cumulative with it's value, 1, the query works correctly.

    @count_cumulative := @count_cumulative + 1 AS `count_cumulative`

Here is the correct result:

group_id    count   count_cumulative
1   1   1
2   1   2
3   1   3

Obviously, in my app, there will be more than one item in each group, so COUNT(*) won't always be 1. I know there are ways to do this with joins or subqueries, and I'll do that if I have to, but in my mind this SHOULD work. So why isn't COUNT(*) working inside of a cumulative sum?

解决方案

This is a problem I often face when doing time series analysis. My preferred way to tackle this is to wrap it into a second select and introduce the counter in the last layer. And you can adapt this technique to more complicated data flows using temporary tables, if reqiured.

I did this small sqlfiddle using the schema you present: http://sqlfiddle.com/#!2/cc97e/21

And here is the query to get the cumulative count:

SELECT
tgc.group_id, @count_cumulative := @count_cumulative + cnt as cum_cnt
FROM (
  SELECT
    group_id, COUNT(*) AS cnt
  FROM `test_group_cumulative` 
  group by group_id
  order by id) AS `tgc`, 
(SELECT @count_cumulative := 0) AS `temp_var`; 

This is the result I get:

GROUP_ID    CUM_CNT
1           1
2           2
3           3

The reason your attempt did not work:

When you do a group by with the temporary variable, mysql executes individual groups independently, and at the time each group is assigned the temporary variable current value, which in this case is 0.

If, you ran this query:

SELECT @count_cumulative;

immediately after

SELECT
    `group_id`,
    COUNT(*) AS `count`,
    @count_cumulative := @count_cumulative + COUNT(*) AS `count_cumulative`
FROM `test_group_cumulative` AS `tgc`
JOIN (SELECT @count_cumulative := 0) AS `_count_cumulative`
GROUP BY `group_id`
ORDER BY `id`;

you would get the value 1. For each of your groups, the @count_cumulative is being reset to 0.

Hence, in my proposed solution, I circumvent this issue by generating the 'group-counts' first and then doing the accumulation.

这篇关于无法累计加上COUNT(*)`的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆