MySQL按数字系列分组 [英] MySQL group by number series

查看:35
本文介绍了MySQL按数字系列分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始学习 MySQL,我遇到了这个问题,我非常需要解决方案或逻辑.

I'm just starting to learn MySQL and I have encountered this problem which I badly needed the solution or just the logic.

例如我有这张表:

id         a 
--        --
 1         1
 2         2
 3         3
 4         5
 5         6
 6         7
 7         9
 8        10
 9        11
10        12

现在,我想要的是显示 a 列中的所有数据,这些数据应该按系列分组.在这种情况下,结果应该是:

Now, what I want is to display all the data in column a which should be grouped by series. In this case the result should be:

series_start|series_end|count
------------+----------+-----
           1          3     3
           5          7     3
           9         12     4

这需要很多子查询和连接.我现在只是想不通.

This needs a lot of subqueries and joins. I just can't figure it out by now.

推荐答案

This is a 问题,这是另一种解决它的方法,它也使用变量:

This is a gaps-and-islands problem, and here's another method of solving it, which also uses variables:

SELECT
  MIN(a) AS series_start,
  MAX(a) AS series_end,
  MAX(a) - MIN(a) + 1 AS series_count
FROM (
  SELECT
    a,
    @r := @r + 1 AS r
  FROM
    yourtable,
    (SELECT @r := 0) AS x
  ORDER BY
    a
) s
GROUP BY
  a - r
ORDER BY
  a - r
;

这就是它的工作原理.

子查询为表行分配行号并返回此行集:

The subquery assigns row numbers to the table rows and returns this row set:

 a   r
--  --
 1   1
 2   2
 3   3
 5   4
 6   5
 7   6
 9   7
10   8
11   9
12  10

在这种情况下,存储行号的 r 列恰好与数据样本中的 id 列匹配,但我假设一般情况下id 列可能有间隙,因此不能在这里使用.

In this case the r column, which stores the row numbers, happens to match the id column in your data sample, but I'm assuming that in general the id column may have gaps, and for that reason it cannot be used here.

主查询根据 ra 之间的差异对结果进行分组:对于顺序值,它总是相同的:

The main query groups the results by the difference between r and a: for sequential values, it will always be the same:

 a   r  a - r
--  --  -----
 1   1      0
 2   2      0
 3   3      0
 5   4      1
 6   5      1
 7   6      1
 9   7      2
10   8      2
11   9      2
12  10      2

这允许我们将这些行组合在一起.此时剩下的就是获取最小值、最大值和计数,这会为您提供以下输出:

and that allows us to group such rows together. All that remains at this point is to get the minimim, maximum and count, which gives you this output:

series_start  series_end  series_count
------------  ----------  ------------
           1           3             3
           5           7             3
           9          12             4

此查询的 SQL Fiddle 演示,我借用了 @sgeddes 的架构,可以在 这里.

A SQL Fiddle demonstration of this query, for which I've borrowed @sgeddes's schema, can be found here.

更新

由于不能使用数字变量(根据注释),您可以使用三角形自连接分配行号,但它的效率远低于使用变量.无论如何,这是修改后的版本,对先前查询的更改以粗体突出显示:

As numeric variables cannot be used (according to comments), you could assign row numbers using a triangular self-join, but it will be much less efficient than using a variable. Anyway, here's the modified version, changes to the previous query being highlighted in bold:

SELECT
  MIN(a) AS series_start,
  MAX(a) AS series_end,
  MAX(a) - MIN(a) + 1 AS series_count
FROM (
  SELECT
    data.a,
    COUNT(*) AS r
  FROM
    yourtable AS data
  INNER JOIN
    yourtable AS tally
  ON
    data.id >= tally.id
  GROUP BY
    data.a
) s
GROUP BY
  a - r
ORDER BY
  a - r
;

方法本身保持不变:子查询返回一个排序的行集,然后与以前一样处理.

The approach itself remains unchanged: the subquery returns a ranked row set, which is then processed same as previously.

可在此处获得用于修改查询的 SQL Fiddle 演示.

A SQL Fiddle demo for the modified query is available here.

这篇关于MySQL按数字系列分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆