2列上的间隙和孤岛-如果A列连续且B列相同 [英] Gaps and islands on 2 columns - if column A consecutive and column B identical

查看:47
本文介绍了2列上的间隙和孤岛-如果A列连续且B列相同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张桌子,如下:

CREATE TABLE `table` (
    `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
    `cc` int(3) unsigned NOT NULL,
    `number` int(10) NOT NULL,
    `name` varchar(64) NOT NULL,
    `datetime` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB

DBMS是Debian 9.1上的MariaDB 10.1.26.我一直试图让它列出连续数字的范围.通过以下查询,我能够完成该操作:

The DBMS is MariaDB 10.1.26 on Debian 9.1. I had been trying to get it to list ranges of consecutive numbers. With the following query, I am able to accomplish that:

SELECT min(number) first_number, max(number) last_number, count(*) AS no_records FROM (
    SELECT c.*, @rn := @rn + 1 rn
    from (SELECT number FROM table WHERE cc = 1 GROUP BY number ORDER BY number) AS c
    CROSS JOIN (SELECT @rn := 0) r
) c
GROUP BY number - rn ORDER BY number ASC

但是,如果我希望根据其他列中的值将项目捆绑在一起,则无法使用.说我仅在 name 的值都相同时才将项目分组.说这是我的数据:

But if I wanted items bunched together based on the value in an additional column, this doesn't work. Say I want the items grouped only when their values for name are all the same. Say this is my data:

INSERT INTO `table` (`id`, `cc`, `number`, `name`) VALUES
(1, 1, 12, 'Hello'),
(2, 1, 2, 'Apple'),
(3, 1, 3, 'Bean'),
(4, 1, 10, 'Hello'),
(5, 1, 11, 'Hello'),
(6, 1, 1, 'Apple'),
(7, 1, 14, 'Deer'),
(8, 1, 14, 'Door'),
(9, 1, 15, 'Hello'),
(10, 1, 17, 'Hello'),

我想得到这样的报告:

first  last   count  name
1      2      2      Apple
3      3      1      Bean
10     12     3      Hello
14     14     1      Deer
14     14     1      Door
15     15     1      Hello
17     17     1      Hello

换句话说,除了将连续的项目分组外,当 name 的值不同时,这些组也被分为单独的组.(换句话说,如果项目都是连续的并且具有相同的确切 name ,则它们仅在一个岛中在一起).我最近的(并且不是很接近)正在执行以下操作:

In other words, in addition to grouping items that are consecutive, these groups are split up into separate groups when their values of name differ. (In other words, items are only in an island together if they are all consecutive AND have the same exact name). The closest (and it's not very close) that I have come, is doing this:

SELECT min(number) first_number, max(number) last_number, count(*) AS no_records FROM (
    SELECT c.*, @rn := @rn + 1 rn
    from (SELECT number FROM table WHERE cc = 1 GROUP BY number, name ORDER BY number) AS c
    CROSS JOIN (SELECT @rn := 0) r
) c
GROUP BY number - rn, name ORDER BY number ASC

这是行不通的,但是发生的事情似乎是将名称的第一个外观返回为 first ,最后一个外观返回为 last ,其中 no_records 是它们之间的数量差异,这肯定是不正确的.

This doesn't work, though, and what happens is it seems to return the first appearance of a name as first and the last appearance as last, with no_records being the difference in number between them, which is certainly not right at all.

我觉得此问题可能与之相关,但是我一直无法理解它,当我尝试将其调整到我的表中时,它或多或少地相当于一个简单的 SELECT * .我需要对查询进行哪些修改才能使其正常工作?

I feel like this question might be related, but I've not been able to make much sense of it, and when I tried tweaking it to my table, it just did the equivalent of a simple SELECT * more or less. What modifications to my query do I need to make to get it to work?

请紧记:

  • 物品可以按任何顺序插入
  • 数字可以重复
  • 名称可以重复,不一定要重复

推荐答案

查询中没有太多更改.基本上,您需要在子查询中选择 name number ,并以相同的顺序进行排序.然后,您可以在外部查询中按名称,编号-rn 进行分组.

There is not much to change in your query. You basically need to select name and number in the subquery and sort in the same order. Then you can group by name, number - rn in the outer query.

SELECT
    min(number) first_number,
    max(number) last_number,
    count(*) AS no_records,
    name
FROM (
    SELECT c.*, @rn := @rn + 1 rn
    from (
        SELECT name, number
        FROM `table`
        WHERE cc = 1
        ORDER BY name, number
        LIMIT 99999999999999999
    ) AS c
    CROSS JOIN (SELECT @rn := 0) r
) c
GROUP BY name, number - rn 
ORDER BY first_number ASC, name ASC;

结果:

first_number  last_number  no_records  name
           1            2           2  Apple
           3            3           1  Bean
          10           12           3  Hello
          14           14           1  Deer
          14           14           1  Door
          15           15           1  Hello
          17           17           1  Hello

db<>小提琴

我通常主张不要以这种方式使用会话变量.原因是此类解决方案取决于内部实现,并且可能会因版本更新或设置更改而被破坏.例如:一旦MariaDB决定在没有LIMIT的子查询中忽略ORDER BY子句.这就是为什么我加入了巨大的LIMIT的原因.

I usually advocate against the use of session variables in this way. The reason is that such solutions depend on internal implementation, and can be broken by version updates or settings changes. For example: Once MariaDB decided to ignore the ORDER BY clause in subqueries without LIMIT. This is why I included a huge LIMIT.

我还在外部ORDER BY子句中用 first_number 替换了 number ,以避免ONLY_FULL_GROUP_BY模式出现问题.

I also replaced number with first_number in the outer ORDER BY clause to avoid problems with ONLY_FULL_GROUP_BY mode.

一种更稳定的生成行号的方法是在临时表中使用AOTO_INCREMENT列:

A more stable way to generate row numbers is to use an AOTO_INCREMENT column in a temporary table:

drop temporary table if exists tmp_tbl;

create temporary table tmp_tbl (
  rn int unsigned auto_increment primary key,
  name varchar(64) not null,
  number int not null
);

insert into tmp_tbl (name, number)
  select name, number
  from `table`
  order by name, number;

最终的SELECT查询与上面的外部查询相同:

The final SELECT query is identical with the outer query above:

SELECT
    min(number) first_number,
    max(number) last_number,
    count(*) AS no_records,
    name
FROM tmp_tbl
GROUP BY name, number - rn 
ORDER BY first_number ASC, name ASC;

db<>小提琴

在较新的版本(从MariaDB 10.2开始)中,您可以使用 ROW_NUMBER()窗口函数:

In a more recent version (starting from MariaDB 10.2) you can use ROW_NUMBER() window function instead:

SELECT
    min(number) first_number,
    max(number) last_number,
    count(*) AS no_records,
    name
FROM (
    SELECT
        name,
        number,
        row_number() OVER (ORDER BY name, number) as rn
    FROM `table`
    WHERE cc = 1
) c
GROUP BY name, number - rn 
ORDER BY first_number ASC, name ASC;

db<>小提琴

这篇关于2列上的间隙和孤岛-如果A列连续且B列相同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆