SQL Server奇怪的分组方案按多个列或 [英] SQL Server Weird Grouping Scenario by multiple columns and OR

查看:145
本文介绍了SQL Server奇怪的分组方案按多个列或的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的分组情况很奇怪,在查找SQL分组的最佳方法时遇到了一些麻烦.

I have a weird grouping scenario and have some troubles finding out what would be the best way for grouping in SQL.

想象一下我们有一张桌子

Imagine we have the following one table

CREATE TABLE Item
(
  KeyId VARCHAR(1) NOT NULL,
  Col1 INT NULL,
  Col2 INT NULL,
  Col3 INT NULL
)

GO

INSERT INTO Item (KeyId, Col1, Col2, Col3)
VALUES 
('a',1,2,3),
('b',5,4,3),
('c',5,7,6),
('d',8,7,9),
('e',11,10,9),
('f',11,12,13),
('g',20,22,21),
('h',23,22,24)

我需要将此表中的记录分组,以便如果Col1 OR Col2 OR Col3对于两个记录相同,则这两个记录应该在同一组中,并且应该有链接. 换句话说,对于上述数据,记录"a"(第一条记录)的Col3 = 3,而记录"b"(第二条记录)的Col3 = 3,因此这两个应该在一组中.但是,然后记录"b"与记录"c"具有相同的Col1,因此记录"c"应与"a"和"b"位于同一组中.然后,记录"d"具有与"c"相同的Col2,因此它也应该在同一组中.同样,"e"和"f"在Col3和Col1中分别具有相同的值.

I need to group records in this table so that if Col1 OR Col2 OR Col3 is the same for two records, then these two records should be in the same group, and there should be chaining. In other words, with the data as above record 'a' (first record) has Col3 = 3 and record 'b' (second record) has also Col3 = 3, so these two should be in one group. But then record 'b' has the same Col1 as record 'c', so record 'c' should be in the same group as 'a' and 'b'. And then record 'd' has the same Col2 as in 'c', so this should also be in the same group. Similarly 'e' and 'f' has the same values in Col3 and Col1 respectively.

另一方面,记录"g"和"h"将在一个组中(因为它们具有相同的Col2 = 22),但是该组将与记录"a","b"的组不同, 'c','d','e','f'.

On the other hand records 'g' and 'h' will be in one group (because they have the same Col2 = 22), but this group will be different from the group for records 'a','b','c','d','e','f'.

查询结果应该类似于

KeyId GroupId
'a'   1 
'b'   1
'c'   1
'd'   1
'e'   1
'f'   1
'g'   2
'h'   2

也许可以使用一些循环/游标来实现此目的,但是我开始考虑更简洁的方式,这似乎很困难.

There is probably a way of doing this with some loops/cursors, but I started thinking about cleaner way and this seems quite difficult.

推荐答案

在这里:

with g (rootid, previd, level, keyid, col1, col2, col3) as (
  select keyid, '-', 1, keyid, col1, col2, col3 from item
  union all
  select g.rootid, g.keyid, g.level + 1, i.keyid, i.col1, i.col2, i.col3 
    from g
    join item i on i.col1 = g.col1 or i.col2 = g.col2 or i.col3 = g.col3 
    where i.keyid > g.keyid
),
  m (keyid, rootid) as (
  select keyid, min(rootid) from g group by keyid
)
select * from m;

结果:

keyid  rootid  
-----  ------
a      a       
b      a       
c      a       
d      a       
e      a       
f      a       
g      g       
h      g       

注意:请记住,默认情况下,SQL Server在处理递归CTE时限制为100次迭代(每组行数). 英语:即使可以如上所述进行操作,SQL Server可以处理的内容也有明显的限制.如果达到此限制,您将收到消息:

Note: Keep in mind that SQL Server has by default a limit of 100 iterations (number of rows per group) when processing recursive CTEs. In English: even though it's possible to do this as shown above, there are clear limitations to what SQL Server can process. If you reach this limit you'll get the message:

在语句完成之前,最大递归100已用尽.

The maximum recursion 100 has been exhausted before statement completion.

如果发生这种情况,请考虑添加条款option (maxrecursion 32767).

If this happens consider adding the clause option (maxrecursion 32767).

这篇关于SQL Server奇怪的分组方案按多个列或的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆