SQL-合并重叠数据 [英] SQL - Consolidate Overlapping Data

查看:114
本文介绍了SQL-合并重叠数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在SQL Server中有一个简单的数据集,看起来像这样

  ** ROW Start End ** 
0 1 2
1 3 5
2 4 6
3 8 9

以图形方式显示,数据将像这样



我要实现的是折叠重叠的数据,以便查询返回

  **行开始结束** 
0 1 2
1 3 6
2 8 9

是否可以在SQL Server中实现而无需编写复杂的过程或语句?

解决方案

这是 SQL小提琴 。 p>

首先,所有限制按顺序排序。然后删除重叠范围内的重复限制(因为开始之后是另一个开始或结束之后是另一个结束)。现在,范围已折叠,开始和结束值再次写在同一行中。

 ,temp_positions为- -将所有限制与开始/结束标志(s / e)

select startx limit,'s'as pos from t
union
select)列为单列endx,'e'作为pos from t

,ordered_positions为--Rank所有限制

选择限制,pos,RANK()OVER(ORDER BY限制)AS将temp_positions

中的
排序为--collapse范围(选择第一个限制,如果s之前或之后是e,则选择最后一个限制),并再次对限制$ b $进行排名b(
select op1。*,RANK()OVER(ORDER BY op1.Rank)AS New_Rank
fromordered_positions op1
内部联接ordered_positions op2
on(op1.Rank = op2 .Rank和op1.Rank = 1且op1.pos ='s')
或(op2.Rank = op1.Rank-1和op2.pos ='e'且op1.pos ='s')
或(op2.Rank = op1.Rank + 1和op2.pos ='s'an d op1.pos ='e')
或(op2.Rank = op1.Rank和op1.pos ='e'和op1.Rank =(从ordered_positions中选择max(Rank))

,final_positions为-现在每个s后跟e。因此,选择s限制和相应的e限制。排名范围

选择cp1.limit作为cp1_limit,cp2.limit作为cp2_limit,RANK()OVER(ORDER BY cp1.limit)AS Final_Rank
fromcollapsed_positions cp1
内部联接cp1.pos ='s'和cp2上的collapsed_positions cp2
; cp2.New_Rank = cp1.New_Rank + 1

-最后,从等级中减去1以从0 $开始范围号b $ b从final_positions fp中选择fp.Final_Rank-1 seq_no,fp.cp1_limit作为starty,fp.cp2_limit作为endy
;

您可以测试每个CTE的结果并跟踪进度。为此,您可以删除以下CTE,然后从前一个CTE中进行选择,例如,如下所示。

 ,temp_positions为- -将所有限制与开始/结束标志(s / e)

select startx limit,'s'as pos from t
union
select)列为单列endx,'e'作为pos from t

,ordered_positions为--Rank所有限制

选择限制,pos,RANK()OVER(ORDER BY限制)AS从temp_positions中排名


select *
在ordered_positions中;


I have a simple data set in SQL Server that appears like this

**ROW    Start    End**
  0     1        2
  1     3        5
  2     4        6
  3     8        9

Graphically, the data would appear like this

What I would like to achieve is to collapse the overlapping data so that my query returns

**ROW    Start    End**
  0     1        2
  1     3        6
  2     8        9

Is this possible in SQL Server without having to write a complex procedure or statement?

解决方案

Here's the SQL Fiddle for another alternative.

First, all the limits are sorted by order. Then the "duplicate" limits within an overlapping range are removed (because a Start is followed by another Start or an End is followed by another End). Now, that the ranges are collapsed, the Start and End values are written out again in the same row.

with temp_positions as  --Select all limits as a single column along with the start / end flag (s / e)
(
    select startx limit, 's' as pos from t
    union
    select endx, 'e' as pos from t
)
, ordered_positions as --Rank all limits
(
    select limit, pos, RANK() OVER (ORDER BY limit) AS Rank
    from temp_positions
)
, collapsed_positions as --Collapse ranges (select the first limit, if s is preceded or followed by e, and the last limit) and rank limits again
(
    select op1.*, RANK() OVER (ORDER BY op1.Rank) AS New_Rank
    from ordered_positions op1
    inner join ordered_positions op2
    on (op1.Rank = op2.Rank and op1.Rank = 1 and op1.pos = 's')
    or (op2.Rank = op1.Rank-1 and op2.pos = 'e' and op1.pos = 's') 
    or (op2.Rank = op1.Rank+1 and op2.pos = 's' and op1.pos = 'e')
    or (op2.Rank = op1.Rank and op1.pos = 'e' and op1.Rank = (select max(Rank) from ordered_positions))
)
, final_positions as --Now each s is followed by e. So, select s limits and corresponding e limits. Rank ranges
(
    select cp1.limit as cp1_limit, cp2.limit as cp2_limit,  RANK() OVER (ORDER BY cp1.limit) AS Final_Rank
    from collapsed_positions cp1
    inner join collapsed_positions cp2
    on cp1.pos = 's' and cp2.New_Rank = cp1.New_Rank+1
)
--Finally, subtract 1 from Rank to start Range #'s from 0
select fp.Final_Rank-1 seq_no, fp.cp1_limit as starty, fp.cp2_limit as endy
from final_positions fp;

You can test the result of each CTE and trace the progression. You can do this by removing the following CTE's and selecting from the preceding one, as below, for example.

with temp_positions as  --Select all limits as a single column along with the start / end flag (s / e)
(
    select startx limit, 's' as pos from t
    union
    select endx, 'e' as pos from t
)
, ordered_positions as --Rank all limits
(
    select limit, pos, RANK() OVER (ORDER BY limit) AS Rank
    from temp_positions
)
select *
from ordered_positions;

这篇关于SQL-合并重叠数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆