考虑“差异"对行进行分组.行之间 [英] Grouping rows considering "difference" between rows

查看:65
本文介绍了考虑“差异"对行进行分组.行之间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表,其中包含开始时间(在示例中使用数字以使其保持简单)以及事件的持续时间.

我想确定块"及其开始时间和结束时间.
每当上一行的结束时间(开始时间+持续时间)(按开始时间排序)与当前行的开始时间之间的差值为>=5时,应开始一个新的块".

这是我的测试数据,包括尝试在注释中进行图形解释的原因:

WITH test_data AS (
  SELECT  0 s, 2 dur FROM dual UNION ALL   --# ■■
  SELECT  2  , 2     FROM dual UNION ALL   --#   ■■
  SELECT 10  , 1     FROM dual UNION ALL   --#           ■
  SELECT 13  , 4     FROM dual UNION ALL   --#              ■■■■
  SELECT 15  , 4     FROM dual             --#                ■■■■
)
--# Should return
--#   0 ..  4                              --# ■■■■
--#  10 .. 19                              --#           ■■■■■■■■■

第一个块在0处开始,在4处结束.由于与下一行的区别是>=5,因此在10处开始另一个块,在19处结束.


我可以使用LAG来识别块的第一行,但是我还没有找到如何继续的方法.

我可以在PL/SQL循环中解决问题,但出于性能原因,我试图避免这种情况.


关于如何编写此查询的任何建议?

谢谢你,彼得

解决方案

我将子查询与分析结合使用,以识别和分组连续范围:

SQL> WITH test_data AS (
  2    SELECT  0 s, 2 dur FROM dual UNION ALL   --# ■■
  3    SELECT  2  , 2     FROM dual UNION ALL   --#   ■■
  4    SELECT 10  , 1     FROM dual UNION ALL   --#           ■
  5    SELECT 13  , 4     FROM dual UNION ALL   --#              ■■■■
  6    SELECT 15  , 4     FROM dual             --#                ■■■■
  7  )
  8  SELECT MIN(s) "begin", MAX(s + dur) "end"
  9    FROM (SELECT s, dur, SUM(gap) over(ORDER BY s) my_group
 10             FROM (SELECT s, dur,
 11                           CASE
 12                              WHEN lag(s + dur) over(ORDER BY s) >= s - 5 THEN
 13                               0
 14                              ELSE
 15                               1
 16                           END gap
 17                      FROM test_data
 18                     ORDER BY s))
 19   GROUP BY my_group;

     begin        end
---------- ----------
         0          4
        10         19

I've got a table which contains start-times (using number in example to keep it simple), and durations of events.

I would like to identify "blocks" and their start- and the end-time.
Whenever the difference between the end-time (start-time + duration) of the previous row (sorted by start-time) and the start-time of the current row is >=5, a new "block" should begin.

This is my test-data, including an attempt of a graphical explanation in the comments:

WITH test_data AS (
  SELECT  0 s, 2 dur FROM dual UNION ALL   --# ■■
  SELECT  2  , 2     FROM dual UNION ALL   --#   ■■
  SELECT 10  , 1     FROM dual UNION ALL   --#           ■
  SELECT 13  , 4     FROM dual UNION ALL   --#              ■■■■
  SELECT 15  , 4     FROM dual             --#                ■■■■
)
--# Should return
--#   0 ..  4                              --# ■■■■
--#  10 .. 19                              --#           ■■■■■■■■■

The first block starts at 0 and ends at 4. Since the difference to the next row is >=5, start another block at 10 which ends at 19.


I can identify the first row of a block, using LAG, but I have not yet found out how to proceed.

And I could solve the problem in a PL/SQL-loop, but I'm trying to avoid that for performance reasons.


Any suggestions on how to write this query?

Thanks in advance, Peter

解决方案

I use subqueries with analytics to identify and group contiguous ranges:

SQL> WITH test_data AS (
  2    SELECT  0 s, 2 dur FROM dual UNION ALL   --# ■■
  3    SELECT  2  , 2     FROM dual UNION ALL   --#   ■■
  4    SELECT 10  , 1     FROM dual UNION ALL   --#           ■
  5    SELECT 13  , 4     FROM dual UNION ALL   --#              ■■■■
  6    SELECT 15  , 4     FROM dual             --#                ■■■■
  7  )
  8  SELECT MIN(s) "begin", MAX(s + dur) "end"
  9    FROM (SELECT s, dur, SUM(gap) over(ORDER BY s) my_group
 10             FROM (SELECT s, dur,
 11                           CASE
 12                              WHEN lag(s + dur) over(ORDER BY s) >= s - 5 THEN
 13                               0
 14                              ELSE
 15                               1
 16                           END gap
 17                      FROM test_data
 18                     ORDER BY s))
 19   GROUP BY my_group;

     begin        end
---------- ----------
         0          4
        10         19

这篇关于考虑“差异"对行进行分组.行之间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆