使用递归查询汇总期间 [英] Aggregate periods using recursive queries

查看:95
本文介绍了使用递归查询汇总期间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要使用超前缓冲区"合并每个组(ID)的顺序事件(具有标识符NUM)的顺序事件的重叠时间段(由FROM和TO变量定义),这意味着如果下一个时间段在缓冲区内开始,它们应该合并.

例如;在下面的示例中,第二个事件(NUM = 2)在时间13开始,该时间在缓冲区(10 + 5 = 15)内.

与我发现的其他类似问题相比,这里的棘手部分是,尽管每个事件的缓冲期都有固定值,但是如果将其与具有更长缓冲期的事件(仅向后)合并,则可能会改变期间.

例如;事件三也合并到与事件1和2相同的时间段,因为这些事件的缓冲时间更长.接下来的缓冲区应该改为(25 + 5 = 30),而不是(25 + 3 = 28),这意味着以下事件4也应包括在这些时间段中.

再次将事件4的缓冲时间更改为5.但是,由于40> 31 + 5,最后一个事件是单独的观察结果.

CREATE TABLE MY_TABLE(ID INTEGER, NUM INTEGER, FROM INTEGER, TO INTEGER, LOOKAHEAD INTEGER);
INSERT INTO MY_TABLE VALUES (1, 1, 1,  10, 5);
INSERT INTO MY_TABLE VALUES (1, 2, 13, 20, 5);
INSERT INTO MY_TABLE VALUES (1, 3, 21, 25, 3);
INSERT INTO MY_TABLE VALUES (1, 4, 29, 31, 3);
INSERT INTO MY_TABLE VALUES (1, 5, 40, 50, 3);

最终,我需要得到的结果是两个分离"时期的两个观察结果;

(ID = 1, FROM = 1,  TO = 31)
(ID = 5, FROM = 40, TO = 50)

自然,我最初以为可以通过创建一个新变量LOOKAHEAD2来创建此"LOOKHEAD"变量,该变量是LOOKAHEAD2的先前值和LOOKAHEAD的当前值的最大值,条件是FROM(此记录)< (TO + LOOKAHEAD)(上一个记录),使用OLAP函数.但这实际上并不起作用,因为它是对自身的引用...

相反,我尝试使用递归查询,从第一个事件(NUM = 1)开始,然后将表与下一个事件(root.NUM+1 = next.NUM)(以(root.TO + root.LOOKAHEAD > next.FROM)为条件)进行联接,并相应地更新LOOKAHEAD变量./p>

但是我以前从未使用过递归查询,也无法将其加入到LOOKAHEAD值的更新值中.

有人知道如何通过递归查询或其他方式解决此问题吗?

解决方案

您应该在分析函数中使用RESET WHEN窗口修饰符(Teradata 16中为LAG,或更早版本中为MAX);不要使用递归查询.

更新:

 DROP TABLE MY_TABLE;
CREATE VOLATILE TABLE MY_TABLE 
( id          INTEGER
, num         INTEGER
, from_value  INTEGER
, to_value    INTEGER
, lookahead   INTEGER
) ON COMMIT PRESERVE ROWS;

INSERT INTO MY_TABLE VALUES (1, 1, 1,  10, 5);
INSERT INTO MY_TABLE VALUES (1, 2, 13, 20, 5);
INSERT INTO MY_TABLE VALUES (1, 3, 21, 25, 3);
INSERT INTO MY_TABLE VALUES (1, 4, 29, 31, 3);
INSERT INTO MY_TABLE VALUES (1, 5, 40, 50, 3);

INSERT INTO MY_TABLE VALUES (2, 1, 1, 10, 5);
INSERT INTO MY_TABLE VALUES (2, 2, 20, 30, 15);
INSERT INTO MY_TABLE VALUES (2, 3, 40, 41, 5);
INSERT INTO MY_TABLE VALUES (2, 4, 100, 200, 5);
INSERT INTO MY_TABLE VALUES (2, 5, 300, 400, 3);


SELECT  id, first_from_value, to_value
FROM  ( SELECT  id
              , to_value
              , CASE WHEN overlaps_flag = 1
                  THEN  NULL
                  ELSE  COALESCE 
                        ( MIN (from_value) 
                            OVER (PARTITION BY id
                                  ORDER BY from_value
                                  RESET WHEN MAX (overlaps_flag) 
                                               OVER (PARTITION BY id 
                                                     ROWS BETWEEN 
                                                          1 PRECEDING 
                                                      AND 1 PRECEDING) = 0
                                  ROWS BETWEEN UNBOUNDED PRECEDING 
                                           AND 1 PRECEDING)
                        , from_value )
                END AS first_from_value
        FROM  ( SELECT  id, from_value, to_value
                      , MAX (from_value) 
                          OVER (PARTITION BY id 
                                ORDER BY from_value 
                                ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)
                          AS next_from_value
                      , CASE WHEN to_value + lookahead + 1 >= next_from_value
                          THEN 1 ELSE 0 
                        END AS overlaps_flag
                FROM  my_table
              ) AS a
      ) AS a
WHERE first_from_value IS NOT NULL
ORDER BY 1, 2
 

 id  first_from_value    to_value
1   1                   31
1   40                  50
2   1                   10
2   20                  41
2   100                 200
2   300                 400
 

I need to merge overlapping periods (defined by FROM and TO variables) of sequential events (with identifier NUM) for each group (ID) with a "lookahead buffer", meaning that if the next period starts within the buffer zone, they should be merged.

For instance; in the following example the second event (NUM = 2) starts at time 13, which is within the buffer zone (10 + 5 = 15).

The tricky part here compared to other similar problems I've found is that although the buffer period has a fixed value for each event, this could potentially change if it is merged with an event (only backwards) that has a longer buffer period.

For instance; Event three is also merged to the same periods as event 1 and 2, and because the buffer periods of these events are longer. The following buffer zone should instead be (25 + 5 = 30), rather than (25 + 3 = 28), meaning the following event 4 should also be included in these periods as well.

Once again the buffer period of event 4 is also changed to 5. However, because 40 > 31+5, the last event is a separate observation.

CREATE TABLE MY_TABLE(ID INTEGER, NUM INTEGER, FROM INTEGER, TO INTEGER, LOOKAHEAD INTEGER);
INSERT INTO MY_TABLE VALUES (1, 1, 1,  10, 5);
INSERT INTO MY_TABLE VALUES (1, 2, 13, 20, 5);
INSERT INTO MY_TABLE VALUES (1, 3, 21, 25, 3);
INSERT INTO MY_TABLE VALUES (1, 4, 29, 31, 3);
INSERT INTO MY_TABLE VALUES (1, 5, 40, 50, 3);

Eventually, the result I need are two observations with the two "disjunct" periods;

(ID = 1, FROM = 1,  TO = 31)
(ID = 5, FROM = 40, TO = 50)

Naturally I initially thought I could create this "LOOKHEAD"-variable, by creating a new variable LOOKAHEAD2 that is the maximum of previous value of LOOKAHEAD2 and current value of LOOKAHEAD, conditional on FROM(this record) < (TO + LOOKAHEAD)(previous record) using OLAP functions. This doesn't really work however since it is a reference to itself...

Instead, I tried using recursive queries, where I start with the first event (NUM = 1), and than recursively join the table with the next event (root.NUM+1 = next.NUM) conditional on (root.TO + root.LOOKAHEAD > next.FROM), and also updating the LOOKAHEAD variable accordingly.

But I have never used recursive queries before, and I can't get it to join on the updated value of the LOOKAHEAD-value.

Does anyone know how to solve this with either recursive queries or other?

解决方案

You should use the RESET WHEN window modifier in your analytic functions (LAG in Teradata 16, or MAX in earlier ones); don't use a recursive query.

Update:

DROP TABLE MY_TABLE;
CREATE VOLATILE TABLE MY_TABLE 
( id          INTEGER
, num         INTEGER
, from_value  INTEGER
, to_value    INTEGER
, lookahead   INTEGER
) ON COMMIT PRESERVE ROWS;

INSERT INTO MY_TABLE VALUES (1, 1, 1,  10, 5);
INSERT INTO MY_TABLE VALUES (1, 2, 13, 20, 5);
INSERT INTO MY_TABLE VALUES (1, 3, 21, 25, 3);
INSERT INTO MY_TABLE VALUES (1, 4, 29, 31, 3);
INSERT INTO MY_TABLE VALUES (1, 5, 40, 50, 3);

INSERT INTO MY_TABLE VALUES (2, 1, 1, 10, 5);
INSERT INTO MY_TABLE VALUES (2, 2, 20, 30, 15);
INSERT INTO MY_TABLE VALUES (2, 3, 40, 41, 5);
INSERT INTO MY_TABLE VALUES (2, 4, 100, 200, 5);
INSERT INTO MY_TABLE VALUES (2, 5, 300, 400, 3);


SELECT  id, first_from_value, to_value
FROM  ( SELECT  id
              , to_value
              , CASE WHEN overlaps_flag = 1
                  THEN  NULL
                  ELSE  COALESCE 
                        ( MIN (from_value) 
                            OVER (PARTITION BY id
                                  ORDER BY from_value
                                  RESET WHEN MAX (overlaps_flag) 
                                               OVER (PARTITION BY id 
                                                     ROWS BETWEEN 
                                                          1 PRECEDING 
                                                      AND 1 PRECEDING) = 0
                                  ROWS BETWEEN UNBOUNDED PRECEDING 
                                           AND 1 PRECEDING)
                        , from_value )
                END AS first_from_value
        FROM  ( SELECT  id, from_value, to_value
                      , MAX (from_value) 
                          OVER (PARTITION BY id 
                                ORDER BY from_value 
                                ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)
                          AS next_from_value
                      , CASE WHEN to_value + lookahead + 1 >= next_from_value
                          THEN 1 ELSE 0 
                        END AS overlaps_flag
                FROM  my_table
              ) AS a
      ) AS a
WHERE first_from_value IS NOT NULL
ORDER BY 1, 2

id  first_from_value    to_value
1   1                   31
1   40                  50
2   1                   10
2   20                  41
2   100                 200
2   300                 400

这篇关于使用递归查询汇总期间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆