如何根据列调整日期和不同的记录 [英] How to adjust the dates and distinct records based on a column

查看:59
本文介绍了如何根据列调整日期和不同的记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 REG_ID| EVENT_TYPE_CD    | EVENT_DATE | PACKAGE_DESC    |PRODUCT_TYPE|TERM_START_DATE|TERM_END_DATE  
------|------------------|------------|-----------------|------------|---------------|----------  

11156 | NEW SUBSCRIPTION |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 23-FEB-16  
11156 | CANCELLATION     |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 23-FEB-16  
11156 | UPSELL           |  23-FEB-16 | CONNECTED CARE  |GOODWILL    | 23-FEB-16     | 22-MAR-16  
11156 | CANCELLATION     |  11-MAR-16 | CONNECTED CARE  |GOODWILL    | 23-FEB-16     | 11-MAR-16  
11156 | UPSELL           |  14-MAR-16 | CONNECTED CARE  |GOODWILL    | 14-APR-16     | 13-APR-17  
11156 | EXPIRATION       |  14-APR-16 | CONNECTED CARE  |GOODWILL    | 14-MAR-16     | 13-APR-17  
11163 | UPSELL           |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 23-FEB-16  
11163 | CANCELLATION     |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 22-MAR-16    
17215 | NEW SUBSCRIPTION |  18-JAN-16 | CONNECTED CARE  |TRIAL       | 18-JAN-16     | 17-JAN-17  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | GUIDANCE        |TRIAL       | 18-JAN-16     | 17-APR-16  
17215 | CANCELLATION     |  22-FEB-16 | GUIDANCE        |TRIAL       | 18-JAN-16     | 22-FEB-16  
17215 | UPSELL           |  25-FEB-16 | GUIDANCE        |GOODWILL    | 25-FEB-16     | 24-APR-16  
17215 | EXPIRATION       |  25-APR-16 | GUIDANCE        |GOODWILL    | 25-FEB-16     | 24-APR-16  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | REMOTE          |TRIAL       | 18-JAN-16     | 17-APR-16  
17215 | UPSELL           |  25-FEB-16 | REMOTE          |GOODWILL    | 25-FEB-16     | 24-APR-16  
17215 | NEW SUBSCRIPTION |  18-JUN-16 | REMOTE          |PAID        | 18-JUN-16     | 17-JUL-16  
17215 | UPSELL           |  25-JUL-16 | REMOTE          |GOODWILL    | 25-JUL-16     | 24-AUG-16 
 

我需要的输出是,所有事情都需要按照EVENT_DATE(事件系列)进行排序

  1. 如果善意" EVENT_DATE紧随试用"产品EVENT_DATE 然后将其视为试用版".如果善意" EVENT_DATE之后是 付费"产品EVENT_DATE,然后将其视为付费"并调整 TERM_END_DATE(没有时的REMOTE理想方案的示例 特定PACKAGE_DESC的取消或EXPIRATION event_type_cd REG_ID)

  2. 如果取消后有事件,请忽略取消(11163出现:这是因为没有新事件 取消后发生的事件)

  3. 如果有多个商誉 沿着小径走,我们需要在T​​ERM_END_DATE中取最大 周期需要分配为TERM_END_DATE(REG_ID 17215和GUIDANCE).
  4. EXPIRATION RECORD应该一直存在,并且其term_start_date 需要调整为周期1的Term_start_date 记录.

 REG_ID| EVENT_TYPE_CD    | EVENT_DATE | PACKAGE_DESC    |PRODUCT_TYPE|TERM_START_DATE| TERM_END_DATE  
------|------------------|------------|-----------------|------------|---------------|----------  

11156 | NEW SUBSCRIPTION |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 13-APR-17  
11156 | EXPIRATION       |  14-APR-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 13-APR-17  
11163 | UPSELL           |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 23-FEB-16  
11163 | CANCELLATION     |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 22-MAR-16  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | CONNECTED CARE  |TRIAL       | 18-JAN-16     | 17-JAN-17  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | GUIDANCE        |TRIAL       | 18-JAN-16     | 24-APR-16  
17215 | EXPIRATION       |  25-APR-16 | GUIDANCE        |TRAIL       | 18-JAN-16     | 24-APR-16  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | REMOTE          |TRIAL       | 18-JAN-16     | 24-APR-16  
17215 | NEW SUBSCRIPTION |  18-JUN-16 | REMOTE          |PAID        | 18-JUN-16     | 24-AUG-16 
 

解决方案

规则非常广泛,使用PL/SQL代码可能会获得更好的结果和性能,因为在游标上进行迭代时可以使用变量.

不过,我认为以下查询可能会满足您的需求:

select  reg_id, 
        event_type_cd,
        event_date,
        package_desc,
        case product_type when 'GOODWILL' then coalesce(prev_product_type, 'TRIAL')
            else product_type
        end as product_type,
        case event_type_cd when 'EXPIRATION' then first_term_start_date
            else term_start_date
        end as term_start_date,
        case next_product_type when 'GOODWILL' then next_term_end_date
            else term_end_date
        end as term_end_date
from    (select reg_id, 
                event_type_cd,
                event_date,
                package_desc,
                product_type,
                term_start_date,
                term_end_date,
                first_value(term_start_date) over (
                    partition by reg_id, package_desc 
                    order by event_date, term_end_date, event_type_cd desc) as first_term_start_date,
                lead(term_end_date, 1) over (
                    partition by reg_id, package_desc 
                    order by event_date, term_end_date, event_type_cd desc) as next_term_end_date,
                lag(product_type, 1) over (
                    partition by reg_id, package_desc 
                    order by event_date, term_end_date, event_type_cd desc) as prev_product_type,
                lead(product_type, 1) over (
                    partition by reg_id, package_desc 
                    order by event_date, term_end_date, event_type_cd desc) as next_product_type
        from    (select reg_id, 
                        event_type_cd,
                        event_date,
                        package_desc,
                        product_type,
                        term_start_date,
                        term_end_date,
                        lead(product_type, 1, '-') over (
                            partition by reg_id, package_desc 
                            order by event_date, term_end_date, event_type_cd desc) as next_product_type
                from    mytable)
                where   not (event_type_cd = 'CANCELLATION' and next_product_type <> '-')
                and     not (product_type = 'GOODWILL' and next_product_type = 'GOODWILL')
                )
where    not (    product_type = 'GOODWILL' 
              and event_type_cd not in ('EXPIRATION', 'CANCELLATION') 
              and prev_product_type is not null)
order by reg_id, package_desc, event_date, term_end_date, event_type_cd desc

该查询具有两级嵌套子查询.

最内层查询仅用于获取循环中下一条记录的 product_type (即,在 reg_id 的同一分区内)和 package_desc ).

中间查询使用该信息来消除:

  • 取消"记录,除非它们是其周期的最后一个记录;
  • 连续的'GOODWILL'记录,仅保留每个序列的最后一个-这是最后一个是临时的,但在此阶段仍需要;

中间查询还会重新获取循环中下一条记录的 product_type ,因为由于删除了记录,现在它可能已更改.此外,它确定:

  • 周期中第一条记录的 term_start_date
  • 周期中下一条记录的 term_end_date
  • 周期中上一条记录的 product_type

最后,外部查询使用此信息执行以下操作:

    如果涉及到"GOODWILL"记录,
  • product_type 设置为先前记录的记录(如果没有先前记录,则设置为"TRIAL");
  • term_start_date 设置为周期中第一个记录的记录,如果它涉及"EXPIRATION"记录;
  • 如果下一条记录涉及"GOODWILL"记录,请
  • term_end_date 设置为周期中下一条记录的记录.

"GOODWILL"记录(在上面第一个项目符号更改之前)从结果中排除,除非它们与循环中的第一个记录相关,或者与"EXPIRATION"或"CANCELLATION"记录相对应. /p>

order by子句使用您在注释中提到的顺序,并附加一个event_type_cd desc以确保用于取消或到期的EVENT_TYPE_CD将始终遵循新订阅或UPSELL,以获取特定的REG_ID,PACKAGE_DESC ".这是因为幸运的是,"NEW SUBSCRIPTION"和"UPSELL"都比"CANCELLATION"和"EXPIRATION"的字母顺序都晚,所以按降序排列就可以对它们进行正确排序.

REG_ID| EVENT_TYPE_CD    | EVENT_DATE | PACKAGE_DESC    |PRODUCT_TYPE|TERM_START_DATE|TERM_END_DATE  
------|------------------|------------|-----------------|------------|---------------|----------  

11156 | NEW SUBSCRIPTION |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 23-FEB-16  
11156 | CANCELLATION     |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 23-FEB-16  
11156 | UPSELL           |  23-FEB-16 | CONNECTED CARE  |GOODWILL    | 23-FEB-16     | 22-MAR-16  
11156 | CANCELLATION     |  11-MAR-16 | CONNECTED CARE  |GOODWILL    | 23-FEB-16     | 11-MAR-16  
11156 | UPSELL           |  14-MAR-16 | CONNECTED CARE  |GOODWILL    | 14-APR-16     | 13-APR-17  
11156 | EXPIRATION       |  14-APR-16 | CONNECTED CARE  |GOODWILL    | 14-MAR-16     | 13-APR-17  
11163 | UPSELL           |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 23-FEB-16  
11163 | CANCELLATION     |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 22-MAR-16    
17215 | NEW SUBSCRIPTION |  18-JAN-16 | CONNECTED CARE  |TRIAL       | 18-JAN-16     | 17-JAN-17  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | GUIDANCE        |TRIAL       | 18-JAN-16     | 17-APR-16  
17215 | CANCELLATION     |  22-FEB-16 | GUIDANCE        |TRIAL       | 18-JAN-16     | 22-FEB-16  
17215 | UPSELL           |  25-FEB-16 | GUIDANCE        |GOODWILL    | 25-FEB-16     | 24-APR-16  
17215 | EXPIRATION       |  25-APR-16 | GUIDANCE        |GOODWILL    | 25-FEB-16     | 24-APR-16  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | REMOTE          |TRIAL       | 18-JAN-16     | 17-APR-16  
17215 | UPSELL           |  25-FEB-16 | REMOTE          |GOODWILL    | 25-FEB-16     | 24-APR-16  
17215 | NEW SUBSCRIPTION |  18-JUN-16 | REMOTE          |PAID        | 18-JUN-16     | 17-JUL-16  
17215 | UPSELL           |  25-JUL-16 | REMOTE          |GOODWILL    | 25-JUL-16     | 24-AUG-16 

The output I need is that everything needs to be sorted by EVENT_DATE (Series of Events)

  1. if a 'Goodwill' EVENT_DATE follows a 'Trial' product EVENT_DATE then treat this as 'Trial'. If a 'Goodwill' EVENT_DATE follows a 'Paid' product EVENT_DATE then treat this as 'Paid' and adjust the TERM_END_DATE (Example for REMOTE Ideal scenario when there is no Cancellation or EXPIRATION event_type_cd for particular PACKAGE_DESC in a REG_ID)

  2. If there is a Event After the Cancellation then ignore Cancellation (11163 shows up: that's the reason since there is no new event after the cancellation)

  3. If there are multiple Goodwill's following the trail we need to take the maximum TERM_END_DATE in the cycle needs to assigned as TERM_END_DATE (REG_ID 17215 and GUIDANCE).
  4. EXPIRATION RECORD should always be there and its term_start_date needs to get adjusted to the Term_start_date of the Cycle's 1st record.

REG_ID| EVENT_TYPE_CD    | EVENT_DATE | PACKAGE_DESC    |PRODUCT_TYPE|TERM_START_DATE| TERM_END_DATE  
------|------------------|------------|-----------------|------------|---------------|----------  

11156 | NEW SUBSCRIPTION |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 13-APR-17  
11156 | EXPIRATION       |  14-APR-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 13-APR-17  
11163 | UPSELL           |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 23-FEB-16  
11163 | CANCELLATION     |  23-FEB-16 | CONNECTED CARE  |PAID        | 23-FEB-16     | 22-MAR-16  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | CONNECTED CARE  |TRIAL       | 18-JAN-16     | 17-JAN-17  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | GUIDANCE        |TRIAL       | 18-JAN-16     | 24-APR-16  
17215 | EXPIRATION       |  25-APR-16 | GUIDANCE        |TRAIL       | 18-JAN-16     | 24-APR-16  
17215 | NEW SUBSCRIPTION |  18-JAN-16 | REMOTE          |TRIAL       | 18-JAN-16     | 24-APR-16  
17215 | NEW SUBSCRIPTION |  18-JUN-16 | REMOTE          |PAID        | 18-JUN-16     | 24-AUG-16 

解决方案

The rules are quite extensive, and your might get better results and performance with PL/SQL code, as that can use variables while iterating over a cursor.

Still, I think the following query may do what you need:

select  reg_id, 
        event_type_cd,
        event_date,
        package_desc,
        case product_type when 'GOODWILL' then coalesce(prev_product_type, 'TRIAL')
            else product_type
        end as product_type,
        case event_type_cd when 'EXPIRATION' then first_term_start_date
            else term_start_date
        end as term_start_date,
        case next_product_type when 'GOODWILL' then next_term_end_date
            else term_end_date
        end as term_end_date
from    (select reg_id, 
                event_type_cd,
                event_date,
                package_desc,
                product_type,
                term_start_date,
                term_end_date,
                first_value(term_start_date) over (
                    partition by reg_id, package_desc 
                    order by event_date, term_end_date, event_type_cd desc) as first_term_start_date,
                lead(term_end_date, 1) over (
                    partition by reg_id, package_desc 
                    order by event_date, term_end_date, event_type_cd desc) as next_term_end_date,
                lag(product_type, 1) over (
                    partition by reg_id, package_desc 
                    order by event_date, term_end_date, event_type_cd desc) as prev_product_type,
                lead(product_type, 1) over (
                    partition by reg_id, package_desc 
                    order by event_date, term_end_date, event_type_cd desc) as next_product_type
        from    (select reg_id, 
                        event_type_cd,
                        event_date,
                        package_desc,
                        product_type,
                        term_start_date,
                        term_end_date,
                        lead(product_type, 1, '-') over (
                            partition by reg_id, package_desc 
                            order by event_date, term_end_date, event_type_cd desc) as next_product_type
                from    mytable)
                where   not (event_type_cd = 'CANCELLATION' and next_product_type <> '-')
                and     not (product_type = 'GOODWILL' and next_product_type = 'GOODWILL')
                )
where    not (    product_type = 'GOODWILL' 
              and event_type_cd not in ('EXPIRATION', 'CANCELLATION') 
              and prev_product_type is not null)
order by reg_id, package_desc, event_date, term_end_date, event_type_cd desc

The query has a two-level nested sub query.

The inner-most query only serves to get the product_type of the next record within the cycle (i.e. within the same partition of reg_id and package_desc).

The middle query uses that information to eliminate:

  • 'CANCELLATION' records, unless they are the very last record of their cycle;
  • Consecutive 'GOODWILL' records, leaving only the last in place of every sequence -- this is last one is temporary, but is at this stage still needed;

The middle query also re-fetches the product_type of the next record in the cycle, as it might now have changed due to the eliminated records. Additionally, it determines:

  • the term_start_date of the first record in the cycle;
  • the term_end_date of the next record in the cycle;
  • the product_type of the previous record in the cycle;

Finally, the outer query uses this information to:

  • set the product_type to that of the previous record (or 'TRIAL' if there is no previous) if it concerns a 'GOODWILL' record;
  • set the term_start_date to the that of the first record in the cycle, if it concerns an 'EXPIRATION' record;
  • set the term_end_date to that of the next record in the cycle, if the next record concerns a 'GOODWILL' record.

The 'GOODWILL' records (before the change in the first bullet above) are excluded from the result, except when they relate to a first record in their cycle, or correspond to an 'EXPIRATION' or 'CANCELLATION' record.

The order by clause uses the order as you mentioned in comments, with an additional event_type_cd desc to make sure "The EVENT_TYPE_CD's for Cancellation or Expiration will always follow the New Subscription or UPSELL for a particular REG_ID, PACKAGE_DESC". This is because by fortune both 'NEW SUBSCRIPTION' and 'UPSELL' come both later in the alphabetical order than 'CANCELLATION' and 'EXPIRATION', so in descending order we get them sorted right.

这篇关于如何根据列调整日期和不同的记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆