如何获得第三份报告以结合客户和订单数据 [英] how to get the 3rd report to combine the customer and order data

查看:76
本文介绍了如何获得第三份报告以结合客户和订单数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对保留率有疑问.
我有2个表,包括客户数据和订单数据.

I have a question about retention rate.
I have 2 tables, including the customer data and the order data.

 DISTRIBUTOR as d
+---------+-----------+--------------+--------------------+
|   ID    | SETUP_DT  | REINSTATE_DT | LOCAL_REINSTATE_DT |
+---------+-----------+--------------+--------------------+
| C111111 | 2018/1/1  | Null         | Null               |
| C111112 | 2015/12/9 | 2018/10/25   | 2018/10/25         |
| C111113 | 2018/10/1 | Null         | Null               |
| C111114 | 2018/10/6 | 2018/12/14   | 2018/12/14         |
+---------+-----------+--------------+--------------------+

 ORDER as o, please noted that the data is for reference...
+---------+----------+-----+
|   ID    |  ORD_DT  | OAL |
+---------+----------+-----+
| C111111 | 2018/1/1 | 112 |
| C111111 | 2018/1/1 | 100 |
| C111111 | 2018/1/1 | 472 |
| C111111 | 2018/1/1 | 452 |
| C111111 | 2018/1/1 | 248 |
| C111111 | 2018/1/1 | 996 |
+---------+----------+-----+

 The 3rd Table in my mind to create the retention rate report
+---------+-----------+-----------+---------------+-----------+
|   ID    |  APP_MON  | ORDER_MON | TimeDiff(Mon) | TTL AMT |
+---------+-----------+-----------+---------------+-----------+
| C111111 | 2018/1/1  | 2018/1/1  |             - |  25,443   |
| C111111 | 2018/1/1  | 2018/2/1  |             1 |  7,610    |
| C111111 | 2018/1/1  | 2018/3/1  |             2 |  20,180   |
| C111111 | 2018/1/1  | 2018/4/1  |             3 |  22,265   |
| C111111 | 2018/1/1  | 2018/5/1  |             4 |  34,118   |
| C111111 | 2018/1/1  | 2018/6/1  |             5 |  19,523   |
| C111111 | 2018/1/1  | 2018/7/1  |             6 |  20,220   |
| C111111 | 2018/1/1  | 2018/8/1  |             7 |  2,006    |
| C111111 | 2018/1/1  | 2018/9/1  |             8 |  15,813   |
| C111111 | 2018/1/1  | 2018/10/1 |             9 |  16,733   |
| C111111 | 2018/1/1  | 2018/11/1 |            10 |  20,973   |
| C111112 | 2018/10/1 | 2017/11/1 |             - |  516      |
| C111112 | 2018/10/1 | 2018/10/1 |             - |  1        |
| C111113 | 2018/10/1 | Null      |             - | Null      |
| C111114 | 2018/12/1 | Null      |             - | Null      |
+---------+-----------+-----------+---------------+-----------+

Definition:
- APP_MON: the month that the customer joined, which is the max date from the start date of [d.SETUP_DT], [d.REINSTATE_DT] and [d.LOCAL_REINSTATE_DT]
- ORD_MON: the month that the customer purchased, which is the start date of the order date month
- TimeDiff: The duration by month between APP_MON and ORD_MON, e.g. if A's ODR_MON is 2018/1/1 and A'S APP_MON is 2018/2/1, the duration is 1.
- TTL_AMT: the total order amount that the customer bought in the related order date month

我试图从第3个表中获取数据. 但是我运行下面的代码,它非常慢...我需要一种更有效的方法,因为我有数百万个数据... 谢谢.

I tried to get the data from 3rd table. But I run the code below and it's very slow... I need a more effective way since I have millions of data... Thanks.

推荐答案

我认为您不需要使用unpivot.要获取最新日期,您可以使用greatest()函数.

I don't think you need to use unpivot. To get the latest date you can just use the greatest() function.

此解决方案有两个子查询,一个用于为每个新客户计算app_mon,另一个用于计算最近两年内下订单的所有客户的最早订单日期.这可能不是最有效的方法,但您的首要任务应该是获得正确的结果.拥有后,您可以根据需要对其进行调整:

This solution has two subqueries, one to calculate the app_mon for each new customer and the other to calculate the earliest order date for all customers who placed an order in the last two years. This may not be the most performative approach but your first priority should be to get the correct outcome; once you have that you can tune it if necessary:

with cust as 
(
    select d.dist_id as id
          , greatest(d.setup_dt, d.reinstate_dt, d.local_reinstate_dt) as app_mo 
    from mjensen_dev.gc_distributor d
    where d.setup_dt >= date '2017-01-01'
    or d.reinstate_dt >= date '2017-01-01'
    or d.local_reinstate_dt >= date '2017-01-01'
) , ord as 
(
    select o.dist_id as id
          , min(o.ord_dt) as ord_mon 
          , sum(o.oal) as ord_amt
    from gc_orders o
    where o.ord_dt >= date '2017-01-01'
    group by o.dist_id
          , trunc(o.ord_dt,'mm')
)
select cust.dist_id as id
       , cust.app_mon
       , ord.ord_mon
       , floor(months_between(ord.ord_mon, cust.app_mon ) as mon_diff
       , sum(o.oal) as ord_amt
from cust
     inner join gc_orders o on cust.id = o.dist_id
order by 1, 2
/

您可能希望调整我对mon_diff的计算.此计算将2018/2/1-2018/1/1视为一个月差额.因为对我来说奇怪的是,在加入之日下订单的客户的mon_diff为1而不是零.但是,如果您的业务规则说明正确,则需要在计算中加1.同样,我没有在日期的处理中包含trunc(),但您可能希望恢复它.

You may wish to tweak at my calculation of mon_diff. This calculation treats 2018/2/1 - 2018/1/1 as one month difference. Because it seems odd to me that a customer who places an order on the day they joined would have a mon_diff of 1 rather than zero. But if your statement of the business rule is correct you would need to add 1 to the calculation. Likewise I have not included the trunc() in the processing of the dates but you may wish to reinstate it.

这篇关于如何获得第三份报告以结合客户和订单数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆