如何使用SQL计算每个月的保留期 [英] How to calculate retention month over month using SQL

查看:113
本文介绍了如何使用SQL计算每个月的保留期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图获得一个基本表,该表显示从一个月到下个月的保留期。因此,如果某人上个月购买了某商品,而下个月又购买了它。

 月,num_transactions,repeat_transactions,保留
2012-02、5、2、40%
2012-03、10、3、30%
2012-04、15、8、53%

因此,如果上个月购买的每个人都在下个月再次购买,则您拥有100%。



<到目前为止,我只能手动计算东西。这给了我两个月都看到过的行:

 选择count(*)作为num_repeat_buyers from 

(选择不同的
to_char(transaction.timestamp,'YYYY-MM')作为月份,
auth_user.email
来自
auth_user,
交易
其中
auth_user.id = transaction.buyer_id和
to_char(transaction.timestamp,'YYYY-MM')='2012-03'
)如表1所示,


(选择不同的
to_char(transaction.timestamp,'YYYY-MM')作为月份,
auth_user.email
来自
auth_user,
交易
,其中
auth_user.id = transaction.buyer_id和
to_char(transaction.timestamp,'YYYY-MM')='2012-04'
)如表2
其中table1.email = table2.email

这不对,但我觉得我可以使用Postgres的某些窗口功能。请注意,开窗功能不允许您指定WHERE子句。您通常可以访问前几行和前几行:

 选择月份,count(*)为num_transactions,count(* )超过(按月分部或按月分部)

(选择不同的
to_char(transaction.timestamp,'YYYY-MM')作为月份,
auth_user.email $ b来自
auth_user的$ b,
交易
,其中
auth_user.id = transaction.buyer_id
订单被
月份
)作为transaction_by_month
组,每月
美元


解决方案

给出以下测试表(您应该提供):

  CREATE TEMP TABLE事务(buyer_id int,tstamp时间戳); 
插入交易值
(1,'2012-01-03 20:00')
,(1,'2012-01-05 20:00')
, (1,'2012-01-07 20:00')-本月多次交易
,(1,'2012-02-03 20:00')-下个月
,(1 ,'2012-03-05 20:00')-下个月
,(2,'2012-01-07 20:00')
,(2,'2012-03-07 20 :00')-下个月
,(3,'2012-01-07 20:00')-仅一次
,(4,'2012-02-07 20:00' ); -仅一次

auth_user 不是与问题有关。

使用 tstamp 作为列名,因为我不使用基本类型作为标识符。



我将使用窗口函数 lag() 来标识重复购买者。为了简短起见,我将聚合和窗口函数合并到一个查询级别。请记住,窗口函数是在聚集函数之后 应用的。

 有t AS(
SELECT Buyer_id
,date_trunc('month',tstamp)AS月
,count(*)AS item_transactions
,lag(date_trunc('month',tstamp))OVER(按PARTY_ID划分)
ORDER BY date_trunc('month',tstamp))
= date_trunc('month',tstamp)-间隔'1 month'
或NULL作为repeat_transaction
FROM transaction
WHERE tstamp> ='2012-01-01':: date
并且tstamp<'2012-05-01':: date-感兴趣的时间范围。
GROUP BY 1, 2

选择月份
,sum(item_transactions)AS num_trans
,count(*)AS num_buyers
,count(repeat_transaction)AS repeat_buyers
, round(
在总和(item_transactions)> 0的情况下
THEN count(repeat_transaction)/总和(item_transacti ons)* 100
ELSE 0
END,2)AS买方保留
从t
组1开始
命令1;

结果:

 月| num_trans | num_buyers | repeat_buyers | Buyer_retention_pct 
--------- + ----------- + ------------ + ---------- ----- + --------------------
2012-01 | 5 | 3 | 0 | 0.00
2012-02 | 2 | 2 | 1 | 50.00
2012-03 | 2 | 2 | 1 | 50.00

我扩展了您的问题,以提供交易数量和购买者数量之间的差额。



repeat_transaction 或NULL 用于转换 FALSE NULL ,因此这些值不会被 count()

-> SQLfiddle。


Trying to get a basic table that shows retention from one month to the next. So if someone buys something last month and they do so the next month it gets counted.

month, num_transactions, repeat_transactions, retention
2012-02, 5, 2, 40%
2012-03, 10, 3, 30%
2012-04, 15, 8, 53%

So if everyone that bought last month bought again the following month you have 100%.

So far I can only calculate stuff manually. This gives me the rows that have been seen in both months:

select count(*) as num_repeat_buyers from 

(select distinct
  to_char(transaction.timestamp, 'YYYY-MM') as month,
  auth_user.email
from
  auth_user,
  transaction
where
  auth_user.id = transaction.buyer_id and
  to_char(transaction.timestamp, 'YYYY-MM') = '2012-03'
) as table1,


(select distinct
  to_char(transaction.timestamp, 'YYYY-MM') as month,
  auth_user.email
from
  auth_user,
  transaction
where
  auth_user.id = transaction.buyer_id and
  to_char(transaction.timestamp, 'YYYY-MM') = '2012-04'
) as table2
where table1.email = table2.email

This is not right but I feel like I can use some of Postgres' windowing functions. Keep in mind the windowing functions don't let you specify WHERE clauses. You mostly have access to the previous rows and the preceding rows:

select month, count(*) as num_transactions, count(*) over (PARTITION BY month ORDER BY month)
from 
    (select distinct
      to_char(transaction.timestamp, 'YYYY-MM') as month,
      auth_user.email
    from
      auth_user,
      transaction
    where
      auth_user.id = transaction.buyer_id
    order by
      month
    ) as transactions_by_month
group by
    month

解决方案

Given the following test table (which you should have provided):

CREATE TEMP TABLE transaction (buyer_id int, tstamp timestamp);
INSERT INTO transaction VALUES 
 (1,'2012-01-03 20:00')
,(1,'2012-01-05 20:00')
,(1,'2012-01-07 20:00')  -- multiple transactions this month
,(1,'2012-02-03 20:00')  -- next month
,(1,'2012-03-05 20:00')  -- next month
,(2,'2012-01-07 20:00')
,(2,'2012-03-07 20:00')  -- not next month
,(3,'2012-01-07 20:00')  -- just once
,(4,'2012-02-07 20:00'); -- just once

Table auth_user is not relevant to the problem.
Using tstamp as column name since I don't use base types as identifiers.

I am going to use the window function lag() to identify repeated buyers. To keep it short I combine aggregate and window functions in one query level. Bear in mind that window functions are applied after aggregate functions.

WITH t AS (
   SELECT buyer_id
         ,date_trunc('month', tstamp) AS month
         ,count(*) AS item_transactions
         ,lag(date_trunc('month', tstamp)) OVER (PARTITION BY  buyer_id
                                           ORDER BY date_trunc('month', tstamp)) 
          = date_trunc('month', tstamp) - interval '1 month'
            OR NULL AS repeat_transaction
   FROM   transaction
   WHERE  tstamp >= '2012-01-01'::date
   AND    tstamp <  '2012-05-01'::date -- time range of interest.
   GROUP  BY 1, 2
   )
SELECT month
      ,sum(item_transactions) AS num_trans
      ,count(*) AS num_buyers
      ,count(repeat_transaction) AS repeat_buyers
      ,round(
          CASE WHEN sum(item_transactions) > 0
             THEN count(repeat_transaction) / sum(item_transactions) * 100
             ELSE 0
          END, 2) AS buyer_retention
FROM   t
GROUP  BY 1
ORDER  BY 1;

Result:

  month  | num_trans | num_buyers | repeat_buyers | buyer_retention_pct
---------+-----------+------------+---------------+--------------------
 2012-01 |         5 |          3 |             0 |               0.00
 2012-02 |         2 |          2 |             1 |              50.00
 2012-03 |         2 |          2 |             1 |              50.00

I extended your question to provide for the difference between the number of transactions and the number of buyers.

The OR NULL for repeat_transaction serves to convert FALSE to NULL, so those values do not get counted by count() in the next step.

-> SQLfiddle.

这篇关于如何使用SQL计算每个月的保留期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆