从 Teradata 到 PROC SQL 的 SQL 语句语法 [英] SQL statment syntax from Teradata to PROC SQL
问题描述
我是 SAS 新手,我正在尝试使用 proc sql 构建一些查询.
I am new to SAS and I am trying to build some query using proc sql.
我在 sql 中有以下代码(teradata 语法):
I have the following code in sql (teradata syntax):
create multiset volatile table mvt_data, no log
( date1 date format 'yyyy-mm-dd'
, flag1 byteint
,cust_id varchar(25)
)
primary index (date1)
on commit preserve rows;
insert into mvt_data values (date '2020-01-03', 1, 'A');
insert into mvt_data values (date '2020-02-04', 0, 'A');
insert into mvt_data values (date '2020-04-05', 0, 'B');
insert into mvt_data values (date '2020-01-19', 1, 'C');
insert into mvt_data values (date '2020-03-20', 1, 'B');
insert into mvt_data values (date '2020-06-20', 1, 'D');
我想要以下输出:
对于 date1 的每个 Cust_id 和 YYYYMM 求和过去 6 个月累积的 Flag1 和总 Flag1.
For each Cust_id and YYYYMM of date1 sum the Flag1 and sum Flag1 cumulative with respect to last 6 months.
我会用同样的语法:
with cte_data_ts (ts1, Flag1) as
(
select cast(date1 as timestamp(0)), Flag1
from mvt_data
)
, cte_gbt (YearMonth, SumFlag1, cust_id) as
(
select cust_id as cust_id
,to_char(begin($TD_TIMECODE_RANGE) at 0, 'yyyymm') as YearMonth
, sum(flag1) as SumFlag1
from cte_data_ts
group by time(cal_months(1))
using timecode(ts1)
fill (0)
)
select cust_id, YearMonth, SumFlag1
, sum(SumFlag1) over(order by cust_id, YearMonth asc rows between 6 preceding and current row) as SumFlag1_last6Months
from cte_gbt;
不幸的是,我知道 proc sql 中不存在 over(),所以有人能帮我实现相同的结果吗?
Unfortunatelly I know that over() does not exists in proc sql, so can anyone help me achieve the same result ?
我添加了一个新列 (Cust_ID)
I added a new column (Cust_ID)
推荐答案
所以首先让我们将 SQL 数据加载转换为正常的 SAS 数据步骤,以便处理一些数据.
So first let's convert the SQL data load into normal SAS data step to have some data to work with.
data mvt_data;
input date1 :yymmdd. flag1 cust_id $;
format date1 yymmdd10.;
cards;
2020-01-03 1 A
2020-02-04 0 A
2020-04-05 0 B
2020-01-19 1 C
2020-03-20 1 B
2020-06-20 1 D
;
这是一种生成未出现在源中的月份的观察值的方法.
Here is a way to generate observations for months that don't appear in the source.
data YearMonth;
min='01JAN2020'd ;
max='01JUN2020'd;
do offset=0 to intck('month',min,max);
do YearMonth=intnx('month',min,offset,'b') ;
output;
end;
end;
keep YearMonth ;
format YearMonth yymmn6.;
run;
或者计算 DATE1 的最小值/最大值并使用它们来设置日期范围.
Or calculate the min/max values of DATE1 and use those to set the range of dates.
如果您没有其他来源的 Cust_id 值列表,您可以查询数据
If you don't have another source for the list of Cust_id values you can query the data
select distinct cust_id from mvt_data
所以加入两者以获得完整的 YearMonth*Cust_id 组合.
So join the two to get the full YearMonth*Cust_id combinations.
select x.cust_id,y.YearMonth
from (select distinct cust_id from mvt_data) x
, YearMonth y
现在将其与实际数据结合以获得每月设置 FLAG1 的计数.您可以利用 SAS 将布尔表达式计算为 0/1 值这一事实来简化代码.(在通用 SQL 实现中,您需要使用 CASE)因此,当标记观察的日期在间隔内时,它对 SUM() 贡献 1.
Now join that with the actual data to get the count number where FLAG1 is set per month. You can use the fact that SAS evaluates boolean expressions to 0/1 values to make the code simpler. (In generic SQL implementation you would need to use CASE) So when the date for the flagged observation is in the interval then it contributes 1 to the SUM().
proc sql ;
create table want as
select a.cust_id
, a.YearMonth
, sum(b.date1 between intnx('month',a.YearMonth,0,'b')
and intnx('month',a.YearMonth,0,'e') ) as SumFlag1
, sum(b.date1 between intnx('month',a.YearMonth,-6,'b')
and intnx('month',a.YearMonth,0,'e') ) as SumFlag1_last6Months
, sum(b.date1 between intnx('month',a.YearMonth,-12,'b')
and intnx('month',a.YearMonth,0,'e') ) as SumFlag1_last12Months
from
(select x.cust_id,y.YearMonth
from (select distinct cust_id from mvt_data) x
, YearMonth y
) a
left join mvt_data b
on a.cust_id = b.cust_id and b.Flag1=1
group by 1,2
;
quit;
结果:
Year Sum SumFlag1_ SumFlag1_
Obs cust_id Month Flag1 last6Months last12Months
1 A 202001 1 1 1
2 A 202002 0 1 1
3 A 202003 0 1 1
4 A 202004 0 1 1
5 A 202005 0 1 1
6 A 202006 0 1 1
7 B 202001 0 0 0
8 B 202002 0 0 0
9 B 202003 1 1 1
10 B 202004 0 1 1
11 B 202005 0 1 1
12 B 202006 0 1 1
...
这篇关于从 Teradata 到 PROC SQL 的 SQL 语句语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!