如何计算与Big Query中当前行相同的工作日的滚动平均值? [英] How to calculate rolling average of same weekday as current row in Big Query?
问题描述
由于某些限制,我必须使用传统SQL。
现在,这是最后21天的滚动总和(相当于3周),但我真的很想找到一种方法来完成前一周的前3天的滚动时间,对应于当前行的同一周。
AVG(sales_total)OVER(PARTITION BY id ORDER BY date RANGE between 22 PRECEDING AND 1 PRECEDING)as avg_of_last_3_week
编辑:
表A
+ - ------ + --------- + --------- + ----------- +
| ID 。 |日期| weekday | sales_total |
+ ------- + --------- + --------- + ----------- +
| 1 | 17年1月1日| 1 | 5 |
| 2 | 17年1月2日| 2 | 。 |
| 3 | 17年1月3日| 3 | 。 |
| 1 | 17年1月8日| 1 | 10 |
| 2 | 17年1月9日| 2 | 。 |
| 3 | 17年1月10日| 3 | 。 |
| 1 | 17年1月15日| 1 | 15 |
| 2 | 17年1月16日| 2 | 。 |
| 3 | 17年1月17日| 3 | 。 |
+ ------- + --------- + --------- + ----------- +
我希望得到的查询返回表A并添加一个额外的列,它是滚动平均值(例如,下面的行将是我期望在01-22-17的id 1)。滚动平均值仅为前3个星期日(5 + 10 + 15)的平均值)。b
$ b
+ ----- - + --------- + --------- + ----------- + ----------- +
| id |日期| weekday | sales_total | rolling_avg |
+ ------- + --------- + --------- + ----------- + ----- ------ +
| 1 | 17年1月22日| 1 | 15 | 10 |
谢谢
#standardSQL
SELECT_ID,sales_date,weekday,sales_total,
AVG(sales_total)OVER(rolling_3_previous_same_weekdays)rolling_avg
FROM(
SELECT *,EXTRACT(DAYOFWEEK FROM sales_date)weekday
FROM t
)
WINDOW rolling_3_previous_same_weekdays AS(
PARTITION BY id,weekday
ORDER BY sales_date
2先行和当前行之间的行
)
ORDER BY weekday,sales_date
你可以使用下面的伪数据来测试/使用它:
$ b
#standardSQL
with t AS(
SELECT 1 id,1 AS sales_total,DATE'2017-01-01'sales_date UNION ALL
SELECT 1,2,DATE'2017-01-02'UNION ALL
SELECT 1,3,DATE'2017-01-03'UNION ALL
SELECT 1,4,DATE' 2017-01-04'UNION ALL
SELECT 1,5,DATE'2017-01-05'UNION ALL
SELECT 1,6,DATE'2017-01-06'UNION ALL
SELECT 1,7,DATE'2017-01-07'UNION ALL
SELECT 1,8,DATE'2017-01-08'UNION ALL
SELECT 1,9,DATE'2017-01-09 'UNION ALL
SELECT 1,10,DATE'2017-01-10'UNION ALL
SELECT 1,11,DATE'2017-01-11'UNION ALL
SELECT 1,12, DATE'2017-01-12'UNION ALL
SELECT 1,13,DATE'2017-01-13'UNION ALL
SELECT 1,14,DATE'2017-01-14'UNION ALL
SELECT 1,15,DATE'2017-01-15'UNION ALL
SELECT 1,16,DATE'2017-01-16'UNION ALL
SELECT 1,17,DATE'2017-01 -17'UNION ALL
SELECT 1,18,DATE'2017-01-18'UNION ALL
SELECT 1,19,DATE'2017-01-19'UNION ALL
SELECT 1, 20,DATE'2017-01-20 UNION ALL
SELECT 1,21,DATE'2017-01-21'UNION ALL
SELECT 1,22,DATE'2017-01-22'UNION ALL
SELECT 1,23,DATE '2017-01-23'UNION ALL
SELECT 1,24,DATE'2017-01-24'UNION ALL
SELECT 1,25,DATE'2017-01-25'UNION ALL
SELECT 1,26,DATE'2017-01-26'UNION ALL
SELECT 1,27,DATE'2017-01-27'UNION ALL
SELECT 1,28,DATE'2017-01- 28'UNION ALL
SELECT 1,29,DATE'2017-01-29'UNION ALL
SELECT 1,30,DATE'2017-01-30'UNION ALL
SELECT 1,31 ,DATE'2017-01-31'
)
SELECT_ID,sales_date,weekday,sales_total,
AVG(sales_total)OVER(rolling_3_previous_same_weekdays)rolling_avg
FROM(
SELECT *,EXTRACT(DAYOFWEEK FROM sales_date)weekday
FROM`project.dataset.your_table
)
WINDOW rolling_3_previous_same_weekdays AS(
PARTITION BY id,weekday
ORDER BY sales_date
2个前置和当前行之间的行
)
ORDER BY weekday,sales_date
我希望您一旦想到以上方法 - 您可以使用BigQuery Legacy SQL轻松复制它 - 此处特定于Standrad SQL的唯一函数是EXTRACT() - 但看起来您甚至不需要它,因为周日已经是您数据的一部分了。
祝你好运! :o)
Due to certain restrictions, I have to use legacy SQL.
Right now, I have this which is a rolling sum of last 21 days(corresponding to 3 weeks), but I am really looking for a way to do a rolling sum of 3 previous weekday corresponding to the same week day as the current row.
AVG(sales_total) OVER (PARTITION BY id ORDER BY date RANGE BETWEEN 22 PRECEDING AND 1 PRECEDING) AS avg_of_last_3_week
EDIT:
Table A
+-------+---------+---------+-----------+
| id . | date | weekday |sales_total|
+-------+---------+---------+-----------+
| 1 | 01-01-17| 1 | 5 |
| 2 | 01-02-17| 2 | . |
| 3 | 01-03-17| 3 | . |
| 1 | 01-08-17| 1 | 10 |
| 2 | 01-09-17| 2 | . |
| 3 | 01-10-17| 3 | . |
| 1 | 01-15-17| 1 | 15 |
| 2 | 01-16-17| 2 | . |
| 3 | 01-17-17| 3 | . |
+-------+---------+---------+-----------+
I would like the resulting query to return Table A with an extra column which is the rolling average (for example, the below row would be what I expect for id 1 on 01-22-17). The rolling average is only the average for the previous 3 sundays (5+10+15))
+-------+---------+---------+-----------+-----------+
| id | date | weekday |sales_total|rolling_avg|
+-------+---------+---------+-----------+-----------+
| 1 | 01-22-17| 1 | 15 | 10 |
Thank you
Below example is for BigQuery Standard SQL (and if you still bound to Legacy SQL - you can easily "translate below to Legacy)
#standardSQL
SELECT id, sales_date, weekday, sales_total,
AVG(sales_total) OVER(rolling_3_previous_same_weekdays) rolling_avg
FROM (
SELECT *, EXTRACT(DAYOFWEEK FROM sales_date) weekday
FROM t
)
WINDOW rolling_3_previous_same_weekdays AS (
PARTITION BY id, weekday
ORDER BY sales_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
)
ORDER BY weekday, sales_date
you can test / play with it using dummy data as below
#standardSQL
WITH t AS (
SELECT 1 id, 1 AS sales_total, DATE '2017-01-01' sales_date UNION ALL
SELECT 1, 2, DATE '2017-01-02' UNION ALL
SELECT 1, 3, DATE '2017-01-03' UNION ALL
SELECT 1, 4, DATE '2017-01-04' UNION ALL
SELECT 1, 5, DATE '2017-01-05' UNION ALL
SELECT 1, 6, DATE '2017-01-06' UNION ALL
SELECT 1, 7, DATE '2017-01-07' UNION ALL
SELECT 1, 8, DATE '2017-01-08' UNION ALL
SELECT 1, 9, DATE '2017-01-09' UNION ALL
SELECT 1, 10, DATE '2017-01-10' UNION ALL
SELECT 1, 11, DATE '2017-01-11' UNION ALL
SELECT 1, 12, DATE '2017-01-12' UNION ALL
SELECT 1, 13, DATE '2017-01-13' UNION ALL
SELECT 1, 14, DATE '2017-01-14' UNION ALL
SELECT 1, 15, DATE '2017-01-15' UNION ALL
SELECT 1, 16, DATE '2017-01-16' UNION ALL
SELECT 1, 17, DATE '2017-01-17' UNION ALL
SELECT 1, 18, DATE '2017-01-18' UNION ALL
SELECT 1, 19, DATE '2017-01-19' UNION ALL
SELECT 1, 20, DATE '2017-01-20' UNION ALL
SELECT 1, 21, DATE '2017-01-21' UNION ALL
SELECT 1, 22, DATE '2017-01-22' UNION ALL
SELECT 1, 23, DATE '2017-01-23' UNION ALL
SELECT 1, 24, DATE '2017-01-24' UNION ALL
SELECT 1, 25, DATE '2017-01-25' UNION ALL
SELECT 1, 26, DATE '2017-01-26' UNION ALL
SELECT 1, 27, DATE '2017-01-27' UNION ALL
SELECT 1, 28, DATE '2017-01-28' UNION ALL
SELECT 1, 29, DATE '2017-01-29' UNION ALL
SELECT 1, 30, DATE '2017-01-30' UNION ALL
SELECT 1, 31, DATE '2017-01-31'
)
SELECT id, sales_date, weekday, sales_total,
AVG(sales_total) OVER(rolling_3_previous_same_weekdays) rolling_avg
FROM (
SELECT *, EXTRACT(DAYOFWEEK FROM sales_date) weekday
FROM `project.dataset.your_table
)
WINDOW rolling_3_previous_same_weekdays AS (
PARTITION BY id, weekday
ORDER BY sales_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
)
ORDER BY weekday, sales_date
I hope as soon as you get an idea of above approach - you can easily reproduce it with BigQuery Legacy SQL - the only function used here that specific to Standrad SQL is EXTRACT() - but looks like you will not even need it as weekday already part of your data
Good luck! :o)
这篇关于如何计算与Big Query中当前行相同的工作日的滚动平均值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!