有没有办法在Big Query中进行滚动平均? [英] Is there a way to do rolling averages in Big Query?

查看:83
本文介绍了有没有办法在Big Query中进行滚动平均?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道Big Query中有一个AVG函数,并且有窗口函数可将上一个或下一个值上移或下移,但是是否有任何函数可以让您在指定的间隔内取平均值?例如,我想要这样的东西:

I know there is an AVG function in Big Query and there are window functions to shift previous or next values up or down a row, but is there any function that allows you to average over a specified interval? For instance I'd like to so something like the following:

SELECT
    city
    AVG(temperature) OVER(PARTITION BY city, INTERVAL day,14, ORDER BY day) as rolling_avg_14_days,
    AVG(temperature) OVER(PARTITION BY city, INTERVAL day,30, ORDER BY day) as rolling_avg_30_days,
WHERE
    city IN ("Los Angeles","Chicago","Sun Prairie","Sunnyvale")
    AND year BETWEEN 1900 AND 2013

我想进行一次滚动平均计算,该计算允许我指定要进行聚合函数的值的范围以及要排序的值.平均值函数将使用当前天的温度和之前的13天(或之前的29天)进行计算和平均.今天有可能吗?我知道,如果我在SELECT语句中放入13个LAG/OVER字段,然后对所有这些字段的结果求平均值,我会做类似的事情,但这会产生很多开销.

I'd like to do a rolling average calculation that allows me to specify a range of values to do an aggregation function over, and what value to order by. The average function would take the current day temp and previous 13 days (or previous 29) to calculate and average. Is this possible today? I know I could do something like this if I put 13 LAG/OVER fields in the SELECT statement and then avg the results of all of them, but that is a lot of overhead.

推荐答案

我认为

I think OVER with RANGE construction of Window Functions is best fit here

假设日期"字段表示为"YYYY-MM-DD"格式,则在下面的查询中进行滚动平均值

Assuming that day field is represented as 'YYYY-MM-DD' format, below query does the rolling averages

SELECT
  city,
  day,
  AVG(temperature) OVER(PARTITION BY city ORDER BY ts 
                RANGE BETWEEN 14*24*3600 PRECEDING AND CURRENT ROW) AS rolling_avg_14_days,
  AVG(temperature) OVER(PARTITION BY city ORDER BY ts 
                RANGE BETWEEN 30*24*3600 PRECEDING AND CURRENT ROW) AS rolling_avg_30_days
FROM (
  SELECT day, city, temperature, TIMESTAMP_TO_SEC(TIMESTAMP(day)) AS ts 
  FROM temperatures
)

您很可能很早以前就已经找到了该解决方案,但仍然想在这里获得我认为更好的答案(截至今天)

You most likely already found this solution long ago, but still wanted to have what i think a better answer (as of today) here for this question

这篇关于有没有办法在Big Query中进行滚动平均?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆