BigQuery-移动中值计算 [英] BigQuery - Moving median calculation

查看:76
本文介绍了BigQuery-移动中值计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的月度销售数据

I have data on monthly sales like this

Company  Month    Sales
Adidas   2018-09   100
Adidas   2018-08    95
Adidas   2018-07   120
Adidas   2018-06   155
...and so on

我需要添加另一列说明median over the past 12 months(如果没有可用的12个月,则列出尽可能多的数据).

I need to add another column stating the median over the past 12 months (or as many as there is data for if 12 months are not available).

在Python中,我想出了如何使用for循环来执行此操作,但是我不确定在BigQuery中如何执行.

In Python I figured out how to do it with for loops, but I'm not sure how to do in BigQuery.

谢谢!

推荐答案

以下是一种可行的方法:

Here is an approach that might work:

CREATE TEMP FUNCTION MEDIAN(arr ANY TYPE) AS ((
  SELECT
    IF(
      MOD(ARRAY_LENGTH(arr), 2) = 0,
      (arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2) - 1)] + arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2))]) / 2,
      arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2))]
    )
  FROM (SELECT ARRAY_AGG(x ORDER BY x) AS arr FROM UNNEST(arr) AS x)
));

SELECT
  Company,
  Month,
  MEDIAN(
    ARRAY_AGG(Sales) OVER (PARTITION BY Company ORDER BY Month ROWS BETWEEN 11 PRECEDING AND CURRENT ROW)
  ) AS trailing_median
FROM (
  SELECT 'Adidas' AS Company, '2018-09' AS Month, 100 AS Sales UNION ALL
  SELECT 'Adidas', '2018-08', 95 UNION ALL
  SELECT 'Adidas', '2018-07', 120 UNION ALL
  SELECT 'Adidas', '2018-06', 155
);

结果是:

+---------+---------+-----------------+
| Company |  Month  | trailing_median |
+---------+---------+-----------------+
| Adidas  | 2018-06 |           155.0 |
| Adidas  | 2018-07 |           137.5 |
| Adidas  | 2018-08 |           120.0 |
| Adidas  | 2018-09 |           110.0 |
+---------+---------+-----------------+

这篇关于BigQuery-移动中值计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆