重复记录以填充Google BigQuery中的日期之间的差距 [英] Duplicating records to fill gap between dates in Google BigQuery

查看:109
本文介绍了重复记录以填充Google BigQuery中的日期之间的差距的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我发现了类似的资源,解决了如何在SQL中执行此操作,如下所示:
重复记录以填补日期间的差距

So I've found similar resources that address how to do this in SQL, like this: Duplicating records to fill gap between dates

据我所知,BigQuery可能不是最好的地方,所以我试图看看它是否完全可能。当试图运行上面链接中的某些方法时,由于BigQuery中不支持某些函数,因此我打了一面墙。

I understand that BigQuery may not be the best place to do this, so I'm trying to see if it's at all possible. When trying to run some of the methods in the link above above I'm hitting a wall as some of the functions aren't supported within BigQuery.

如果存在表数据结构如下:

If a table exists with data structured like so:

    MODIFY_DATE             SKU         STORE   STOCK_ON_HAND
    08/01/2016 00:00:00     1120010     21      100
    08/05/2016 00:00:00     1120010     21      75
    08/07/2016 00:00:00     1120010     21      40

如何在Google BigQuery中构建一个类似于下面的输出的查询?

How can I build a query within Google BigQuery that yields an output like the one below? A value at a given date is repeated until the next change for the dates in between:

    MODIFY_DATE             SKU         STORE   STOCK_ON_HAND
    08/01/2016 00:00:00     1120010     21      100
    08/02/2016 00:00:00     1120010     21      100
    08/03/2016 00:00:00     1120010     21      100
    08/04/2016 00:00:00     1120010     21      100
    08/05/2016 00:00:00     1120010     21      75
    08/06/2016 00:00:00     1120010     21      75
    08/07/2016 00:00:00     1120010     21      40

我知道我需要生成一个表格,其中包含给定范围内的所有日期,但我很难理解这是否可以完成。任何想法?

I know I need to generate a table that has all the dates within a given range, but I'm having a hard time understanding if this can be done. Any ideas?

推荐答案


如何在Google BigQuery中构建一个查询,一个在下面?

How can I build a query within Google BigQuery that yields an output like the one below? A value at a given date is repeated until the next change for the dates in between

请参阅下面的示例 $ b $重复给定日期的值直到下一次更改 b

See example below

SELECT
  MODIFY_DATE, 
  MAX(SKU_TEMP) OVER(PARTITION BY grp) AS SKU,
  MAX(STORE_TEMP) OVER(PARTITION BY grp) AS STORE,
  MAX(STOCK_ON_HAND_TEMP) OVER(PARTITION BY grp) AS STOCK_ON_HAND,
FROM (
  SELECT
    DAY AS MODIFY_DATE, SKU AS SKU_TEMP, STORE AS STORE_TEMP, STOCK_ON_HAND AS STOCK_ON_HAND_TEMP,
    COUNT(SKU) OVER(ORDER BY DAY ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS grp,
  FROM (
    SELECT DATE(DATE_ADD(TIMESTAMP("2016-08-01"), pos - 1, "DAY")) AS DAY
    FROM (
         SELECT ROW_NUMBER() OVER() AS pos, *
         FROM (FLATTEN((
         SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP("2016-08-07"), TIMESTAMP("2016-08-01")), '.'),'') AS h
         FROM (SELECT NULL)),h
    )))
  ) AS DATES
  LEFT JOIN (
    SELECT DATE(MODIFY_DATE) AS MODIFY_DATE, SKU, STORE, STOCK_ON_HAND 
    FROM 
      (SELECT "2016-08-01" AS MODIFY_DATE, "1120010" AS SKU, 21 AS STORE, 75 AS STOCK_ON_HAND),
      (SELECT "2016-08-05" AS MODIFY_DATE, "1120010" AS SKU, 22 AS STORE, 100 AS STOCK_ON_HAND),
      (SELECT "2016-08-07" AS MODIFY_DATE, "1120011" AS SKU, 23 AS STORE, 40 AS STOCK_ON_HAND),
  ) AS TABLE_WITH_GAPS
  ON TABLE_WITH_GAPS.MODIFY_DATE = DATES.DAY
)
ORDER BY MODIFY_DATE

这篇关于重复记录以填充Google BigQuery中的日期之间的差距的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆