如何在BigQuery中获取给定时间具有特定值的ID数组作为其最新值? [英] How do I get an array of id's having a specific value as their latest value at a given time in BigQuery?

查看:37
本文介绍了如何在BigQuery中获取给定时间具有特定值的ID数组作为其最新值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个BigQuery表,其中包含以下数据:

I have a BigQuery table with the following data:

    SELECT DATE("2019-11-11") as date, "old" as state, 1 as id UNION ALL 
    SELECT DATE("2019-11-12"), "new", 1 UNION ALL 
    SELECT DATE("2019-11-13"), "new" , 2 UNION ALL 
    SELECT DATE("2019-11-14"), "old" , 1

我想在"new"(新)列表中获取所有id.每天的状态-应该保留状态,直到另行通知为止(在这种情况下,切换为旧"状态).我该怎么做呢?我曾尝试使用ARRAY_AGG,但无法提出一种解决方案,既可以查看id的最新值,也可以查看新状态.

I want to get all id's in the "new" state for each day - the state should be preserved until told otherwise (switched to "old" in this case). How do I do this? I have tried working with ARRAY_AGG but cannot come up with a solution where I could both look at the latest value for the id as well as check for the new state.

因此,在上面的示例中,我希望输出为:

So with the example above I would like the output to be:

date      | new_state_ids

2019-11-11| NULL
2019-11-12| [1]
2019-11-13| [1,2]
2019-11-14| [2]

感谢您的帮助,谢谢!

推荐答案

在下面考虑

select date, state, id, 
  case state when 'new' then 
      array_agg(id) over(partition by grp order by date) 
    else if(prev_id is null, null, [prev_id])
  end new_state_ids
from (
  select *, countif(new_grp) over(order by date) grp
  from (
    select *, 
      state != lag(state) over(order by date) new_grp,
      lag(id) over(order by date) prev_id
    from `project.dataset.table`
  )
)    

如果应用于您问题中的样本数据-输出为

if applied to sample data in your question - output is

这篇关于如何在BigQuery中获取给定时间具有特定值的ID数组作为其最新值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆