如何在BigQuery中获取给定时间具有特定值的ID数组作为其最新值? [英] How do I get an array of id's having a specific value as their latest value at a given time in BigQuery?
本文介绍了如何在BigQuery中获取给定时间具有特定值的ID数组作为其最新值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个BigQuery表,其中包含以下数据:
I have a BigQuery table with the following data:
SELECT DATE("2019-11-11") as date, "old" as state, 1 as id UNION ALL
SELECT DATE("2019-11-12"), "new", 1 UNION ALL
SELECT DATE("2019-11-13"), "new" , 2 UNION ALL
SELECT DATE("2019-11-14"), "old" , 1
我想在"new"(新)列表中获取所有id.每天的状态-应该保留状态,直到另行通知为止(在这种情况下,切换为旧"状态).我该怎么做呢?我曾尝试使用ARRAY_AGG,但无法提出一种解决方案,既可以查看id的最新值,也可以查看新状态.
I want to get all id's in the "new" state for each day - the state should be preserved until told otherwise (switched to "old" in this case). How do I do this? I have tried working with ARRAY_AGG but cannot come up with a solution where I could both look at the latest value for the id as well as check for the new state.
因此,在上面的示例中,我希望输出为:
So with the example above I would like the output to be:
date | new_state_ids
2019-11-11| NULL
2019-11-12| [1]
2019-11-13| [1,2]
2019-11-14| [2]
感谢您的帮助,谢谢!
推荐答案
在下面考虑
select date, state, id,
case state when 'new' then
array_agg(id) over(partition by grp order by date)
else if(prev_id is null, null, [prev_id])
end new_state_ids
from (
select *, countif(new_grp) over(order by date) grp
from (
select *,
state != lag(state) over(order by date) new_grp,
lag(id) over(order by date) prev_id
from `project.dataset.table`
)
)
如果应用于您问题中的样本数据-输出为
if applied to sample data in your question - output is
这篇关于如何在BigQuery中获取给定时间具有特定值的ID数组作为其最新值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文