如何取消嵌套和旋转BigQuery中的两列 [英] How to unnest and pivot two columns in BigQuery
问题描述
说我有一个包含以下信息的BQ表
Say I have a BQ table containing the following information
| id | test.name | test.score |
|---- |----------- |------------ |
| 1 | a | 5 |
| | b | 7 |
| 2 | a | 8 |
| | c | 3 |
嵌套测试的位置.我如何将测试转换为下表?
Where test is nested. How would I pivot test into the following table?
| id | a | b | c |
|---- |--- |--- |--- |
| 1 | 5 | 7 | |
| 2 | 8 | | 3 |
我无法直接进行数据透视测试,因为在 pivot(test)
处收到以下错误消息:未找到表值函数
.先前的问题( 1 ,
I cannot pivot test directly, as I get the following error message at pivot(test)
: Table-valued function not found
. Previous questions (1, 2) don't deal with nested columns or are outdated.
以下查询看起来是一个有用的第一步:
The following query looks like a useful first step:
select a.id, t
from `table` as a,
unnest(test) as t
但是,这只是为我提供了:
However, this just provides me with:
| id | test.name | test.score |
|---- |----------- |------------ |
| 1 | a | 5 |
| 1 | b | 7 |
| 2 | a | 8 |
| 2 | c | 3 |
推荐答案
条件聚合是一种很好的方法.如果表很大,您可能会发现它的性能最佳:
Conditional aggregation is a good approach. If your tables are large, you might find that this has the best performance:
select t.id,
(select max(tt.score) from unnest(t.score) tt where tt.name = 'a') as a,
(select max(tt.score) from unnest(t.score) tt where tt.name = 'b') as b,
(select max(tt.score) from unnest(t.score) tt where tt.name = 'c') as c
from `table` t;
我之所以建议这样做,是因为它避免了外部聚集. unnest()
的发生没有对数据进行重新排列-我发现这在性能方面是一个巨大的胜利.
The reason I recommend this is because it avoids the outer aggregation. The unnest()
happens without shuffling the data around -- and I have found that this is a big win in terms of performance.
这篇关于如何取消嵌套和旋转BigQuery中的两列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!