如何取消嵌套和旋转BigQuery中的两列 [英] How to unnest and pivot two columns in BigQuery

查看:49
本文介绍了如何取消嵌套和旋转BigQuery中的两列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个包含以下信息的BQ表

Say I have a BQ table containing the following information

| id    | test.name     | test.score    |
|----   |-----------    |------------   |
| 1     | a             | 5             |
|       | b             | 7             |
| 2     | a             | 8             |
|       | c             | 3             |

嵌套测试的位置.我如何将测试转换为下表?

Where test is nested. How would I pivot test into the following table?

| id    | a     | b     | c     |
|----   |---    |---    |---    |
| 1     | 5     | 7     |       |
| 2     | 8     |       | 3     |

我无法直接进行数据透视测试,因为在 pivot(test)处收到以下错误消息:未找到表值函数.先前的问题( 1

I cannot pivot test directly, as I get the following error message at pivot(test): Table-valued function not found. Previous questions (1, 2) don't deal with nested columns or are outdated.

以下查询看起来是一个有用的第一步:

The following query looks like a useful first step:

select a.id, t
from `table` as a,
unnest(test) as t

但是,这只是为我提供了:

However, this just provides me with:

| id    | test.name     | test.score    |
|----   |-----------    |------------   |
| 1     | a             | 5             |
| 1     | b             | 7             |
| 2     | a             | 8             |
| 2     | c             | 3             |

推荐答案

条件聚合是一种很好的方法.如果表很大,您可能会发现它的性能最佳:

Conditional aggregation is a good approach. If your tables are large, you might find that this has the best performance:

select t.id,
       (select max(tt.score) from unnest(t.score) tt where tt.name = 'a') as a,
       (select max(tt.score) from unnest(t.score) tt where tt.name = 'b') as b,
       (select max(tt.score) from unnest(t.score) tt where tt.name = 'c') as c
from `table` t;

我之所以建议这样做,是因为它避免了外部聚集. unnest()的发生没有对数据进行重新排列-我发现这在性能方面是一个巨大的胜利.

The reason I recommend this is because it avoids the outer aggregation. The unnest() happens without shuffling the data around -- and I have found that this is a big win in terms of performance.

这篇关于如何取消嵌套和旋转BigQuery中的两列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆