在BigQuery中取消嵌套结构 [英] Unnesting structs in BigQuery

查看:100
本文介绍了在BigQuery中取消嵌套结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在BigQuery中扁平化两个数组的结构的正确方法是什么?我有一个像此处所示的数据集(struct.destination和struct.visitors数组是有序的-即访问者数专门对应于同一行中的目的地):

What is the correct way to flatten a struct of two arrays in BigQuery? I have a dataset like the one pictured here (the struct.destination and struct.visitors arrays are ordered - i.e. the visitor counts correspond specifically to the destinations in the same row):

我想重新组织数据,以便对起点和终点的每个唯一组合都有一个总的访问者计数.理想情况下,最终结果将如下所示:

I want to reorganize the data so that I have a total visitor count for each unique combination of origins and destinations. Ideally, the end result will look like this:

我尝试连续两次使用UNNEST-一次在struct.destination上,然后在struct.visitors上,但这会产生错误的结果(每个目标在仅应映射时都会映射到访问者计数数组中的每个值到同一行中的值):

I tried using UNNEST twice in a row - once on struct.destination and then on struct.visitors, but this produces the wrong result (each destination gets mapped to every value in the array of visitor counts when it should only get mapped to the value in the same row):

SELECT
  origin,
  unnested_destination,
  unnested_visitors
FROM
  dataset.table,
  UNNEST(struct.destination) AS unnested_destination,
  UNNEST(struct.visitors) AS unnested_visitors

推荐答案

您有一个重复的结构.所以,我想你想要

You have one struct that is repeated. So, I think you want:

SELECT origin,
       s.destination,
       s.visitors
FROM dataset.table t CROSS JOIN
     UNNEST(t.struct) s;

我知道,您有两个数组的结构.您可以这样做:

I see, you have a struct of two arrays. You can do:

SELECT origin, d.destination, v.visitors
FROM dataset.table t CROSS JOIN
     UNNEST(struct.destination) s WITH OFFSET nd LEFT JOIN
     UNNEST(struct.visitors) v WITH OFFSET nv
     ON nd = nv

这篇关于在BigQuery中取消嵌套结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆