Google Bigquery 中的 REPEATED 字段是什么意思? [英] What does REPEATED field in Google Bigquery mean?

查看:21
本文介绍了Google Bigquery 中的 REPEATED 字段是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请在以下示例中检查我对 REPEATED 字段的理解:

<代码>{"title": "字母表的历史",作者": [{姓名":拉里"},]}

此 JSON 具有架构:

<预><代码>[{"name": "title",类型":字符串"},{"name": "作者","类型": "记录",领域":[{"name": "姓名",类型":字符串"}]}]

但是下面的JSON

<代码>{"title": "字母表的历史",作者":[拉里"、史蒂夫"、埃里克"]}

有架构:

<预><代码>[{"name": "title",类型":字符串"},{"name": "作者","type": "STRING",模式":重复"}]

这是正确的吗?

nb:我尝试阅读文档,但不能'找不到关于此的任何解释.

解决方案

关闭.在您的第一个示例中,author 是一个对象数组,对应于 BQ 中的重复记录.所以架构将是:

<预><代码>[{"name": "title",类型":字符串"},{"name": "作者","类型": "记录",模式":重复",<--- 注意!领域":[{"name": "姓名",类型":字符串"}]}]

您的第二个数据/模式对看起来不错(但请注意,整体模式是一个数组,而不是一个对象,并且元素之间需要逗号).

这里有一些关于嵌套和重复字段的讨论:https://cloud.google.com/bigquery/docs/data?hl=en#nested

这里还有一些示例 JSON 数据对象:https://cloud.google.com/bigquery/preparing-data-for-bigquery#dataformats

但我同意我们没有很好地解释这些对象如何映射到 BQ 模式.对不起!

Please check my understanding of REPEATED field in the following examples:

{
    "title": "History of Alphabet",
    "author": [
        {
            "name": "Larry"
        },
    ]
}

This JSON has schema:

[
    {
        "name": "title",
        "type": "STRING"
    },
    {
        "name": "author",
        "type": "RECORD",
        "fields": [
            {
                "name": "name",
                "type": "STRING"
            }
        ]
    }
]

But the following JSON

{
    "title": "History of Alphabet",
    "author": ["Larry", "Steve", "Eric"]
}

has schema:

[
    {
        "name": "title",
        "type": "STRING"
    },
    {
        "name": "author",
        "type": "STRING",
        "mode": "REPEATED"
    }
]

Is this correct?

nb: I tried to go through the documentation, but can't find any explanation about this.

解决方案

Close. In your first example, author is an array of objects, which corresponds to a repeated record in BQ. So the schema would be:

[
    {
        "name": "title",
        "type": "STRING"
    },
    {
        "name": "author",
        "type": "RECORD",
        "mode": "REPEATED",   <--- NOTE!
        "fields": [
            {
                "name": "name",
                "type": "STRING"
            }
        ]
    }
]

Your second data/schema pair looks good (but note that the overall schema is an array, not an object, and it needs commas between elements).

There is some discussion of nested and repeated fields here: https://cloud.google.com/bigquery/docs/data?hl=en#nested

There are also some sample JSON data objects here: https://cloud.google.com/bigquery/preparing-data-for-bigquery#dataformats

But I agree we don't do a good job of explaining how those objects map to BQ schemas. Sorry about that!

这篇关于Google Bigquery 中的 REPEATED 字段是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆