用空值导入json数据 [英] Import json data with null values
问题描述
从BigQuery的导入文档中,
注意:不允许使用空值
所以我假设<$对于BigQuery导入,JSON格式的数据不允许使用c $ c> null 。但是, null
值在常规ETL任务中非常常见(由于缺少数据)。什么应该是一个很好的解决方案来导入这样的json源文件?请注意,我的数据包含嵌套结构,因此我不希望转换为 CSV
并使用 ,,
来表示 null
值。
我想我可以做的一种方法是替换所有 null
值分别与不同数据类型的默认值,例如,
- 字符串:
null $
- 整数:
null
- > -1
但我不喜欢它。我正在寻找更好的选择。
顺便说一句,我试图用包含
null 的json文件来执行
bq load
code>值。我收到以下错误消息:
$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ - 预期'''发现'n'
- 预计'''发现'n'
- 预期'''发现'n'
- 预期'''发现'n
...
我认为这是
null指示
使用,是否正确?
编辑:如果我删除所有
null
字段,它似乎工作。我想这是处理null
数据的方式。您不能有null
为数据字段,但你可以不包含它,所以我需要一个过滤代码来删除我的原始json中的所有null
字段。解决方案您可以使用JSON格式源文件导入NULL值 - 为值为NULL的元素省略key:value对。 b
$ b示例 - 假设您有这样的模式:
{
name:kind,
type:string
},
{
name:fullName,
type:string,
},
name:age,
type:integer,
mode:nullable
}
没有NULL值的记录可能如下所示:
<$ p $ b $ {
$ {
$ age $'$'$'$'$'$' / code>然而,当年龄为NULL时,试试这个(注意,没有年龄键):
{kind:person,
fullName:某人,
}
如果您遇到问题,请告诉我们。我会做一个笔记来改进使用JSON导入格式使用NULL值的文档。
From the import documentation of BigQuery,
Note: Null values are not allowed
So I assume
null
is not allowed in a json-formatted data for BigQuery import. However,null
value is actually very common in regular ETL task (due to missing data). What should be a good solution to import such json source files? Note my data contains nested structures so I do not prefer a conversion toCSV
and use,,
to represent anull
value.One way I think I can do is to replace all
null
values with default values of different data types respectively, e.g.,- string:
null
-> empty string - integer:
null
-> -1 - float:
null
-> -1.0 - ...
But I don't like it. I am looking for better options.
BTW, I tried to do
bq load
with a json file containingnull
values. I get the below error:Failure details: - Expected '"' found 'n' - Expected '"' found 'n' - Expected '"' found 'n' - Expected '"' found 'n' - Expected '"' found 'n ...
I think this is the indication of
null
usage, is it correct?EDIT: If I remove all the
null
fields, it seems to work. I guess this is the way to handle thenull
data. You cannot havenull
for a data field, but you can just not include it. So I need to have a filtering code to remove all thenull
field in my raw json.解决方案You can import NULL values using JSON format source files - omit the key:value pair for values that are NULL.
Example - Let's say you have a schema like this:
{ "name": "kind", "type": "string" }, { "name": "fullName", "type": "string", }, { "name": "age", "type": "integer", "mode": "nullable" }
A record with no NULL values might look like this:
{"kind": "person", "fullName": "Some Person", "age": 22 }
However, when "age" is NULL, try this (note, no "age" key):
{"kind": "person", "fullName": "Some Person", }
Please let us know if you have issues with this. I'll make a note to improve the documentation around using NULL values with JSON import formats.
这篇关于用空值导入json数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- 整数: