BigQuery中的json对象的嵌套字符串数组 [英] Unnest stringified array of json objects in BigQuery
本文介绍了BigQuery中的json对象的嵌套字符串数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个表,其中包含一个string
列,其中包含JSON对象的字符串化列表,如下所示:
I have a table that contains a string
column containing a stringified list of JSON objects like so:
'[{"a": 5, "b": 6}, {"a": 7, "b": 8}]'
我想取消嵌套此数组,然后使用json_extract()
或json_extract_scalar()
从这些对象中获取值.
I would like to unnest this array, and then use json_extract()
or json_extract_scalar()
to get the values out of these objects.
根据 BigQuery的JSON函数文档尚不清楚可以使用内置功能来做到这一点.
It's unclear from BigQuery's JSON Function documentation that I'm able to do so using baked-in functionality.
是否需要UDF才能做到这一点,或者BigQuery中是否存在此功能?
下面的UDF完成了我要寻找的东西:
The below UDF accomplishes what I'm looking for:
CREATE TEMP FUNCTION
JSON_EXTRACT_ARRAY(input STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(input).map(x => JSON.stringify(x));
""";
with
raw as (
select
1 as id,
'[{"a": 5, "b": 6}, {"a": 7, "b": 8}]' as body
)
select
id,
json_extract(entry, '$.a') as a,
json_extract(entry, '$.b') as b
from
raw,
unnest(json_extract_array(body)) as entry
推荐答案
尝试类似的方法
with
raw as (
select
1 as id,
'[{"a": 5, "b": 6}, {"a": 7, "b": 8}]' as body
)
select
r.id,
r.body,
regexp_extract_all(r.body, r'({.*?})'),
json_extract(entry, '$.a') as a,
json_extract(entry, '$.b') as b
from
raw as r
cross join unnest(
regexp_extract_all(r.body, r'({.*?})')
) as entry
或更一般的解决方案
with
raw as (
select
1 as id,
'[{"a": 5, "b": {"x": 1, "y": 2}}, {"b": {"c": 5, "d": 8}, "a": 7}]' as body
)
select
r.id,
r.body,
split(trim(r.body, '[]{}'), '}, {'),
json_extract(concat('{', entry, '}'), '$.a') as a,
json_extract(concat('{', entry, '}'), '$.b') as b
from
raw as r
cross join unnest(
split(trim(r.body, '[]{}'), '}, {')
) as entry
这篇关于BigQuery中的json对象的嵌套字符串数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文