BigQuery-从一组Json文件中提取数据 [英] BigQuery - Extract data from an array of Json files
问题描述
早上好
在大查询中,我有一个表格,其中的一列具有以下形式(我显示3行):
In big query I have a table with one column on the following form (I show 3 lines):
[{'a':1,'b':4,'c':5},{'a':0,'b':7,'c':8},{'a':4,'b':9,'c':12}]
[{'a':9,'b':10,'c':9}]
[{'a':5,'b':10,'c':9},{'a':1,'b':10,'c':9},{'a':7,'b':10,'c':9}]
也就是说,我有一个Jsons数组(长度不固定).对于要提取的每一行(创建新列),如果存在,则当键"a" = 1时键"b"的值(对于每一行,键"a"可以等于1一度).我无法导入外部程序包.
that is, I have an array (Not fixed length) of Jsons. For every row I would like to extract (create a new column), if it is present, the value of the key 'b' when the key 'a'=1 (for every row the key 'a' can be equal 1 just one time). I can not import external packages.
谢谢您的帮助!
推荐答案
以下是BigQuery标准SQL
Below is for BigQuery Standard SQL
#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(json).map(x=>JSON.stringify(x));
""";
WITH `project.dataset.table` AS (
SELECT "[{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}]" json UNION ALL
SELECT "[{'a':9,'b':10, 'c':9}]" UNION ALL
SELECT "[{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}]"
)
SELECT json, JSON_EXTRACT_SCALAR(x, '$.b') AS b
FROM `project.dataset.table`,
UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.a') = '1'
有结果
Row json b
1 [{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}] 4
2 [{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}] 10
如果您需要保留所有原始行,请在下面使用
In case if you need to preserve all original rows - use below
#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(json).map(x=>JSON.stringify(x));
""";
WITH `project.dataset.table` AS (
SELECT "[{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}]" json UNION ALL
SELECT "[{'a':9,'b':10, 'c':9}]" UNION ALL
SELECT "[{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}]"
)
SELECT json,
(SELECT JSON_EXTRACT_SCALAR(x, '$.b')
FROM UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.a') = '1'
) AS b
FROM `project.dataset.table`
有结果
Row json b
1 [{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}] 4
2 [{'a':9,'b':10, 'c':9}] null
3 [{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}] 10
这篇关于BigQuery-从一组Json文件中提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!