JSON数组上的类似于BigQuery枚举的函数 [英] BigQuery Enumerate-like function on JSON Array
本文介绍了JSON数组上的类似于BigQuery枚举的函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想将JSON编码的列表转换为本地BigQuery数组,理想情况下,最终将是具有值,位置条目的元组或字典的列表.因此,对python枚举功能的引用.
I want to turn a JSON encoded list into a native BigQuery Array, ideally this would end up being a list of tuples or dictionaries with value, position entries. Hence the reference to the python enumerate functionality.
即
[(idx, elem) for idx, elem in enumerate(json_list_string)]
[{'pos':idx, 'value':elem} for idx, elem in enumerate(json_list_string)]
The first part of turning the json into an array I already solved using this question
WITH
my_ids AS (
SELECT 'xyz' as grp, '["7f9f98fh9g4ef393d3h5", "chg3g33f26949hg6067d", "g477e5973ec04g7c3232", "0de1ec83304d761he786", "3c1h1f153530g90g35c2", "946637g145h48322686f"]' as ids
UNION ALL
SELECT 'abc' as grp, '["7f9f98fh9g4ef393d3h5", "chg3g33fdsfsdfs49hg6067d", "g477e5973ec04g7c3232", "0de1ec83304d761he786", "3c1h1f153530g90g35c2", "946637g145h48322686f"]' as ids
)
SELECT
*
FROM my_ids
在理想的世界中,我将得到如下输出:
In an ideal world I would get an output like:
xyz, 7f9f98fh9g4ef393d3h5, 1
xyz, chg3g33f26949hg6067d, 2
...
abc, 946637g145h48322686f, 6
请注意,列表可能会很长(最多24个条目,我有点不想对所有路径进行硬编码)
Please note, that the lists can be rather long (up to 24 entries and I kinda don't want to hardcode all the paths)
Edit2 :(可能的解决方案)
WITH
my_ids AS (
SELECT 'xyz' as grp, '["7f9f98fh9g4ef393d3h5", "chg3g33f26949hg6067d", "g477e5973ec04g7c3232", "0de1ec83304d761he786", "3c1h1f153530g90g35c2", "946637g145h48322686f"]' as ids
UNION ALL
SELECT 'abc' as grp, '["7f9f98fh9g4ef393d3h5", "chg3g33fdsfsdfs49hg6067d", "g477e5973ec04g7c3232", "0de1ec83304d761he786", "3c1h1f153530g90g35c2", "946637g145h48322686f"]' as ids
),
as_list AS (SELECT
*,
SPLIT(REGEXP_REPLACE(JSON_EXTRACT(ids,'$'), r'[\[\]\"]', ''), ',') AS split_items,
GENERATE_ARRAY(1, ARRAY_LENGTH(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(ids,'$'), r'[\[\]\"]', ''), ','))) AS positions
FROM my_ids)
SELECT grp, ids, positions[OFFSET(off)] as pos
FROM as_list, unnest(split_items) as ids WITH OFFSET off
推荐答案
以下是BigQuery标准SQL
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.my_ids` AS (
SELECT 'xyz' AS grp, '["7f9f98fh9g4ef393d3h5", "chg3g33f26949hg6067d", "g477e5973ec04g7c3232", "0de1ec83304d761he786", "3c1h1f153530g90g35c2", "946637g145h48322686f"]' AS ids UNION ALL
SELECT 'abc' AS grp, '["7f9f98fh9g4ef393d3h5", "chg3g33fdsfsdfs49hg6067d", "g477e5973ec04g7c3232", "0de1ec83304d761he786", "3c1h1f153530g90g35c2", "946637g145h48322686f"]' AS ids
)
SELECT grp, id, ROW_NUMBER() OVER(PARTITION BY grp ORDER BY OFFSET) pos
FROM `project.dataset.my_ids`,
UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(ids,'$'), r'[\[\]\"]', ''), ',')) id WITH OFFSET
这篇关于JSON数组上的类似于BigQuery枚举的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文