在单独的行上返回Redshift JSON数组的元素 [英] Return elements of Redshift JSON array on separate rows

查看：99 发布时间：2019/11/24 17:21:52 json amazon-redshift

本文介绍了在单独的行上返回Redshift JSON数组的元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个Redshift表，如下所示:

I have a Redshift table that looks like this:

 id | metadata
---------------------------------------------------------------------------
 1  | [{"pet":"dog"},{"country":"uk"}]
 2  | [{"pet":"cat"}]
 3  | []
 4  | [{"country":"germany"},{"education":"masters"},{"country":"belgium"}]

所有数组元素只有一个字段.
不能保证特定字段将出现在数组的任何元素中.
字段名称可以在数组中重复
数组元素可以任意顺序

All array elements have just one field.
There is no guarantee that a particular field will feature in any of an array's elements.
A field name can be repeated in an array
The array elements can be in any order

我想找回一张看起来像这样的桌子:

I am wanting to get back a table that looks like this:

 id |   field   |  value
------------------------
 1  | pet       | dog
 1  | country   | uk
 2  | pet       | cat
 4  | country   | germany
 4  | education | masters
 4  | country   | belgium

然后我可以将其与输入表其余部分上的查询结合起来.

I can then combine this with my queries on the rest of the input table.

我曾尝试过使用Redshift JSON函数，但是由于无法在Redshift中编写函数/使用循环/具有变量，我真的看不到做到这一点的方法！

I have tried playing around with the Redshift JSON functions, but without being able to write functions/use loops/have variables in Redshift, I really can't see a way to do this!

请让我知道是否可以澄清其他内容.

Please let me know if I can clarify anything else.

推荐答案

感谢受到启发的

Thanks to this inspired blog post, I've been able to craft a solution. This is:

创建一个查找表，以有效地迭代"每个数组的元素.该表中的行数等于或大于数组的最大元素数.假设这是4(可以使用SELECT MAX(JSON_ARRAY_LENGTH(metadata)) FROM input_table计算):

CREATE VIEW seq_0_to_3 AS
    SELECT 0 AS i UNION ALL                                      
    SELECT 1 UNION ALL
    SELECT 2 UNION ALL    
    SELECT 3          
);

由此，我们可以为每个JSON元素创建一行:

From this, we can create one row per JSON element:

WITH exploded_array AS (                                                                          
    SELECT id, JSON_EXTRACT_ARRAY_ELEMENT_TEXT(metadata, seq.i) AS json
    FROM input_table, seq_0_to_3 AS seq
    WHERE seq.i < JSON_ARRAY_LENGTH(metadata)
  )
SELECT *
FROM exploded_array;

制作:

 id | json
------------------------------
 1  | {"pet":"dog"}
 1  | {"country":"uk"}
 2  | {"pet":"cat"}
 4  | {"country":"germany"}
 4  | {"education":"masters"}
 4  | {"country":"belgium"}

但是，我需要提取字段名称/值.由于我看不到使用Redshift的有限函数提取JSON字段名称的任何方法，因此我将使用正则表达式来做到这一点:

However, I was needing to extract the field names/values. As I can't see any way to extract JSON field names using Redshift's limited functions, I'll do this using a regular expression:

WITH exploded_array AS (                                                                                       
    SELECT id, JSON_EXTRACT_ARRAY_ELEMENT_TEXT(metadata, seq.i) AS json
    FROM input_table, seq_0_to_3 AS seq
    WHERE seq.i < JSON_ARRAY_LENGTH(metadata)
)
SELECT id, field, JSON_EXTRACT_PATH_TEXT(json, field)
FROM (
    SELECT id, json, REGEXP_SUBSTR(json, '[^{"]\\w+[^"]') AS field
    FROM exploded_array
);

这篇关于在单独的行上返回Redshift JSON数组的元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在单独的行上返回Redshift JSON数组的元素 [英] Return elements of Redshift JSON array on separate rows

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在单独的行上返回Redshift JSON数组的元素 [英] Return elements of Redshift JSON array on separate rows

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭