Redshift将单个动态列拆分为新表中的多行 [英] Redshift split single dynamic column into multiple rows in new table

查看：114 发布时间：2020/8/23 3:31:33 sql amazon-redshift

本文介绍了Redshift将单个动态列拆分为新表中的多行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用如下表:

       uid               |            segmentids
-------------------------+----------------------------------------
 f9b6d54b-c646-4bbb-b0ec | 4454918|4455158|4455638|4455878|4455998
 asd7a0s9-c646-asd7-b0ec | 1265899|1265923|1265935|1266826|1266596
 gd3355ff-cjr8-assa-fke0 | 2237557|2237581|2237593
 laksnfo3-kgi5-fke0-b0ec | 4454918|4455158|4455638|4455878

如何使用以下方法创建新表

How to create a new table with:

      uid               |         segmentids
-------------------------+---------------------------
 f9b6d54b-c646-4bbb-b0ec |           4454918
 f9b6d54b-c646-4bbb-b0ec |           1265899
 f9b6d54b-c646-4bbb-b0ec |           2237557
 f9b6d54b-c646-4bbb-b0ec |           4454918
 f9b6d54b-c646-4bbb-b0ec |           4454918
 asd7a0s9-c646-asd7-b0ec |           1265899
 asd7a0s9-c646-asd7-b0ec |           1265923
 asd7a0s9-c646-asd7-b0ec |           1265935 
 asd7a0s9-c646-asd7-b0ec |           1266826
 asd7a0s9-c646-asd7-b0ec |           1266596

段的数量是动态的，可以随每个记录而变化. 我尝试使用带分隔符的Split函数，但是它需要字符串中的索引，在这里是动态的.

The number of segments are dynamic, can vary with each record. I tried the Split function with delimiter, but it requires the index in string, which is dynamic here.

有什么建议吗?

推荐答案

这是Redshift答案，它将与每行多达1万个细分ID值一起使用.

Here is the Redshift answer, it will work with up to 10 thousand segment ids values per row.

测试数据

create table test_split (uid varchar(50),segmentids varchar(max));
insert into test_split
    values
      ('f9b6d54b-c646-4bbb-b0ec','4454918|4455158|4455638|4455878|4455998'),
      ('asd7a0s9-c646-asd7-b0ec','1265899|1265923|1265935|1266826|1266596'),
      ('asd7345s9-c646-asd7-b0ec','1235935|1263456|1265675696'),
      ('as345a0s9-c646-asd7-b0ec','12765899|12658883|12777935|144466826|1266226|12345')
;

代码

with ten_numbers as (select 1 as num union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9 union select 0)
  , generted_numbers AS
(
    SELECT (1000 * t1.num) + (100 * t2.num) + (10 * t3.num) + t4.num AS gen_num
    FROM ten_numbers AS t1
      JOIN ten_numbers AS t2 ON 1 = 1
      JOIN ten_numbers AS t3 ON 1 = 1
      JOIN ten_numbers AS t4 ON 1 = 1
)
  , splitter AS
(
    SELECT *
    FROM generted_numbers
    WHERE gen_num BETWEEN 1 AND (SELECT max(REGEXP_COUNT(segmentids, '\\|') + 1)
                                 FROM test_split)
)
  --select * from splitter;
  , expanded_input AS
(
    SELECT
      uid,
      split_part(segmentids, '|', s.gen_num) AS segment
    FROM test_split AS ts
      JOIN splitter AS s ON 1 = 1
    WHERE split_part(segmentids, '|', s.gen_num) <> ''
)
SELECT * FROM expanded_input;

前2个cte步骤(ten_numbers和generate_numbers)用于生成许多行，这是必需的，因为不支持generate_series

the first 2 cte steps (ten_numbers and generated_numbers) are used to generate a number of rows, this is needed because generate_series is not supported

下一步(分隔符)只需要等于最大分隔符+ 1(即段的最大数目)的行数

The next step (splitter) just takes a number of rows equal to the max number of delimiters + 1 (which is the max number of segments)

最后，我们将输入数据与交叉分割器交叉，使用split_part获取相关值，然后排除空白部分(这是因为该行的段数小于或等于最大段数)

finally, we cross join splitter with the input data, take the related value using split_part and then exclude blank parts (which are caused where the row has < the max number of segments)

这篇关于Redshift将单个动态列拆分为新表中的多行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Redshift将单个动态列拆分为新表中的多行 [英] Redshift split single dynamic column into multiple rows in new table

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Redshift将单个动态列拆分为新表中的多行 [英] Redshift split single dynamic column into multiple rows in new table

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭