Snowflake 我们如何遍历临时表的每一行并将其值插入到另一个表中,其中每个字段的值都是一行? [英] Snowflake how can we loop over each row of a temp table and insert its values with into another table where each field with its value is a single row?

查看:70
本文介绍了Snowflake 我们如何遍历临时表的每一行并将其值插入到另一个表中,其中每个字段的值都是一行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在将数据加载到一个事实表中,我们在 Snowflake 上的原始临时表如下所示:

We are loading data into a fact table, we the original temporary table on Snowflake looks like the following:

其中 indicator_nbr 字段是调查中提出的问题.

Where indicator_nbr fields are questions asked within a survey.

我们在构建仓库数据库时使用数据建模技术,因此数据将被添加到事实表中,如下所示:

We are using data modelling techniques in building our warehouse database, so the data will be added into a fact table like so:

如果有其他问题,那么指标 2 和指标 3 也一样.

Then the same for the indicator 2 and 3 and so on if there is other questions.

每个字段及其值将作为单行.当然,还有其他元数据需要添加,例如 load_dtrecord_src,但它们都不是问题.

Each Field with its value will be as a single row. Of course there is other metadata to be added like load_dt and record_src but they are not a problem.

当前脚本正在执行以下操作:

The current script is doing the following:

  1. 将字段放入数组 =>fields_array = ['indicator_1', 'indicator_2', 'indicator_3']

一个循环将遍历数组并开始添加每个字段及其每行的值.所以假设我们有 100 行,我们将运行 300 次插入,一次一个:

A loop will run over the array and start adding each field with its value for each row. So imagine we are having 100 rows, we will run 300 inserts, one at a time:

for (var col_num = 0; col_num

    var COL_NAME = fields_array[col_num];

    var field_value_query = "INSERT INTO SAT_FIELD_VALUE SELECT md5(id), CURRENT_TIMESTAMP(), NULL, 'SRC', "+COL_NAME+", md5(foreign_key_field) FROM "+TEMP_TABLE_NAME+"";

}

如对 这篇文章显示了完整的脚本,最好循环连接插入查询的每个from values的字符串.

As mentioned in the comment on this post showing the full script, it is better to loop over a string concatenating each from values of the insert query.

建议的解决方案有两个问题:

There is 2 issues of the suggested solution:

  1. Snowflake 上的查询有大小限制(应小于 1 MB);
  2. 如果我们要遍历每个字段并连接 from values,我们还应该从临时表中执行 select 查询以获取列,所以不会有任何优化,或者我们会减少一点但不会太多.
  1. There is a size limit of a query on Snowflake (it should be less than 1 MB);
  2. if we are going to loop over each field and concatenate the from values, we should do a select query as well from the temp table to get the value of the column, so there will be no optimization, or we will reduce the time a little bit but not to much.

一个可能的解决方案

我想做一个 sql 查询,从临时表中选择所有内容,然后进行散列和所有内容,并在转置后将其保存到数组中,但我不知道该怎么做.

I was thinking of doing an sql query selecting everything from the temp table, and do hashing and everything and save it into an array after transposing, but I have no idea how to do it.

推荐答案

不确定这是否是您要找的,但似乎您只想做一个透视:

Not sure if this is what you're looking for but it seems as though you just want to do a pivot:

设置示例场景

create or replace transient table source_table
(
    id          number,
    indicator_1 varchar,
    indicator_2 number,
    indicator_3 varchar
);

insert overwrite into source_table
values (1, 'Test', 2, 'DATA'),
       (2, 'Prod', 3, 'DATA'),
       (3, 'Test', 1, 'METADATA'),
       (4, 'Test', 1, 'DATA')
;

create or replace transient table target_table
(
    hash_key varchar,
    md5      varchar
);

运行插入

select
    name_col as hash_key,
    md5(id)
from (select
          id,
          indicator_1,
          indicator_2::varchar as indicator_2,
          indicator_3
      from source_table) unpivot ( val_col for name_col in (indicator_1, indicator_2, indicator_3))
;

这会生成如下所示的 target_table:

This results in a target_table that looks like this:

+-----------+--------------------------------+
|HASH_KEY   |MD5                             |
+-----------+--------------------------------+
|INDICATOR_1|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_2|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_3|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_1|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_2|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_3|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_1|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_2|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_3|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_1|a87ff679a2f3e71d9181a67b7542122c|
|INDICATOR_2|a87ff679a2f3e71d9181a67b7542122c|
|INDICATOR_3|a87ff679a2f3e71d9181a67b7542122c|
+-----------+--------------------------------+

这篇关于Snowflake 我们如何遍历临时表的每一行并将其值插入到另一个表中,其中每个字段的值都是一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆