如何在Big Query中更新表,其中要更新的字段名称是另一个表中的值 [英] How update a table in Big Query where the name of fields to update are values in another table

查看:166
本文介绍了如何在Big Query中更新表,其中要更新的字段名称是另一个表中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

伙计!

我需要一些解决以下问题的想法:

I need some ideas, with the follow problem:

我有两个表:

表1:

+-------+------------+---------+
| ID    | field_name | value   |
+-------+------------+---------+
| 1     | usd        |  10.08  |
| 1     | gross_amt  |  52.0   |
| 1     | jpy        |  30.05  |
| 2     | usd        |  50.0   |
| 2     | eur        |  50.0   |
| 3     | real_amt   |  210.43 |
| 3     | total      |  320    |
| 4     | jpy        |  23.45  |
| 4     | name       |  john   |
| 4     | city       |  utah   |
+-------+------------+---------+

表2:

+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| ID  |  name | last_name |   date1  | counrty | city |  usd  |  eur  |  jpy  | gross_amt | real_amt | total | ... | field200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| 1   |  jane | doe       | 19900108 |   usa   | LA   | 9.08  | 0.00  | 29.05 | 50.0      |  52.0    | 900.0 | ... | value200 |
| 2   |  lane | smith     | 19900108 |   usa   | LA   | 40.8  | 40.0  | 0.00  | 100.0     |  70.0    | 290.0 | ... | value200 |
| 3   |  mike | hoffa     | 19900108 |   usa   | SF   | 5.05  | 0.00  | 0.00  | 10.0      |  25.0    | 100.0 | ... | value200 |
| 4   |  paul | doe       | 19900108 |   usa   | NY   | 1.00  | 0.00  | 29.05 | 45.0      |  55.0    | 110.0 | ... | value200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+

我需要用表1列value的值更新表2中位于列field_name的表1中的字段的值,这两个表中的两个ID相同,除此之外, ,表1中列value的数据类型是字符串,但是表2中要更新的列的数据类型是不同的,尤其是数字(数字,int64,float64)

I need to update the values of the fields in the table 2, which are in table 1 in column field_name, with the values of table 1 column value, both IDs are the same in both tables, beside that, the datatype of column value in table 1 are string, but the data type of the columns to update in the table 2 are diferent, especially the numbers(numeric, int64, float64)

上面的表是一个示例,实际问题的表2有200个字段,表1中的ID最多可以对每天要修改的数千条记录进行40个值修改

The tables above are an example, table 2 of the real problem has 200 fields and in table 1 for an ID there can be up to 40 value modifications for thousands of records to be modified daily

谢谢

我尝试了以下两种解决方案:

I have tried the following two solutions:

解决方案1(有效,但速度很慢,记录很多):

Solution 1 (it works, but very slow, it is a lot of records):

DECLARE SQLSCRIPT STRING DEFAULT '';
DECLARE col, val, id STRING;
DECLARE n INT64;
DECLARE i INT64 DEFAULT 1;

SET n= (SELECT COUNT(*) FROM `project.dataset.table1`);

WHILE i <= n DO
    SET col = (SELECT col FROM `project.dataset.table1`  LIMIT 1);
    SET val = (SELECT val FROM `project.dataset.table1`  LIMIT 1);
    SET id = (SELECT id FROM `project.dataset.table1`  LIMIT 1);
    SET SQLScript = (SELECT CONCAT('UPDATE `project.dataset.table2`` SET ',col,' = ',val,' WHERE id = ','"',id,'"'));
    SET i = i + 1;
END WHILE;
EXECUTE IMMEDIATE  SQLSCRIPT;

解决方案2(我无法使用它,它给我以下错误):

Solution 2 (I can't get it to work, it gives me the following error):

[错误执行大查询] [1]: https://i.stack.imgur.com/Pv44T.png

[error execution Big Query] [1]: https://i.stack.imgur.com/Pv44T.png

EXECUTE IMMEDIATE (SELECT STRING_AGG('UPDATE `project.dataset.table2` SET '||x.col||'="'||x.val||'" WHERE id = "'||x.id||'"', ';')
                      FROM UNNEST((SELECT ARRAY_AGG(STRUCT(id, col, val))
                                    FROM `project.dataset.table1`)) AS x);

推荐答案

下面是BigQuery标准SQL

Below is for BigQuery Standard SQL

EXECUTE IMMEDIATE '''
CREATE TEMP TABLE pivot1 AS
SELECT id, ''' || (
  SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', CAST(value AS " || data_type || "), NULL)) AS " || field_name)
  FROM `project.dataset.table1`
  JOIN (
    SELECT column_name, data_type
    FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS`
    WHERE tablename = 'table2' 
  ) ON field_name = column_name
) || '''  
FROM `project.dataset.table1`
GROUP BY id
''';

EXECUTE IMMEDIATE '''
MERGE `project.dataset.table2` AS t2
USING pivot1 AS t1
ON t2.id = t1.id
WHEN MATCHED THEN
  UPDATE SET
''' || (
  SELECT STRING_AGG(DISTINCT field_name || ' = IFNULL(t1.' || field_name || ', t2.' || field_name || ')')
  FROM `project.dataset.table1` 
);

SELECT * FROM `project.dataset.table2` ORDER BY id;

如果要应用于您的问题的样本数据(表1和表2)-输出为(突出显示更新)

If to apply to sample data (table1 and table2) from your question - output is (updates are highlighted)

这篇关于如何在Big Query中更新表,其中要更新的字段名称是另一个表中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆