将2个Hive外部表格的数据插入新的外部表格中,并增加一列 [英] Insert data of 2 Hive external tables in new External table with additional column

查看:2462
本文介绍了将2个Hive外部表格的数据插入新的外部表格中,并增加一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个外部配置单元表,如下所示。我已经使用sqoop从oracle中填充数据。

 创建外部表transaction_usa 

tran_id int,
acct_id int,
tran_date字符串,
数量double,
描述字符串,
branch_code字符串,
tran_state字符串,
tran_city字符串,
speendby字符串,
tran_zip int

行格式分隔
存储为文本文件
位置'/ user / stg / bank_stg / tran_usa';

创建外部表transaction_canada

tran_id int,
acct_id int,
tran_date字符串,
金额双倍,
说明字符串,
branch_code字符串,
tran_state字符串,
tran_city字符串,
speendby字符串,
tran_zip int

行格式分隔
存储为文本文件
位置'/ user / stg / bank_stg / tran_canada';

现在我想合并2个以上的表格数据,因为它在1个外部配置单元表中具有所有相同的字段如上面的两个表格,但是有一个额外的列来标识哪个数据来自哪个表格。新增外部表格,其中列为 source_table 。新外部表如下。

 创建外部表transaction_usa_canada 

tran_id int,
acct_id int,
tran_date字符串,
数量double,
描述字符串,
branch_code字符串,
tran_state字符串,
tran_city字符串,
speendby字符串,
tran_zip int,
source_table字符串

分隔行格式
存储为文本文件
位置'/ user / gds / bank_ds / tran_usa_canada ;

我该怎么做。

解决方案

您从每个表执行 SELECT ,并对这些操作执行 UNION ALL 操作结果并最终将结果插入到第三个表中。



以下是最终的配置单元查询:

  INSERT INTO TABLE transaction_usa_canada 
SELECT trans_id,acct_id,tran_date,amount,description,branch_code,tran_state,tran_city,speendby,tran_zip,'transaction_usa'AS source_table FROM transaction_usa
UNION ALL
SELECT tran_id,acct_id,tran_date,amount ,description,branch_code,tran_state,tran_city,speendby,tran_zip,'transaction_canada'AS source_table FROM transaction_canada;

希望这可以帮到你!!!

I have 2 external hive tables as follows. I have populated data in them from oracle using sqoop.

create external table transaction_usa
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
row format delimited
stored as textfile
location '/user/stg/bank_stg/tran_usa';

create external table transaction_canada
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
row format delimited
stored as textfile
location '/user/stg/bank_stg/tran_canada';

Now i want to merge above 2 tables data as it is in 1 external hive table with all same fields as in the above 2 tables but with 1 extra column to identify that which data is from which table. The new external table with additional column as source_table. The new external table is as follows.

create external table transaction_usa_canada
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int,
source_table string
)
row format delimited
stored as textfile
location '/user/gds/bank_ds/tran_usa_canada';

how can I do it.?

解决方案

You do SELECT from each table and perform UNION ALL operation on these results and finally insert the result into your third table.

Below is the final hive query:

INSERT INTO TABLE transaction_usa_canada
SELECT tran_id, acct_id, tran_date, amount, description, branch_code, tran_state, tran_city, speendby, tran_zip, 'transaction_usa' AS source_table FROM transaction_usa
UNION ALL
SELECT tran_id, acct_id, tran_date, amount, description, branch_code, tran_state, tran_city, speendby, tran_zip, 'transaction_canada' AS source_table FROM transaction_canada;

Hope this help you!!!

这篇关于将2个Hive外部表格的数据插入新的外部表格中,并增加一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆