如何在 Hive 0.13 中更新表? [英] How to update table in Hive 0.13?
本文介绍了如何在 Hive 0.13 中更新表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的 Hive 版本是 0.13.我有两个表,table_1
和 table_2
My Hive version is 0.13. I have two tables, table_1
and table_2
table_1
包含:
customer_id | items | price | updated_date
------------+-------+-------+-------------
10 | watch | 1000 | 20170626
11 | bat | 400 | 20170625
table_2
包含:
customer_id | items | price | updated_date
------------+----------+-------+-------------
10 | computer | 20000 | 20170624
如果customer_id
已经存在,我想更新table_2
的记录,如果没有,它应该附加到table_2
.
I want to update records of table_2
if customer_id
already exists in it, if not, it should append to table_2
.
由于 Hive 0.13 不支持更新,我尝试使用 join,但失败了.
As Hive 0.13 does not support update, I tried using join, but it fails.
推荐答案
您可以使用 row_number
或 full join
.这是使用 row_number
的示例:
You can use row_number
or full join
. This is example using row_number
:
insert overwrite table_1
select customer_id, items, price, updated_date
from
(
select customer_id, items, price, updated_date,
row_number() over(partition by customer_id order by new_flag desc) rn
from
(
select customer_id, items, price, updated_date, 0 as new_flag
from table_1
union all
select customer_id, items, price, updated_date, 1 as new_flag
from table_2
) all_data
)s where rn=1;
另请参阅此答案以使用 FULL JOIN
进行更新:https://stackoverflow.com/a/37744071/2700344
Also see this answer for update using FULL JOIN
: https://stackoverflow.com/a/37744071/2700344
这篇关于如何在 Hive 0.13 中更新表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文