在Hive表格中插入覆盖分区 - 值重复 [英] Insert overwrite partition in Hive table - Values getting duplicated

查看:2545
本文介绍了在Hive表格中插入覆盖分区 - 值重复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用非分区表创建了Hive表,并使用select查询将数据插入到分区 Hive表中。



指定网站 p>


  1. 通过以上链接,我的分区表包含重复值。以下是设置

这是我的Sample员工数据集: link1



我尝试了以下查询:

更新雇员ID为19到50000的Steven工资。
$ b


INSERT OVERWRITE TABLE Unm_Parti_Trail PARTITION(Department ='A')
SELECT employeeid ,firstname,指定,CASE WHEN employeeid = 19 THEN
50000 ELSE salary END AS工资FROM Unm_Parti_Trail;

值为获得重复。

  7 Nirmal Tech 12000 A 
7 Nirmal Tech 12000 B
A 中,但它被复制到部门 B / p>

我在做什么事情不对?

忘记了上一次INSERT OVERWRITE中的WHERE子句:

  INSERT INTO TABLE Unm_Parti_Trail PARTITION(Department ='A')
SELECT employeeid,firstname,designation,CASE WHEN employeeid = 19
THEN 50000 ELSE salary END AS salary FROM Unm_Parti_Trail
WHERE department ='A';


I created a Hive table with Non-partition table and using select query I inserted data into Partitioned Hive table.

Refered site

  1. By following above link my partition table contains duplicate values. Below are the setps

This is my Sample employee dataset:link1

I tried the following queries: link2

But after updating a value in Hive table,

Updating salary of Steven with EmployeeID 19 to 50000.

INSERT OVERWRITE TABLE Unm_Parti_Trail PARTITION (Department = 'A') SELECT employeeid,firstname,designation, CASE WHEN employeeid=19 THEN 50000 ELSE salary END AS salary FROM Unm_Parti_Trail;

the values are getting duplicated.

7       Nirmal  Tech    12000   A
7       Nirmal  Tech    12000   B

Nirmal is placed in Department A only but it is duplicated to department B.

Am I doing anything wrong?

Please suggest.

解决方案

It seems like you forgot the WHERE clause in your last INSERT OVERWRITE:

INSERT INTO TABLE Unm_Parti_Trail PARTITION (Department = 'A') 
SELECT employeeid,firstname,designation, CASE WHEN employeeid=19 
THEN 50000 ELSE salary END AS salary FROM Unm_Parti_Trail 
WHERE department = 'A';

这篇关于在Hive表格中插入覆盖分区 - 值重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆