当托管(不是外部)表被删除时,hdfs上的Hive文件不会被删除 [英] Hive files on hdfs not being deleted when managed (not external) table is dropped

查看:1339
本文介绍了当托管(不是外部)表被删除时,hdfs上的Hive文件不会被删除的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我从Hive交互式命令行中删除托管表时,在 /user/hive/warehouse/<databasename> ;.db 仍然存在。当我使用相同的名称重新创建表并尝试执行

  INSERT INTO TABLE 

,因为它仍然包含我在初始阶段加载到这些分区(本例中为dt和hr分区)的数据。仅当我使用

  INSERT OVERWRITE TABLE 

会最终加载数据,但是我的ETL需要使用 INSERT INTO TABLE

有什么想法?我准备好创建同一张表,但名称不同,或者只是进入并删除hdfs上的内容,但是我担心这是否会打破Metastore或其他内容。最后,我确信它是一个托管表,而不是外部。

解决方案

有时Hive会删除表元数据,但是默默地失败将文件移动到垃圾箱。您是否检查了 / user /< user> /。Trash 的权限?确保ETL用户对此文件夹拥有适当的权限。


When I drop a managed table from the Hive interactive command line, the underlying files that were created on hdfs in /user/hive/warehouse/<databasename>.db still exist. This causes issues when I recreate the table with the same name and try to do

INSERT INTO TABLE 

as it still contains the data that I loaded into those partitions (dt and hr partitions in my case) in my initial go around. Only if I use

INSERT OVERWRITE TABLE

will it then finally load the data properly, but my ETL needs to use INSERT INTO TABLE.

Any ideas? I'm about ready to just create the same table but with a different name, or just go in and delete the stuff on hdfs but I'm worried if that'll break the metastore or something. Lastly, I'm positive it is a managed table and not external.

解决方案

Sometimes Hive will delete the table metadata but silently fail to move the files to the trash. Have you checked the permissions on /user/<user>/.Trash? Ensure that the ETL user has proper permission for this folder.

这篇关于当托管(不是外部)表被删除时,hdfs上的Hive文件不会被删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆