当托管(不是外部)表被删除时,hdfs上的Hive文件不会被删除 [英] Hive files on hdfs not being deleted when managed (not external) table is dropped
问题描述
当我从Hive交互式命令行中删除托管表时,在 /user/hive/warehouse/<databasename> ;.db
仍然存在。当我使用相同的名称重新创建表并尝试执行
INSERT INTO TABLE
,因为它仍然包含我在初始阶段加载到这些分区(本例中为dt和hr分区)的数据。仅当我使用
INSERT OVERWRITE TABLE
会最终加载数据,但是我的ETL需要使用 INSERT INTO TABLE
。
有什么想法?我准备好创建同一张表,但名称不同,或者只是进入并删除hdfs上的内容,但是我担心这是否会打破Metastore或其他内容。最后,我确信它是一个托管表,而不是外部。
有时Hive会删除表元数据,但是默默地失败将文件移动到垃圾箱。您是否检查了 / user /< user> /。Trash
的权限?确保ETL用户对此文件夹拥有适当的权限。
When I drop a managed table from the Hive interactive command line, the underlying files that were created on hdfs in /user/hive/warehouse/<databasename>.db
still exist. This causes issues when I recreate the table with the same name and try to do
INSERT INTO TABLE
as it still contains the data that I loaded into those partitions (dt and hr partitions in my case) in my initial go around. Only if I use
INSERT OVERWRITE TABLE
will it then finally load the data properly, but my ETL needs to use INSERT INTO TABLE
.
Any ideas? I'm about ready to just create the same table but with a different name, or just go in and delete the stuff on hdfs but I'm worried if that'll break the metastore or something. Lastly, I'm positive it is a managed table and not external.
Sometimes Hive will delete the table metadata but silently fail to move the files to the trash. Have you checked the permissions on /user/<user>/.Trash
? Ensure that the ETL user has proper permission for this folder.
这篇关于当托管(不是外部)表被删除时,hdfs上的Hive文件不会被删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!