Hive托管表vs外部表:LOCATION目录 [英] Hive Managed Table vs External Table : LOCATION directory

查看:701
本文介绍了Hive托管表vs外部表:LOCATION目录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在阅读一些HIVE书籍和教程。其中一本书 - 实践中的Hadoop说:


当您创建一个
外部(非托管)表时,Hive将数据保存在由
指定的目录LOCATION关键字保持不变。但是,如果要执行相同的CREATE命令
并删除EXTERNAL关键字,该表将成为一个托管表,
和Hive会将LOCATION目录的内容移动到/ user / hive /
warehouse / stocks,这可能不是您期望的行为。


我使用LOCATION关键字创建了一个托管表。然后将数据从HDFS文件加载到表中。但是我看不到在/ user / hive / warehouse下创建的任何目录。而新的目录是在LOCATION中提到的。所以我想如果我用LOCATION创建一个MANAGED表,那么Hive仓库目录中没有创建任何东西?这种理解是否正确?



另外,如果输入文件在LOAD命令中的位置是hdfs,那么内部或外部表都会将数据移动到它们的位置。这种理解也是正确的吗?在这两种情况下(托管或外部)位置是可选的,所以每当你指定LOCATION数据时存储在同一个HDFC LOCATION PATH中,而不管您正在创建哪个表(托管或外部)。
而且,如果您不使用LOCATION,则会考虑在hive-site.xml中提到的默认位置路径。

I have been going through some HIVE books and tutorials. One of the book - Hadoop in Practice says

When you create an external (unmanaged) table, Hive keeps the data in the directory specified by the LOCATION keyword intact. But if you were to execute the same CREATE command and drop the EXTERNAL keyword, the table would be a managed table, and Hive would move the contents of the LOCATION directory into /user/hive/ warehouse/stocks, which may not be the behavior you expect.

I created a managed table with LOCATION keyword. And then loaded data into the table from a HDFS file. But I could not see any directory created under /user/hive/warehouse. Rather the new directory was created in LOCATION mentioned. So I think if I create a MANAGED table with LOCATION mentioned then there is nothing created in Hive warehouse directory ? Is this understanding correct ?

Also if the location of the input file during LOAD command is hdfs, then internal or external table both will move the data to their location. Is this understanding also correct ?

解决方案

In both cases(managed or external) Location is optional so whenever you specify LOCATION data will be stored on the same HDFC LOCATION PATH irrespective of which table you are creating(managed or external). And, if you don't use LOCATION, default location path which is mentioned in hive-site.xml is considered.

这篇关于Hive托管表vs外部表:LOCATION目录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆