HDFS中的Hadoop Hive查询文件 [英] Hadoop Hive query files from hdfs

查看:555
本文介绍了HDFS中的Hadoop Hive查询文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我在HDFS之上构建Hive,是否需要在处理它们之前将所有文件放入hive/仓库文件夹中?我可以通过hive查询hdfs中的任何文件吗?怎么样?

If I build Hive on top of HDFS, do I need to put all the files into hive/warehouse folder before processing them? Can I query any file which is in hdfs by hive? How?

推荐答案

您无需执行任何特殊操作即可在现有HDFS群集之上运行Hive.这是由于Hive的体系结构而发生的.默认情况下,Hive在HDFS上运行.

You don't have to do anything special in order to run Hive on top of your existing HDFS cluster. This happens by virtue of Hive's architecture. Hive by default runs on HDFS.

在处理它们之前,我是否需要将所有文件放入配置单元/仓库文件夹中?

您也不必这样做.

创建Hive表并使用 LOAD 命令从文件中将数据加载到其中时,基本文件自动被移入Hive仓库.您无需明确地执行任何操作.但这要付出代价.如果删除此类表,则文件将被删除.这些类型的文件在Hive术语中称为托管表.

When you create a Hive table and load data from a file into it using LOAD command, the base file automatically gets moved into the Hive warehouse. You don't have to do anything explicitly. But this comes with a cost. If you drop such a table your file will be deleted. These types of files are called as Managed Tables in Hive terminology.

为了解决此问题,您可以使用Hive支持的另一种类型的表,即外部表.创建外部表并将数据加载到外部表时,基本文件不会被移动到仓库中.仅与该表关联的元数据被添加到Hive 元存储中.而且,当您删除此表时,只有元数据会从元存储中删除,而不会删除基础文件.创建外部表时,只需通过 LOCATION 子句指定基本文件的位置.

In order to overcome this issue you can make use of another type of tables supported by Hive, External Tables. When you create an External Table and load data into it, the base file doesn't get moved into the warehouse. Just the metadata associated with that table gets added into the Hive metastore. And when you delete this table, only the metadata gets removed form the metastore without removing the base file. You just have to specify the location of the base file through the LOCATION clause while creating an external table.

我可以通过hive查询hdfs中的任何文件吗?怎么样?

是的.在 LOCATION 子句的帮助下,创建一个引用该文件的外部表.然后,您可以像其他任何Hive表一样查询此文件中的数据.

Yes. Create an external table which will refer to this file with the help of LOCATION clause. You can then query the data inside this file like any other Hive table.

希望这可以回答您的查询.

Hope this answers your query.

这篇关于HDFS中的Hadoop Hive查询文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆