配置单元 - 创建外部表格 [英] Hive - External table creation

查看:138
本文介绍了配置单元 - 创建外部表格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



查询存储在外部系统中的数据,例如amazon s3
- 避免将这些数据转化为HDFS



任何人都可以详细说明上述说明。 避免将数据输入HDFS?加载数据本地命令将有助于将本地文件加载到HDFS,并且HIVE将在顶部应用该格式。

是否可以访问HDFS以外的数据?



是否可以访问超出HDFS的数据?


您可以读取任何Hadoop兼容文件系统上的数据,而不仅仅是HDFS。


有人可以详细说明上述说明。 避免将数据输入HDFS?

以S3为例,您可以创建一个外部表格,其位置为 s3a:// bucket / path ,除非您真的需要读取HDFS与S3相比的速度,否则无需将其带入HDFS。但是,要将数据集保存在临时云群集中,结果应该写回到提供的任何长期存储中。

I am learning hive and read an article about when to use HIVE external table and mentioned the statement below.

To query data stored in external system such as amazon s3 - Avoid brining in that data into HDFS

Can anyone elaborate above statement. "Avoid brining in that data into HDFS"? Load data local command will help to load local file into HDFS and HIVE is applying the format on the top.
Is it possible to access the data which is out of HDFS?

解决方案

is it possible to access the data which is out of HDFS?

HIve can read data on any Hadoop Compatible filesystem, not only HDFS.

Can someone elaborate above statement. "Avoid brining in that data into HDFS "?

With the example of S3, you can create an external table with a location of s3a://bucket/path, there's no need to bring it to HDFS unless you really needed the speed of reading HDFS compared to S3. However, to persist a dataset in an ephemeral cloud cluster, results should be written back to whatever long-term storage is provided.

这篇关于配置单元 - 创建外部表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆