hadoop中hcatalog的用途是什么? [英] What is use of hcatalog in hadoop?

查看:66
本文介绍了hadoop中hcatalog的用途是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Hadoop 的新手.我知道 HCatalog 是 Hadoop 的表和存储管理层.但是它究竟是如何工作的以及如何使用它.请举一些简单的例子.

I'm new to Hadoop. I know that the HCatalog is a table and storage management layer for Hadoop. But how exactly it works and how to use it. Please give some simple example.

推荐答案

HCatalog 支持读取和写入可以写入 Hive SerDe(串行器-解串器)的任何格式的文件.默认情况下,HCatalog 支持 RCFile、CSV、JSON 和 SequenceFile 格式.要使用自定义格式,您必须提供 InputFormat、OutputFormat 和 SerDe.

HCatalog supports reading and writing files in any format for which a Hive SerDe (serializer-deserializer) can be written. By default, HCatalog supports RCFile, CSV, JSON, and SequenceFile formats. To use a custom format, you must provide the InputFormat, OutputFormat, and SerDe.

HCatalog 构建在 Hive Metastore 之上,并结合了 Hive DDL 中的组件.HCatalog 为 Pig 和 MapReduce 提供读写接口,并使用 Hive 的命令行接口发出数据定义和元数据探索命令.

HCatalog is built on top of the Hive metastore and incorporates components from the Hive DDL. HCatalog provides read and write interfaces for Pig and MapReduce and uses Hive’s command line interface for issuing data definition and metadata exploration commands.

它还提供了一个 REST 接口,允许外部工具访问 Hive DDL(数据定义语言)操作,例如创建表"和描述表".

It also presents a REST interface to allow external tools access to Hive DDL (Data Definition Language) operations, such as "create table" and "describe table".

HCatalog 提供数据的关系视图.数据存储在表中,这些表可以放入数据库中.表也​​可以按一个或多个键进行分区.对于一个键(或一组键)的给定值,将有一个分区包含具有该值(或一组值)的所有行.

HCatalog presents a relational view of data. Data is stored in tables and these tables can be placed into databases. Tables can also be partitioned on one or more keys. For a given value of a key (or set of keys) there will be one partition that contains all rows with that value (or set of values).

大部分文字来自https://cwiki.apache.org/confluence/display/Hive/HCatalog+UsingHCat.

这篇关于hadoop中hcatalog的用途是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆