如何在特定日期使用 hive 添加分区? [英] How to add partition using hive by a specific date?

查看:23
本文介绍了如何在特定日期使用 hive 添加分区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 hive(带有外部表)来处理存储在 amazon S3 上的数据.

I'm using hive (with external tables) to process data stored on amazon S3.

我的数据分区如下:

                       DIR   s3://test.com/2014-03-01/
                       DIR   s3://test.com/2014-03-02/
                       DIR   s3://test.com/2014-03-03/
                       DIR   s3://test.com/2014-03-04/
                       DIR   s3://test.com/2014-03-05/

s3://test.com/2014-03-05/ip-foo-request-2014-03-05_04-20_00-49.log
s3://test.com/2014-03-05/ip-foo-request-2014-03-05_06-26_19-56.log
s3://test.com/2014-03-05/ip-foo-request-2014-03-05_15-20_12-53.log
s3://test.com/2014-03-05/ip-foo-request-2014-03-05_22-54_27-19.log

如何使用hive创建分区表?

How to create a partition table using hive?

   CREATE EXTERNAL TABLE test (
    foo string,
    time string,
    bar string
    )  PARTITIONED BY (? string)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY '	'
    LOCATION 's3://test.com/';

有人可以回答这个问题吗?谢谢!

Could somebody answer this question ? Thanks!

推荐答案

首先从正确的表定义开始.在你的情况下,我只会使用你写的:

First start with the right table definition. In your case I'll just use what you wrote:

CREATE EXTERNAL TABLE test (
    foo string,
    time string,
    bar string
)  PARTITIONED BY (dt string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '	'
LOCATION 's3://test.com/';

默认情况下,Hive 期望分区位于通过约定 s3://test.com/partitionkey=partitionvalue 命名的子目录中.例如

Hive by default expects partitions to be in subdirectories named via the convention s3://test.com/partitionkey=partitionvalue. For example

s3://test.com/dt=2014-03-05

如果您遵循此约定,您可以使用 MSCK 添加所有分区.

If you follow this convention you can use MSCK to add all partitions.

如果您不能或不想使用此命名约定,则需要添加所有分区,如下所示:

If you can't or don't want to use this naming convention, you will need to add all partitions as in:

ALTER TABLE test
    ADD PARTITION (dt='2014-03-05')
    location 's3://test.com/2014-03-05'

这篇关于如何在特定日期使用 hive 添加分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆