在Hive中使用多级分区 [英] Using multiple levels of partitions in Hive

查看:877
本文介绍了在Hive中使用多级分区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道以下是否可能。我有Hive中的数据按日期和记录器分区,但我也有不属于特定记录器的数据。



例如

  date = 2012-01-01 / logger = 1 / part000 
date = 2012-01-01 / logger = 1 / part001
date = 2012-01-01 / logger = 2 / part000
date = 2012-01-01 / logger = 2 / part001
date = 2012-01-01 / part000


$ b

我创建了我的表格:

  create table mytable(
...

由(日期字符串,记录器int)分区
....
;

并添加分区:

  alter table mytable add partition(date ='2012-01-01',logger = 1)location'/ user / me / date = 2012-01-01 / logger = 1 /'; 
...

我可以查询分区中的数据,但无法查询数据文件 date = 2012-01-01 / part000 。是否有可能在不符合分区的情况下包含此文件?

谢谢

解决方案

亚伦,你是如何设法获得这样的结构的?通常,如果分区键缺失,HIVE会创建名为 __ HIVE_DEFAULT_PARTITION __ 的分区。

I am wondering if the following is possible. I have data in Hive partitioned by date and logger, but I also have data that does not fall under a particular logger.

e.g.

date=2012-01-01/logger=1/part000
date=2012-01-01/logger=1/part001
date=2012-01-01/logger=2/part000
date=2012-01-01/logger=2/part001
date=2012-01-01/part000

I created my table with:

create table mytable (
    ...
)
partitioned by (date string, logger int)
....
;

and added partitions:

alter table mytable add partition (date='2012-01-01', logger=1) location '/user/me/date=2012-01-01/logger=1/';
...

I can query data in the partitions, but I cannot query data in the file date=2012-01-01/part000. Is it possible to include this file without it conforming to the partitioning?

Thank you

解决方案

Aaron, how did you manage to obtain such structure? Usually if partition key is missing HIVE creates partition called __HIVE_DEFAULT_PARTITION__.

这篇关于在Hive中使用多级分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆