如何在Hive中使用日期根据星期动态分区表 [英] How to dynamically partition the table based on week by using date in Hive

查看:1812
本文介绍了如何在Hive中使用日期根据星期动态分区表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个包含Id和Date列的Results表。



创建表结果(Id int,Date String)

以''结尾的行格式定界字段



存储为文本文件;

Id日期

11 2012-04-06

12 2012-05-08

2013-02-10

p>

14 2013-05-06



15 2013-08-22

16 2014-04-01

17 2014-05-06

18 2014- 06-03



19 2014-07-24



20 2014-08-26



如何通过基于上述日期栏中的year和week no进行动态分区将上述数据存储到Historical表中。



在历史表中,它应该包含基于年份和年份的分区。周,输出必须是

历史分区

2012分区包含2个分区

2013分区包含3个分区

2014分区包含5个分区

解决方案

b
$ b

  SET hive.exec.dynamic.partition = true; 
SET hive.exec.dynamic.partition.mode = nonstric;

- 使用下面的分区创建一个历史表

  hive> create table由(year_part string,week_no int)划分的历史(Id int,日期字符串)以','结尾的行格式定界字段; 

- 将数据加载到历史表中并从结果表插入数据,以便将数据分区到历史表中具体取决于年份之后的日期,并根据结果表格中的日期动态查找周数。
- 请确保您要在其中分区的列应该在选择语句中最后一列。如果有一系列的列,那么分区(col3,col4)中的顺序应该在select语句中匹配。

  hive>插入覆盖表历史分区(year_part,week_no)从结果中选择id,日期,年份(date)作为year_part,WEEKOFYEAR(date)作为week_no; 

- 现在验证正确创建的分区并填充数据是否正确。


There is a "Results" table which contains Id and Date columns.

create table Results(Id int, Date String)

row format delimited fields terminated by ','

stored as textfile;

Id Date

11 2012-04-06

12 2012-05-08

13 2013-02-10

14 2013-05-06

15 2013-08-22

16 2014-04-01

17 2014-05-06

18 2014-06-03

19 2014-07-24

20 2014-08-26

How to store the above data into "Historical" table by dynamically partitioning based on year and week no from the above date column.

In the Historical table it should contain partitions based on year & week, output must be

Historical partition

2012 partition contains 2 partitions

2013 partition contains 3 partitions

2014 partition contains 5 partitions

解决方案

as you want to do dynamic partition we need to do this

-- Set following two properties for your Hive session:

SET hive.exec.dynamic.partition=true;
SET hive.exec.dynamic.partition.mode=nonstric;

-- Create an Historical table with partition as below

hive> create table Historical (Id int, Date String) partitioned by (year_part string, week_no int) row format delimited fields terminated by ',';

--Load the Data into Historical table and insert from Results table such that data is partitioned in Historical table depending upon the year from date and dynamically found week number based on the date in Results table. --Do make sure that the column on which you want to partition should come last in select statements. If there are series of column then there order in partition(col3,col4) should match in select statement.

hive> insert overwrite table Historical partition(year_part, week_no) select id, date, year(date) as year_part, WEEKOFYEAR(date) as week_no from Results;

-- Now verify the partition created properly and data populated is correct too or not.

这篇关于如何在Hive中使用日期根据星期动态分区表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆