在Hive中为年,月和日创建表分区 [英] Create table partition in Hive for year,month and day
问题描述
应用数据/ ContryName /年/月/日/ app1.json
例如:
AppData / India / 2016/07/01 / geek.json
AppData / India / 2016/07/02 / geek.json
AppData / US / 2016/07/01 / geek.json
现在我创建了一个带分区的外部表。
PARTITIONED BY(国家字符串,年份字符串,月份字符串,日期字符串)
<在此之后,我需要在alter table语句中添加分区。
ALTER TABLE mytable
ADD PARTITION(country ='India',year ='2016',month = '01 ',day = '01')
location'AppData / India / 2016/07/01 /'
为每一天创建添加分区脚本是不可能的,
有没有最简单的方法来实现这个目标?
msck修复表mytable
,但不会以您当前的目录命名会议
演示
bash
hdfs dfs -mkdir -p / AppData / country = India / year = 2016 / month = 07 / day = 01
hdfs dfs -mkdir -p / AppData / country = India / year = 2016 / month = 07 / day = 02
hdfs dfs -mkdir -p / AppData / country = US / year = 2016 / month = 07 / day = 01
<
$ b hive code> create table mytable(i int)
由(country string,year string,month string,day string)分区
location'/ AppData'
;
hive> ; msck修复表mytable;
OK
分区不在metaore中:mytable:country = India / year = 2016 / month = 07 / day = 01 mytable:country = India / year = 2016 / month = 07 / day = 02 mytable:国家=美国/年= 2016 /月= 07 /日= 01
修复:向metastore添加分区mytable:country =印度/年= 2016 /月= 07 /日= 01
修复: to metastore mytable:country = India / year = 2016 / month = 07 / day = 02
修复:向metastore添加分区mytable:country = US / year = 2016 / month = 07 / day = 01
hive>显示分区mytable;
OK
分区
国家=印度/年= 2016 /月= 07 /日= 01
国家=印度/年= 2016 /月= 07 /日= 02
country = US / year = 2016 / month = 07 / day = 01
I have my data folder in the below structure with 2 years data(2015-2017).
AppData/ContryName/year/month/Day/app1.json
For eg:
AppData/India/2016/07/01/geek.json
AppData/India/2016/07/02/geek.json
AppData/US/2016/07/01/geek.json
Now I have created an external table with partition.
PARTITIONED BY (Country String, Year String, Month String, day String)
After this, I need to add the partition in alter table statement.
ALTER TABLE mytable
ADD PARTITION (country='India', year='2016',month='01', day='01')
location 'AppData/India/2016/07/01/'
Create add partition script to each and every day is not possible,
Is there any simplest way to achieve this?
msck repair table mytable
, but not with your current directory naming convention
Demo
bash
hdfs dfs -mkdir -p /AppData/country=India/year=2016/month=07/day=01
hdfs dfs -mkdir -p /AppData/country=India/year=2016/month=07/day=02
hdfs dfs -mkdir -p /AppData/country=US/year=2016/month=07/day=01
hive
create table mytable (i int)
partitioned by (country string, year string, month string, day string)
location '/AppData'
;
hive> msck repair table mytable;
OK
Partitions not in metastore: mytable:country=India/year=2016/month=07/day=01 mytable:country=India/year=2016/month=07/day=02 mytable:country=US/year=2016/month=07/day=01
Repair: Added partition to metastore mytable:country=India/year=2016/month=07/day=01
Repair: Added partition to metastore mytable:country=India/year=2016/month=07/day=02
Repair: Added partition to metastore mytable:country=US/year=2016/month=07/day=01
hive> show partitions mytable;
OK
partition
country=India/year=2016/month=07/day=01
country=India/year=2016/month=07/day=02
country=US/year=2016/month=07/day=01
这篇关于在Hive中为年,月和日创建表分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!