在运行时基于特定条件删除配置单元分区 [英] Dropping hive partition based on certain condition in runtime
问题描述
创建表t1(x int,y int,s string)分区by(wk int)存储为sequencefile;
该表格包含以下数据:
select * from t1;
+ ------- + ------- + ------- + -------- + - +
| t1.x | t1.y | t1.s | t1.wk |
+ ------- + ------- + ------- + -------- + - +
| 1 | 2 | abc | 10 |
| 4 | 5 | xyz | 11 |
| 7 | 8 | pqr | 12 |
+ ------- + ------- + ------- + -------- + - +
现在问题是当分区数量> = 2
时丢弃最老的分区。
这可以通过hql或通过任何shell脚本来处理吗?
考虑到我将使用dbname作为像 hive这样的变量 - e'使用$ dbname;如果你的分区是按日期排序的,你可以写一个shell脚本您可以使用
hive -e'SHOW PARTITIONS t1'
得到所有分区,在你的例子中,它会返回:
wk = 10
wk = 11
wk = 12
然后,您可以发布 hive -e'ALTER TABLE t1 DROP PARTITION(wk = 10)'
删除第一个分区;
:
db = mydb
if((``hive -euse $ db; SHOW PARTITIONS t1| grep wk | wc -l'<2));然后
退出;
fi
partition =`hive -euse $ db; SHOW PARTITIONS t1| grep wk |头-1';
hive -e使用$ db; ALTER TABLE t1 DROP PARTITION($ partition);
I have a table in hive built using the following command:
create table t1 (x int, y int, s string) partitioned by (wk int) stored as sequencefile;
The table has the data below:
select * from t1;
+-------+-------+-------+--------+--+
| t1.x | t1.y | t1.s | t1.wk |
+-------+-------+-------+--------+--+
| 1 | 2 | abc | 10 |
| 4 | 5 | xyz | 11 |
| 7 | 8 | pqr | 12 |
+-------+-------+-------+--------+--+
Now the ask is to drop the oldest partition when partition count is >=2
Can this be handled in hql or through any shell script and how?
Considering I will be using dbname as variable like hive -e 'use "$dbname"; show partitions t1
If your partitions are ordered by date, you could write a shell script in which you could use hive -e 'SHOW PARTITIONS t1'
to get all partitions, in your example, it will return:
wk=10
wk=11
wk=12
Then you can issue hive -e 'ALTER TABLE t1 DROP PARTITION (wk=10)'
to remove the first partition;
So something like:
db=mydb
if (( `hive -e "use $db; SHOW PARTITIONS t1" | grep wk | wc -l` < 2)) ; then
exit;
fi
partition=`hive -e "use $db; SHOW PARTITIONS t1" | grep wk | head -1`;
hive -e "use $db; ALTER TABLE t1 DROP PARTITION ($partition)";
这篇关于在运行时基于特定条件删除配置单元分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!