如何减少SQL的生成文件“Alter Table / Partition Concatenate”在蜂巢? [英] How to reduce generating files of SQL "Alter Table/Partition Concatenate" in Hive?
问题描述
Hive版本:1.2.1
配置:设置hive.execution.engine = tez;
set hive.merge.mapredfiles = true;
set hive.merge.smallfiles.avgsize = 256000000;
set hive.merge.tezfiles = true;
HQL:
ALTER TABLE`table_name` PARTITION(partion_name1 ='val1',partion_name2 ='val2',partion_name3 ='val3',partion_name4 ='val4')CONCATENATE;
我使用HQL合并特定表/分区的文件。但是,执行后输出目录中仍然有很多文件;而它们的大小远小于25600万。那么如何减少输出文件的数量。
顺便说一下,使用MapReduce代替Tez也不行。
也许你可以尝试 insert overwrite table ... partition(...)select * from ...
这个人可以使用tezfiles的合并设置。
Hive version: 1.2.1
Configuration:
set hive.execution.engine=tez;
set hive.merge.mapredfiles=true;
set hive.merge.smallfiles.avgsize=256000000;
set hive.merge.tezfiles=true;
HQL:
ALTER TABLE `table_name` PARTITION (partion_name1 = 'val1', partion_name2='val2', partion_name3='val3', partion_name4='val4') CONCATENATE;
I use the HQL to merge files of specific table / partition. However, after execution there are still many files in output directory; and their size are far less than 256000000. So how to decrease the number of output files.
BTW, use MapReduce instead of Tez also didn't work.
Maybe u can try insert overwrite table ... partition ( ... ) select * from ...
This one can use the merge setting for tezfiles.
这篇关于如何减少SQL的生成文件“Alter Table / Partition Concatenate”在蜂巢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!