蜂巢中的减速器选择 [英] Reducer Selection in Hive
本文介绍了蜂巢中的减速器选择的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下记录要处理
1000, 1001, 1002 to 1999,
2000, 2001, 2002 to 2999,
3000, 3001, 3002 to 3999
我想使用HIVE处理以下记录集,以便reducer-1将处理1000至1999的数据,而reducer-2将处理2000至2999的数据,而reducer-3将处理3000至3999的数据.请帮助我解决以上问题.
And I want to process the following record set using HIVE in such a way so that reducer-1 will process data 1000 to 1999 and reducer-2 will process data 2000 to 2999 and reducer-3 will process data 3000 to 3999.Please help me to solve above problem.
推荐答案
使用DISTRIBUTE BY
,映射器的输出将根据distribution by子句进行分组,以传递给reducer进行处理:
Use DISTRIBUTE BY
, mappers output is being grouped according to the distribute by clause to be transferred to reducers for processing:
select ...
from ...
distribute by case when col between 1000 and 1999 then 1
when col between 2000 and 2999 then 2
when col between 3000 and 3999 then 3
end
或者简单地
distribute by floor(col/1000)
这篇关于蜂巢中的减速器选择的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文