Hive无法手动设置减速器的数量 [英] Hive unable to manually set number of reducers
问题描述
我有以下hive查询:
select count(distinct id)as mytable from total;
会自动产生:
1408映射器
1 Reducer
我需要手动设置减速器的数量,并且我尝试了以下方法:
set mapred.reduce.tasks = 50
set hive.exec.reducers.max = 50
但这些设置似乎都没有兑现。查询需要永久运行。有没有办法手动设置reducer或可能重写查询,以便可以导致更多的reducer?感谢!
在这样的配置单元中编写查询:
SELECT COUNT(DISTINCT id)....
只使用一个减速器。
您应该:
-
使用此命令设置所需的减速器数量:
设置mapred.reduce.tasks = 50
-
重写查询如下:
lockquote
SELECT COUNT(*)FROM(SELECT DISTINCT id FROM ...)t;
这将导致2个map + reduce工作,而不是1个工作,但是性能提升将会很大。
I have the following hive query:
select count(distinct id) as total from mytable;
which automatically spawns:
1408 Mappers
1 Reducer
I need to manually set the number of reducers and I have tried the following:
set mapred.reduce.tasks=50
set hive.exec.reducers.max=50
but none of these settings seem to be honored. The query takes forever to run. Is there a way to manually set the reducers or maybe rewrite the query so it can result in more reducers? Thanks!
writing query in hive like this:
SELECT COUNT(DISTINCT id) ....
will always result in using only one reducer. You should:
use this command to set desired number of reducers:
set mapred.reduce.tasks=50
rewrite query as following:
SELECT COUNT(*) FROM ( SELECT DISTINCT id FROM ... ) t;
This will result in 2 map+reduce jobs instead of one, but performance gain will be substantial.
这篇关于Hive无法手动设置减速器的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!