Hive 无法手动设置减速器的数量 [英] Hive unable to manually set number of reducers
问题描述
我有以下配置单元查询:
select count(distinct id) as total from mytable;
自动生成:
1408 映射器
1 减速机
我需要手动设置减速器的数量,我尝试了以下方法:
set mapred.reduce.tasks=50设置 hive.exec.reducers.max=50
但这些设置似乎都没有得到尊重.查询需要永远运行.有没有办法手动设置减速器或重写查询以便它可以产生更多减速器?谢谢!
像这样在 hive 中编写查询:
SELECT COUNT(DISTINCT id) ....
总是会导致只使用一个减速器.你应该:
使用此命令设置所需的减速器数量:
设置 mapred.reduce.tasks=50
重写查询如下:
<块引用>
SELECT COUNT(*) FROM ( SELECT DISTINCT id FROM ... ) t;
这将导致 2 个 map+reduce 作业而不是 1 个,但性能提升将是可观的.
I have the following hive query:
select count(distinct id) as total from mytable;
which automatically spawns:
1408 Mappers
1 Reducer
I need to manually set the number of reducers and I have tried the following:
set mapred.reduce.tasks=50
set hive.exec.reducers.max=50
but none of these settings seem to be honored. The query takes forever to run. Is there a way to manually set the reducers or maybe rewrite the query so it can result in more reducers? Thanks!
writing query in hive like this:
SELECT COUNT(DISTINCT id) ....
will always result in using only one reducer. You should:
use this command to set desired number of reducers:
set mapred.reduce.tasks=50
rewrite query as following:
SELECT COUNT(*) FROM ( SELECT DISTINCT id FROM ... ) t;
This will result in 2 map+reduce jobs instead of one, but performance gain will be substantial.
这篇关于Hive 无法手动设置减速器的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!