能够限制hadoop hive mapred作业的最大缩减器? [英] Ability to limit maximum reducers for a hadoop hive mapred job?
问题描述
我尝试在查询前加上:
set mapred.running.reduce.limit = 25;
和
set hive.exec.reducers.max = 35;
最后一个人把一个530裁员的工作下载到35 ...这使我认为
现在给予
code> set mapred.tasktracker.reduce.tasks.maximum = 3;
尝试查看该数字是否是每个节点某种最大数
设置mapred。 tasktracker.reduce.tasks.maximum = 3;
没有效果,但值得一试。
不完全是问题的解决方案,但可能是一个很好的妥协。
set hive.exec.reducers.max = 45;
对于有400+ reducers的doom的超级查询,这将最昂贵的hive任务35减速器。我的集群目前只有10个节点,每个节点支持7个reducer ...所以在现实中只有70个reducers可以一次运行。通过将作业减少到低于70,我注意到速度有轻微的提高,而最终产品没有任何明显的变化。测试这在生产中弄清楚究竟是怎么回事。在过渡期间,这是一个很好的妥协解决方案。
I've tried prepending my query with:
set mapred.running.reduce.limit = 25;
And
set hive.exec.reducers.max = 35;
The last one jailed a job with 530 reducers down to 35... which makes me think it was going to try and shoe horn 530 reducers worth of work into 35.
Now giving
set mapred.tasktracker.reduce.tasks.maximum = 3;
a try to see if that number is some sort of max per node ( previously was 7 on a cluster with 70 potential reducer's ).
Update:
set mapred.tasktracker.reduce.tasks.maximum = 3;
Had no effect, was worth a try though.
Not exactly a solution to the question, but potentially a good compromise.
set hive.exec.reducers.max = 45;
For a super query of doom that has 400+ reducers, this jails the most expensive hive task down to 35 reducers total. My cluster currently only has 10 nodes, each node supporting 7 reducers...so in reality only 70 reducers can run as one time. By jailing the job down to less then 70, I've noticed a slight improvement in speed without any visible changes to the final product. Testing this in production to figure out what exactly is going on here. In the interim it's a good compromise solution.
这篇关于能够限制hadoop hive mapred作业的最大缩减器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!