能够限制hadoop hive mapred作业的最大缩减器? [英] Ability to limit maximum reducers for a hadoop hive mapred job?

查看:227
本文介绍了能够限制hadoop hive mapred作业的最大缩减器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试在查询前加上:

  set mapred.running.reduce.limit = 25; 

  set hive.exec.reducers.max = 35; 

最后一个人把一个530裁员的工作下载到35 ...这使我认为



现在给予

 

code> set mapred.tasktracker.reduce.tasks.maximum = 3;

尝试查看该数字是否是每个节点某种最大数

 设置mapred。 tasktracker.reduce.tasks.maximum = 3; 

没有效果,但值得一试。

解决方案

不完全是问题的解决方案,但可能是一个很好的妥协。

  set hive.exec.reducers.max = 45; 

对于有400+ reducers的doom的超级查询,这将最昂贵的hive任务35减速器。我的集群目前只有10个节点,每个节点支持7个reducer ...所以在现实中只有70个reducers可以一次运行。通过将作业减少到低于70,我注意到速度有轻微的提高,而最终产品没有任何明显的变化。测试这在生产中弄清楚究竟是怎么回事。在过渡期间,这是一个很好的妥协解决方案。


I've tried prepending my query with:

set mapred.running.reduce.limit = 25;

And

 set hive.exec.reducers.max = 35;

The last one jailed a job with 530 reducers down to 35... which makes me think it was going to try and shoe horn 530 reducers worth of work into 35.

Now giving

set mapred.tasktracker.reduce.tasks.maximum = 3;

a try to see if that number is some sort of max per node ( previously was 7 on a cluster with 70 potential reducer's ).

Update:

 set mapred.tasktracker.reduce.tasks.maximum = 3;

Had no effect, was worth a try though.

解决方案

Not exactly a solution to the question, but potentially a good compromise.

set hive.exec.reducers.max = 45;

For a super query of doom that has 400+ reducers, this jails the most expensive hive task down to 35 reducers total. My cluster currently only has 10 nodes, each node supporting 7 reducers...so in reality only 70 reducers can run as one time. By jailing the job down to less then 70, I've noticed a slight improvement in speed without any visible changes to the final product. Testing this in production to figure out what exactly is going on here. In the interim it's a good compromise solution.

这篇关于能够限制hadoop hive mapred作业的最大缩减器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆