如何根据用于提高性能的实例数量增加hadoop中的映射器和减速器? [英] How to increase the mappers and reducers in hadoop according to number of instances used to increase the performance?

查看:22
本文介绍了如何根据用于提高性能的实例数量增加hadoop中的映射器和减速器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我增加映射器的数量并减少减速器的数量,那么执行时任何作业的性能(增加/减少)是否有任何差异?

If I increase the number of mappers and decrease the number of reducers, then is there any difference in the performance (increase/decrease) of any job while execution?

我也想问一下,mapper和reducer的数量怎么设置?我从来没有玩过这个设置,所以我不知道这个.我知道 hadoop,但我有代码,因为我经常使用 Hive.

Also I want to ask that How to set the number of mappers and reducers? I have never played with this setting thats why I don't know about this. I know hadoop but I have code with it as I use Hive a lot.

另外,如果我想增加映射器和减速器的数量,那么如何设置它以及我设置它的值是多少.是否取决于实例的数量(比如 10 个)?

Also If I want to increase the number of mappers and reducers then how to set it and upto what value do I set it. Is it depend upon the number of instances (Lets say 10)?

请回复我我想试试这个并检查性能.谢谢.

Please reply me I want to try this and check the performance. Thanks.

推荐答案

改变映射器的数量 - 是纯粹的优化,不应该影响结果.您应该设置 number 以充分利用您的集群(如果它是专用的).尝试每个节点的映射器数量等于核心数量.查看 CPU 使用率,并增加该数字,直到 CPU 使用率几乎全部达到或系统开始交换为止.如果您没有足够的内存,您可能需要的映射器比内核少.
减速器的数量会影响结果,因此,如果您需要特定数量的减速器(如 1) - 设置它
如果您可以处理任意数量的减速器的结果 - 执行与 Mappers 相同的优化.
从理论上讲,在此调整过程中您可能会受到 IO 限制 - 在调整任务数量时也要注意这一点.尽管映射器/减速器数量增加,但您可以通过低 CPU 利用率来识别它.

Changing number of mappers - is pure optimization which should not affect results. You should set number to fully utilize your cluster (if it is dedicated). Try number of mappers per node equal to number of cores. Look on CPU utilization, and increase the number until you get almost full CPU utilization or, you system start swapping. It might happens that you need less mappers then cores, if you have not enough memory.
Number of reducers impacts results so , if you need specific number of reducer (like 1) - set it
If you can handle results of any number of reducers - do the same optimization as with Mappers.
Theoretically you can became IO bound during this tuning process - pay attention to this also when tuning number of tasks. You can recognieze it by low CPU utilization despite increase of mappers / reducers count.

这篇关于如何根据用于提高性能的实例数量增加hadoop中的映射器和减速器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆