Hadoop中的减速器数量 [英] Number of reducers in hadoop

查看:74
本文介绍了Hadoop中的减速器数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习Hadoop,我发现减速器的数量非常令人困惑:

I was learning hadoop, I found number of reducers very confusing :

1)减速器的数量与分区的数量相同.

1) Number of reducers is same as number of partitions.

2)减速器的数量为0.95或1.75乘以(节点数)*(每个节点的最大容器数).

2) Number of reducers is 0.95 or 1.75 multiplied by (no. of nodes) * (no. of maximum containers per node).

3)减速器的数量由 mapred.reduce.tasks 设置.

3) Number of reducers is set by mapred.reduce.tasks.

4)减速器的数量最接近:块大小的倍数*任务时间在5到15分钟之间*创建尽可能少的文件.

4) Number of reducers is closest to: A multiple of the block size * A task time between 5 and 15 minutes * Creates the fewest files possible.

我很困惑,我们是否明确设置了reducer的数量,或者它是由mapreduce程序本身完成的?

I am very confused, Do we explicitly set number of reducers or it is done by mapreduce program itself?

减速器的数量是如何计算的?请告诉我如何计算减速器的数量.

How is number of reducers is calculated? Please tell me how to calculate number of reducers.

推荐答案

1-减速器的数量与分区的数量相同- False .单个reducer可以在一个或多个分区上工作.但是选定的分区将在启动的reducer上完全完成.

1 - The number of reducers is as number of partitions - False. A single reducer might work on one or more partitions. But a chosen partition will be fully done on the reducer it is started.

2-这只是理论上可以为Hadoop集群配置的最大缩减器数量.这也很大程度上取决于您正在处理的数据的类型(确定减速器需要承担多少繁重的工作).

2 - That is just a theoretical number of maximum reducers you can configure for a Hadoop cluster. Which is very much dependent on the kind of data you are processing too (decides how much heavy lifting the reducers are burdened with).

3- mapred-site.xml 配置只是对纱线的建议.但是在内部,ResourceManager会运行自己的算法,可以随时随地优化事情.因此,该值实际上并不是每次运行的reducer任务的数量.

3 - The mapred-site.xml configuration is just a suggestion to the Yarn. But internally the ResourceManager has its own algorithm running, optimizing things on the go. So that value is not really the number of reducer tasks running every time.

4-这似乎有点不现实.我的块大小可能为128MB,并且每次我不能有128 * 5的最小化减速器数量.我相信那再次是错误的.

4 - This one seems a bit unrealistic. My block size might 128MB and everytime I can't have 128*5 minimum number of reducers. That's again is false, I believe.

没有可以配置或计算的固定数量的reducer任务.这取决于实际可分配多少资源.

There is no fixed number of reducers task that can be configured or calculated. It depends on the moment how much of the resources are actually available to allocate.

这篇关于Hadoop中的减速器数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆