如何在 Elastic MapReduce 上的 Hadoop 2.4.0 中为每个节点设置精确的最大并发运行任务数 [英] How to set the precise max number of concurrently running tasks per node in Hadoop 2.4.0 on Elastic MapReduce

查看:31
本文介绍了如何在 Elastic MapReduce 上的 Hadoop 2.4.0 中为每个节点设置精确的最大并发运行任务数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-sumption-gotchas/,确定每个节点并发运行任务数的公式为:

According to http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/, the formula for determining the number of concurrently running tasks per node is:

min (yarn.nodemanager.resource.memory-mb / mapreduce.[map|reduce].memory.mb, 
     yarn.nodemanager.resource.cpu-vcores / mapreduce.[map|reduce].cpu.vcores) .

但是,将这些参数设置为(对于 c3.2xlarge 集群):

However, on setting these parameters to (for a cluster of c3.2xlarges):

yarn.nodemanager.resource.memory-mb = 14336

yarn.nodemanager.resource.memory-mb = 14336

mapreduce.map.memory.mb = 2048

mapreduce.map.memory.mb = 2048

yarn.nodemanager.resource.cpu-vcores = 8

yarn.nodemanager.resource.cpu-vcores = 8

mapreduce.map.cpu.vcores = 1,

mapreduce.map.cpu.vcores = 1,

当公式说应该是 7 时,我发现每个节点最多只能同时运行 4 个任务.怎么回事?

I find I'm only getting up to 4 tasks running concurrently per node when the formula says 7 should be. What's the deal?

我在 AMI 3.1.0 上运行 Hadoop 2.4.0.

I'm running Hadoop 2.4.0 on AMI 3.1.0.

推荐答案

我的经验公式不正确.Cloudera 提供的公式是正确的,并且似乎给出了预期的并发运行任务数,至少在 AMI 3.3.1 上是这样.

My empirical formula was incorrect. The formula provided by Cloudera is the correct one and appears to give the expected number of concurrently running tasks, at least on AMI 3.3.1.

这篇关于如何在 Elastic MapReduce 上的 Hadoop 2.4.0 中为每个节点设置精确的最大并发运行任务数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆