Hadoop纱线容器不能分配足够的空间 [英] Hadoop Yarn Container Does Not Allocate Enough Space

查看:135
本文介绍了Hadoop纱线容器不能分配足够的空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行Hadoop作业,在我的yarn-site.xml文件中,我有以下配置:

 < property> 
< name> yarn.scheduler.minimum-allocation-mb< / name>
<值> 2048< /值>
< / property>
<属性>
< name> yarn.scheduler.maximum-allocation-mb< / name>
<值> 4096< /值>
< / property>

但是,我仍然偶尔会得到以下错误:

  Container [pid = 63375,containerID = container_1388158490598_0001_01_000003]超出了物理内存限制。当前使用情况:使用2 GB物理内存2.0 GB;使用2.8 GB的4.2 GB虚拟内存。杀死容器。 

我发现通过增加yarn.scheduler.minimum-allocation-mb,分配的物理内存为容器上升。但是,我并不总是希望为我的容器分配4GB,并认为通过明确指定最大大小,我可以解决这个问题。我知道Hadoop在映射器运行之前无法弄清楚它需要为容器分配多少内存,所以我应该如何才能为容器分配更多的内存,仅当它需要额外的内存时才能使用?

解决方案

您还应该正确配置MapReduce的内存分配。从此HortonWorks教程


[...]



对于我们的示例集群, Container
(yarn.scheduler.minimum-allocation-mb)= 2 GB的最小RAM。我们将为Map任务容器分配4 GB
,为Reduce任务容器分配8 GB。



在mapred-site.xml中:



mapreduce.map.memory.mb :4096



mapreduce.reduce.memory.mb :8192



每个Container将为Map和Reduce任务运行JVM。 JVM
堆大小应设置为低于上面定义的Map和Reduce内存
,以便它们位于由YARN分配的Container
内存的范围内。



在mapred-site.xml中:

mapreduce.map.java.opts -Xmx3072m



mapreduce.reduce.java.opts

上面的设置配置物理RAM的上限
Map和Reduce任务将使用

最后,有人在此线程有同样的问题Hadoop的邮件列表,并在他们的情况下,原来他们的代码中有内存泄漏。


I'm running a Hadoop job, and in my yarn-site.xml file, I have the following configuration:

    <property>
            <name>yarn.scheduler.minimum-allocation-mb</name>
            <value>2048</value>
    </property>
    <property>
            <name>yarn.scheduler.maximum-allocation-mb</name>
            <value>4096</value>
    </property>

However, I still occasionally get the following error:

Container [pid=63375,containerID=container_1388158490598_0001_01_000003] is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.8 GB of 4.2 GB virtual memory used. Killing container.

I've found that by increasing yarn.scheduler.minimum-allocation-mb, the physical memory allocated for the container goes up. However, I don't always want 4GB being allocated for my container, and thought that by explicitly specifying a maximum size, I'd be able to go around this problem. I realize that Hadoop can't figure out how much memory it needs to allocate for the container before the mapper runs, so how should I go about allocating more memory for the container only if it needs that extra memory?

解决方案

You should also properly configure the memory allocations for MapReduce. From this HortonWorks tutorial:

[...]

For our example cluster, we have the minimum RAM for a Container (yarn.scheduler.minimum-allocation-mb) = 2 GB. We’ll thus assign 4 GB for Map task Containers, and 8 GB for Reduce tasks Containers.

In mapred-site.xml:

mapreduce.map.memory.mb: 4096

mapreduce.reduce.memory.mb: 8192

Each Container will run JVMs for the Map and Reduce tasks. The JVM heap size should be set to lower than the Map and Reduce memory defined above, so that they are within the bounds of the Container memory allocated by YARN.

In mapred-site.xml:

mapreduce.map.java.opts: -Xmx3072m

mapreduce.reduce.java.opts: -Xmx6144m

The above settings configure the upper limit of the physical RAM that Map and Reduce tasks will use.

Finally, someone in this thread in the Hadoop mailing list had the same problem and in their case, it turned out they had a memory leak in their code.

这篇关于Hadoop纱线容器不能分配足够的空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆