容器超出内存限制 [英] Container is running beyond memory limits

查看:560
本文介绍了容器超出内存限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Hadoop v1中,我分配了每个大小为1GB的7个映射器和缩放器插槽,我的映射器&减速器运行良好。我的机器有8G内存,8个处理器。
现在使用YARN,当在同一台机器上运行相同的应用程序时,我得到了容器错误。
默认情况下,我有以下设置:

 < property> 
< name> yarn.scheduler.minimum-allocation-mb< / name>
<值> 1024< /值>
< / property>
<属性>
< name> yarn.scheduler.maximum-allocation-mb< / name>
<值> 8192< /值>
< / property>
<属性>
< name> yarn.nodemanager.resource.memory-mb< / name>
<值> 8192< /值>
< / property>

它给了我错误:

  Container [pid = 28920,containerID = container_1389136889967_0001_01_000121]超出了虚拟内存限制。当前使用情况:使用1 GB物理内存1.2 GB;使用2.2 GB的2.1 GB虚拟内存。杀死容器。 

然后我尝试在mapred-site.xml中设置内存限制:

 <属性> 
<名称> mapreduce.map.memory.mb< / name>
<值> 4096< /值>
< / property>
<属性>
< name> mapreduce.reduce.memory.mb< / name>
<值> 4096< /值>
< / property>

但仍然出现错误:

  Container [pid = 26783,containerID = container_1389136889967_0009_01_000002]超出了物理内存限制。当前使用情况:使用4 GB物理内存4.2 GB;使用了5.2 GB的8.4 GB虚拟内存。杀死容器。 

我很困惑为什么map任务需要这么多内存。根据我的理解,1GB的内存足够用于我的地图/缩小任务。为什么当我给容器分配更多内存时,这个任务使用更多?是因为每个任务都会分裂吗?我觉得稍微减小容器的大小并创建更多容器会更有效,因此可以同时运行更多的任务。问题是我怎样才能确保每个容器不会被分配更多的分裂比它可以处理?

解决方案

您还应该正确配置MapReduce的最大内存分配。从此HortonWorks教程


[b]

群集中的每台计算机都有48 GB的RAM。这些RAM中的一些应该保留用于操作系统使用。在每个节点上,我们将为YARN分配40 GB RAM,并为操作系统保留8 GB。对于我们的示例集群,我们拥有最小的RAM容器
(yarn.scheduler.minimum-allocation-mb)= 2 GB。我们将为Map任务容器分配4 GB
,为Reduce任务容器分配8 GB。



在mapred-site.xml中:



mapreduce.map.memory.mb :4096



mapreduce.reduce.memory.mb :8192



每个Container将为Map和Reduce任务运行JVM。 JVM
堆大小应设置为低于上面定义的Map和Reduce内存
,以便它们位于由YARN分配的Container
内存的范围内。



在mapred-site.xml中:

mapreduce.map.java.opts -Xmx3072m



mapreduce.reduce.java.opts

上面的设置配置物理RAM的上限
Map和Reduce任务将使用


总结:


  1. 在YARN中,您应该使用 mapreduce configs,而不是 mapred 那些。 编辑:现在您已经编辑了您的问题,此评论不再适用。 您正在配置的内容实际上是您要请求的多少,而不是什么是分配的最大数量。
  2. 最大限制使用上面列出的 java.opts 设置进行配置。

最后,您可能需要查看其他 SO问题,描述类似的问题(和解决方案)。


In Hadoop v1, I have assigned each 7 mapper and reducer slot with size of 1GB, my mappers & reducers runs fine. My machine has 8G memory, 8 processor. Now with YARN, when run the same application on the same machine, I got container error. By default, I have this settings:

  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>8192</value>
  </property>
  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>8192</value>
  </property>

It gave me error:

Container [pid=28920,containerID=container_1389136889967_0001_01_000121] is running beyond virtual memory limits. Current usage: 1.2 GB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.

I then tried to set memory limit in mapred-site.xml:

  <property>
    <name>mapreduce.map.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>4096</value>
  </property>

But still getting error:

Container [pid=26783,containerID=container_1389136889967_0009_01_000002] is running beyond physical memory limits. Current usage: 4.2 GB of 4 GB physical memory used; 5.2 GB of 8.4 GB virtual memory used. Killing container.

I'm confused why the the map task need this much memory. In my understanding, 1GB of memory is enough for my map/reduce task. Why as I assign more memory to container, the task use more? Is it because each task gets more splits? I feel it's more efficient to decrease the size of container a little bit and create more containers, so that more tasks are running in parallel. The problem is how can I make sure each container won't be assigned more splits than it can handle?

解决方案

You should also properly configure the maximum memory allocations for MapReduce. From this HortonWorks tutorial:

[...]

Each machine in our cluster has 48 GB of RAM. Some of this RAM should be >reserved for Operating System usage. On each node, we’ll assign 40 GB RAM for >YARN to use and keep 8 GB for the Operating System

For our example cluster, we have the minimum RAM for a Container (yarn.scheduler.minimum-allocation-mb) = 2 GB. We’ll thus assign 4 GB for Map task Containers, and 8 GB for Reduce tasks Containers.

In mapred-site.xml:

mapreduce.map.memory.mb: 4096

mapreduce.reduce.memory.mb: 8192

Each Container will run JVMs for the Map and Reduce tasks. The JVM heap size should be set to lower than the Map and Reduce memory defined above, so that they are within the bounds of the Container memory allocated by YARN.

In mapred-site.xml:

mapreduce.map.java.opts: -Xmx3072m

mapreduce.reduce.java.opts: -Xmx6144m

The above settings configure the upper limit of the physical RAM that Map and Reduce tasks will use.

To sum it up:

  1. In YARN, you should use the mapreduce configs, not the mapred ones. EDIT: This comment is not applicable anymore now that you've edited your question.
  2. What you are configuring is actually how much you want to request, not what is the max to allocate.
  3. The max limits are configured with the java.opts settings listed above.

Finally, you may want to check this other SO question that describes a similar problem (and solution).

这篇关于容器超出内存限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆