容器运行超出内存限制 [英] Container is running beyond memory limits

查看:64
本文介绍了容器运行超出内存限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Hadoop v1 中,我分配了每个 7 个映射器和减速器插槽,大小为 1GB,我的映射器和减速机运行良好.我的机器有8G内存,8个处理器.现在使用 YARN,在同一台机器上运行相同的应用程序时,出现容器错误.默认情况下,我有以下设置:

In Hadoop v1, I have assigned each 7 mapper and reducer slot with size of 1GB, my mappers & reducers runs fine. My machine has 8G memory, 8 processor. Now with YARN, when run the same application on the same machine, I got container error. By default, I have this settings:

  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>8192</value>
  </property>
  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>8192</value>
  </property>

它给了我错误:

Container [pid=28920,containerID=container_1389136889967_0001_01_000121] is running beyond virtual memory limits. Current usage: 1.2 GB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.

然后我尝试在 mapred-site.xml 中设置内存限制:

I then tried to set memory limit in mapred-site.xml:

  <property>
    <name>mapreduce.map.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>4096</value>
  </property>

但仍然出现错误:

Container [pid=26783,containerID=container_1389136889967_0009_01_000002] is running beyond physical memory limits. Current usage: 4.2 GB of 4 GB physical memory used; 5.2 GB of 8.4 GB virtual memory used. Killing container.

我很困惑为什么地图任务需要这么多内存.据我了解,1GB 内存足以满足我的 map/reduce 任务.为什么当我为容器分配更多内存时,任务使用更多?是不是因为每个任务都有更多的分裂?我觉得稍微减少容器的大小并创建更多容器更有效,以便更多任务并行运行.问题是我怎样才能确保每个容器不会被分配比它可以处理的更多的拆分?

I'm confused why the the map task need this much memory. In my understanding, 1GB of memory is enough for my map/reduce task. Why as I assign more memory to container, the task use more? Is it because each task gets more splits? I feel it's more efficient to decrease the size of container a little bit and create more containers, so that more tasks are running in parallel. The problem is how can I make sure each container won't be assigned more splits than it can handle?

推荐答案

您还应该正确配置 MapReduce 的最大内存分配.来自 这个 HortonWorks 教程:

You should also properly configure the maximum memory allocations for MapReduce. From this HortonWorks tutorial:

[...]

我们集群中的每台机器都有 48 GB 的 RAM.其中一些 RAM 应该>保留供操作系统使用.在每个节点上,我们将为 >YARN 分配 40 GB RAM 以供使用,并为操作系统分配 8 GB

Each machine in our cluster has 48 GB of RAM. Some of this RAM should be >reserved for Operating System usage. On each node, we’ll assign 40 GB RAM for >YARN to use and keep 8 GB for the Operating System

对于我们的示例集群,我们有容器的最小 RAM(yarn.scheduler.minimum-allocation-mb) = 2 GB.因此,我们将分配 4 GB用于 Map 任务容器,8 GB 用于 Reduce 任务容器.

For our example cluster, we have the minimum RAM for a Container (yarn.scheduler.minimum-allocation-mb) = 2 GB. We’ll thus assign 4 GB for Map task Containers, and 8 GB for Reduce tasks Containers.

在 mapred-site.xml 中:

In mapred-site.xml:

mapreduce.map.memory.mb: 4096

mapreduce.reduce.memory.mb: 8192

每个容器将为 Map 和 Reduce 任务运行 JVM.JVM堆大小应设置为低于 Map 和 Reduce 内存上面定义的,以便它们在容器的范围内YARN 分配的内存.

Each Container will run JVMs for the Map and Reduce tasks. The JVM heap size should be set to lower than the Map and Reduce memory defined above, so that they are within the bounds of the Container memory allocated by YARN.

在 mapred-site.xml 中:

In mapred-site.xml:

mapreduce.map.java.opts:-Xmx3072m

mapreduce.reduce.java.opts:-Xmx6144m

以上设置配置物理内存的上限Map 和 Reduce 任务将使用.

总结一下:

  1. 在 YARN 中,您应该使用 mapreduce 配置,而不是 mapred 配置.此评论不再适用,因为您已编辑问题.
  2. 您正在配置的实际上是您想要请求的数量,而不是分配的最大值.
  3. 使用上面列出的 java.opts 设置配置最大限制.
  1. In YARN, you should use the mapreduce configs, not the mapred ones. This comment is not applicable anymore now that you've edited your question.
  2. What you are configuring is actually how much you want to request, not what is the max to allocate.
  3. The max limits are configured with the java.opts settings listed above.

最后,您可能想查看其他 SO 问题 描述了类似的问题(和解决方案).

Finally, you may want to check this other SO question that describes a similar problem (and solution).

这篇关于容器运行超出内存限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆