“超出GC开销限制”在Hadoop .20 datanode上 [英] "GC Overhead limit exceeded" on Hadoop .20 datanode

查看:143
本文介绍了“超出GC开销限制”在Hadoop .20 datanode上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我查找过并没有发现与Hadoop Datanode进程相关的很多信息因GC超出开销限制而死亡,所以我想我会发布一个问题。

我们正在运行一个测试,在这里我们需要确认我们的Hadoop集群可以处理大约300万个文件(当前是4个节点集群)。我们使用的是64位JVM,我们已经为namenode分配了8g。但是,当我的测试程序向DFS写入更多文件时,datanodes开始消失,并出现以下错误:
线程DataNode:[/ var / hadoop / data / hadoop / data]中的异常java.lang.OutOfMemoryError:超过GC开销限制

我看到了一些关于某些选项的帖子(并行GC?)我猜哪些可以在hadoop-env.sh中设置,但我不太确定的语法,我是一个新手,所以我没有完全理解它是如何完成的。
感谢这里的任何帮助!使用这个命令尝试增加datanode的内存:(需要hadoop重新启动)

解决方案


$ b $ pre $ export HADOOP_DATANODE_OPTS = - Xmx10g

这会将堆设置为10gb ......您可以根据需要增加。



您可以也可以在 $ HADOOP_CONF_DIR / hadoop-env.sh 文件中粘贴。


I've searched and not finding much information related to Hadoop Datanode processes dying due to GC overhead limit exceeded, so I thought I'd post a question.

We are running a test where we need to confirm our Hadoop cluster can handle having ~3million files stored on it (currently a 4 node cluster). We are using a 64bit JVM and we've allocated 8g to the namenode. However, as my test program writes more files to DFS, the datanodes start dying off with this error: Exception in thread "DataNode: [/var/hadoop/data/hadoop/data]" java.lang.OutOfMemoryError: GC overhead limit exceeded

I saw some posts about some options (parallel GC?) I guess which can be set in hadoop-env.sh but I'm not too sure of the syntax and I'm kind of a newbie, so I didn't quite grok how it's done. Thanks for any help here!

解决方案

Try to increase the memory for datanode by using this: (hadoop restart required for this to work)

export HADOOP_DATANODE_OPTS="-Xmx10g"

This will set the heap to 10gb...you can increase as per your need.

You can also paste this at the start in $HADOOP_CONF_DIR/hadoop-env.sh file.

这篇关于“超出GC开销限制”在Hadoop .20 datanode上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆