运行Hadoop时如何避免OutOfMemoryException? [英] How to avoid OutOfMemoryException when running Hadoop?

查看:122
本文介绍了运行Hadoop时如何避免OutOfMemoryException?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在执行大量模式匹配的情况下对超过1,5 TB的数据运行Hadoop作业。我有几台机器,每台机器的内存都是16GB,而且我总是用这个数据(我使用Hive)获得 OutOfMemoryException



我想知道如何在文件 hadoop-env.sh 中优化设置选项 HADOOP_HEAPSIZE 所以,我的工作不会失败。甚至有可能设置这个选项,这样我的作业不会失败?



当我将 HADOOP_HEAPSIZE 设置为1,5 GB并从查询中删除了一半模式匹配,作业成功运行。那么,如果这无助于避免工作失败,那么这个选择是什么?



我希望做更多的实验来优化设置,但因为这些工作需要10小时以上运行,我要求你的建议。

解决方案

作业是否失败或服务器崩溃?如果您的作业由于节点上的OutOfMemmory而失败,那么您可以调整最大映射和缩减器的数量,并为每个节点选择JVM,以便永远不会发生。 mapred.child.java.opts(默认为200Xmx)通常必须根据您的数据节点特定的硬件来增加。

http://allthingshadoop.com/2010/04/28/map-reduce -tips-tricks-your-first-real-cluster /



最大任务可以在Namenode上设置,或者在数据节点上重写(并设置final)可能有不同的硬件配置。最大的任务是为mappers和reducer设置的。要计算它,它基于CPU(内核)和RAM的数量以及您在mapred.child.java.opts中设置的JVM最大值(默认值为200)。 Datanode和Tasktracker分别设置为1GB,因此对于8GB的计算机,mapred.tasktracker.map.tasks.maximum可以设置为7,mapred.tasktracker.reduce.tasks.maximum可以设置为7以及mapred.child.java .opts设置为-400Xmx(假设有8个内核)。请注意,如果您只有1个CPU和1个内核,那么您可以通过CPU完成这些任务最多,然后是为数据节点获取新硬件或将掩码任务设置为1的时间。如果您有1个4核的CPU,则将映射设置为3并减少到3会很好(为守护进程节省1个内核)。

默认情况下,只有一个reducer,您需要配置mapred.reduce 。任务不止一个。这个值应该是每个节点最多任务数量乘以数据节点数量的.95到1.75倍之间的某个值。因此,如果您有3个数据节点,并且设置了7个最大任务,那么将其配置为25到36之间。



如果服务器因OutOfMemory问题而崩溃, HADOOP_HEAPSIZE仅用于进程堆(不是任务的执行)。

最后,如果你的Job需要很长时间,你可以检查你是否有另一个好的配置补充是mapred.compress.map.output。将这个值设置为true应该(压缩时间与传输时间之间的平衡)加速减速器的复制,特别是在处理大型数据集时。通常工作需要时间,但也可以通过调整来加快工作速度= 8 ^)


I'm running a Hadoop job over 1,5 TB of data with doing much pattern matching. I have several machines with 16GB RAM each, and I always get OutOfMemoryException on this job with this data (I'm using Hive).

I would like to know how to optimally set option HADOOP_HEAPSIZE in file hadoop-env.sh so, my job would not fail. Is it even possible, to set this option so my jobs won't fail?

When I set HADOOP_HEAPSIZE to 1,5 GB and removed half of pattern matching from query, job run successfully. So what is this option for, if it doesn't help avoiding job failures?

I ment to do more experimenting with optimal setup, but since those jobs take >10hr to run, I'm asking for your advice.

解决方案

Is the Job failing or is your server crashing? If your Job is failing because of OutOfMemmory on nodes you can tweek your number of max maps and reducers and the JVM opts for each so that will never happen. mapred.child.java.opts (the default is 200Xmx) usually has to be increased based on your data nodes specific hardware.

http://allthingshadoop.com/2010/04/28/map-reduce-tips-tricks-your-first-real-cluster/

Max tasks can be setup on the Namenode or overridden (and set final) on data nodes that may have different hardware configurations. The max tasks are setup for both mappers and reducers. To calculate this it is based on CPU (cores) and the amount of RAM you have and also the JVM max you setup in mapred.child.java.opts (the default is 200). The Datanode and Tasktracker each are set to 1GB so for a 8GB machine the mapred.tasktracker.map.tasks.maximum could be set to 7 and the mapred.tasktracker.reduce.tasks.maximum set to 7 with the mapred.child.java.opts set to -400Xmx (assuming 8 cores). Please note these task maxes are as much done by your CPU if you only have 1 CPU with 1 core then it is time to get new hardware for your data node or set the mask tasks to 1. If you have 1 CPU with 4 cores then setting map to 3 and reduce to 3 would be good (saving 1 core for the daemon).

By default there is only one reducer and you need to configure mapred.reduce.tasks to be more than one. This value should be somewhere between .95 and 1.75 times the number of maximum tasks per node times the number of data nodes. So if you have 3 data nodes and it is setup max tasks of 7 then configure this between 25 and 36.

If your server is crashing with OutOfMemory issues then that is where the HADOOP_HEAPSIZE comes in just for the processes heap (not the execution of task).

Lastly, if your Job is taking that long you can check to see if you have another good configuration addition is mapred.compress.map.output. Setting this value to true should (balance between the time to compress vs transfer) speed up the reducers copy greatly especially when working with large data sets. Often jobs do take time but there are also options to tweak to help speed things up =8^)

这篇关于运行Hadoop时如何避免OutOfMemoryException?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆