Hadoop Reduce子代中的OOM异常 [英] OOM exception in Hadoop Reduce child

查看:64
本文介绍了Hadoop Reduce子代中的OOM异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在获得OOM异常(Java堆空间)以减少子级.在化简器中,我将所有值附加到StringBuilder上,这将是化简器过程的输出.值的数量不是很多.我试图将mapred.reduce.child.java.opts的值增加到512M和1024M,但这无济于事.减速器代码如下.

I am getting OOM exception (Java heap space) for reduce child. In the reducer, I am appending all the values to a StringBuilder which would be the output of the reducer process. The number of values aren't that many. I tried to increase the value of mapred.reduce.child.java.opts to 512M and 1024M but that doesn't help. Reducer code is given below.

            StringBuilder adjVertexStr = new StringBuilder();
        long itcount= 0;
        while(values.hasNext()) {
            adjVertexStr.append(values.next().toString()).append(" ");
            itcount++;
        }
        log.info("Size of iterator: " + itcount);
        multipleOutputs.getCollector("vertex", reporter).collect(key, new Text(""));
        multipleOutputs.getCollector("adjvertex", reporter).collect(adjVertexStr, new Text(""));

在上面的代码中的3个地方,我得到了异常.

I get exceptions at 3 places in the above code.

  1. 在异常stacktrace中,行号指向添加字符串的while循环语句.
  2. 在最后一行-collect()语句.
  3. 我有一个集合来累积所有值-这样就不会有重复的值.我稍后将其删除.

迭代器的一些样本大小如下:238695、1、13、673、1、1等.这些值不是很大.为什么我总是收到OOM异常?任何帮助对我来说都是宝贵的.

Some sample sizes of iterator are as follows: 238695, 1, 13, 673, 1, 1 etc. These are not very large values. Why do I keep getting the OOM exception? Any help would be valuable to me.

堆栈跟踪

2012-10-10 21:15:03,929 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 238695                                                                                                   
2012-10-10 21:15:04,190 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:04,190 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:04,190 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 13                                                                                                       
2012-10-10 21:15:04,190 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:04,191 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:04,193 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 673                                                                                                       
2012-10-10 21:15:04,195 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:04,196 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:04,196 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:04,196 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:04,196 INFO partitioning.UndirectedGraphPartitioner: Size of iterator: 1                                                                                                        
2012-10-10 21:15:09,856 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs`    truncater with mapRetainSize=-1 and reduceRetainSize=-1                                                       
2012-10-10 21:15:09,916 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to  User mapping with a cache timeout of 14400 seconds.                                                     
2012-10-10 21:15:09,916 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName hduser for UID         2006 from the native implementation                                                                      
2012-10-10 21:15:09,922 FATAL org.apache.hadoop.mapred.Child: Error running child :       java.lang.OutOfMemoryError: Java heap space                                                                           
    at java.util.Arrays.copyOf(Arrays.java:2882)                                                                                                                                                      
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)                                                                                                                 
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)                                                                                                                         
    at java.lang.StringBuilder.append(StringBuilder.java:119)                                                                                                                                         
    at partitioning.UndirectedGraphPartitioner$Reduce.reduce(UndirectedGraphPartitioner.java:106)                                                                                            
    at partitioning.UndirectedGraphPartitioner$Reduce.reduce(UndirectedGraphPartitioner.java:82)                                                                                             
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:519)                                                                                                                         
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)                                                                                                                                   
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)                                                                                                                                           
    at java.security.AccessController.doPrivileged(Native Method)                                                                                                                                     
    at javax.security.auth.Subject.doAs(Subject.java:396)                                                                                                                                             
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)                                                                                                           
    at org.apache.hadoop.mapred.Child.main(Child.java:249) 

推荐答案

因此,在您的示例中,您希望将特定键的值输出为以空格分隔的值列表(作为输出键),并输出为空文本作为输出值.

So for your example, you want to output the values for a particular key as a space separated list of the values (as the output key), and an empty text as the output value.

您的输出格式将使用以下的reduce键/值(这将在您的reducer代码中):

Your output format for this would consume the reduce key / values as follows (this would be in your reducer code):

for (Text value : values) {
    multipleOutputs.getCollector("adjvertex", reporter)
       .collect(key, value);
}

然后,实际的recordWriter会将键用作逻辑触发器:

The actual recordWriter would then use the key as a logic trigger:

当传递的密钥与先前传递的密钥不同时,先前写入的记录将被关闭(例如,在制表符后加换行符).上一个键将被更新,新值将写到输出流中.

When a key is passed that is different to the previously passed key, the previous record being written would be closed out (write a tab followed by a newline for example). The previous key would be updated and the new value written out to the output stream.

如果键与上一个键相同,则在输出流中输出一个空格,后跟值.

If the key is the same as the previous key, then output a space followed by the value to the output stream.

在记录记录器的close方法中,执行与传递新键相同的逻辑(输出选项卡,然后换行).

In the close method for the record writer, perform the same logic as if a new key was being passed (output a tab, followed by a newline).

希望这是有道理的.唯一需要注意的是,如果您有一个自定义组比较器(这将导致记录编写器中的先前键比较失败).还请记住,在更新先前的密钥跟踪变量时,请对密钥进行深拷贝.

Hope this makes sense. The only thing you need to be careful of is if you have a custom group comparator (which will cause the previous key comparison in the record writer to fail). Also remember to make a deep copy of the key when updating the previous key tracking variable.

这篇关于Hadoop Reduce子代中的OOM异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆