Hadoop溢出失败 [英] Hadoop Spill failure

查看:229
本文介绍了Hadoop溢出失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用Hadoop 0.21.0,985326和一个由6个工作节点和一个头节点构成的项目。
提交常规mapreduce作业失败,但我不知道为什么。

  org.apache.hadoop.mapred.Child:运行child的异常:java.io.IOException :溢出失败
在org.apache.hadoop.mapred.MapTask $ MapOutputBuffer.checkSpillException(MapTask.java:1379)
at org.apache.hadoop.mapred.MapTask $ MapOutputBuffer.access $ 200(MapTask。 java:711)
at org.apache.hadoop.mapred.MapTask $ MapOutputBuffer $ Buffer.write(MapTask.java:1193)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.Text.write(Text.java:290)
at org.apache.hadoop.io.serializer.WritableSerialization $ WritableSerializer.serialize(WritableSerialization.java:100)
at org.apache.hadoop.io.serializer.WritableSerialization $ WritableSerializer.serialize(WritableSerialization.java:84)
at org.apache.hadoop.mapred.MapTask $ MapOutputBuffer.collect(MapTask.java: 967)
在org.apache.hadoop.mapred.MapTask $ NewOutputCollector.write(MapTask.java:583)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:92)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper $ Context.write(WrappedMapper.java: 111)
,位于be.ac.ua.comp.ronny.riki.invertedindex.FilteredInvertedIndexBuilder $ Map.map(FilteredInvertedIndexBuilder.java:113)
位于be.ac.ua.comp.ronny.riki。 invertedindex.FilteredInvertedIndexBuilder $ Map.map(FilteredInvertedIndexBuilder.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred。 MapTask.runNewMapper(MapTask.java:652)
在org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
在org.apache.hadoop.mapred.Child $ 4.run (Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org .apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
在org.apache.hadoop。 mapred.Child.main(Child.java:211)
导致:java.lang.RuntimeException:java.lang.NoSuchMethodException:org.apache.hadoop.io.ArrayWritable。< init>()
在org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:123)
在org.apache.hadoop.io.serializer.WritableSerialization $ WritableDeserializer.deserialize(WritableSerialization.java:68)
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:145)
$ b at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer $ Context.nextKey(WrappedReducer.java :291)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:168)
at org.apache.hadoop.mapred.Task $ NewCombinerRunner.combine(Task.java:1432 )在org.apache.had处
oop.mapred.MapTask $ MapOutputBuffer.sortAndSpill(MapTask.java:1457)
at org.apache.hadoop.mapred.MapTask $ MapOutputBuffer.access $ 600(MapTask.java:711)
at org.apache .hadoop.mapred.MapTask $ MapOutputBuffer $ SpillThread.run(MapTask.java:1349)
导致:java.lang.NoSuchMethodException:org.apache.hadoop.io.ArrayWritable。< init>()
at java.lang.Class.getConstructor0(Class.java:2706)
at java.lang.Class.getDeclaredConstructor(Class.java:1985)
at org.apache.hadoop.util.ReflectionUtils .newInstance(ReflectionUtils.java:117)
... 10 more

目前,我正在试验一些希望这个错误消失的配置参数,但直到现在这还没有成功。
我正在调整的配置参数是:
$ b


  • mapred.map.tasks = 60

  • mapred.reduce.tasks = 12

  • Job.MAP_OUTPUT_COMPRESS(或mapreduce.map.output.compress)= true

  • Job.IO_SORT_FACTOR (或mapreduce.task.io.sort.factor)= 10

  • Job.IO_SORT_MB(或mapreduce.task.io.sort.mb)= 256

  • Job.MAP_JAVA_OPTS(或mapreduce.map.java.opts)=-Xmx256或-Xmx512
  • Job.REDUCE_JAVA_OPTS(或mapreduce.reduce.java.opts) =-Xmx256或-Xmx512


任何人都可以解释为什么发生上述异常吗?以及如何避免它?或者只是一个简短的解释,说明hadoop spill操作意味着什么? 解决方案

好的,所有问题都解决了。

Map-Reduce序列化操作需要为 org.apache.hadoop.io.ArrayWritable 创建默认构造函数。

Hadoops实现没有为ArrayWritable提供默认构造函数。

这就是为什么抛出java.lang.NoSuchMethodException:org.apache.hadoop.io.ArrayWritable。()并导致奇怪的溢出异常。


一个简单的包装器使得ArrayWritable真正可写并修复它!奇怪的是,Hadoop没有提供这个。


I'am currently working on a project using Hadoop 0.21.0, 985326 and a cluster of 6 worker nodes and a head node. Submitting a regular mapreduce job fails, but I have no idea why. Has anybody seen this exception before?

org.apache.hadoop.mapred.Child: Exception running child : java.io.IOException: Spill failed
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.checkSpillException(MapTask.java:1379)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$200(MapTask.java:711)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1193)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.io.Text.write(Text.java:290)
    at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:100)
    at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:84)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:967)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:583)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:92)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:111)
    at be.ac.ua.comp.ronny.riki.invertedindex.FilteredInvertedIndexBuilder$Map.map(FilteredInvertedIndexBuilder.java:113)
    at be.ac.ua.comp.ronny.riki.invertedindex.FilteredInvertedIndexBuilder$Map.map(FilteredInvertedIndexBuilder.java:1)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:652)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
    at org.apache.hadoop.mapred.Child.main(Child.java:211)
Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.io.ArrayWritable.<init>()
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:123)
    at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:68)
    at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
    at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:145)
    at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
    at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:291)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:168)
    at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1432)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1457)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$600(MapTask.java:711)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1349)
Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.io.ArrayWritable.<init>()
    at java.lang.Class.getConstructor0(Class.java:2706)
    at java.lang.Class.getDeclaredConstructor(Class.java:1985)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    ... 10 more

Currently, I'm experimenting with some configuration parameters hoping that this error disappears, but until now this was unsuccessful. The configuration parameters I'm tweaking are:

  • mapred.map.tasks = 60
  • mapred.reduce.tasks = 12
  • Job.MAP_OUTPUT_COMPRESS (or mapreduce.map.output.compress) = true
  • Job.IO_SORT_FACTOR (or mapreduce.task.io.sort.factor) = 10
  • Job.IO_SORT_MB (or mapreduce.task.io.sort.mb) = 256
  • Job.MAP_JAVA_OPTS (or mapreduce.map.java.opts) = "-Xmx256" or "-Xmx512"
  • Job.REDUCE_JAVA_OPTS (or mapreduce.reduce.java.opts) = "-Xmx256" or "-Xmx512"

Can anybody explain why the exception above occurs? And how to avoid it? Or just a short explanation what the hadoop spill operation implies?

解决方案

Ok, all problems are solved.

The Map-Reduce serialization operation needs intern a default constructor for org.apache.hadoop.io.ArrayWritable.
Hadoops implementation didn't provide a default constructor for ArrayWritable.
That's why the java.lang.NoSuchMethodException: org.apache.hadoop.io.ArrayWritable.() was thrown and caused the weird spill exception.

A simple wrapper made ArrayWritable really writable and fixed it! Strange that Hadoop did not provide this.

这篇关于Hadoop溢出失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆