在 Hadoop 中实现自定义可写? [英] Implementation of custom Writable in Hadoop?
问题描述
我在 Hadoop 中定义了一个自定义的 Writable 类,但是 Hadoop 在运行我的程序时给了我以下错误消息.
I have defined a custom Writable class in Hadoop, but Hadoop gives me the following error message when running my program.
java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
at org.apache.hadoop.io.SortedMapWritable.readFields(SortedMapWritable.java:180)
at EquivalenceClsAggValue.readFields(EquivalenceClsAggValue.java:82)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1282)
at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1222)
at org.apache.hadoop.mapred.Task$CombineValuesIterator.next(Task.java:1301)
at Mondrian$Combine.reduce(Mondrian.java:119)
at Mondrian$Combine.reduce(Mondrian.java:1)
at org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1442)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1436)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
原因:java.lang.NullPointerException在 java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:332)....
Caused by: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:332)....
EquivalenceClsAggValue 是我定义的 Writable 类的名称,这是我的类:
EquivalenceClsAggValue is the name of the Writable class I've defined and this is my class:
public class EquivalenceClsAggValue implements WritableComparable<EquivalenceClsAggValue>{
public ArrayList<SortedMapWritable> aggValues;
public EquivalenceClsAggValue(){
aggValues = new ArrayList<SortedMapWritable>();
}
@Override
public void readFields(DataInput arg0) throws IOException {
int size = arg0.readInt();
for (int i=0;i<size;i++){
SortedMapWritable tmp = new SortedMapWritable();
tmp.readFields(arg0);
aggValues.add(tmp);
}
}
@Override
public void write(DataOutput arg0) throws IOException {
//write the size first
arg0.write(aggValues.size());
//write each element
for (SortedMapWritable s:aggValues){
s.write(arg0);
}
}
我想知道问题的根源是什么.
I wonder to know what is the source of the problem.
推荐答案
看起来你的 write(DataOutput)
方法有错误:
Looks like an error in your write(DataOutput)
method:
@Override
public void write(DataOutput arg0) throws IOException {
//write the size first
// arg0.write(aggValues.size()); // here you're writing an int as a byte
// try this instead:
arg0.writeInt(aggValues.size()); // actually write int as an int
//..
查看 DataOutput.write(int
) 与 DataOutput.writeInt(int)
我还将修改您在 readFields 中创建的 SortedMapWritable
tmp 局部变量以使用 ReflectionUtils.newInstance()
:
I'd also amend your creation of the SortedMapWritable
tmp local variable in readFields to use ReflectionUtils.newInstance()
:
@Override
public void readFields(DataInput arg0) throws IOException {
int size = arg0.readInt();
for (int i=0;i<size;i++){
SortedMapWritable tmp = ReflectionUtils.newInstance(
SortedMapWritable.class, getConf());
tmp.readFields(arg0);
aggValues.add(tmp);
}
}
请注意,要使其正常工作,您还需要修改类签名以扩展 Configurable
(这样 Hadoop 会在您的对象最初创建时注入 Configuration
对象):
Note for this to work, you'll also need to amend you class signature to extend Configurable
(such that Hadoop will inject a Configuration
object when your object is initially created):
public class EquivalenceClsAggValue
extends Configured
implements WritableComparable<EquivalenceClsAggValue> {
这篇关于在 Hadoop 中实现自定义可写?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!