如何序列化Hadoop中的Java对象？ [英] How to serialize an Java Object in Hadoop?

查看：115 发布时间：2018/5/31 18:55:37 java serialization hadoop

本文介绍了如何序列化Hadoop中的Java对象？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对象应该实现 Writable 接口，以便在Hadoop中传输时进行序列化。以Lucene ScoreDoc 类为例：

  public class ScoreDoc implements java.io.Serializable {
 
 / **查询的这个文档的分数。 * / 
公众流动分数; 
 
 / **专家：文档编号。 
 * @see Searcher＃doc（int）* / 
 public int doc; 
 
 **仅由{@link TopDocs＃merge}设置* / 
 public int shardIndex; 
 
 / **构造一个ScoreDoc。 * / 
 public ScoreDoc（int doc，float score）{
 this（doc，score，-1）; 
} 
 
 / **构造一个ScoreDoc。 * / 
 public ScoreDoc（int doc，float score，int shardIndex）{
 this.doc = doc; 
 this.score =分数; 
 this.shardIndex = shardIndex; 
} 
 
 //一种便捷的调试方法。 
 @Override 
 public String toString（）{
 returndoc =+ doc +score =+ score +shardIndex =+ shardIndex; 
 
 
 
 
 $ b 我应该如何用可写入界面？  Writable 和 java.io.serializable 接口之间的连接是什么？
解决方案
我认为篡改内置的Lucene类不是一个好主意。相反，让你自己的类可以包含ScoreDoc类型的字段，并且可以在接口中实现可写的Hadoop。它会是这样的：
  public class MyScoreDoc implements Writable {
 
 private ScoreDoc sd; 
 $ b $ public void write（DataOutput out）throws IOException {
 String [] splits = sd.toString（）。split（）; 
 
 //从字符串
获得分数值Float score = Float.parseFloat（（splits [0] .split（=））[1]）; 
 
 //对doc和shardIndex字段做同样的处理
 // .... 
 
 out.writeInt（score）; 
 out.writeInt（doc）; 
 out.writeInt（shardIndex）; 
} 
 
 public void readFields（DataInput in）throws IOException {
 float score = in.readInt（）; 
 int doc = in.readInt（）; 
 int shardIndex = in.readInt（）; 
 
 sd = new ScoreDoc（score，doc，shardIndex）; 
} 
 
 //字符串toString（）
} 
  
 
Object should implement Writable interface in order to be serialized when transmitted in Hadoop. Take the Lucene ScoreDoc class as an example:
public class ScoreDoc implements java.io.Serializable {

  /** The score of this document for the query. */
  public float score;

  /** Expert: A hit document's number.
   * @see Searcher#doc(int) */
  public int doc;

  /** Only set by {@link TopDocs#merge} */
  public int shardIndex;

  /** Constructs a ScoreDoc. */
  public ScoreDoc(int doc, float score) {
    this(doc, score, -1);
  }

  /** Constructs a ScoreDoc. */
  public ScoreDoc(int doc, float score, int shardIndex) {
    this.doc = doc;
    this.score = score;
    this.shardIndex = shardIndex;
  }

  // A convenience method for debugging.
  @Override
  public String toString() {
    return "doc=" + doc + " score=" + score + " shardIndex=" + shardIndex;
  }
}
How should I serialize it with Writable interface? What is the connection between Writable and java.io.serializable interface?
 解决方案 
I think that it wont be a good idea to tamper with the in-built Lucene class. Instead, have your own class which can will contain the fields of ScoreDoc type and would implement Hadoop writable in interface. It would be something like this:
public class MyScoreDoc implements Writable  {      

  private ScoreDoc sd;

  public void write(DataOutput out) throws IOException {
      String [] splits = sd.toString().split(" ");

      // get the score value from the string
      Float score = Float.parseFloat((splits[0].split("="))[1]);

      // do the same for doc and shardIndex fields
      // ....    

      out.writeInt(score);
      out.writeInt(doc);
      out.writeInt(shardIndex);
  }

  public void readFields(DataInput in) throws IOException {
      float score = in.readInt();
      int doc = in.readInt();
      int shardIndex = in.readInt();

      sd = new ScoreDoc (score, doc, shardIndex);
  }

  //String toString()
}


                        
这篇关于如何序列化Hadoop中的Java对象？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何序列化Hadoop中的Java对象？ [英] How to serialize an Java Object in Hadoop?

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何序列化Hadoop中的Java对象？ [英] How to serialize an Java Object in Hadoop?

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭