如何将复杂对象存储到hadoop Hbase中? [英] How to store complex objects into hadoop Hbase?

查看:157
本文介绍了如何将复杂对象存储到hadoop Hbase中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我拥有需要存储到Hadoop的收集字段的复杂对象。我不想通过整个对象树并明确地存储每个字段。所以我只考虑复杂领域的序列化并将其作为一个大件存储。而不是在阅读对象时进行desirialize。那么最好的办法是什么?我希望Hadoop能够处理这种情况。



要存储的示例对象的类:

  class ComplexClass {

<简单字段>

List< AnotherComplexClassWithCollectionFields>采集;


}


解决方案

< HBase只处理字节数组,所以你可以用你认为合适的方式序列化你的对象。

序列化对象的标准Hadoop方法是实现 org.apache.hadoop.io.Writable 接口。然后你可以使用 org.apache.hadoop.io.WritableUtils.toByteArray(Writable ... writable)。$ b $将你的对象序列化成一个字节数组。 b

此外,Hadoop社区中还有其他人使用的序列化框架,如Avro,Protocol Buffers和Thrift。所有的都有自己的具体用例,你的研究也是如此。如果你正在做一些简单的事情,那么实现Hadoop的Writable就足够了。


I have complex objects with collection fields which needed to be stored to Hadoop. I don't want to go through whole object tree and explicitly store each field. So I just think about serialization of complex fields and store it as one big piece. And than desirialize it when reading object. So what is the best way to do it? I though about using some kind serilization for that but I hope that Hadoop has means to handle this situation.

Sample object's class to store:

class ComplexClass {

<simple fields>

List<AnotherComplexClassWithCollectionFields> collection;


}

解决方案

HBase only deals with byte arrays, so you can serialize your object in any way you see fit.

The standard Hadoop way of serializing objects is to implement the org.apache.hadoop.io.Writable interface. Then you can serialize your object into a byte array using org.apache.hadoop.io.WritableUtils.toByteArray(Writable ... writable).

Also, there are other serialization frameworks that people in the Hadoop community use, like Avro, Protocol Buffers, and Thrift. All have their specific use cases, so do your research. If you're doing something simple, implementing Hadoop's Writable should be good enough.

这篇关于如何将复杂对象存储到hadoop Hbase中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆