在HBase中存储和检索字符串数组 [英] Store and retrieve string arrays in HBase
问题描述
I've read this answer (How to store complex objects into hadoop Hbase?) regarding the storing of string arrays with HBase.
据说使用 ArrayWritable
类来序列化数组.使用 WritableUtils.toByteArray(Writable ... writable)
,我会得到一个 byte []
,可以存储在HBase中.
There it is said to use the ArrayWritable
Class to serialize the array. With WritableUtils.toByteArray(Writable ... writable)
I'll get a byte[]
which I can store in HBase.
当我现在再次尝试检索行时,我得到一个 byte []
,我可以通过某种方式再次将其转换回 ArrayWritable
.但是我没有办法做到这一点.也许您知道答案,或者我在根本上错误地序列化我的 String []
?
When I now try to retrieve the rows again, I get a byte[]
which I have somehow to transform back again into an ArrayWritable
.
But I don't find a way to do this. Maybe you know an answer or am I doing fundamentally wrong serializing my String[]
?
推荐答案
您可以应用以下方法来获取 ArrayWritable
(摘自我先前的回答,请参见
You may apply the following method to get back the ArrayWritable
(taken from my earlier answer, see here) .
public static <T extends Writable> T asWritable(byte[] bytes, Class<T> clazz)
throws IOException {
T result = null;
DataInputStream dataIn = null;
try {
result = clazz.newInstance();
ByteArrayInputStream in = new ByteArrayInputStream(bytes);
dataIn = new DataInputStream(in);
result.readFields(dataIn);
}
catch (InstantiationException e) {
// should not happen
assert false;
}
catch (IllegalAccessException e) {
// should not happen
assert false;
}
finally {
IOUtils.closeQuietly(dataIn);
}
return result;
}
此方法只是根据提供的类类型标记将字节数组反序列化为正确的对象类型.
例如:假设您有一个自定义ArrayWritable:
This method just deserializes the byte array to the correct object type, based on the provided class type token.
E.g:
Let's assume you have a custom ArrayWritable:
public class TextArrayWritable extends ArrayWritable {
public TextArrayWritable() {
super(Text.class);
}
}
现在,您发出一个HBase get:
Now you issue a single HBase get:
...
Get get = new Get(row);
Result result = htable.get(get);
byte[] value = result.getValue(family, qualifier);
TextArrayWritable tawReturned = asWritable(value, TextArrayWritable.class);
Text[] texts = (Text[]) tawReturned.toArray();
for (Text t : texts) {
System.out.print(t + " ");
}
...
注意:
您可能已经找到了 writeCompressedStringArray()方法如果您有自己的String数组支持的Writable类,那么这似乎是合适的.在使用它们之前,我警告您,这些可能会由于以下原因而导致严重的性能下降:由gzip压缩/解压缩引起的开销.
Note:
You may have already found the readCompressedStringArray() and writeCompressedStringArray() methods in WritableUtils
which seem to be suitable if you have your own String array-backed Writable class.
Before using them, I'd warn you that these can cause serious performance hit due to
the overhead caused by the gzip compression/decompression.
这篇关于在HBase中存储和检索字符串数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!