基于时间戳的HBase扫描? [英] Timestamp Based Scans in HBase?
问题描述
Row1 - Val1 => t
Row1 - Val2 => t + 3
Row1 - Val3 => t + 5
Row2 - Val1 => t
Row2 - Val2 => t + 3
Row2 - Val3 => t + 5
扫描'test_table',其中version = t + 4应该返回
Row1 - Val1 => t + 3
Row2 - Val2 => t + 3
如何实现基于时间戳的扫描(基于最新可用值小于或等于到HBase的时间戳)?
考虑以下表格:
HBase的(主):009:0>创建't1',{NAME => 'f1',VERSIONS => 100}
hbase(main):010:0>放置't1','key1','f1:a','value1'
hbase(main):011:0>放置't1','key1','f1:a','value2'
hbase(main):012:0>放置't1','key1','f1:a','value3'
hbase(main):013:0>放置't1','key2','f1:a','value4'
hbase(main):014:0>放置't1','key2','f1:a','value5'
hbase(main):015:0> 't1','key1','f1:a','value6'
这是它的扫描在所有版本的shell中:
hbase(main):003:0>扫描't1',{VERSIONS => 100}
ROW COLUMN + CELL
key1 column = f1:a,timestamp = 1416083314098,value = value6
key1 column = f1:a,timestamp = 1416083294981,value = value3
key1 column = f1:a,timestamp = 1416083293273,value = value2
key1 column = f1:a,timestamp = 1416083291009,value = value1
key2 column = f1:a,timestamp = 1416083305050,value = value5
key2 column = f1:a,timestamp = 1416083299840,value = value4
扫描按照您的要求限制在特定的时间戳:
hbase(main):002:0>扫描't1',{TIMERANGE => [0,1416083300000]}
ROW COLUMN + CELL
key1 column = f1:a,timestamp = 1416083294981,value = value3
key2 column = f1:a,timestamp = 1416083299840,value = value4
以下是Java代码中的相同内容:
package org.example.test;
导入org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client。*;
import org.apache.hadoop.hbase.util.Bytes;
import java.io.IOException;
public class test {
public static void main(String [] args)throws IOException {
HTable table = new HTable(HBaseConfiguration.create(),t1);
扫描s =新扫描();
s.setMaxVersions(1);
s.setTimeRange(0L,1416083300000L);
ResultScanner scanner = table.getScanner(s);
for(Result rr = scanner.next(); rr!= null; rr = scanner.next()){
System.out.println(Bytes.toString(rr.getRow())+ =>+
Bytes.toString(rr.getValue(Bytes.toBytes(f1),Bytes.toBytes(a))));
code
$ b 请注意,指定时间范围最大值为排除,这意味着如果你想获得最大时间戳T的所有键的最后一个值,你应该指定范围的上限到T + 1
For Example for hbase table 'test_table', Values inserted are:
Row1 - Val1 => t
Row1 - Val2 => t + 3
Row1 - Val3 => t + 5
Row2 - Val1 => t
Row2 - Val2 => t + 3
Row2 - Val3 => t + 5
on scan 'test_table' where version = t + 4 should return
Row1 - Val1 => t + 3
Row2 - Val2 => t + 3
How do i achieve time stamp based scans (Based on latest available value less than or equal to the timestamp) in HBase?
解决方案 Consider this table:
hbase(main):009:0> create 't1', { NAME => 'f1', VERSIONS => 100 }
hbase(main):010:0> put 't1', 'key1', 'f1:a', 'value1'
hbase(main):011:0> put 't1', 'key1', 'f1:a', 'value2'
hbase(main):012:0> put 't1', 'key1', 'f1:a', 'value3'
hbase(main):013:0> put 't1', 'key2', 'f1:a', 'value4'
hbase(main):014:0> put 't1', 'key2', 'f1:a', 'value5'
hbase(main):015:0> put 't1', 'key1', 'f1:a', 'value6'
Here's its scan in shell with all the versions:
hbase(main):003:0> scan 't1', {VERSIONS => 100 }
ROW COLUMN+CELL
key1 column=f1:a, timestamp=1416083314098, value=value6
key1 column=f1:a, timestamp=1416083294981, value=value3
key1 column=f1:a, timestamp=1416083293273, value=value2
key1 column=f1:a, timestamp=1416083291009, value=value1
key2 column=f1:a, timestamp=1416083305050, value=value5
key2 column=f1:a, timestamp=1416083299840, value=value4
Here's the scan limited to a specific timestamp, as you requested:
hbase(main):002:0> scan 't1', { TIMERANGE => [0, 1416083300000] }
ROW COLUMN+CELL
key1 column=f1:a, timestamp=1416083294981, value=value3
key2 column=f1:a, timestamp=1416083299840, value=value4
Here's the same in Java code:
package org.example.test;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import java.io.IOException;
public class test {
public static void main (String[] args) throws IOException {
HTable table = new HTable(HBaseConfiguration.create(), "t1");
Scan s = new Scan();
s.setMaxVersions(1);
s.setTimeRange (0L, 1416083300000L);
ResultScanner scanner = table.getScanner(s);
for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {
System.out.println(Bytes.toString(rr.getRow()) + " => " +
Bytes.toString(rr.getValue(Bytes.toBytes("f1"), Bytes.toBytes("a"))));
}
}
}
Be aware that specifying the time range maximal value is excluded, which means that if you want to get the last value for all the keys with maximum timestamp T, you should specify upper bound of the range to T+1
这篇关于基于时间戳的HBase扫描?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!