根据日期获取hbase表的数量 [英] get count of hbase table based on dates

查看:157
本文介绍了根据日期获取hbase表的数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用插入的时间戳,基于时间段计算hbase表行数的最简单方法是什么?我只找到了使用:

  hbase>计数't1',INTERVAL => 100000 

这并不能解决我的问题。似乎有另一种选择,但我得到0结果?

  hbase>得到'hbase_output','*',{TIMERANGE => [1445212800,1445299200]} 
COLUMN CELL
0行0.0900秒内



这是唯一的两个选择吗?我为表中的所有行添加了'*',并认为这可能是不正确的。 HBase维护时间戳还有每个记录的版本。

get用于根据行键检索特定的记录。所以一旦你完成了这个批评,你会得到更多的选择来获得不同的版本和时间戳。

扫描用于获取所有记录。您也可以选择指定版本和时间戳。但是,由于扫描会给出整个记录列表,因此您无法进行计数操作。

因此,我担心,您最好的办法是编写一张地图缩小至扫描,带时间戳范围,并获得计数。事实上,使用map reduce与count shell方法相比,Rowcounter是获得Hbase数量的最佳方式。



我曾经做过类似的事情。开始使用Rowcounter源代码,并调整添加过滤器。对于日期,您可以维护您自己的字段,或者可以有任何列限定词近期日期(只要您将整个记录存储到Hbase中)。否则,如果您的行的某些部分被单独保存,则必须使用特定的列限定符。


What would be the easiest way to get a count of hbase table rows based on a time period using the inserted timestamp? I only have found using:

hbase> count ‘t1’, INTERVAL => 100000

This does not solve my problem. There seems to be another option but I am getting 0 results?

hbase>  get 'hbase_output', '*', {TIMERANGE => [1445212800,1445299200]}
COLUMN                                   CELL
0 row(s) in 0.0900 seconds

would this be the only two options to do this? I put the '*', for all rows in the table and thinking this may be incorrect.

解决方案

HBase maintains the time stamp and also versions for each record.

get is used to retrieve a specific record based on row key. So once you fulfill that critteria, you get additional options to get for different versions and time stamps.

scan is used to get all the records. Again you have the option to specify version and time stamp. However, since scan gives you the entire record list, you cant have a count operation.

So I am afraid, your best bet would be, to write a map reduce to scan, with time stamp range, and get the count. Infact, using map reduce Rowcounter is the best way to get Hbase count when compared to count shell method.

I have worked on a similar thing. Started with Rowcounter source code, and tweaked to add filter. For date, you can maintain your own field or can have any column qualifier recent date(as long as you have entire record being stored into Hbase). Otherwise, if you have parts of your row being saved separately, you have to use your specific column qualifier.

这篇关于根据日期获取hbase表的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆