Hbase快速统计行数 [英] Hbase quickly count number of rows

查看:351
本文介绍了Hbase快速统计行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在我在 ResultScanner 上实现行计数,就像这样

  for(结果rs = scanner.next(); rs!= null; rs = scanner.next()){
number ++;



$ b $ p
$ b如果达到数百万次计算的数据量很大,我想实时计算我不想使用Mapreduce



如何快速计算行数。

解决方案在HBase中使用RowCounter
RowCounter是一个mapreduce作业,用于统计表的所有行。这是一个很好的实用工具,可以用作健全性检查,以确保HBase可以读取表中所有块,如果有任何元数据不一致的担忧。它将在单个进程中运行mapreduce,但如果您有MapReduce群集供其利用,它将运行得更快。

  $ hbase org.apache.hadoop.hbase.mapreduce.RowCounter< tablename> 

用法:RowCounter [options]
< tablename> [
--starttime = [start]
--endtime = [end]
[--range = [startKey],[endKey]]
[< column1> < column2> ...]
]


Right now I implement row count over ResultScanner like this

for (Result rs = scanner.next(); rs != null; rs = scanner.next()) {
    number++;
}

If data reaching millions time computing is large.I want to compute in real time that i don't want to use Mapreduce

How to quickly count number of rows.

解决方案

Use RowCounter in HBase RowCounter is a mapreduce job to count all the rows of a table. This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency. It will run the mapreduce all in a single process but it will run faster if you have a MapReduce cluster in place for it to exploit.

$ hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename>

Usage: RowCounter [options] 
    <tablename> [          
        --starttime=[start] 
        --endtime=[end] 
        [--range=[startKey],[endKey]] 
        [<column1> <column2>...]
    ]

这篇关于Hbase快速统计行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆