Hbase Java API:检索与部分行键匹配的所有行 [英] Hbase Java API: Retrieving all rows that match a Partial Row Key

查看:99
本文介绍了Hbase Java API:检索与部分行键匹配的所有行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Python模块 happybase 中,我可以检索具有以给定字符串开头的行键的所有行(即使用部分行键进行搜索)。



假设我有一个格式为(ID | TYPE | DATE)的rowkey,我可以找到所有通过以下方式ID为1和A的类型的行:

  import happybase 
connection = happybase.Connection('
table = connection.table('table_name')
用于键值,table.scan中的数据(row_prefix =1 | A |):
打印键,数据

这是我迄今为止所有尝试完成的完全客户端Java程序使用 Java HBase API 的基础知识,但我只能使用全行键搜索一行:

  import org.apache.hadoop.conf.Configuration; 
导入org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HTable;
导入org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;
// class foo {
public static void main(String [] args){
Configuration conf = new Configuration();
conf.addResource(新路径(C:\\core-site.xml));
conf.addResource(new Path(C:\\hbase-site.xml));
HTable table = new HTable(conf,table_name);
Result row = table.get(new Get(Bytes.toBytes(1 | A | 2014-01-01 00:00)));
printRow(row);
}
public static void printRow(Result result){
String returnString =;
returnString + = Bytes.toString(result.getValue(Bytes.toBytes(cf),Bytes.toBytes(id)))+,;
returnString + = Bytes.toString(result.getValue(Bytes.toBytes(cf),Bytes.toBytes(type)))+,;
returnString + = Bytes.toString(result.getValue(Bytes.toBytes(cf),Bytes.toBytes(date)));
System.out.println(returnString);


其中cf是列族。



回答:

  import java.io.IOException ; 
import java.util.Iterator;
导入org.apache.hadoop.conf.Configuration;
导入org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.HTable;
导入org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
导入org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.filter.Filter;
import org.apache.hadoop.hbase.filter.PrefixFilter;
import org.apache.hadoop.hbase.util.Bytes;
// class foo {
public static void main(String [] args){
Configuration conf = new Configuration();
conf.addResource(新路径(C:\\core-site.xml));
conf.addResource(new Path(C:\\hbase-site.xml));
HTable table = new HTable(conf,table_name);
byte [] prefix = Bytes.toBytes(1 | A |);
扫描扫描=新扫描(前缀);
过滤器prefixFilter =新的PrefixFilter(前缀);
scan.setFilter(prefixFilter);
ResultScanner resultScanner = table.getScanner(scan);
printRows(resultScanner);
// Result row = table.get(new Get(Bytes.toBytes(1 | A | 2014-01-01 00:00)));
// printRow(row);
}
public static void printRows(ResultScanner resultScanner){
for(Iterator< Result> iterator = results.iterator(); iterator.hasNext();){
printRow iterator.next();
}
}
public static void printRow(Result result){
String returnString =;
returnString + = Bytes.toString( result.getValue(Bytes.toBytes(cf),Bytes.toBytes(id)))+,;
returnString + = Bytes.toString(result.getValue(Bytes.toBytes(cf ),Bytes.toBytes(type)))+,;
returnString + = Bytes.toString(result.getValue(Bytes.toBytes(cf),Bytes.toBytes(date)) );
System.out.println(returnString);
}
//}

请注意,我使用 setFilter 方法,而下面的答案使用 addFilter 方法我们使用不同的API帐户。

解决方案

您正在使用HTable get 操作,所以你只得到返回一行(注意,你也可以在这里指定一个前缀,并且你不必提供完整的关键字)

如果你想返回多行应该使用 Scan

  byte [] prefix = Bytes.toBytes 1 | A |); 
扫描扫描=新扫描(前缀);
PrefixFilter prefixFilter = new PrefixFilter(prefix);
scan.addFilter(prefixFilter);
ResultScanner resultScanner = table.getScanner(scan);


In the Python module happybase, I can retrieve all rows that have a row key starting with a given string (i.e, search using a partial row key).

Let's say I have a rowkey in the format of (ID|TYPE|DATE), I would be able to find all rows with an ID of 1 and a TYPE of A by:

import happybase
connection = happybase.Connection('hmaster-host.com')
table = connection.table('table_name')
for key, data in table.scan(row_prefix="1|A|"):
    print key, data

This is what I have so far as a totally client side Java program for anyone trying to do the basics using the Java HBase API, but I can only search for a row using the full row key:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;
//class foo {
public static void main(String[] args) {
    Configuration conf = new Configuration();
    conf.addResource(new Path("C:\\core-site.xml"));
    conf.addResource(new Path("C:\\hbase-site.xml"));
    HTable table = new HTable(conf, "table_name");
    Result row = table.get(new Get(Bytes.toBytes("1|A|2014-01-01 00:00")));
    printRow(row); 
}
public static void printRow(Result result) {
    String returnString = "";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("id"))) + ", ";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("type"))) + ", ";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("date")));
    System.out.println(returnString);
}
//}

Where "cf" is the name of the column family.

ANSWER:

import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.filter.Filter;
import org.apache.hadoop.hbase.filter.PrefixFilter;
import org.apache.hadoop.hbase.util.Bytes;
//class foo {
public static void main(String[] args) {
    Configuration conf = new Configuration();
    conf.addResource(new Path("C:\\core-site.xml"));
    conf.addResource(new Path("C:\\hbase-site.xml"));
    HTable table = new HTable(conf, "table_name");
    byte[] prefix = Bytes.toBytes("1|A|");
    Scan scan = new Scan(prefix);
    Filter prefixFilter = new PrefixFilter(prefix);
    scan.setFilter(prefixFilter);
    ResultScanner resultScanner = table.getScanner(scan);
    printRows(resultScanner);
    //Result row = table.get(new Get(Bytes.toBytes("1|A|2014-01-01 00:00")));
    //printRow(row); 
}
public static void printRows(ResultScanner resultScanner) {
    for (Iterator<Result> iterator = results.iterator(); iterator.hasNext();) {
        printRow(iterator.next();
    }
}
public static void printRow(Result result) {
    String returnString = "";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("id"))) + ", ";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("type"))) + ", ";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("date")));
    System.out.println(returnString);
}
//}

Note that I use the setFilter method, whereas the answer below uses the addFilter method, on account of us using different APIs.

解决方案

You are using the HTable get operation so you're only getting back one row (note that you can specify a prefix here as well and you don't have to give the complete key)

If you want to get back multiple rows you should use a Scan

byte[] prefix=Bytes.toBytes("1|A|");
Scan scan = new Scan(prefix);
PrefixFilter prefixFilter = new PrefixFilter(prefix);
scan.addFilter(prefixFilter);
ResultScanner resultScanner = table.getScanner(scan);

这篇关于Hbase Java API:检索与部分行键匹配的所有行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆