如何让Pig将行存储在HBase中作为字符串而不是字节? [英] How to have Pig store rows in HBase as strings not bytes?

查看:276
本文介绍了如何让Pig将行存储在HBase中作为字符串而不是字节?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我使用 hbase shell 并发行:

  put 'test','rowkey1','cf:foo','bar'
scan'test'


$ b $如果我使用 happybase 我会看到结果为字符串,而不是字节。问题:

$ p $ 导入happybase
连接= happybase.Connection('<主机名>')
表():
table.put('rowkey2',{'cf:foo':'bar'})
for table.scan()中的行:
打印行

我会将结果看作一个字符串,而不是以字节表示。



我在配置单元中有数据,我通过以下方式运行聚合并存储在HDFS上:

  INSERT OVERWRITE DIRECTORY'aggregation_test'
SELECT device_id,device_name,sum(device_cost)
FROM devices
GROUP BY device_id,device_name
ORDER BY device_id,device_name

但是,如果我是在Pig中使用以下内容:

  A = LOAD'aggregation_test'使用PigStorage(',')as(device_id:chararray,device_name :chararray,device_sum:int); 
STORE A INTO'hbase:// aggregation_test'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'cf:device_name,cf:device_sum');

扫描 hbase shell happybase 结果是字节,而不是字符串。



我甚至无法搜索一个字符串的行键。



如何使用Pig和HBaseStorage将数据从HDFS存储到HBase中,因为字符串不是字节?

解决方案

您是否尝试过使用HBaseBinaryConverter选项?例如:

 使用
org.apache.pig.backend将CompleteCases_f存储到'hbase:// user_test'中。 hadoop.hbase.HBaseStorage(
'id:DEFAULT id:last_modified birth:year gender:female gender:male',' - caster HBaseBinaryConverter'
);


If I use the hbase shell and issue:

put 'test', 'rowkey1','cf:foo', 'bar'
scan 'test'

I will see the result as a string, not in bytes.

If I use happybase and issue:

import happybase
connection = happybase.Connection('<hostname>')
table = connection.table('test')
table.put('rowkey2', {'cf:foo': 'bar'})
for row in table.scan():
    print row

I will see the result as a string, not in bytes.

I have data in hive that I ran an aggregation on and stored on HDFS via:

INSERT OVERWRITE DIRECTORY 'aggregation_test'
SELECT device_id, device_name, sum(device_cost)
FROM devices
GROUP BY device_id, device_name
ORDER BY device_id, device_name

However, if I issue the following in Pig:

A = LOAD 'aggregation_test' USING PigStorage(',') as (device_id:chararray, device_name:chararray, device_sum:int);
STORE A INTO 'hbase://aggregation_test'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
    'cf:device_name, cf:device_sum');

Scans in hbase shell and in happybase result in bytes, not in string.

I can't even search on a row key that is a string.

How can I use Pig and HBaseStorage to store data from HDFS into HBase as strings not bytes?

解决方案

have you tried using HBaseBinaryConverter option? Something like:

store CompleteCases_f into 'hbase://user_test' using
    org.apache.pig.backend.hadoop.hbase.HBaseStorage(
        'id:DEFAULT id:last_modified birth:year gender:female gender:male','-caster HBaseBinaryConverter'
);

这篇关于如何让Pig将行存储在HBase中作为字符串而不是字节?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆