将输出从REDUCER写入多个表 [英] Write output to multiple tables from REDUCER

查看:53
本文介绍了将输出从REDUCER写入多个表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以从我的reducer向HBase中的多个表写入输出吗?我浏览了不同的博客文章,但是即使使用 MultiTableOutputFormat ,也无法找到方法.

我提到了这一点:写入HBASE中的多个表

但无法找出 context.write 调用的API签名.

减速器代码:

 公共类MyReducer扩展了TableReducer< Text,Result,Put>{私有静态最终Logger logger = Logger.getLogger(MyReducer.class);@SuppressWarnings(弃用")@Override受保护的void reduce(文本键,Iterable< Result>数据,Context上下文)引发IOException,InterruptedException {logger.info(正在处理--->" + key.toString());对于(结果res:数据){放置权=新放置权(res.getRow());KeyValue [] raw = res.raw();对于(KeyValue kv:raw){put.add(kv);}context.write(obj,put);**//我不知道如何在此处提供表格名称.**}}} 

解决方案

要标识表名,您应该将表名作为键传递给 context.write(key,put)方法:

  ImmutableBytesWritable键=新的ImmutableBytesWritable(Bytes.toBytes("tableName"));context.write(key,put); 

但是,如果您想一次通过MapReduce作业加载大量数据,那么使用 MultiTableHFileOutputFormat 可能会很有趣.此输出格式为您需要的每个HBase表创建HFile,然后您可以使用 LoadIncrementalHFiles 工具轻松加载这些文件:

  hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles/tmp/multiTableJobResult hbaseTable 

您可以在文章中阅读有关 MultiTableHFileOutputFormat 的更多信息:Write to multiple tables in HBASE

But not able to figure out the API signature for context.write call.

Reducer code:

public class MyReducer extends TableReducer<Text, Result, Put> {

    private static final Logger logger = Logger.getLogger( MyReducer.class );

    @SuppressWarnings( "deprecation" )
    @Override
    protected void reduce( Text key, Iterable<Result> data, Context context ) throws IOException, InterruptedException {
        logger.info( "Working on ---> " + key.toString() );
        for ( Result res : data ) {
            Put put = new Put( res.getRow() );
            KeyValue[] raw = res.raw();
            for ( KeyValue kv : raw ) {
                put.add( kv );
            }

            context.write( obj, put );
            **// I dont know how to give table name here.**

        }
    }
}

解决方案

To identify the table names you should pass the table name as the key to context.write(key, put) method:

ImmutableBytesWritable key = new ImmutableBytesWritable(Bytes.toBytes("tableName"));
context.write(key, put);

But if you want to load a huge amount of data via MapReduce job at once then it might be interesting for you to use MultiTableHFileOutputFormat. This output format creates HFiles for every HBase table you need and then you can easily load these files with LoadIncrementalHFiles tool:

hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles /tmp/multiTableJobResult hbaseTable

You can read more about MultiTableHFileOutputFormat in the article: http://tech.adroll.com/blog/data/2014/07/15/multi-table-bulk-import.html

这篇关于将输出从REDUCER写入多个表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆