Hadoop mapreduce从HBase流式传输 [英] Hadoop mapreduce streaming from HBase

查看:167
本文介绍了Hadoop mapreduce从HBase流式传输的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个使用HBase(0.20.1)作为数据源和数据宿的Hadoop(0.20.1)mapreduce作业。我想用Python编写这个工作,它需要我使用hadoop-0.20.1-streaming.jar来将数据传入和传出我的Python脚本。如果数据源/接收器是HDFS文件,这工作正常。

Hadoop是否支持从/到HBase的mapreduce流?

解决方案

这似乎是我想要的,但它不是Hadoop发行版的一部分。任何其他建议或意见仍然欢迎。



http://github.com/wanpark/hadoop-hbase-streaming


I'm building a Hadoop (0.20.1) mapreduce job that uses HBase (0.20.1) as both the data source and data sink. I would like to write the job in Python which has required me to use hadoop-0.20.1-streaming.jar to stream data to and from my Python scripts. This works fine if the data source/sink are HDFS files.

Does Hadoop support streaming from/to HBase for mapreduce?

解决方案

This seems to do what I want but it's not part of the Hadoop distribution. Any other suggestions or comments still welcome.

http://github.com/wanpark/hadoop-hbase-streaming

这篇关于Hadoop mapreduce从HBase流式传输的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆