从System读取文本文件到Hbase MapReduce [英] read text file from System to Hbase MapReduce

查看:96
本文介绍了从System读取文本文件到Hbase MapReduce的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将文本文件中的数据加载到Map Reduce中,我从很多天开始浏览,但我没有找到适合我的工作的正确解决方案。是否有任何方法或类从系统读取文本/ csv文件并将数据存储到HBASE表中。对于我来说真的很紧迫,任何人都可以帮助我了解MapReduce F / w。

解决方案

所有文本文件都应该在hdfs中。
您需要为作业指定输入格式和输出格式

 作业作业=新作业(conf,example) ; 
FileInputFormat.addInputPath(job,new Path(PATH to text file));
job.setInputFormatClass(TextInputFormat.class);
job.setMapperClass(YourMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
TableMapReduceUtil.initTableReducerJob(hbase_table_name,YourReducer.class,job);
job.waitForCompletion(true);

YourReducer 应扩展 org.apache.hadoop.hbase.mapreduce.TableReducer< Text,Text,Text>



缩减代码示例

  public class YourReducer扩展TableReducer<文本,文本,文本> {
private byte [] rawUpdateColumnFamily = Bytes.toBytes(colName);
/ **
*在任务开始时调用一次。
* /
@Override
protected void setup(Context context)抛出IOException,InterruptedException {
//在reducer开始时需要完成的工作
}

@Override
public void reduce(Text keyin,Iterable< Text> values,Context context)throws IOException,InterruptedException {
//聚合计数
int valuesCount = 0 ;
for(Text val:values){
valuesCount + = 1;
//在表格中放置日期
Put put = new Put(keyin.toString()。getBytes());
long explicitTimeInMs = new Date()。getTime();
put.add(rawUpdateColumnFamily,Bytes.toBytes(colName),explicitTimeInMs,val.toString()。getBytes());
context.write(keyin,put);

$ b}
}
}

示例映射器类

  public static class YourMapper扩展了Mapper< LongWritable,Text,Text,IntWritable> {
private static static IntWritable one = new IntWritable(1);
私人文字=新文字();
public void map(LongWritable key,Text value,Context context)throws IOException,InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while(tokenizer.hasMoreTokens()){
word.set(tokenizer.nextToken());
context.write(word,one);
}
}
}


I need to load data from text file to Map Reduce, I am goggling from many days but i didn't find any right solution for my work. Is there any Method or Class which reads a text /csv file from a system and store the data into HBASE Table. Its really very urgent for me please can any one help me in Knowing MapReduce F/w.

解决方案

For reading from text file first of all the text file should be in hdfs. You need to specify input format and outputformat for job

Job job = new Job(conf, "example");
FileInputFormat.addInputPath(job, new Path("PATH to text file"));
job.setInputFormatClass(TextInputFormat.class);
job.setMapperClass(YourMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
TableMapReduceUtil.initTableReducerJob("hbase_table_name", YourReducer.class, job);
job.waitForCompletion(true);

YourReducer should extends org.apache.hadoop.hbase.mapreduce.TableReducer<Text, Text, Text>

Sample reducer code

public class YourReducer extends TableReducer<Text, Text, Text> {    
private byte[] rawUpdateColumnFamily = Bytes.toBytes("colName");
/**
* Called once at the beginning of the task.
*/
@Override
protected void setup(Context context) throws IOException, InterruptedException {
// something that need to be done at start of reducer
}

@Override
public void reduce(Text keyin, Iterable<Text> values, Context context) throws IOException, InterruptedException {
// aggregate counts
int valuesCount = 0;
for (Text val : values) {
   valuesCount += 1;
   // put date in table
   Put put = new Put(keyin.toString().getBytes());
   long explicitTimeInMs = new Date().getTime();
   put.add(rawUpdateColumnFamily, Bytes.toBytes("colName"), explicitTimeInMs,val.toString().getBytes());
   context.write(keyin, put);


      }
    }
}

Sample mapper class

public static class YourMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
        word.set(tokenizer.nextToken());
        context.write(word, one);
        }
    }
}

这篇关于从System读取文本文件到Hbase MapReduce的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆