在Hbase导出的表上运行MapReduce thorws无法找到Value类的反序列化器:'org.apache.hadoop.hbase.client.Result [英] Running MapReduce on Hbase Exported Table thorws Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result

查看:362
本文介绍了在Hbase导出的表上运行MapReduce thorws无法找到Value类的反序列化器:'org.apache.hadoop.hbase.client.Result的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Hbase Export utility工具取得了Hbase表的备份。

  hbase org.apache.hadoop.hbase.mapreduce .ExportFinancialLineItem/ project / fricadev / ESGTRF / EXPORT

这已经在mapreduce和将我所有的表格数据传送到输出文件夹。
根据文档,输出文件的文件格式将是序列文件。
所以我跑下面的代码从文件中提取我的键和值。

现在我想运行mapreduce从输出文件读取键值,但是获取下面的异常


java.lang.Exception:java.io.IOException:无法找到Value类的
解串器:
'org.apache.hadoop.hbase.client.Result'。如果您使用自定义序列化的
,请确保
配置'io.serializations'已正确配置。
在org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:406)
导致:java.io.IOException:无法找到Value类的解串器:'org .apache.hadoop.hbase.client.Result。如果您使用自定义序列化,请
确保配置'io.serializations'正确配置

at org.apache.hadoop.io.SequenceFile $ Reader.init(SequenceFile.java:1964)
at org.apache.hadoop.io.SequenceFile $ Reader.initialize(SequenceFile.java:1811)
at org.apache.hadoop.io.SequenceFile $ Reader。(SequenceFile.java:1760)
at org.apache.hadoop.io.SequenceFile $ Reader。(SequenceFile.java:1774)
at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:50)
at org.apache.hadoop.mapred.MapTask $ NewTrackingRecordReader.initialize(MapTask.java:478)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:671)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)


这是我的驱动程序代码

  package SEQ ; 

导入org.apache.hadoop.conf.Configured;
导入org.apache.hadoop.fs.FileSystem;
导入org.apache.hadoop.fs.Path;
导入org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class SeqDriver extends Configured implements Tool b $ b {
public static void main(String [] args)throws Exception {
int exitCode = ToolRunner.run(new SeqDriver(),args );
System.exit(exitCode);

$ b $ public int run(String [] args)throws Exception {
if(args.length!= 2){
System.err.printf(用法:%s需要两个参数文件\\\

getClass()。getSimpleName());
返回-1;
}
String outputPath = args [1];

FileSystem hfs = FileSystem.get(getConf());
Job job = new Job();
job.setJarByClass(SeqDriver.class);
job.setJobName(SequenceFileReader);

HDFSUtil.removeHdfsSubDirIfExists(hfs,new Path(outputPath),true);

FileInputFormat.addInputPath(job,new Path(args [0]));
FileOutputFormat.setOutputPath(job,new Path(args [1]));

job.setOutputKeyClass(ImmutableBytesWritable.class);
job.setOutputValueClass(Result.class);
job.setInputFormatClass(SequenceFileInputFormat.class);

job.setMapperClass(MySeqMapper.class);

job.setNumReduceTasks(0);


int returnValue = job.waitForCompletion(true)? 0:1;

if(job.isSuccessful()){
System.out.println(Job was successful);
} else if(!job.isSuccessful()){
System.out.println(Job was not successful);
}

return returnValue;


$ / code $ / pre
$ b $ p这是我的映射器代码

  package SEQ; 

import java.io.IOException;

导入org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
$ b $ public class MySeqMapper扩展Mapper< ImmutableBytesWritable,Result,Text,Text> {

@Override
public void map(ImmutableBytesWritable row,Result value,Context context )
抛出IOException,InterruptedException {
}
}


解决方案

所以我会回答我的问题
这里是需要使它工作



因为我们使用HBase来存储我们的数据和这个reducer将结果输出到HBase表中,Hadoop告诉我们他不知道如何序列化我们的数据。这就是为什么我们需要帮助它。在setUp中设置io.serializations变量

pre $ h code $ hbaseConf.setStrings(io.serializations,new String [] {hbaseConf.get (io.serializations),MutationSerialization.class.getName(),ResultSerialization.class.getName()});


I have taken the Hbase table backup using Hbase Export utility tool .

hbase org.apache.hadoop.hbase.mapreduce.Export "FinancialLineItem" "/project/fricadev/ESGTRF/EXPORT"

This has kicked in mapreduce and transferred all my table data into Output folder . As per the document the file format will of the ouotput file is sequence file . So i ran below code to extract my key and value from the file .

Now i want to run mapreduce to read the key value from the output file but getting below exception

java.lang.Exception: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization. at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406) Caused by: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization. at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1964) at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1811) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1760) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1774) at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:50) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:478) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:671) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)

Here is my driver code

package SEQ;

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class SeqDriver extends Configured implements Tool 
{
    public static void main(String[] args) throws Exception{
        int exitCode = ToolRunner.run(new SeqDriver(), args);
        System.exit(exitCode);
    }

    public int run(String[] args) throws Exception {
        if (args.length != 2) {
            System.err.printf("Usage: %s needs two arguments   files\n",
                    getClass().getSimpleName());
            return -1;
        }
        String outputPath = args[1];

        FileSystem hfs = FileSystem.get(getConf());
        Job job = new Job();
        job.setJarByClass(SeqDriver.class);
        job.setJobName("SequenceFileReader");

        HDFSUtil.removeHdfsSubDirIfExists(hfs, new Path(outputPath), true);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setOutputKeyClass(ImmutableBytesWritable.class);
        job.setOutputValueClass(Result.class);
        job.setInputFormatClass(SequenceFileInputFormat.class);

        job.setMapperClass(MySeqMapper.class);

        job.setNumReduceTasks(0);


        int returnValue = job.waitForCompletion(true) ? 0:1;

        if(job.isSuccessful()) {
            System.out.println("Job was successful");
        } else if(!job.isSuccessful()) {
            System.out.println("Job was not successful");           
        }

        return returnValue;
    }
}

Here is my mapper code

package SEQ;

import java.io.IOException;

import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MySeqMapper extends Mapper <ImmutableBytesWritable, Result, Text, Text>{

    @Override
    public void map(ImmutableBytesWritable row, Result value,Context context)
    throws IOException, InterruptedException {
    }
  }

解决方案

So i will answer my question here is what was needed to make it work

Because we use HBase to store our data and this reducer outputs its result to HBase table, Hadoop is telling us that he doesn’t know how to serialize our data. That is why we need to help it. Inside setUp set the io.serializations variable

hbaseConf.setStrings("io.serializations", new String[]{hbaseConf.get("io.serializations"), MutationSerialization.class.getName(), ResultSerialization.class.getName()});

这篇关于在Hbase导出的表上运行MapReduce thorws无法找到Value类的反序列化器:'org.apache.hadoop.hbase.client.Result的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆