Hadoop新API的Class Cast例外 [英] Class Cast exception for the Hadoop new API

查看:108
本文介绍了Hadoop新API的Class Cast例外的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用一些简单的代码使用Map reduce框架来咳嗽。以前我已经使用mapred包实现,并且能够将输入格式类指定为KeyvalueTextInputFormat但是在使用mapreduce的新Api中,此类不存在。我尝试使用TextInputFormat.class,但我仍然得到以下异常:

   -  job_local_0001 
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable无法在com.hp.hpl.mapReduceprocessing.MapReduceWrapper中转换为org.apache.hadoop.io.Text
$ HitFileProccesorMapper_internal.map(MapReduceWrapper.java:1)
在org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org。 apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:177)

这里是代码示例片段

  Configuration conf = new Configuration(); 
conf.set(key.value.separator.output.line,,);

工作职位=新职位(conf,结果汇总);
job.setJarByClass(ProcessInputFile.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);

job.setMapperClass(MultithreadedMapper.class);
MultithreadedMapper.setMapperClass(job,HitFileProccesorMapper_internal.class);
MultithreadedMapper.setNumberOfThreads(job,3);
//job.setMapperClass(HitFileProccesorMapper_internal.class);
job.setReducerClass(HitFileReducer_internal.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);



FileInputFormat.addInputPath(job,new Path(inputFileofhits.getName()));
FileOutputFormat.setOutputPath(job,new Path(ProcessInputFile.resultAggProps
.getProperty(OUTPUT_DIRECTORY)));

尝试{
job.waitForCompletion(true);
} catch(InterruptedException e){
// TODO自动生成的catch块
e.printStackTrace();
} catch(ClassNotFoundException e){
// TODO自动生成的catch块
e.printStackTrace();
}

请让我知道什么是配置更改,以便classcast异常可以避免。

解决方案

这种情况通常发生在MapReduce试图作为键/值传递的类型不匹配时,以及



你说你正在使用 KeyvalueTextInputFormat ,但是在你的代码中正在使用 的TextInputFormat TextInputFormat 将记录提供为< LongWritable,Text> :位置,行。 b

我猜猜你的Mapper的类型是< Text,Text,?,?> 。因此,MapReduce试图将 LongWritable 转换为 TextInputFormat 将其赋给 Text ,它不能,所以它炸了。



我建议你要么 KeyvalueTextInputFormat 或将映射器的类型更改为< LongWritable,Text,?,?>


i have trying to cough up with some simple code using Map reduce framework. Previously I had implemented using mapred package and I was able to specify the input format class as KeyvalueTextInputFormat But in the new Api using mapreduce this class is not present. I tried using the TextInputFormat.class but i still get the following exception

- job_local_0001
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text
    at com.hp.hpl.mapReduceprocessing.MapReduceWrapper$HitFileProccesorMapper_internal.map(MapReduceWrapper.java:1)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

here is a sample snippet of the code

Configuration conf = new Configuration();
         conf.set("key.value.separator.output.line", ",");    

        Job job = new Job(conf, "Result Aggregation");
        job.setJarByClass(ProcessInputFile.class);

        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(Text.class);

        job.setMapperClass(MultithreadedMapper.class);
        MultithreadedMapper.setMapperClass(job, HitFileProccesorMapper_internal.class);
        MultithreadedMapper.setNumberOfThreads(job, 3);
        //job.setMapperClass(HitFileProccesorMapper_internal.class);
        job.setReducerClass(HitFileReducer_internal.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);



        FileInputFormat.addInputPath(job, new Path(inputFileofhits.getName()));
        FileOutputFormat.setOutputPath(job, new Path(ProcessInputFile.resultAggProps
                .getProperty("OUTPUT_DIRECTORY")));

        try {
            job.waitForCompletion(true);
        } catch (InterruptedException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (ClassNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

Please do let me know what are the configuration changes to be made so that classcast exception can be avoided.

解决方案

This usually happens when there is a type mismatch in what MapReduce is trying to pass through as a key/value and what the Map or Reduce class is templated to have.

You say that you are using KeyvalueTextInputFormat, but in your code you are using TextInputFormat. TextInputFormat delivers records as <LongWritable, Text> : "position, line".

I'm going to guess that the type of your Mapper is <Text, Text, ?, ?>. Therefore, MapReduce is trying to cast the LongWritable that TextInputFormat is giving it to a Text, and it can't, so it bombs out.

I suggest you either KeyvalueTextInputFormat or change the type of your mapper to <LongWritable, Text, ?, ?>.

这篇关于Hadoop新API的Class Cast例外的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆