在运行简单MapReduce程序时获取java.lang.ClassCastException：类java.lang.String [英] getting java.lang.ClassCastException: class java.lang.String in running a simple MapReduce Program

查看：387 发布时间：2018/5/31 19:58:24 java hadoop mapreduce classcastexception

本文介绍了在运行简单MapReduce程序时获取java.lang.ClassCastException：类java.lang.String的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图执行一个简单的MapReduce程序，其中Map接受输入，将它分为两部分（key => String和value => Integer）
Reducer将相应键的值
我每次都得到ClassCastException。
我无法理解，代码中是什么导致了这个错误

我的代码：

  import java.io.IOException; 
 import java.util.Iterator; 
 
导入org.apache.hadoop.fs.Path; 
 import org.apache.hadoop.io.LongWritable; 
 import org.apache.hadoop.io.Text; 
 import org.apache.hadoop.mapred.FileInputFormat; 
 import org.apache.hadoop.mapred.FileOutputFormat; 
 import org.apache.hadoop.mapred.JobClient; 
 import org.apache.hadoop.mapred.JobConf; 
 import org.apache.hadoop.mapred.MapReduceBase; 
 import org.apache.hadoop.mapred.Mapper; 
 import org.apache.hadoop.mapred.OutputCollector; 
 import org.apache.hadoop.mapred.Reducer; 
 import org.apache.hadoop.mapred.Reporter; 
 import org.apache.hadoop.mapred.TextInputFormat; 
 import org.apache.hadoop.mapred.TextOutputFormat; 
 
 public class Test {
 public static class Map扩展MapReduceBase实现
 Mapper< LongWritable，Text，String，Integer> {
 
 @Override 
 public void map（LongWritable key，Text value，
 OutputCollector< String，Integer>输出，Reporter记者）
抛出IOException {
 String line = value.toString（）; 
 String [] lineParts = line.split（，）; 
 output.collect（lineParts [0]，Integer.parseInt（lineParts [1]））; 
 
 
 
 $ b public static class Reduce extends MapReduceBase implements 
 Reducer< String，Integer，String，Integer> {
 
 @Override 
 public void reduce（String key，Iterator< Integer> values，
 OutputCollector< String，Integer>输出，Reporter记者）
抛出IOException { 
 int sum = 0; 
 while（values.hasNext（））{
 sum = sum + values.next（）; 
} 
 output.collect（key，sum）; 
 
 
 
 public static void main（String [] args）throws Exception {
 
 JobConf conf = new JobConf（Test.class）; 
 conf.setJobName（ProductCount）; 
 
 conf.setMapOutputKeyClass（String.class）; 
 conf.setMapOutputValueClass（Integer.class）; 
 
 conf.setOutputKeyClass（String.class）; 
 conf.setOutputValueClass（Integer.class）; 
 
 conf.setMapperClass（Map.class）; 
 conf.setReducerClass（Reduce.class）; 
 
 conf.setInputFormat（TextInputFormat.class）; 
 conf.setOutputFormat（TextOutputFormat.class）; 
 
 FileInputFormat.setInputPaths（conf，new Path（args [0]））; 
 FileOutputFormat.setOutputPath（conf，new Path（args [1]））; 
 
 JobClient.runJob（conf）; 
 
 
 $ b

样本数据：

  abc，10 
 abc，10 
 abc，10 
 def，9 
 def， 9

以下是堆栈跟踪。它与我的键值有什么关系？

  14/02/11 23:57:35信息mapred.JobClient ：任务ID：attempt_201402110240_0013_m_000001_2，状态：FAILED 
 java.lang.ClassCastException：类java.lang.String $ b在java.lang.Class.asSubclass处的b（Class.java:3018）$ b $在org .apache.hadoop.mapred.JobConf.getOutputKeyComparator（JobConf.java:795）
 at org.apache.hadoop.mapred.MapTask $ MapOutputBuffer。< init>（MapTask.java:816）
 at org.apache.hadoop.mapred.MapTask.runOldMapper（MapTask.java:382）
位于org.apache.hadoop.mapred.MapTask.run（MapTask.java:324）
位于org.apache。 hadoop.mapred.Child $ 4.run（Child.java:268）
 at java.security.AccessController.doPrivileged（Native Method）
 at javax.security.auth.Subject.doAs（Subject.java： 396）
 at org.apache.hadoop.security.UserGroupInformation.doAs（UserGroupInformation.java:1115）
 at org.apache.hadoop.mapred.Child.main（Child.java:262）
 
 
线程main中的异常java.io.IOException：J ob失败！ 
 at org.apache.hadoop.mapred.JobClient.runJob（JobClient.java:1246）
 at Test.main（Test.java:69）
 at sun.reflect.NativeMethodAccessorImpl.invoke0 （Native Method）
 at sun.reflect.NativeMethodAccessorImpl.invoke（NativeMethodAccessorImpl.java:39）
 at sun.reflect.DelegatingMethodAccessorImpl.invoke（DelegatingMethodAccessorImpl.java:25）
 at java.lang .reflect.Method.invoke（Method.java:597）
 at org.apache.hadoop.util.RunJar.main（RunJar.java:186）

解决方案

在我看来，您好像对输出没有使用正确的类。

从MapReduce 教程中的一个：

键和值类必须由框架序列化，因此需要实现Writable接口。此外，关键类必须实现WritableComparable接口，以方便框架进行排序。因此，您应该用替换 String.class ，然后将 > Text.class 和 Integer.class 与 IntWritable.class 。
我希望能解决您的问题。为什么我不能使用基本的String或Integer类？ Integer和String实现了Java的标准Serializable接口，如文档。问题在于MapReduce序列化/反序列化不使用此标准接口的值，而是使用自己的接口，称为可编写。那为什么他们不使用基本的Java接口呢？简短的回答：因为它更高效。因为您已经定义了MapReduce代码中输入/输出的类型，所以Writer接口在序列化时省略了类型定义。因为你的代码已经知道会发生什么，而不是像这样序列化字符串：字符串：theStringItself 可以将其序列化为： theStringItself 正如你所见，这样可以节省大量的内存。长答案：阅读这个真棒 blog post 。 I am trying to execute a simple MapReduce program, wherein the Map takes the input, splits it in two parts(key=> String and value=>Integer) The reducer sums up the values for a corresponding key I am getting ClassCastException everytime. I am not able to understand, what in the code is causing this error My Code: import java.io.IOException; import java.util.Iterator; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.FileInputFormat; import org.apache.hadoop.mapred.FileOutputFormat; import org.apache.hadoop.mapred.JobClient; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapred.MapReduceBase; import org.apache.hadoop.mapred.Mapper; import org.apache.hadoop.mapred.OutputCollector; import org.apache.hadoop.mapred.Reducer; import org.apache.hadoop.mapred.Reporter; import org.apache.hadoop.mapred.TextInputFormat; import org.apache.hadoop.mapred.TextOutputFormat; public class Test { public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, String, Integer> { @Override public void map(LongWritable key, Text value, OutputCollector<String, Integer> output, Reporter reporter) throws IOException { String line = value.toString(); String[] lineParts = line.split(","); output.collect(lineParts[0], Integer.parseInt(lineParts[1])); } } public static class Reduce extends MapReduceBase implements Reducer<String, Integer, String, Integer> { @Override public void reduce(String key, Iterator<Integer> values, OutputCollector<String, Integer> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum = sum + values.next(); } output.collect(key, sum); } } public static void main(String[] args) throws Exception { JobConf conf = new JobConf(Test.class); conf.setJobName("ProductCount"); conf.setMapOutputKeyClass(String.class); conf.setMapOutputValueClass(Integer.class); conf.setOutputKeyClass(String.class); conf.setOutputValueClass(Integer.class); conf.setMapperClass(Map.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); } } Sample Data: abc,10 abc,10 abc,10 def,9 def,9 Following is the stack trace. Does it have anything to do with my key-value? 14/02/11 23:57:35 INFO mapred.JobClient: Task Id : attempt_201402110240_0013_m_000001_2, Status : FAILED java.lang.ClassCastException: class java.lang.String at java.lang.Class.asSubclass(Class.java:3018) at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:795) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:816) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:382) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:262) Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246) at Test.main(Test.java:69) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) 解决方案 It seems to me as if you are not using the correct classes for the Output. From one of the MapReduce Tutorials: The key and value classes have to be serializable by the framework and hence need to implement the Writable interface. Additionally, the key classes have to implement the WritableComparable interface to facilitate sorting by the framework. Therefore you should replace String.class with Text.class and Integer.class with IntWritable.class. I hope that fixes your problem. Why can't I use the basic String or Integer classes? Integer and String implement the standard Serializable-interface of Java as seen in the docs. The problem is that MapReduce serializes/deserializes values not utilizing this standard interface but rather an own interface, which is called Writable. So why don't they just use the basic Java Interface? Short answer: Because it is more efficient. The Writable Interface omits the type definition when serializing, because you already define the types of the input/output in your MapReduce-code. As your code already knows what's coming, instead of serializing a String like this: String: "theStringItself" It could be serialized like: theStringItself As you can see this saves an enormous amount of memory. Long answer: Read this awesome blog post. 这篇关于在运行简单MapReduce程序时获取java.lang.ClassCastException：类java.lang.String的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在运行简单MapReduce程序时获取java.lang.ClassCastException：类java.lang.String [英] getting java.lang.ClassCastException: class java.lang.String in running a simple MapReduce Program

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

在运行简单MapReduce程序时获取java.lang.ClassCastException：类java.lang.String [英] getting java.lang.ClassCastException: class java.lang.String in running a simple MapReduce Program

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭