的TaskID<初始化>(Lorg /阿帕奇/ hadoop的/映射精简/作业ID; Lorg /阿帕奇/ hadoop的/映射精简/任务类型;我)V [英] TaskID.<init>(Lorg/apache/hadoop/mapreduce/JobID;Lorg/apache/hadoop/mapreduce/TaskType;I)V
问题描述
val jobConf = new JobConf(hbaseConf)
jobConf.setOutputFormat(classOf [TableOutputFormat])
jobConf.set(TableOutputFormat.OUTPUT_TABLE,tablename)
val indataRDD = sc.makeRDD(Array(1,jack,15,2,Lily,16,3,mike,16))
indataRDD.map (_.split(','))
val rdd = indataRDD.map(_。split(','))。map {arr => {
val put = new Put(Bytes。 toBytes(arr(0).toInt))
put.add(Bytes.toBytes(cf),Bytes.toBytes(name),Bytes.toBytes(arr(1)))
put.add(Bytes.toBytes(cf),Bytes.toBytes(age),Bytes.toBytes(arr(2).toInt))
(new ImmutableBytesWritable,put)
}}
rdd.saveAsHadoopDataset(jobConf)
当我运行hadoop或spark工作时,经常碰到错误:
线程main中的异常java.lang.NoSuchMethodError:org.apache.hadoop.mapred.TaskID。< ; init>(org.apache.spark.S的Lorg / apache / hadoop / mapreduce / JobID; Lorg / apache / hadoop / mapreduce / TaskType; I)V
parkHadoopWriter.setIDs(SparkHadoopWriter.scala:158)
at org.apache.spark.SparkHadoopWriter.preSetup(SparkHadoopWriter.scala:60)
at org.apache.spark.rdd.PairRDDFunctions $$ anonfun $ saveAsHadoopDataset $ 1.apply $ mcV $ sp(PairRDDFunctions.scala:1188)
at org.apache.spark.rdd.PairRDDFunctions $$ anonfun $ saveAsHadoopDataset $ 1.apply(PairRDDFunctions.scala:1161)
at org。 apache.spark.rdd.PairRDDFunctions $$ anonfun $ saveAsHadoopDataset $ 1.apply(PairRDDFunctions.scala:1161)
at org.apache.spark.rdd.RDDOperationScope $ .withScope(RDDOperationScope.scala:151)
在org.apache.spark.rdd.RDDOperationScope $ .withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org。 apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1161)
at com.iteblog.App $ .main(App.scala:62)
at com.iteblog.App.main(App .scala)`
一开始我认为是jar冲突,但我仔细检查了jar:t这里没有其他的罐子。 spark和hadoop版本为:
< groupId> org.apache.spark< / groupId>
< artifactId> spark-core_2.11< / artifactId>
< version> 2.0.1< / version>`
< groupId> org.apache.hadoop< / groupId>
< artifactId> hadoop-core< / artifactId>
< version> 2.6.0-mr1-cdh5.5.0< / version>
我发现TaskID和TaskType都在 hadoop-core jar,但不在同一个包中。为什么mapred.TaskID可以引用mapreduce.TaskType?
我也遇到了这样的问题。它基本上只是由于jar问题。
从Maven添加Jar文件spark-core_2.10
< dependency>
< groupId> org.apache.spark< / groupId>
< artifactId> spark-core_2.10< / artifactId>
< version> 2.0.2< / version>
< /依赖关系>
更改Jar文件
val jobConf = new JobConf(hbaseConf)
jobConf.setOutputFormat(classOf[TableOutputFormat])
jobConf.set(TableOutputFormat.OUTPUT_TABLE, tablename)
val indataRDD = sc.makeRDD(Array("1,jack,15","2,Lily,16","3,mike,16"))
indataRDD.map(_.split(','))
val rdd = indataRDD.map(_.split(',')).map{arr=>{
val put = new Put(Bytes.toBytes(arr(0).toInt))
put.add(Bytes.toBytes("cf"),Bytes.toBytes("name"),Bytes.toBytes(arr(1)))
put.add(Bytes.toBytes("cf"),Bytes.toBytes("age"),Bytes.toBytes(arr(2).toInt))
(new ImmutableBytesWritable, put)
}}
rdd.saveAsHadoopDataset(jobConf)
When I run hadoop or spark jobs, I often meet the error:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.TaskID.<init>(Lorg/apache/hadoop/mapreduce/JobID;Lorg/apache/hadoop/mapreduce/TaskType;I)V
at org.apache.spark.SparkHadoopWriter.setIDs(SparkHadoopWriter.scala:158)
at org.apache.spark.SparkHadoopWriter.preSetup(SparkHadoopWriter.scala:60)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1188)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1161)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1161)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1161)
at com.iteblog.App$.main(App.scala:62)
at com.iteblog.App.main(App.scala)`
At the begin, I think, is the jar conflict, but I carefully checked the jar: there are no other jars. The spark and hadoop versions are:
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.1</version>`
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.6.0-mr1-cdh5.5.0</version>
And I found that the TaskID and TaskType are all in the hadoop-core jar, but not in the same package. Why the mapred.TaskID can refer the mapreduce.TaskType ?
I have also faced such issue . It basically due to jar issue only.
Add the Jar file from Maven spark-core_2.10
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>2.0.2</version>
</dependency>
After changing the Jar file
这篇关于的TaskID<初始化>(Lorg /阿帕奇/ hadoop的/映射精简/作业ID; Lorg /阿帕奇/ hadoop的/映射精简/任务类型;我)V的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!