提交到YARN集群时,为什么Elasticsearch-spark 5.5.0失败并出现AbstractMethodError? [英] Why does elasticsearch-spark 5.5.0 fail with AbstractMethodError when submitting to YARN cluster?

查看:213
本文介绍了提交到YARN集群时,为什么Elasticsearch-spark 5.5.0失败并出现AbstractMethodError?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个spark工作,其主要目标是将其写入es并提交,问题是当我将其提交到spark集群上时,spark退还了

I wrote a spark job which main goal is to write into es, and submit it , the issue is when I submit it onto spark clusters, spark gave back

[错误] [org.apache.spark.deploy.yarn.ApplicationMaster]用户类引发异常:java.lang.AbstractMethodError:org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext; Lorg/apache/spark/sql/SaveMode; Lscala/collection/immutable/Map; Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation; java.lang.AbstractMethodError:org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext; Lorg/apache/spark/sql/SaveMode; Lscala/collection/immutable/Map; Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation; 在org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472) 在org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48) 在org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult $ lzycompute(commands.scala:58) 在org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) 在org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) 在org.apache.spark.sql.execution.SparkPlan $$ anonfun $ execute $ 1.apply(SparkPlan.scala:117)此处

[ERROR][org.apache.spark.deploy.yarn.ApplicationMaster] User class threw exception: java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation; java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation; at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)here

但是,如果我使用local [2]提交工作,那么工作就可以了.奇怪,罐子的两个环境是相同的.我使用elasticsearch-spark20_2.11_5.5.0和spark2.2

But if I submit my job use local[2] ,the job worked out just fine. Strange, and two environments of jars are the same.I use elasticsearch-spark20_2.11_5.5.0 and spark2.2

推荐答案

您似乎遇到了Spark版本不匹配的问题,即您使用了elasticsearch-spark20_2.11_5.5.0(名称中请注意spark20)和Spark 2.2.

It appears you face a Spark version mismatch, i.e. you use elasticsearch-spark20_2.11_5.5.0 (note spark20 in the name) and Spark 2.2.

引用 java.lang.AbstractMethodError的Javadoc :

当应用程序尝试调用抽象方法时抛出.通常,此错误是由编译器捕获的.如果自从上次编译当前执行的方法以来某个类的定义发生了不兼容的更改,则只有在运行时才会发生此错误.

Thrown when an application tries to call an abstract method. Normally, this error is caught by the compiler; this error can only occur at run time if the definition of some class has incompatibly changed since the currently executing method was last compiled.

这几乎可以解释您的体验(请注意以此错误只能在运行时发生"开头的部分).

That pretty much explains what you experience (note the part that starts with "this error can only occur at run time").

深入研究,堆栈跟踪中的这一行为我提供了您所使用的Spark的确切版本,即Spark 2.2.0.

Digging in deeper, this line in the stack trace gave me the exact version of Spark you've used, i.e. Spark 2.2.0.

org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472)

org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472)

这将为您提供问题出生"的确切位置(请参阅

That gives you the exact location where the issue was "born" (see that line):

dataSource.createRelation(sparkSession.sqlContext, mode, caseInsensitiveOptions, data)

与堆栈跟踪中最上面的行匹配:

That matches the top-most line in the stack trace:

java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext; Lorg/apache/spark/sql/SaveMode; Lscala/collection/immutable/Map; Lorg/apache/spark/sql/Dataset; )Lorg/apache/spark/sql/sources/BaseRelation; java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext; Lorg/apache/spark/sql/SaveMode; Lscala/collection/immutable/Map; Lorg/apache/spark/sql/Dataset; )Lorg/apache/spark/sql/sources/BaseRelation

java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation; java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation

看来elasticsearch-spark20_2.11_5.5.0连接器是

It looks like the elasticsearch-spark20_2.11_5.5.0 connector is a CreatableRelationProvider, but somehow it does not implement the method. How is that possible since Spark 2.0 had this interface already?! Let's find out and review the source code of elasticsearch-spark20_2.11_5.5.0.

从堆栈跟踪中,您知道ES实现为

From the stack trace you know the ES implementation is org.elasticsearch.spark.sql.DefaultSource. The data source is indeed a CreatableRelationProvider:

private[sql] class DefaultSource ... with CreatableRelationProvider  {

它确实覆盖了必需的

And it does override the required createRelation method (as otherwise it would not have been possible to compile it since the interface existed since 1.3!)

方法和堆栈跟踪之间的唯一变化是data: DataFrame(在连接器和接口中)与Lorg/apache/spark/sql/Dataset;在堆栈跟踪中.这就引出了关于Spark应用程序中代码的问题,或者您将Spark应用程序提交到YARN集群的方式可能不正确(然后您确实将Spark应用程序提交到YARN集群,不是吗?)

The only change between the methods and the stack trace is data: DataFrame (in the connector and the interface) vs Lorg/apache/spark/sql/Dataset; in the stack trace. That begs the question about the code in your Spark application or perhaps there's something incorrect in how you submit the Spark application to the YARN cluster (and you do submit the Spark application to a YARN cluster, don't you?)

我很困惑,但希望答案能对造成这种情况的原因有所启发.

I'm puzzled, but hopefully the answer has shed some light on what might've been causing it.

这篇关于提交到YARN集群时,为什么Elasticsearch-spark 5.5.0失败并出现AbstractMethodError?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆