为什么在提交到 YARN 集群时,elasticsearch-spark 5.5.0 会因 AbstractMethodError 失败? [英] Why does elasticsearch-spark 5.5.0 fail with AbstractMethodError when submitting to YARN cluster?

查看:19
本文介绍了为什么在提交到 YARN 集群时,elasticsearch-spark 5.5.0 会因 AbstractMethodError 失败?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个spark作业,主要目标是写入es,然后提交,问题是当我把它提交到spark集群时,spark回馈了

I wrote a spark job which main goal is to write into es, and submit it , the issue is when I submit it onto spark clusters, spark gave back

[ERROR][org.apache.spark.deploy.yarn.ApplicationMaster] 用户类抛出异常:java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;在 org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472)在 org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48)在 org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)在 org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)在 org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)在 org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) 在这里

[ERROR][org.apache.spark.deploy.yarn.ApplicationMaster] User class threw exception: java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation; java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation; at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)here

但是如果我使用 local[2] 提交我的工作,那么工作就很好了.奇怪,两个jar包的环境都一样.我用的是elasticsearch-spark20_2.11_5.5.0和spark2.2

But if I submit my job use local[2] ,the job worked out just fine. Strange, and two environments of jars are the same.I use elasticsearch-spark20_2.11_5.5.0 and spark2.2

推荐答案

看来您面临 Spark 版本不匹配,即您使用 elasticsearch-spark20_2.11_5.5.0(注意 spark20 在名称中)和 Spark 2.2.

It appears you face a Spark version mismatch, i.e. you use elasticsearch-spark20_2.11_5.5.0 (note spark20 in the name) and Spark 2.2.

引用java.lang.AbstractMethodError的javadoc:

当应用程序试图调用抽象方法时抛出.通常,这个错误是由编译器捕获的;仅当自上次编译当前执行的方法以来某些类的定义发生不兼容更改时,才会在运行时发生此错误.

Thrown when an application tries to call an abstract method. Normally, this error is caught by the compiler; this error can only occur at run time if the definition of some class has incompatibly changed since the currently executing method was last compiled.

这几乎解释了您的体验(注意以此错误只能在运行时发生"开头的部分).

That pretty much explains what you experience (note the part that starts with "this error can only occur at run time").

深入挖掘,堆栈跟踪中的这一行为我提供了您使用的 Spark 的确切版本,即 Spark 2.2.0.

Digging in deeper, this line in the stack trace gave me the exact version of Spark you've used, i.e. Spark 2.2.0.

org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472)

org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472)

这为您提供了问题出生"的确切位置(参见 那行):

That gives you the exact location where the issue was "born" (see that line):

dataSource.createRelation(sparkSession.sqlContext, mode, caseInsensitiveOptions, data)

匹配堆栈跟踪中最顶层的行:

That matches the top-most line in the stack trace:

java.lang.AbstractMethodError:org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)lorg/apache/spark/sql/sources/BaseRelation;java.lang.AbstractMethodError:org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)lorg/apache/spark/sql/sources/BaseRelation

java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation; java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation

看起来 elasticsearch-spark20_2.11_5.5.0 连接器是一个 CreatableRelationProvider,但不知何故它没有实现方法.这怎么可能,因为 Spark 2.0 已经有这个接口了?!让我们找出并查看elasticsearch-spark20_2.11_5.5.0的源代码.

It looks like the elasticsearch-spark20_2.11_5.5.0 connector is a CreatableRelationProvider, but somehow it does not implement the method. How is that possible since Spark 2.0 had this interface already?! Let's find out and review the source code of elasticsearch-spark20_2.11_5.5.0.

从堆栈跟踪中您知道 ES 实现是 org.elasticsearch.spark.sql.DefaultSource.数据源确实是一个 CreatableRelationProvider:

From the stack trace you know the ES implementation is org.elasticsearch.spark.sql.DefaultSource. The data source is indeed a CreatableRelationProvider:

private[sql] class DefaultSource ... with CreatableRelationProvider  {

它确实覆盖了所需的 createRelation 方法(否则将无法编译它,因为该接口自 1.3 以来就存在!)

And it does override the required createRelation method (as otherwise it would not have been possible to compile it since the interface existed since 1.3!)

方法和堆栈跟踪之间的唯一变化是data: DataFrame(在连接器和界面中)与Lorg/apache/spark/sql/Dataset;在堆栈跟踪中.这就引出了关于您的 Spark 应用程序中的代码的问题,或者您将 Spark 应用程序提交到 YARN 集群的方式可能不正确(并且您确实将 Spark 应用程序提交到了 YARN 集群,不是吗?em>)

The only change between the methods and the stack trace is data: DataFrame (in the connector and the interface) vs Lorg/apache/spark/sql/Dataset; in the stack trace. That begs the question about the code in your Spark application or perhaps there's something incorrect in how you submit the Spark application to the YARN cluster (and you do submit the Spark application to a YARN cluster, don't you?)

我很困惑,但希望答案能阐明可能导致它的原因.

I'm puzzled, but hopefully the answer has shed some light on what might've been causing it.

这篇关于为什么在提交到 YARN 集群时,elasticsearch-spark 5.5.0 会因 AbstractMethodError 失败?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆