java.lang.NoSuchMethodError: scala.Predef$.refArrayOps 在 Spark 作业中使用 Scala [英] java.lang.NoSuchMethodError: scala.Predef$.refArrayOps in Spark job with Scala
问题描述
完全错误:
<块引用>线程main"中的异常java.lang.NoSuchMethodError:scala.Predef$.refArrayOps([Ljava/lang/Object;)[Ljava/lang/Object;在 org.spark_module.SparkModule$.main(SparkModule.scala:62)在 org.spark_module.SparkModule.main(SparkModule.scala)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:498)在 org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)在 org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)在 org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)在 org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)在 org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)在 org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)在 org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)在 org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
当我在 IntelliJ 中编译和运行代码时,它一直执行得很好.当我将 .jar 作为 spark 作业(运行时)提交时,错误显示.
第 62 行包含:for ((elem, i) <- args.zipWithIndex)
.为了确定,我注释掉了其余的代码,错误一直显示在该行上.
起初我以为是zipWithIndex
的错.然后我将它更改为 for (elem <- args)
并猜测是什么,错误仍然显示.for
是否导致了这种情况?
Google 搜索总是指出用于编译的版本与运行时使用的版本之间的 Scala 版本不兼容,但我找不到解决方案.
我尝试了
然后我做了这个来检查Scala的运行时版本,输出是:
<块引用>(文件:/C:/Users/me/.gradle/caches/modules-2/files-2.1/org.scala-lang/scala-library/2.12.11/1a0634714a956c1aae9abefc83acaf6d4eabfa7d/scala-1.12.罐)
版本似乎匹配...
这是我的 gradle.build(包括 fatJar
任务)
组'org.spark_module'版本1.0-快照"应用插件:'scala'应用插件:想法"应用插件:'eclipse'存储库{MavenCentral()}主意 {项目 {jdkName = '1.8'语言级别 = '1.8'}}依赖{实现组:'org.scala-lang',名称:'scala-library',版本:'2.12.11'实现组:'org.apache.spark',名称:'spark-core_2.12'//,版本:'2.4.5'实现组:'org.apache.spark',名称:'spark-sql_2.12'//,版本:'2.4.5'实现组:'com.datastax.spark',名称:'spark-cassandra-connector_2.12',版本:'2.5.0'实现组:'org.apache.spark',名称:'spark-mllib_2.12',版本:'2.4.5'实现组:'log4j',名称:'log4j',版本:'1.2.17'实现组:'org.scalaj',名称:'scalaj-http_2.12',版本:'2.4.2'}任务 fatJar(类型:Jar){zip64 真从 {configuration.runtimeClasspath.collect { it.isDirectory() ?它:zipTree(它)}} {排除META-INF/*.SF"排除META-INF/*.DSA"排除META-INF/*.RSA"}显现 {属性主类":org.spark_module.SparkModule"}带罐子}配置.所有{解析策略{强制'com.google.guava:guava:12.0.1'}}compileScala.targetCompatibility = "1.8"compileScala.sourceCompatibility = "1.8"罐子{zip64 真getArchiveFileName()从 {配置.编译.收集{it.isDirectory() ?它:zipTree(它)}}显现 {属性主类":org.spark_module.SparkModule"}排除 'META-INF/*.RSA', 'META-INF/*.SF', 'META-INF/*.DSA'}
构建(胖)jar:
gradlew fatJar
在 IntelliJ 的终端中.
运行作业:
spark-submit.cmd .\SparkModule-1.0-SNAPSHOT.jar
在 Windows PowerShell 中.
谢谢
spark-submit.cmd 和 spark-shell.cmd 都显示 Scala 版本 2.11.12,所以是的,它们不同于我在 IntelliJ (2.12.11) 中使用的一个.问题是,在 Spark 的下载页面中,只有一个 Spark 分布Scala 2.12,它没有 Hadoop;这是否意味着我必须在 gradle.build 中从 2.12 降级到 2.11?
我会尝试 spark-submit --version
以了解 scala version
使用的是什么 火花
使用 spark-submit --version
我得到这个信息
[cloudera@quickstart scala-programming-for-data-science]$ spark-submit --version欢迎来到____ __/__/__ ___ _____//___\ \/_ \/_ `/__/'_//___/.__/\_,_/_//_/\_\ 版本 2.2.0.cloudera4/_/使用 Scala 版本 2.11.8,Java HotSpot(TM) 64 位服务器 VM,1.8.0_202分行负责人由用户 jenkins 在 2018-09-27T02:42:51Z 编译修订版 0ef0912caaab3f2636b98371eb29adb42978c595网址 git://github.mtv.cloudera.com/CDH/spark.git键入 --help 以获取更多信息.
从 spark-shell
你可以试试这个来了解 scala 版本
scala>util.Properties.versionStringres3:字符串 = 版本 2.11.8
OS
可能正在使用其他 scala 版本
,就我而言,您可以看到 spark scala 版本
和 OS scala 版本
不一样
[cloudera@quickstart scala-programming-for-data-science]$ scala -versionScala 代码运行器版本 2.12.8 -- 版权所有 2002-2018,LAMP/EPFL 和 Lightbend, Inc.
注意来自 O'Really Learning Spark Holden Karau、Andy Konwinski、Patrick Wendell &马泰扎哈里亚
依赖冲突
一个偶尔的破坏性问题是在以下情况下处理依赖冲突
用户应用程序和 Spark
本身都依赖于同一个库.这出现相对较少,但一旦发生,用户就会感到烦恼.通常,这将体现当 NoSuchMethodError
、ClassNotFoundException
或其他一些JVM 异常
与类加载相关的Spark
作业在执行过程中被抛出.这个问题有两种解决方案.首先是将您的应用程序修改为依赖与 Spark
相同版本的 第三方库
.第二个是使用通常称为的过程修改应用程序的打包阴影."Maven
构建工具通过高级配置支持着色例 7-5 中显示的插件(实际上,着色 功能是插件的原因被命名为 maven-shade-plugin
).阴影允许您制作第二个副本不同命名空间下的冲突包并重写应用程序的代码使用重命名的版本.这种有点brute-force
的技术在解决运行时依赖冲突
.有关如何阴影的具体说明依赖项,请参阅您的构建工具的文档.
Full error:
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)[Ljava/lang/Object; at org.spark_module.SparkModule$.main(SparkModule.scala:62) at org.spark_module.SparkModule.main(SparkModule.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
When I compile and run the code in IntelliJ, it executes fine all the way through. The error shows when I submit the .jar as a spark job (runtime).
Line 62 contains: for ((elem, i) <- args.zipWithIndex)
. I commented out the rest of the code to be sure, and the error kept showing on that line.
At first I thought it was zipWithIndex
's fault. Then I changed it for for (elem <- args)
and guess what, the error still showed. Is the for
causing this?
Google searching always points to Scala versions incompatibility between version used to compile and version used on runtime but I can't figure out a solution.
I tried this to check Scala version used by IntelliJ and here is everything Scala-related under Modules > Scala:
Then I did this to check the run-time version of Scala and the output is:
(file:/C:/Users/me/.gradle/caches/modules-2/files-2.1/org.scala-lang/scala-library/2.12.11/1a0634714a956c1aae9abefc83acaf6d4eabfa7d/scala-library-2.12.11.jar )
Versions seem to match...
This is my gradle.build (includes fatJar
task)
group 'org.spark_module'
version '1.0-SNAPSHOT'
apply plugin: 'scala'
apply plugin: 'idea'
apply plugin: 'eclipse'
repositories {
mavenCentral()
}
idea {
project {
jdkName = '1.8'
languageLevel = '1.8'
}
}
dependencies {
implementation group: 'org.scala-lang', name: 'scala-library', version: '2.12.11'
implementation group: 'org.apache.spark', name: 'spark-core_2.12'//, version: '2.4.5'
implementation group: 'org.apache.spark', name: 'spark-sql_2.12'//, version: '2.4.5'
implementation group: 'com.datastax.spark', name: 'spark-cassandra-connector_2.12', version: '2.5.0'
implementation group: 'org.apache.spark', name: 'spark-mllib_2.12', version: '2.4.5'
implementation group: 'log4j', name: 'log4j', version: '1.2.17'
implementation group: 'org.scalaj', name: 'scalaj-http_2.12', version: '2.4.2'
}
task fatJar(type: Jar) {
zip64 true
from {
configurations.runtimeClasspath.collect { it.isDirectory() ? it : zipTree(it) }
} {
exclude "META-INF/*.SF"
exclude "META-INF/*.DSA"
exclude "META-INF/*.RSA"
}
manifest {
attributes 'Main-Class': 'org.spark_module.SparkModule'
}
with jar
}
configurations.all {
resolutionStrategy {
force 'com.google.guava:guava:12.0.1'
}
}
compileScala.targetCompatibility = "1.8"
compileScala.sourceCompatibility = "1.8"
jar {
zip64 true
getArchiveFileName()
from {
configurations.compile.collect {
it.isDirectory() ? it : zipTree(it)
}
}
manifest {
attributes 'Main-Class': 'org.spark_module.SparkModule'
}
exclude 'META-INF/*.RSA', 'META-INF/*.SF', 'META-INF/*.DSA'
}
To build the (fat) jar:
gradlew fatJar
in IntelliJ's terminal.
To run the job:
spark-submit.cmd .\SparkModule-1.0-SNAPSHOT.jar
in Windows PowerShell.
Thank you
EDIT:
spark-submit.cmd and spark-shell.cmd both show Scala version 2.11.12, so yes, they differ from the one I am using in IntelliJ (2.12.11). The problem is, in Spark's download page, there is only one Spark distribution for Scala 2.12 and it comes without Hadoop; does it mean I have to downgrade from 2.12 to 2.11 in my gradle.build?
I would try spark-submit --version
to know what scala version
is using spark
With spark-submit --version
I get this information
[cloudera@quickstart scala-programming-for-data-science]$ spark-submit --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.0.cloudera4
/_/
Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_202
Branch HEAD
Compiled by user jenkins on 2018-09-27T02:42:51Z
Revision 0ef0912caaab3f2636b98371eb29adb42978c595
Url git://github.mtv.cloudera.com/CDH/spark.git
Type --help for more information.
from the spark-shell
you could try this to know the scala version
scala> util.Properties.versionString
res3: String = version 2.11.8
The OS
could be using other scala version
, in my case as you can see spark scala version
and OS scala version
are different
[cloudera@quickstart scala-programming-for-data-science]$ scala -version
Scala code runner version 2.12.8 -- Copyright 2002-2018, LAMP/EPFL and Lightbend, Inc.
Note From O'Really Learning Spark "Holden Karau, Andy Konwinski,Patrick Wendell & Matei Zaharia"
Dependency Conflicts
One occasionally disruptive issue is dealing with dependency conflicts
in cases where
a user application and Spark
itself both depend on the same library. This comes up
relatively rarely, but when it does, it can be vexing for users. Typically, this will manifest
itself when a NoSuchMethodError
, a ClassNotFoundException
, or some other
JVM exception
related to class loading is thrown during the execution of a Spark
job.
There are two solutions to this problem. The first is to modify your application to
depend on the same version of the third-party library
that Spark
does. The second is
to modify the packaging of your application using a procedure that is often called
"shading." The Maven
build tool supports shading through advanced configuration
of the plug-in shown in Example 7-5 (in fact, the shading capability is why the plugin
is named maven-shade-plugin
). Shading allows you to make a second copy of the
conflicting package under a different namespace and rewrites your application’s code
to use the renamed version. This somewhat brute-force
technique is quite effective at
resolving runtime dependency conflicts
. For specific instructions on how to shade
dependencies, see the documentation for your build tool.
这篇关于java.lang.NoSuchMethodError: scala.Predef$.refArrayOps 在 Spark 作业中使用 Scala的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!