使用--jars选项时火花抛出的ClassNotFoundException [英] Spark throws ClassNotFoundException when using --jars option
问题描述
我试图按照这里介绍的星火独立的应用实例
<一href=\"https://spark.apache.org/docs/latest/quick-start.html#standalone-applications\">https://spark.apache.org/docs/latest/quick-start.html#standalone-applications
I was trying to follow the Spark standalone application example described here https://spark.apache.org/docs/latest/quick-start.html#standalone-applications
这个例子运行良好通过以下调用:
The example ran fine with the following invocation:
spark-submit --class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar
然而,当我试图通过介绍一些第三方的库 - 罐子
,它抛出的ClassNotFoundException
。
$ spark-submit --jars /home/linpengt/workspace/scala-learn/spark-analysis/target/pack/lib/* \
--class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.ClassNotFoundException: SimpleApp
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:300)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
删除 - 罐子
选项和程序再次运行(我没有真正开始使用这些库还)。这里有什么问题吗?我应该如何添加外部罐子?
Removing the --jars
option and the program runs again (I didn't actually start using those libraries yet). What's the problem here? How should I add the external jars?
推荐答案
据火花提交的
- 帮助
的 - 罐子
选项需要一个的逗号的地方罐子 - 分隔列表,包括司机和执行人的classpath
According to spark-submit
's --help
, the --jars
option expects a comma-separated list of local jars to include on the driver and executor classpaths.
我认为,这里发生了什么是 /家庭/ linpengt /工作区/斯卡拉学习/火花分析/目标/包/ lib目录/ *
正在扩大成的空格的罐子,并在列表中的第二个JAR的 - 分隔列表被视为应用程序JAR。
I think that what's happening here is that /home/linpengt/workspace/scala-learn/spark-analysis/target/pack/lib/*
is expanding into a space-separated list of jars and the second JAR in the list is being treated as the application jar.
一个解决方案是使用你的shell建立一个逗号分隔的jar文件清单;这里是做在bash的基础上,在计算器上这个答案(见这个问题的答案对于处理更复杂的方法包含空格的文件名):
One solution is to use your shell to build a comma-separated list of jars; here's a quick way of doing it in bash, based on this answer on StackOverflow (see that answer for more complex approaches that handle filenames that contain spaces):
spark-submit --jars $(echo /dir/of/jars/*.jar | tr ' ' ',') \
--class "SimpleApp" --master local[4] path/to/myApp.jar
这篇关于使用--jars选项时火花抛出的ClassNotFoundException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!