Spark 安装 - 错误:无法找到或加载主类 org.apache.spark.launcher.Main [英] Spark installation - Error: Could not find or load main class org.apache.spark.launcher.Main
问题描述
安装 spark 2.3 并在 .bashrc 中设置以下环境变量(使用 gitbash)
After spark installation 2.3 and setting the following env variables in .bashrc (using gitbash)
HADOOP_HOME
HADOOP_HOME
SPARK_HOME
PYSPARK_PYTHON
PYSPARK_PYTHON
JDK_HOME
执行 $SPARK_HOME/bin/spark-submit 显示以下错误.
executing $SPARK_HOME/bin/spark-submit is displaying the following error.
错误:无法找到或加载主类 org.apache.spark.launcher.Main
Error: Could not find or load main class org.apache.spark.launcher.Main
我在 stackoverflow 和其他网站上进行了一些研究检查,但无法找出问题所在.
I did some research checking in stackoverflow and other sites, but could not figure out the problem.
执行环境
- Windows 10 企业版
- Spark 版本 - 2.3
- Python 版本 - 3.6.4
你能提供一些指导吗?
推荐答案
我收到了那个错误信息.它可能有几个根本原因,但这是我调查和解决问题的方式(在 linux 上):
I had that error message. It probably may have several root causes but this how I investigated and solved the problem (on linux):
- 不要启动
spark-submit
,而是尝试使用bash -x spark-submit
来查看哪一行失败. - 多次执行该过程(因为 spark-submit 调用嵌套脚本),直到找到调用的底层过程:在我的情况下类似于:
- instead of launching
spark-submit
, try usingbash -x spark-submit
to see which line fails. - do that process several times ( since spark-submit calls nested scripts ) until you find the underlying process called : in my case something like :
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp '/opt/spark-2.2.0-bin-hadoop2.7/conf/:/opt/spark-2.2.0-bin-hadoop2.7/jars/*' -Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name 'Spark shell' spark-shell代码>
因此,spark-submit 启动了一个 java 进程,但使用 /opt/spark-2.2.0-bin-hadoop2.7/中的文件找不到 org.apache.spark.launcher.Main 类jars/*
(参见上面的 -cp 选项).我在这个 jars 文件夹中做了一个 ls 并计算了 4 个文件而不是整个 spark 分发(约 200 个文件).这可能是安装过程中的一个问题.所以我重新安装了 spark,检查了 jar 文件夹,它就像一个魅力.
So, spark-submit launches a java process and can't find the org.apache.spark.launcher.Main class using the files in /opt/spark-2.2.0-bin-hadoop2.7/jars/*
(see the -cp option above). I did an ls in this jars folder and counted 4 files instead of the whole spark distrib (~200 files).
It was probably a problem during the installation process. So I reinstalled spark, checked the jar folder and it worked like a charm.
所以,你应该:
- 检查
java
命令(cp 选项) - 检查您的 jars 文件夹(它至少包含所有 spark-*.jar 吗?)
希望对你有帮助.
这篇关于Spark 安装 - 错误:无法找到或加载主类 org.apache.spark.launcher.Main的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!