Spark Python提交错误:文件不存在:pyspark.zip [英] Spark Python submission error : File does not exist: pyspark.zip
问题描述
我正在尝试以 yarn-cluster 模式提交python spark应用程序.
I'm trying to submit python spark application on yarn-cluster mode.
Seq(System.getenv("SPARK_HOME")+"/bin/spark-submit","--master",sparkConfig.getString("spark.master"),"--executor-memory",sparkConfig.getString("spark.executor-memory"),"--num-executors",sparkConfig.getString("spark.num-executors"),"python/app.py") !
我遇到以下错误,
诊断:文件不存在: hdfs://xxxxxx:8020/user/hdfs/.sparkStaging/application_123456789_0138/pyspark.zip java.io.FileNotFoundException:文件不存在: hdfs://xxxxxx:8020/user/hdfs/.sparkStaging/application_123456789_0138/pyspark.zip
Diagnostics: File does not exist: hdfs://xxxxxx:8020/user/hdfs/.sparkStaging/application_123456789_0138/pyspark.zip java.io.FileNotFoundException: File does not exist: hdfs://xxxxxx:8020/user/hdfs/.sparkStaging/application_123456789_0138/pyspark.zip
我找到了 https://issues.apache.org/jira/browse/SPARK- 10795
但是票证仍然开放!
推荐答案
当您尝试通过部署模式集群"以火花方式提交作业,并且试图将master设置为"local"时,就会发生这种情况.例如
This happens when you are trying to spark-submit a job with deploy-mode "cluster" and you are trying to set master as "local"; e.g.
val sparkConf = new SparkConf().setAppName("spark-pi-app").setMaster("local[10]");
您有两个选择: 选项1: 将上面的行更改为:
You have two options: Option #1: Change the above line to:
val sparkConf = new SparkConf().setAppName("spark-pi-app");
并将您的工作提交为
./bin/spark-submit --master yarn --deploy-mode cluster --driver-memory 512m --executor-memory 512m --executor-cores 1 --num-executors 3 --jars hadoop-common-{version}.jar,hadoop-lzo-{version}.jar --verbose --queue hadoop-queue --class "SparkPi" sparksbtproject_2.11-1.0.jar
选项2:使用客户端"部署模式提交您的工作
Option #2: Submit your job with deploy-mode as "client"
./bin/spark-submit --master yarn --deploy-mode client --driver-memory 512m --executor-memory 512m --executor-cores 1 --num-executors 3 --jars hadoop-common-{version}.jar,hadoop-lzo-{version}.jar --verbose --queue hadoop-queue --class "SparkPi" sparksbtproject_2.11-1.0.jar
这篇关于Spark Python提交错误:文件不存在:pyspark.zip的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!