Kubernetes中的Spark作业停留在RUNNING状态 [英] Spark job in Kubernetes stuck in RUNNING state

查看:161
本文介绍了Kubernetes中的Spark作业停留在RUNNING状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在本地运行(Docker桌面)的Kubernetes中提交Spark作业.我可以提交作业,并在屏幕上查看其最终输出.

I'm submitting Spark jobs in Kubernetes running locally (Docker desktop). I'm able to submit the jobs and see their final output in the screen.

但是,即使它们已经完成,驱动程序和执行程序窗格也仍然处于RUNNING状态.

However, even if they're completed, the driver and executor pods are still in a RUNNING state.

用于提交Spark作业到kubernetes的基本映像是Spark随附的基本映像,如

The base images used to submit the Spark jobs to kubernetes are the ones that come with Spark, as described in the docs.

这是我的spark-submit命令的样子:

~/spark-2.4.3-bin-hadoop2.7/bin/spark-submit \
    --master k8s://https://kubernetes.docker.internal:6443 \
    --deploy-mode cluster \
    --name my-spark-job \
    --conf spark.kubernetes.container.image=my-spark-job \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
    --conf spark.kubernetes.submission.waitAppCompletion=false \
    local:///opt/spark/work-dir/my-spark-job.py

这是kubectl get pods返回的内容:

NAME                                READY   STATUS    RESTARTS   AGE
my-spark-job-1568669908677-driver   1/1     Running   0          11m
my-spark-job-1568669908677-exec-1   1/1     Running   0          10m
my-spark-job-1568669908677-exec-2   1/1     Running   0          10m

推荐答案

弄清楚了.我忘记了stop Spark上下文.我的脚本现在看起来像这样,完成后,驱动程序进入Completed状态,并且驱动程序被删除.

Figured it out. I forgot to stop the Spark Context. My script looks like this now, and at completion, the driver goes into Completed status and the drivers get deleted.

sc = SparkContext()

sqlContext = SQLContext(sc)

# code

sc.stop()

这篇关于Kubernetes中的Spark作业停留在RUNNING状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆