我可以在集群模式下运行dataproc作业吗 [英] Can I run dataproc jobs in cluster mode

查看:97
本文介绍了我可以在集群模式下运行dataproc作业吗的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

刚刚开始熟悉GCP dataproc.我已经注意到,当我使用gcloud dataproc jobs submit pyspark时,作业是通过spark.submit.deployMode=client提交的. spark.submit.deployMode=cluster对我们来说是一个选择吗?

Just starting to get familiar with GCP dataproc. I've noticed when I use gcloud dataproc jobs submit pyspark that jobs are submitted with spark.submit.deployMode=client. Is spark.submit.deployMode=cluster an option for us?

推荐答案

是的,可以通过指定--properties spark.submit.deployMode=cluster来进行.只需注意驱动程序输出将在yarn用户日志中(您可以从控制台的Stackdriver Logging中访问它们).默认情况下,我们在客户端模式下运行,以将驱动程序输出流式传输给您.

Yes, you can, by specifying --properties spark.submit.deployMode=cluster. Just note that driver output will be in yarn userlogs (you can access them in Stackdriver Logging from the Console). We run in client mode by default to stream driver output to you.

这篇关于我可以在集群模式下运行dataproc作业吗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆