强制驱动程序在以"--deploy-mode cluster"运行的spark独立群集中的特定从属服务器上运行. [英] Forcing driver to run on specific slave in spark standalone cluster running with "--deploy-mode cluster"

查看：88 发布时间：2021/4/8 19:57:50 apache-spark apache-spark-standalone

本文介绍了强制驱动程序在以"--deploy-mode cluster"运行的spark独立群集中的特定从属服务器上运行.的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在运行一个小型Spark集群，其中有两个EC2实例(m4.xlarge).

到目前为止，我一直在一个节点上运行spark主服务器，在另一个节点上运行单个spark从属服务器(4个核心，16g内存)，然后在主节点上以客户端部署模式部署我的spark(流)应用程序.设置摘要是:

-执行者内存16克

-执行者核心4

-驱动程序内存8克

-驱动程序核心2

-部署模式客户端

这将导致在我的单个从服务器上运行4个内核和16Gb内存的单个执行程序.驱动程序在主节点上的群集外部"运行(即，主节点未为其分配资源).

理想情况下，我想使用集群部署模式，以便可以利用supervise选项.我已经在主节点上启动了第二个从节点，为它提供了2个内核和8g内存(分配的资源较少，以便为主守护程序留出空间).

当我在群集部署模式下运行我的spark作业时(使用与上述相同的设置，但使用--deploy-mode群集).大约有50％的时间我得到了所需的部署，即驱动程序通过在主节点上运行的从属服务器运行(该主节点具有2个核和8Gb的正确资源)，这使原始的从属节点可以自由分配执行权.4核16 GB.但是，其他50％的时间是主服务器在非主服务器从节点上运行驱动程序，这意味着我在该节点上获得了具有2个内核和2个内核的驱动程序.8Gb内存，这将使节点没有足够的资源来启动执行程序(需要4个核和16Gb的内存).

是否有任何方法可以强制Spark主控制器为驱动程序使用特定的工作程序/从属程序?Given Spark知道有两个从属节点，一个具有2个核心，另一个具有4个核心，并且我的驱动程序需要2个核心，而我的执行者则需要4个核心，因此理想情况下可以得出正确的最佳位置，但这并没有似乎是这样.

感谢任何想法/建议！

谢谢！

解决方案

我可以看到这是一个老问题，但是让我仍然回答，有人可能会发现它有用.

在提交应用程序时，将-driver-java-options =-Dspark.driver.host =< HOST>" 选项添加到 spark-submit 脚本中，并且Spark应该将驱动程序部署到指定的主机.

I am running a small spark cluster, with two EC2 instances (m4.xlarge).

So far I have been running the spark master on one node, and a single spark slave (4 cores, 16g memory) on the other, then deploying my spark (streaming) app in client deploy-mode on the master. Summary of settings is:

--executor-memory 16g

--executor-cores 4

--driver-memory 8g

--driver-cores 2

--deploy-mode client

This results in a single executor on my single slave running with 4 cores and 16Gb memory. The driver runs "outside" of the cluster on the master-node (i.e. it is not allocated its resources by the master).

Ideally I'd like to use cluster deploy-mode so that I can take advantage of the supervise option. I have started a second slave on the master node giving it 2 cores and 8g memory (smaller allocated resources so as to leave space for the master daemon).

When I run my spark job in cluster deploy-mode (using the same settings as above but with --deploy-mode cluster). Around 50% of the time I get the desired deployment which is that the driver runs through the slave running on the master node (which has the right resources of 2 cores & 8Gb) which leaves the original slave node free to allocate an executor of 4 cores & 16Gb. However the other 50% of the time the master runs the driver on the non-master slave node, which means I get an driver on that node with 2 cores & 8Gb memory, which then leaves no node with sufficient resources to start an executor (which requires 4 cores & 16Gb).

Is there any way to force the spark master to use a specific worker / slave for my driver? Given spark knows that there are two slave nodes, one with 2 cores and the other with 4 cores, and that my driver needs 2 cores, and my executor needs 4 cores it would ideally work out the right optimal placement, but this doesn't seem to be the case.

Any ideas / suggestions gratefully received!

Thanks!

解决方案

I can see that this is an old question, but let me answer it still, someone might find it useful.

Add --driver-java-options="-Dspark.driver.host=<HOST>" option to spark-submit script, when submitting application, and Spark should deploy driver to specified host.

这篇关于强制驱动程序在以"--deploy-mode cluster"运行的spark独立群集中的特定从属服务器上运行.的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

强制驱动程序在以"--deploy-mode cluster"运行的spark独立群集中的特定从属服务器上运行. [英] Forcing driver to run on specific slave in spark standalone cluster running with "--deploy-mode cluster"

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

强制驱动程序在以"--deploy-mode cluster"运行的spark独立群集中的特定从属服务器上运行. [英] Forcing driver to run on specific slave in spark standalone cluster running with &quot;--deploy-mode cluster&quot;

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

强制驱动程序在以"--deploy-mode cluster"运行的spark独立群集中的特定从属服务器上运行. [英] Forcing driver to run on specific slave in spark standalone cluster running with "--deploy-mode cluster"

登录关闭