如果工作节点上未安装Spark(在YARN上),则如何启动Spark Executors? [英] How are Spark Executors launched if Spark (on YARN) is not installed on the worker nodes?

查看:137
本文介绍了如果工作节点上未安装Spark(在YARN上),则如何启动Spark Executors?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于在集群模式下在YARN上运行的Apache Spark的问题.根据此线程,Spark本身不必安装在集群中的每个(工作)节点上.我的问题出在Spark执行器上:通常,应该由YARN或更确切地说是资源管理器来决定资源分配.因此,可以在集群中的任何(工作)节点上随机启动Spark Executors.但是,如果未在任何(工作)节点上安装Spark,那么YARN如何启动Spark Executors?

I have a question regarding Apache Spark running on YARN in cluster mode. According to this thread, Spark itself does not have to be installed on every (worker) node in the cluster. My problem is with the Spark Executors: In general, YARN or rather the Resource Manager is supposed to decide about resource allocation. Hence, Spark Executors could be launched randomly on any (worker) node in the cluster. But then, how can Spark Executors be launched by YARN if Spark is not installed on any (worker) node?

推荐答案

在较高级别上,当Spark应用程序在YARN上启动时,

In a high level, When Spark application launched on YARN,

  1. 将在一个YARN容器中创建一个Application Master(特定于火花).
  2. 其他用于Spark工作者(执行者)的YARN容器

Spark驱动程序将序列化的动作(代码)传递给执行者以处理数据.

火花组装提供了与Spark相关的jar,以便在 YARN群集和应用程序将具有其自身相关的功能 罐子.

spark-assembly provides spark related jars to run Spark jobs on a YARN cluster and application will have its own functional related jars.


(2017-01-04)


(2017-01-04)

Spark 2.0 不再需要笨拙的组装罐进行生产 部署.

Spark 2.0 no longer requires a fat assembly jar for production deployment.source

这篇关于如果工作节点上未安装Spark(在YARN上),则如何启动Spark Executors?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆