spark-submit.sh如何与不同的模式和不同的集群管理器一起工作? [英] How does spark-submit.sh work with different modes and different cluster managers?

查看:41
本文介绍了spark-submit.sh如何与不同的模式和不同的集群管理器一起工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Apache Spark中,spark-submit.sh如何在不同的模式和不同的集群管理器下工作?具体来说:

In Apache Spark, how does spark-submit.sh work with different modes and different cluster managers? Specifically:

在本地部署模式下,

  • spark-submit.sh是否跳过任何​​集群管理器的调用?
  • 不需要在本地计算机上安装集群管理器是否正确?

在客户端或群集部署模式下,

In client or cluster deployment mode,

  • spark-submit.sh是否可以与不同的集群管理器(Spark独立服务器,YARN,Mesos,Kubernetes)一起使用?不同的集群管理器是否具有不同的接口,并且spark-submit.sh必须以不同的方式调用它们?

  • Does spark-submit.sh work with different cluster managers (Spark standalone, YARN, Mesos, Kubernetes)? Do different cluster managers have different interfaces, and spark-submit.sh has to invoke them in different ways?

spark-submit.sh对程序员来说是否显示与-master 相同的界面?spark-submit.sh的选项-master 用于指定集群管理器.

Does spark-submit.sh appear to programmers the same interface except --master? option --master of spark-submit.sh is used to specify a cluster manager.

谢谢.

推荐答案

要弄清楚,在任何模式( client 集群或是否以 local 模式运行spark).集群管理器只是为了使资源分配更容易和独立而已,但是始终选择使用还是不使用它.

To make things clear, there is absolutely no need to specify any cluster manager while running spark on any mode (client or cluster or whether you run spark in local mode). The cluster manager is only there to make resource allocation easier and independent, but it is always your choice to use one or not.

spark-submit 命令不需要运行集群管理器.

The spark-submit command doesn't need a cluster manager present to run.

使用该命令的不同方式是:

The different ways in which you can use the command are:

1) local 模式:

./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master local[8] \
  /path/to/examples.jar \
  100

2)没有资源管理器(也称为

2) client mode without a resource manager (also known as spark standalone mode):

./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

3)具有火花独立模式的 cluster 模式:

3) cluster mode with spark standalone mode:

./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

4)使用资源管理器的客户端/集群模式:

4) Client/Cluster mode with a resource manager:

./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode cluster \  # can be client for client mode
  --executor-memory 20G \
  --num-executors 50 \
  /path/to/examples.jar \
  1000

如上所述,无论是否存在集群管理器,spark-submit.sh的行为都将相同.另外,如果要使用诸如yarn,mesos之类的资源管理器,则spark-submit的行为将保持不变.您可以在 spark-submit 此处了解更多信息.

As you can see above, the spark-submit.sh will behave in the same way whether there is a cluster manager or not. Also, if you want to use a resource manager like yarn, mesos, the behaviour of spark-submit will remain the same. You can read more about spark-submit here.

这篇关于spark-submit.sh如何与不同的模式和不同的集群管理器一起工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆