spark-submit:"--master local [n]";和"--master local --executor-cores m"; [英] spark-submit: Difference between " --master local[n]" and "--master local --executor-cores m"

查看:237
本文介绍了spark-submit:"--master local [n]";和"--master local --executor-cores m";的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一台双核计算机(每个核有2个线程).我使用2个不同的spark-submit参数运行Spark作业.

I have a dual-core machine (with 2 threads on each core). I run a Spark job with 2 different spark-submit parameters.

spark-submit --master local[4]

spark-submit --master local --executor-cores 2

以上两个示例之间真的有区别吗?我试图让Spark总共使用4个线程来执行Spark任务",每个物理核心上使用2个线程.

Is there really any difference between the two examples above? I am trying to get Spark to use 4 total threads for Spark "tasks", 2 threads on each physical core.

推荐答案

首先,--executor-cores参数或spark.executor.cores配置选项不适用于本地模式.结果:

First of all --executor-cores argument or spark.executor.cores configuration option are not applicable in local mode. As a result:

  • --master local[4]使用四个工作线程以本地模式启动Spark.
  • --master local使用一个工作线程以本地模式启动Spark. --executor-core无效.
  • --master local[4] starts Spark in the local mode using four worker threads.
  • --master local starts Spark in the local mode using one worker thread. --executor-core has no effect.

这仅说明数据处理"线程. Spark使用的线程总数可能会大得多.

This accounts only for "data processing" threads. Overall number of threads used by Spark can be significantly larger.

如果不使用操作系统和调度细节,第一个选择就是您要使用四个线程的情况.

Without going into OS and scheduling details the first option is the one you're looking for if you want to utilize four threads.

这篇关于spark-submit:"--master local [n]";和"--master local --executor-cores m";的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆