使用Amazon的"maximizeResourceAllocation"的Spark + EMR设置未使用所有核心/vcore [英] Spark + EMR using Amazon's "maximizeResourceAllocation" setting does not use all cores/vcores

查看:119
本文介绍了使用Amazon的"maximizeResourceAllocation"的Spark + EMR设置未使用所有核心/vcore的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Amazon特定的maximizeResourceAllocation标志针对Spark运行EMR集群(版本为emr-4.2.0),如

I'm running an EMR cluster (version emr-4.2.0) for Spark using the Amazon specific maximizeResourceAllocation flag as documented here. According to those docs, "this option calculates the maximum compute and memory resources available for an executor on a node in the core node group and sets the corresponding spark-defaults settings with this information".

我正在使用m3.2xlarge实例作为工作节点运行集群.我正在为YARN主服务器使用单个m3.xlarge-我可以让它运行的最小的m3实例,因为它执行的不多.

I'm running the cluster using m3.2xlarge instances for the worker nodes. I'm using a single m3.xlarge for the YARN master - the smallest m3 instance I can get it to run on, since it doesn't do much.

情况是这样的:当我运行Spark作业时,每个执行器的请求核心数是8.(我只有在配置"yarn.scheduler.capacity.resource-calculator": "org.apache.hadoop.yarn.util.resource.DominantResourceCalculator"之后才得到这个,而实际上不在文档中,但是我离题了).这似乎是有道理的,因为根据这些文档,m3.2xlarge具有8"vCPU".但是,在实际实例本身上,在/etc/hadoop/conf/yarn-site.xml中,每个节点都配置为将yarn.nodemanager.resource.cpu-vcores设置为16.我会(猜测)认为这一定是由于超线程或其他一些硬件幻想造成的.

The situation is this: When I run a Spark job, the number of requested cores for each executor is 8. (I only got this after configuring "yarn.scheduler.capacity.resource-calculator": "org.apache.hadoop.yarn.util.resource.DominantResourceCalculator" which isn't actually in the documentation, but I digress). This seems to make sense, because according to these docs an m3.2xlarge has 8 "vCPUs". However, on the actual instances themselves, in /etc/hadoop/conf/yarn-site.xml, each node is configured to have yarn.nodemanager.resource.cpu-vcores set to 16. I would (at a guess) think that must be due to hyperthreading or perhaps some other hardware fanciness.

所以问题是这样的:当我使用maximizeResourceAllocation时,我得到了Amazon实例类型具有的"vCPU"数量,这似乎只是YARN运行的已配置"VCore"数量的一半.节点;结果,执行程序仅使用实例上实际计算资源的一半.

So the problem is this: when I use maximizeResourceAllocation, I get the number of "vCPUs" that the Amazon Instance type has, which seems to be only half of the number of configured "VCores" that YARN has running on the node; as a result, the executor is using only half of the actual compute resources on the instance.

这是Amazon EMR中的错误吗?其他人也遇到同样的问题吗?我还缺少其他一些不可记录的魔术配置吗?

Is this a bug in Amazon EMR? Are other people experiencing the same problem? Is there some other magic undocumented configuration that I am missing?

推荐答案

好的,经过大量的实验,我能够找到问题所在.我将在这里报告我的发现,以帮助人们将来避免沮丧.

Okay, after a lot of experimentation, I was able to track down the problem. I'm going to report my findings here to help people avoid frustration in the future.

  • 虽然YARN要求的8个内核与16个VCore之间存在差异,但这似乎没有什么区别. YARN并没有使用cgroups或其他任何花哨的东西来实际限制执行程序实际可以使用的CPU数量.
  • 执行器上的核心"实际上有点用词不当.实际上,执行者一次愿意执行多少个并发任务.本质上可以归结为每个执行器上有多少线程正在执行工作".
  • 设置maximizeResourceAllocation时,在运行Spark程序时,它将属性spark.default.parallelism设置为集群中所有非主实例的实例核心(或"vCPU")的数量. .即使在正常情况下,这也可能太小了;我听说建议将其设置为运行作业所需的内核数的4倍.这将有助于确保在任何给定阶段都有足够的可用任务,以使所有执行器上的CPU保持忙碌状态.
  • 当您具有来自不同运行的不同spark程序的数据时,很可能会使用不同数量的分区来保存数据(采用RDD或Parquet格式或其他格式).运行Spark程序时,请确保在加载时或在CPU密集型任务之前对数据进行重新分区.由于您可以在运行时访问spark.default.parallelism设置,因此这是重新分配分区的便捷方式.
  • While there is a discrepancy between the 8 cores asked for and the 16 VCores that YARN knows about, this doesn't seem to make a difference. YARN isn't using cgroups or anything fancy to actually limit how many CPUs the executor can actually use.
  • "Cores" on the executor is actually a bit of a misnomer. It is actually how many concurrent tasks the executor will willingly run at any one time; essentially boils down to how many threads are doing "work" on each executor.
  • When maximizeResourceAllocation is set, when you run a Spark program, it sets the property spark.default.parallelism to be the number of instance cores (or "vCPUs") for all the non-master instances that were in the cluster at the time of creation. This is probably too small even in normal cases; I've heard that it is recommended to set this at 4x the number of cores you will have to run your jobs. This will help make sure that there are enough tasks available during any given stage to keep the CPUs busy on all executors.
  • When you have data that comes from different runs of different spark programs, your data (in RDD or Parquet format or whatever) is quite likely to be saved with varying number of partitions. When running a Spark program, make sure you repartition data either at load time or before a particularly CPU intensive task. Since you have access to the spark.default.parallelism setting at runtime, this can be a convenient number to repartition to.

TL; DR

  1. maximizeResourceAllocation几乎可以为您正确完成所有操作,除了...
  2. 您可能希望将spark.default.parallelism显式设置为希望作业基于步长"(以EMR表示)/应用程序"(以YARN表示)运行的实例核心数的4倍,即每次设置并...
  3. 确保在程序内 数据已适当分区(即需要多个分区)以允许Spark正确并行化数据
  1. maximizeResourceAllocation will do almost everything for you correctly except...
  2. You probably want to explicitly set spark.default.parallelism to 4x number of instance cores you want the job to run on on a per "step" (in EMR speak)/"application" (in YARN speak) basis, i.e. set it every time and...
  3. Make sure within your program that your data is appropriately partitioned (i.e. want many partitions) to allow Spark to parallelize it properly

这篇关于使用Amazon的"maximizeResourceAllocation"的Spark + EMR设置未使用所有核心/vcore的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆