提高spark.yarn.executor.memoryOverhead [英] Boosting spark.yarn.executor.memoryOverhead

查看:884
本文介绍了提高spark.yarn.executor.memoryOverhead的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在EMR上运行(py)Spark作业,该作业将处理大量数据.目前,我的工作失败,并显示以下错误消息:

I'm trying to run a (py)Spark job on EMR that will process a large amount of data. Currently my job is failing with the following error message:

Reason: Container killed by YARN for exceeding memory limits.
5.5 GB of 5.5 GB physical memory used.
Consider boosting spark.yarn.executor.memoryOverhead.

因此,我用谷歌搜索了如何执行此操作,发现我应该使用--conf标志传递spark.yarn.executor.memoryOverhead参数.我这样做是这样的:

So I google'd how to do this, and found that I should pass along the spark.yarn.executor.memoryOverhead parameter with the --conf flag. I'm doing it this way:

aws emr add-steps\
--cluster-id %s\
--profile EMR\
--region us-west-2\
--steps Name=Spark,Jar=command-runner.jar,\
Args=[\
/usr/lib/spark/bin/spark-submit,\
--deploy-mode,client,\
/home/hadoop/%s,\
--executor-memory,100g,\
--num-executors,3,\
--total-executor-cores,1,\
--conf,'spark.python.worker.memory=1200m',\
--conf,'spark.yarn.executor.memoryOverhead=15300',\
],ActionOnFailure=CONTINUE" % (cluster_id,script_name)\

但是,当我重新运行该作业时,它会一直显示5.5 GB of 5.5 GB physical memory used的错误消息,这表明我的记忆没有增加..关于我在做什么的任何提示?

But when I rerun the job it keeps giving me the same error message, with the 5.5 GB of 5.5 GB physical memory used, which implies that my memory did not increase.. any hints on what I am doing wrong?

编辑

以下是有关我最初如何创建集群的详细信息:

Here are details on how I initially create the cluster:

aws emr create-cluster\
--name "Spark"\
--release-label emr-4.7.0\
--applications Name=Spark\
--bootstrap-action Path=s3://emr-code-matgreen/bootstraps/install_python_modules.sh\
--ec2-attributes KeyName=EMR2,InstanceProfile=EMR_EC2_DefaultRole\
--log-uri s3://emr-logs-zerex\
--instance-type r3.xlarge\
--instance-count 4\
--profile EMR\
--service-role EMR_DefaultRole\
--region us-west-2'

谢谢.

推荐答案

几个小时后,我找到了解决此问题的方法.创建集群时,我需要传递以下标志作为参数:

After a couple of hours I found the solution to this problem. When creating the cluster, I needed to pass on the following flag as a parameter:

--configurations file://./sparkConfig.json\

包含以下内容的JSON文件

With the JSON file containing:

[
    {
      "Classification": "spark-defaults",
      "Properties": {
        "spark.executor.memory": "10G"
      }
    }
  ]

这使我可以通过使用最初发布的参数来增加下一步的memoryOverhead.

This allows me to increase the memoryOverhead in the next step by using the parameter I initially posted.

这篇关于提高spark.yarn.executor.memoryOverhead的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆