对于纱线的火花性能问题 [英] Performance issues for spark on YARN

查看:162
本文介绍了对于纱线的火花性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在尝试运行纱线我们的火花集群。相比独立模式尤其是当我们遇到一些性能问题。

We are trying to run our spark cluster on yarn. We are having some performance issues especially when compared to the standalone mode.

我们有各具有16GB的RAM 5个节点和各8个核心的集群。我们在纱线-site.xml中配置的最小容器大小为3GB,最大值为14GB。当提交作业纱线集群,我们提供执行人= 10,遗嘱执行人= 14 GB的内存数量。按照我的理解我们的工作应分配14GB的4容器。但火花UI仅会显示每3 7.2GB的容器。

We have a cluster of 5 nodes with each having 16GB RAM and 8 cores each. We have configured the minimum container size as 3GB and maximum as 14GB in yarn-site.xml. When submitting the job to yarn-cluster we supply number of executor = 10, memory of executor =14 GB. According to my understanding our job should be allocated 4 container of 14GB. But the spark UI shows only 3 container of 7.2GB each.

我们无法保证分配给它的集装箱号和资源。相比于独立模式时,这会导致不利的性能。

We are unable to ensure the container number and resources allocated to it. This causes detrimental performance when compared to the standalone mode.

你能放下就如何优化性能纱任何指针?

Can you drop any pointer on how to optimize yarn performance?

这是我使用提交作业的命令:

This is the command I use for submitting the job:

$SPARK_HOME/bin/spark-submit --class "MyApp" --master yarn-cluster --num-executors 10 --executor-memory 14g  target/scala-2.10/my-application_2.10-1.0.jar  

接着我改变了我的纱site.xml文件的讨论,也是火花提交命令。

Following the discussion I changed my yarn-site.xml file and also the spark-submit command.

下面是新的纱线的site.xml code:

Here is the new yarn-site.xml code :

<property>
<name>yarn.resourcemanager.hostname</name>
<value>hm41</value>
</property>

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>14336</value>
</property>

<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2560</value>
</property>

<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>13312</value>
</property>

和火花新命令提交的

$SPARK_HOME/bin/spark-submit --class "MyApp" --master yarn-cluster --num-executors 4 --executor-memory  10g --executor-cores 6   target/scala-2.10/my-application_2.10-1.0.jar 

通过这个我能得到每台机器上的6个核心,但每个节点的内存使用量仍然是围绕5G。我附上SPARKUI和HTOP的屏幕截图。

With this I am able to get 6 cores on each machine but the memory usage of each node is still around 5G. I have attached the screen shot of SPARKUI and htop.

推荐答案

您在SparkUI看到内存(7.2GB)是spark.storage.memoryFraction,默认情况下为0.6。至于你丢失的执行者,你应该看看在纱线资源管理器日志。

The memory (7.2GB) you see in the SparkUI is the spark.storage.memoryFraction, which by default is 0.6. As for your missing executors, you should look in the YARN resource manager logs.

这篇关于对于纱线的火花性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆