如何在YARN中配置垄断FIFO应用程序队列? [英] How to configure monopolistic FIFO application queue in YARN?

查看:122
本文介绍了如何在YARN中配置垄断FIFO应用程序队列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在hadoop集群中禁用YARN应用程序的并行执行.现在,YARN具有默认设置,因此可以并行运行多个作业.我看不出这有什么好处,因为这两个作业的运行速度都较慢.

I need to disable parallel execution of YARN applications in hadoop cluster. Now, YARN has default settings, so several jobs can run in parallel. I see no advantages of this, because both jobs run slower.

我发现此设置yarn.scheduler.capacity.maximum-applications限制了最大应用程序数,但同时影响了已提交和正在运行的应用程序(如文档所述).我想将提交的应用程序保持在队列中,直到当前正在运行的应用程序尚未完成.该怎么办?

I found this setting yarn.scheduler.capacity.maximum-applications which limits maximum number of applications, but it affects both submitted and running apps (as stated in docs). I'd like to keep submitted apps in queue until current running application is not finished. How can this be done?

推荐答案

1)将Scheduler更改为FairScheduler

1) Change Scheduler to FairScheduler

Hadoop发行版默认情况下使用CapacityScheduler(Cloudera使用FairScheduler作为默认调度程序).将此属性添加到yarn-site.xml

Hadoop distributions use CapacityScheduler by default (Cloudera uses FairScheduler as default Scheduler). Add this property to yarn-site.xml

<property>
  <name>yarn.resourcemanager.scheduler.class</name>
  <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>

2)设置default队列

2) Set default Queue

Fair Scheduler为每个用户创建一个队列.即,如果三个不同的用户提交作业,则将创建三个单独的队列,并且资源将在三个队列之间共享.通过在yarn-site.xml

Fair Scheduler creates a queue per user. I.E., if three different users submit jobs then three individual queues will be created and the resources will be shared among the three queues. Disable it by adding this property in yarn-site.xml

<property>
  <name>yarn.scheduler.fair.user-as-default-queue</name>
  <value>false</value>
</property>

这可确保所有作业进入单个 default 队列.

This assures that all the jobs go into a single default queue.

3)限制最大申请数

现在,作业队列已被限制为一个default队列.将可以在该队列中运行的应用程序的最大数量限制为 1 .

Now that the job queue has been limited to one default queue. Restrict the maximum number of applications to 1 that can be run in that queue.

$HADOOP_CONF_DIR下创建一个名为fair-scheduler.xml的文件并添加这些条目

Create a file named fair-scheduler.xml under the $HADOOP_CONF_DIR and add these entries

<allocations>
   <queueMaxAppsDefault>1</queueMaxAppsDefault>
</allocations>

此外,将此属性添加到yarn-site.xml

Also, add this property in yarn-site.xml

<property>
  <name>yarn.scheduler.fair.allocation.file</name>
  <value>$HADOOP_CONF_DIR/fair-scheduler.xml</value>
</property>

添加这些属性后,

重新启动YARN服务.

Restart YARN services after adding these properties.

在提交多个应用程序时,首先将应用程序ACCEPTED视为活动应用程序,其余的将作为待处理应用程序排队.这些待处理的应用程序将继续处于ACCEPTED状态,直到RUNNING应用程序为FINISHED.允许Active应用程序使用所有可用资源.

On submitting multiple applications, the application ACCEPTED first will be considered as the Active application and the remaining will be queued as Pending applications. These pending applications will continue to be in ACCEPTED state until the RUNNING application is FINISHED. The Active application will be allowed to utilise all the available resources.

参考: 查看全文

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆