Hadoop Capacity Scheduler和Spark [英] Hadoop Capacity Scheduler and Spark

查看：95 发布时间：2021/4/8 19:49:18 hadoop apache-spark cloudera

本文介绍了Hadoop Capacity Scheduler和Spark的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如果我按照此处的说明在纱线中定义CapacityScheduler队列

If I define CapacityScheduler Queues in yarn as explained here

http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html

我该如何使用它?

我想运行spark作业...但是它们不应该占用所有群集，而应该在CapacityScheduler上执行，CapacityScheduler已为其分配了一组固定的资源.

I want to run spark jobs... but they should not take up all the cluster but instead execute on a CapacityScheduler which has a fixed set of resources allocated to it.

这是否可能……特别是在cloudera平台上(假设cloudera上的火花在纱线上运行?)

Is that possible ... specifically on the cloudera platform (given that spark on cloudera runs on yarn?).

推荐答案

您应通过编辑Capacity-scheduler.xml来根据需要配置CapacityScheduler.您还需要在yarn-site.xml中将yarn.resourcemanager.scheduler.class指定为org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler，这也是当前hadoop版本的默认选项
将Spark作业提交到设计的队列中.

例如:

$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi \
    --master yarn \
    --deploy-mode cluster \
    --driver-memory 4g \
    --executor-memory 2g \
    --executor-cores 1 \
    --queue thequeue \
    lib/spark-examples*.jar \
    10

-queue 指示您将提交的队列，该队列应与CapacityScheduler配置保持一致

The --queue indicates the queue you will submit which should be conformed with your CapacityScheduler configuration

这篇关于Hadoop Capacity Scheduler和Spark的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Hadoop Capacity Scheduler和Spark [英] Hadoop Capacity Scheduler and Spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Hadoop Capacity Scheduler和Spark [英] Hadoop Capacity Scheduler and Spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭