Spark History Server Spark UI作业部分中的什么是“活动作业” [英] What is 'Active Jobs' in Spark History Server Spark UI Jobs section

查看:112
本文介绍了Spark History Server Spark UI作业部分中的什么是“活动作业”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试了解Spark History服务器组件。
我知道,历史记录服务器显示已完成的Spark应用程序。



不过,我看到已完成的Spark应用程序的活动作业设置为1。我试图了解职位部分中的有效职位是什么意思。
此外,应用程序在30分钟内完成,但是当我在8小时后打开History Server时,持续时间显示为8.0h。
请参阅屏幕截图。





能否请您帮助我理解上图中的有效工作,持续时间和阶段:成功/总计项目?

解决方案

最后经过一番研究,找到了我的问题的答案。



Spark应用程序由一个驱动程序和一个或多个执行程序组成。驱动程序实例化SparkContext,后者协调执行程序以运行Spark应用程序。此信息显示在Spark History Server Web UI的 <活动>活动中。



执行器运行驱动程序分配的任务。



当Spark应用程序在YARN上运行时,它具有自己的yarn客户程序和yarn应用程序主版本。
YARN应用程序有一个纱线客户端,纱线应用程序主节点以及在节点管理器上运行的容器列表。



在我的情况下,Yarn以独立模式运行,因此驱动程序程序作为纱线应用程序主线程运行。 Yarn客户端从应用程序主机获取状态,应用程序主机协调容器以运行任务。



可以在Cloudera Manager Admin中的YARN应用程序页面中监视此正在运行的作业。控制台,正在运行



如果应用程序成功,则历史记录服务器将显示 已完成作业列表。 和 活动作业部分也将被删除



如果应用在容器级别和YARN失败然后将此信息传达给驱动程序,历史记录服务器将显示 失败的作业列表,并且 活动的作业部分也将被删除。 / p>

但是,如果应用程序在容器级别失败,并且YARN无法将其传达给驱动程序,则驱动程序实例化的作业将进入遗忘状态。它认为作业仍在运行,并一直在等待YARN应用程序主机通知作业状态。因此,在History Server中,它仍以 running 的形式显示在 活动作业中。



因此,我的解决方法是:
要检查正在运行的作业的状态,请转到Cloudera Manager管理控制台中的YARN应用程序页面或使用YARN CLI命令。
作业完成/失败后,打开Spark History Server以获取有关资源使用,DAG和执行时间轴信息的更多详细信息。


I'm trying to understand Spark History server components. I know that, History server shows completed Spark applications.

Nonetheless, I see 'Active Jobs' set to 1 for a completed Spark application. I'm trying to understand what is 'Active Jobs' mean in Jobs section. Also, Application completed within 30 minutes, but when I opened History Server after 8 hours, 'Duration' shows 8.0h. Please see the screenshot.

Could you please help me understand 'Active Jobs', 'Duration' and 'Stages: Succeeded/Total' items in above image?

解决方案

Finally after some research, found answer for my question.

A Spark application consists of a driver and one or more executors. The driver program instantiates SparkContext, which coordinates the executors to run the Spark application. This information is displayed on Spark History Server Web UI 'Active Jobs' section.

The executors run tasks assigned by the driver.

When Spark application runs on YARN, it has its own implementation of yarn client and yarn application master. YARN application has a yarn client, yarn application master and list of container running on node managers.

In my case Yarn is running in standalone mode, thus driver program is running as a thread of the yarn application master. The Yarn client pulls status from the application master and application master coordinates the containers to run the tasks.

This running job could be monitored in YARN applications page in the Cloudera Manager Admin Console, while it is running.

If application succeeds, then History server will show list of 'Completed Jobs' and also 'Active Jobs' section will be removed.

If application fails at the containers level and YARN communicates this information to Driver then, History server will show list of 'Failed Jobs' and also 'Active Jobs' section will be removed.

Nonetheless, if application fails at the containers level and YARN couldn't communicate that to driver, then Driver instantiated job gets into oblivion state. It thinks job is still being run and keeps waiting to hear from YARN application master for the job status. Hence, in History Server, it still shows up in 'Active Jobs' as running.

So my take away from this is: To check the status of running job, go to YARN applications page in the Cloudera Manager Admin Console or use YARN CLI command. After job completion/failure, Open the Spark History Server to get more details on resources usage, DAG and execution timeline information.

这篇关于Spark History Server Spark UI作业部分中的什么是“活动作业”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆