Apache Zeppelin如何计算Spark作业进度条? [英] How Apache Zeppelin computes Spark job progress bar?

查看:176
本文介绍了Apache Zeppelin如何计算Spark作业进度条?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从Apache Zeppelin笔记本界面启动Spark作业时,它会向您显示作业执行的进度条.但是,这一进展实际上意味着什么?有时它会缩小或扩展.是当前阶段的进展还是整个工作?

When starting spark job from Apache Zeppelin notebook interface it shows you a progress bar of job execution. But what does this progress actually mean? Sometimes it shrinks or expands. Is it a progress of current stage or a whole job?

推荐答案

在Web界面中,进度条显示的是getProgress函数返回的值(并非为每个交错器都实现,例如

In the web interface, the progress bar is showing the value returned by the getProgress function (not implemented for every interpeters, such as python).

此函数返回一个百分比.

This function returns a percentage.

使用Spark解释器时,该值似乎是已完成任务的百分比(从progress函数)/spark-scala-parent/src/main/scala/org/apache/zeppelin/spark/JobProgressUtil.scala"rel =" nofollow noreferrer> JobProgressUtil ):

When using the Spark interpreter, the value seems to be the percentage of tasks done (Calling the following progress function from JobProgressUtil) :

def progress(sc: SparkContext, jobGroup : String):Int = {
    val jobIds = sc.statusTracker.getJobIdsForGroup(jobGroup)
    val jobs = jobIds.flatMap { id => sc.statusTracker.getJobInfo(id) }
    val stages = jobs.flatMap { job =>
      job.stageIds().flatMap(sc.statusTracker.getStageInfo)
    }

    val taskCount = stages.map(_.numTasks).sum
    val completedTaskCount = stages.map(_.numCompletedTasks).sum
    if (taskCount == 0) {
      0
    } else {
      (100 * completedTaskCount.toDouble / taskCount).toInt
    }
}

与此同时,我在Zeppelin文档中找不到指定的内容.

Meanwhile, I could not find it specified in the Zeppelin documentation.

这篇关于Apache Zeppelin如何计算Spark作业进度条?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆