如何向Spark Job添加自定义描述以在Spark Web UI中显示 [英] How to add custom description to Spark Job for displaying in Spark Web UI

查看:414
本文介绍了如何向Spark Job添加自定义描述以在Spark Web UI中显示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我们向Spark提交应用程序时,执行任何操作后,Spark Web UI会显示Job和Stage,如count at MyJob.scala:15.但是在我的应用程序中,有多个countsave操作.因此,很难理解UI.除了count at MyJob.scala:15,我们还可以添加自定义描述以为作业提供更详细的信息.

When we submit application to Spark, and after performing any operation Spark Web UI displays Job and Stages like count at MyJob.scala:15. But in my application there are multiple count and save operations are there. So it is very difficult to understand UI. Instead of count at MyJob.scala:15, can we add custom description to give more detailed information to job.

在谷歌搜索中找到 https://issues.apache.org/jira/browse/SPARK -3468 https://github.com/apache/spark/pull/2342 ,作者附上图像 ,并带有计数",缓存和计数",延迟作业"等详细说明.那我们能做到吗?我正在使用Spark 2.0.0.

While googling found https://issues.apache.org/jira/browse/SPARK-3468 and https://github.com/apache/spark/pull/2342, author attached image, with detailed description like 'Count', 'Cache and Count', 'Job with delays'. So can we achieve same? I am using Spark 2.0.0.

推荐答案

使用 示例:
python:

Examples:
python:

In [28]: sc.setJobGroup("my job group id", "job description goes here")
In [29]: lines = sc.parallelize([1,2,3,4])
In [30]: lines.count()
Out[30]: 4

斯卡拉:

scala> sc.setJobGroup("my job group id", "job description goes here")
scala> val lines = sc.parallelize(List(1,2,3,4))
scala> lines.count()
res3: Long = 4

SparkUI:

我希望这是您想要的.

这篇关于如何向Spark Job添加自定义描述以在Spark Web UI中显示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆