在 google Colaboratory 上工作时如何打开 Spark UI? [英] How to open Spark UI when working on google Colaboratory?

查看:47
本文介绍了在 google Colaboratory 上工作时如何打开 Spark UI?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何通过 Spark WEB UI 监控作业的进度?如果我运行本地模式,我可以使用本地 PC 上的端口 4040 访问 Spark UI.我只是使用 http://localhost:4040.

How can I monitor the progress of a job through the Spark WEB UI? I can access Spark UI using the port 4040 on my local PC if I am running local mode. I just use http://localhost:4040.

推荐答案

按照这个 colab notebook 您可以执行以下操作.

Following this colab notebook you can do the following.

首先,配置 Spark UI 并启动 Spark 会话:

First, configure the Spark UI and start a Spark session:

import findspark
findspark.init()
from pyspark.sql import SparkSession
from pyspark import SparkContext, SparkConf


conf = SparkConf().set('spark.ui.port', '4050')
sc = SparkContext(conf=conf)
spark = SparkSession.builder.master('local[*]').getOrCreate()

在下一个单元格中运行:

In the next cell run:

!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip ngrok-stable-linux-amd64.zip
get_ipython().system_raw('./ngrok http 4050 &')

这将安装 ngrok 并创建一个 URL,您可以通过该 URL 访问 Spark UI(等待 10 秒以启动).

which will install ngrok and create a URL through which you can access the Spark UI (wait 10sec for it to start).

现在,要访问 URL,请调用:

Now, to access the URL, call:

!curl -s http://localhost:4040/api/tunnels

它打印出一个看起来像这样的 JSON(被截断):

which prints out a JSON that looks something like this (truncated):

{"tunnels":[{"name":"command_line","uri":"/api/tunnels/command_line","public_url":"https://1b881e94406c.ngrok.io","proto":"https", ... }

--您正在寻找上面的这个 "public_url",这是您的 Spark UI 的 URL.

-- you're looking for the this "public_url" above, that's your Spark UI's URL.

或者,运行这个:

!curl -s http://localhost:4040/api/tunnels | python3 -c "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

我已经测试过了,它对我有用.

I've tested it and it works for me.

这篇关于在 google Colaboratory 上工作时如何打开 Spark UI?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆