如何设置pyspark jupyter笔记本的端口? [英] How to set port for pyspark jupyter notebook?
问题描述
我正在使用脚本启动pyspark jupyter笔记本:
I am starting a pyspark jupyter notebook with a script:
#!/bin/bash
ipaddres=...
echo "Start notebook server at IP address $ipaddress"
function snotebook ()
{
#Spark path (based on your computer)
SPARK_PATH=/home/.../software/spark-2.3.1-bin-hadoop2.7
export PYSPARK_DRIVER_PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
# For python 3 users, you have to add the line below or you will get an error
export PYSPARK_PYTHON=python3
$SPARK_PATH/bin/pyspark --master local[10]
}
snotebook --no-browser --ip $ipaddress --certfile=/home/.../local/mycert.pem --keyfile /home/.../local/mykey.key
我想知道如何设置端口.我可以设置一个环境变量吗?我想在笔记本电脑启动之前确定端口.我尝试了--port 7999
.
I wonder how to set the port. Is there an environment variable that I can set? I would like to determine the port before the notebook starts. I tried --port 7999
.
推荐答案
如果您是指Spark UI端口,则在spark-env.sh
中,它列出了您可以覆盖或在该文件中设置的这两个环境变量.
If you mean Spark UI ports, in the spark-env.sh
, it lists these two environment variables that you can overwrite, or set in that file
# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master
# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker
我不确定Jupyter值还是PySpark甚至通过它们,但是如果jupyter notebook --port
可以独立工作,那么我会尝试
I'm not sure the Jupyter values or if PySpark even passes them through, but if jupyter notebook --port
works on its own, then I would try
export PYSPARK_DRIVER_PYTHON_OPTS="notebook --port=7999"
如果要将snotebook
中的所有参数传递给变量,则需要
If you want to pass all the argument from snotebook
into the variable, then you need
export PYSPARK_DRIVER_PYTHON_OPTS="notebook $@"
这篇关于如何设置pyspark jupyter笔记本的端口?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!