Pyspark命令无法识别 [英] Pyspark command not recognised
问题描述
我已经安装了anaconda,并且我已经下载了Spark 1.6.2.我正在使用此答案中的以下说明为Jupyter配置火花在此处输入链接说明
I have anaconda installed and also I have downloaded Spark 1.6.2. I am using the following instructions from this answer to configure spark for Jupyter enter link description here
我已将spark目录下载并解压缩为
I have downloaded and unzipped the spark directory as
~/spark
现在,当我进入该目录和bin时,会看到以下内容
Now when I cd into this directory and into bin I see the following
SFOM00618927A:spark $ cd bin
SFOM00618927A:bin $ ls
beeline pyspark run-example.cmd spark-class2.cmd spark-sql sparkR
beeline.cmd pyspark.cmd run-example2.cmd spark-shell spark-submit sparkR.cmd
load-spark-env.cmd pyspark2.cmd spark-class spark-shell.cmd spark-submit.cmd sparkR2.cmd
load-spark-env.sh run-example spark-class.cmd spark-shell2.cmd spark-submit2.cmd
我还按照上述答案将环境变量添加到了.bash_profile和.profile
I have also added the environment variables as mentioned in the above answer to my .bash_profile and .profile
现在我要检查的spark/bin目录中的第一件事是pyspark命令是否首先在shell上运行.
Now in the spark/bin directory first thing I want to check is if pyspark command works on shell first.
所以我在执行cd spark/bin之后就这样做了
So I do this after doing cd spark/bin
SFOM00618927A:bin $ pyspark
-bash: pyspark: command not found
按照我可以做的所有步骤后的答案
As per the answer after following all the steps I can just do
pyspark
在任何目录的终端中
,它应该启动一个带有spark引擎的jupyter笔记本.但是,即使外壳中的pyspark也无法正常运行,也请忘记使其在juypter笔记本上运行
in terminal in any directory and it should start a jupyter notebook with spark engine. But even the pyspark within the shell is not working forget about making it run on juypter notebook
请告知此处出了什么问题.
Please advise what is going wrong here.
我做了
open .profile
在主目录中,这就是存储在路径中的内容.
at home directory and this is what is stored in the path.
export PATH=/Users/854319/anaconda/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/Users/854319/spark/bin
export PYSPARK_DRIVER_PYTHON=ipython
export PYSPARK_DRIVER_PYTHON_OPTS='notebook' pyspark
推荐答案
1-您需要设置 JAVA_HOME
并触发shell路径以找到它们.在您的 .profile
中设置它们之后,您可能想要
1- You need to set JAVA_HOME
and spark paths for the shell to find them. After setting them in your .profile
you may want to
source ~/.profile
激活当前会话中的设置.从您的评论中,我可以看到您已经遇到了 JAVA_HOME
问题.
to activate the setting in the current session. From your comment I can see you're already having the JAVA_HOME
issue.
请注意,如果您具有 .bash_profile
或 .bash_login
,则 .profile
将无法按照
Note if you have .bash_profile
or .bash_login
, .profile
will not work as described here
2-当您处于 spark/bin
时,您需要运行
2- When you are in spark/bin
you need to run
./pyspark
告诉外壳程序目标位于当前文件夹中.
to tell the shell that the target is in the current folder.
这篇关于Pyspark命令无法识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!