在Python Shell进口pyspark [英] importing pyspark in python shell
问题描述
的这是别人的问题上的另一个论坛,从未回答了一个副本,所以我想我会重新问在这里,因为我有同样的问题。 (请参阅http://geekple.com/blogs/feeds/Xgzu7/posts/351703064084736)
我有我的机器上正确安装Spark和我能够用./bin/pyspark作为我的巨蟒间preTER时没有错误运行具有pyspark模块Python程序。
I have Spark installed properly on my machine and am able to run python programs with the pyspark modules without error when using ./bin/pyspark as my python interpreter.
然而,当我试图运行常规的Python壳,当我尝试导入pyspark模块,我得到这个错误:
However, when I attempt to run the regular Python shell, when I try to import pyspark modules I get this error:
from pyspark import SparkContext
和它说:
"No module named pyspark".
我该如何解决这个问题?有没有一个环境变量,我需要设置为指向的Python到pyspark头/库/等。?如果我的火花安装/火花/,其中pyspark路径,我需要包括哪些内容?或者,可以pyspark程序只能从pyspark间preTER运行?
How can I fix this? Is there an environment variable I need to set to point Python to the pyspark headers/libraries/etc.? If my spark installation is /spark/, which pyspark paths do I need to include? Or can pyspark programs only be run from the pyspark interpreter?
推荐答案
原来,pyspark斌是LOADING Python和自动加载正确的库路径。检查出$ SPARK_HOME /斌/ pyspark:
Turns out that the pyspark bin is LOADING python and automatically loading the correct library paths. Check out $SPARK_HOME/bin/pyspark :
# Add the PySpark classes to the Python path:
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
我添加这一行到我的.bashrc文件,现在这些模块被正确地找到了!
I added this line to my .bashrc file and the modules are now correctly found!
这篇关于在Python Shell进口pyspark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!