在pip安装pyspark之后运行pyspark [英] Running pyspark after pip install pyspark

查看:1097
本文介绍了在pip安装pyspark之后运行pyspark的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在家用计算机上安装pyspark.我做到了

I wanted to install pyspark on my home machine. I did

pip install pyspark
pip install jupyter

两者看起来都很好.

但是当我尝试运行pyspark时,我会得到

But when I try to run pyspark I get

pyspark
Could not find valid SPARK_HOME while searching ['/home/user', '/home/user/.local/bin']

SPARK_HOME应该设置为什么?

推荐答案

我刚刚遇到了同样的问题,但是事实证明,pip install pyspark下载的火花分布在本地模式下效果很好.点子没有设置适当的SPARK_HOME.但是当我手动设置此项时,pyspark就像一个超级按钮一样工作(无需下载任何其他软件包).

I just faced the same issue, but it turned out that pip install pyspark downloads spark distirbution that works well in local mode. Pip just doesn't set appropriate SPARK_HOME. But when I set this manually, pyspark works like a charm (without downloading any additional packages).

$ pip3 install --user pyspark
Collecting pyspark
  Downloading pyspark-2.3.0.tar.gz (211.9MB)
    100% |████████████████████████████████| 211.9MB 9.4kB/s 
Collecting py4j==0.10.6 (from pyspark)
  Downloading py4j-0.10.6-py2.py3-none-any.whl (189kB)
    100% |████████████████████████████████| 194kB 3.9MB/s 
Building wheels for collected packages: pyspark
  Running setup.py bdist_wheel for pyspark ... done
  Stored in directory: /home/mario/.cache/pip/wheels/4f/39/ba/b4cb0280c568ed31b63dcfa0c6275f2ffe225eeff95ba198d6
Successfully built pyspark
Installing collected packages: py4j, pyspark
Successfully installed py4j-0.10.6 pyspark-2.3.0

$ PYSPARK_PYTHON=python3 SPARK_HOME=~/.local/lib/python3.5/site-packages/pyspark pyspark
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
2018-03-31 14:02:39 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.3.0
      /_/

Using Python version 3.5.2 (default, Nov 23 2017 16:37:01)
>>>

希望有帮助:-)

这篇关于在pip安装pyspark之后运行pyspark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆