java.io.IOException:无法运行程序“python";在 Pycharm 中使用 Spark (Windows) [英] java.io.IOException: Cannot run program "python" using Spark in Pycharm (Windows)

查看:36
本文介绍了java.io.IOException:无法运行程序“python";在 Pycharm 中使用 Spark (Windows)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Pycharm 中的 Spark 编写一个非常简单的代码,我的操作系统是 Windows 8.我一直在处理几个问题,除了一个问题之外,这些问题以某种方式设法解决了.当我使用 pyspark.cmd 运行代码时,一切正常,但我在 pycharm 中使用相同的代码没有运气.我使用以下代码修复了 SPARK_HOME 变量的问题:

I am trying to write a very simple code using Spark in Pycharm and my os is Windows 8. I have been dealing with several problems which somehow managed to fix except for one. When I run the code using pyspark.cmd everything works smoothly but I have had no luck with the same code in pycharm. There was a problem with SPARK_HOME variable which I fixed using the following code:

import sys
import os
os.environ['SPARK_HOME'] = "C:/Spark/spark-1.4.1-bin-hadoop2.6"
sys.path.append("C:/Spark/spark-1.4.1-bin-hadoop2.6/python")
sys.path.append('C:/Spark/spark-1.4.1-bin-hadoop2.6/python/pyspark')

所以现在当我导入 pyspark 并且一切正常时:

So now when I import the pyspark and everything is fine:

from pyspark import SparkContext

当我想运行其余代码时,问题就出现了:

The problem rises when I want to run the rest of my code:

logFile = "C:/Spark/spark-1.4.1-bin-hadoop2.6/README.md"
sc = SparkContext()
logData = sc.textFile(logFile).cache()
logData.count()

当我收到以下错误时:

15/08/27 12:04:15 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.io.IOException: Cannot run program "python": CreateProcess error=2, The system cannot find the file specified

我已将 python 路径添加为环境变量,并且它使用命令行正常工作,但我无法弄清楚这段代码的问题是什么.非常感谢任何帮助或评论.

I have added the python path as an environment variable and it's working properly using the command line but I could not figure out what my problem is with this code. Any help or comment is much appreciated.

谢谢

推荐答案

在为此苦苦挣扎了两天后,我想出了问题所在.我将以下内容添加到PATH"变量中作为 Windows 环境变量:

After struggling with this for two days, I figured what the problem is. I added the followings to the "PATH" variable as windows environment variable:

C:/Spark/spark-1.4.1-bin-hadoop2.6/python/pyspark
C:\Python27

请记住,您需要将目录更改为安装 spark 的位置,对于 python 也是如此.另一方面,我不得不提一下,我使用的是包含 Hadoop 的 Spark 预构建版本.

Remember, You need to change the directory to wherever your spark is installed and also the same thing for python. On the other hand, I have to mention that I am using prebuild version of spark which has Hadoop included.

祝大家好运.

这篇关于java.io.IOException:无法运行程序“python";在 Pycharm 中使用 Spark (Windows)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆