示例Pyspark程序返回[WinError 2]系统找不到文件 [英] Sample Pyspark program returns [WinError 2] The system cannot find the file
本文介绍了示例Pyspark程序返回[WinError 2]系统找不到文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是我要运行的代码.我已经设置了spark,hadoop,java和python的路径.使用Java 8,Spark 2.2.1和hadoop 2.7.5.
Here is the code I am trying to run. I have set the paths for spark, hadoop, java and python. Using Java 8, Spark 2.2.1 and hadoop 2.7.5.
import random
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName('MyFirstStandaloneApp')
sc = SparkContext(conf=conf)
NUM_SAMPLES = 20
def inside(p):
x, y = random.random(), random.random()
return x*x + y*y < 1
count = sc.parallelize(xrange(0, NUM_SAMPLES)) \
.filter(inside).count()
print("Pi is roughly %f" % (4.0 * count / NUM_SAMPLES))
我收到的错误在这里:
Traceback (most recent call last):
File "sample1.py", line 4, in <module>
sc = SparkContext(conf=conf)
File "C:\ProgramData\Anaconda3\lib\site-packages\pyspark\context.py", line
115, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "C:\ProgramData\Anaconda3\lib\site-packages\pyspark\context.py", line
283, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "C:\ProgramData\Anaconda3\lib\site-packages\pyspark\java_gateway.py",
line 80, in launch_gateway
proc = Popen(command, stdin=PIPE, env=env)
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 997,
in_execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
推荐答案
解决方案我安装了spark两次,其中一个来自Apache的独立版本,另一个来自Anaconda的独立版本导致路径问题.
Solution I had installed spark twice one standalone version from Apache and one from Anaconda caused problems with the paths.
这篇关于示例Pyspark程序返回[WinError 2]系统找不到文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文