示例Pyspark程序返回[WinError 2]系统找不到文件 [英] Sample Pyspark program returns [WinError 2] The system cannot find the file

查看:64
本文介绍了示例Pyspark程序返回[WinError 2]系统找不到文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我要运行的代码.我已经设置了spark,hadoop,java和python的路径.使用Java 8,Spark 2.2.1和hadoop 2.7.5.

Here is the code I am trying to run. I have set the paths for spark, hadoop, java and python. Using Java 8, Spark 2.2.1 and hadoop 2.7.5.

import random
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName('MyFirstStandaloneApp')
sc = SparkContext(conf=conf)
NUM_SAMPLES = 20
def inside(p):
   x, y = random.random(), random.random()
   return x*x + y*y < 1

count = sc.parallelize(xrange(0, NUM_SAMPLES)) \
         .filter(inside).count()
print("Pi is roughly %f" % (4.0 * count / NUM_SAMPLES))

我收到的错误在这里:

Traceback (most recent call last):
File "sample1.py", line 4, in <module>
 sc = SparkContext(conf=conf)
File "C:\ProgramData\Anaconda3\lib\site-packages\pyspark\context.py", line 
    115, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "C:\ProgramData\Anaconda3\lib\site-packages\pyspark\context.py", line 
         283, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "C:\ProgramData\Anaconda3\lib\site-packages\pyspark\java_gateway.py", 
         line 80, in launch_gateway
        proc = Popen(command, stdin=PIPE, env=env)
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 709, in __init__
      restore_signals, start_new_session)
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 997, 
     in_execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

推荐答案

解决方案我安装了spark两次,其中一个来自Apache的独立版本,另一个来自Anaconda的独立版本导致路径问题.

Solution I had installed spark twice one standalone version from Apache and one from Anaconda caused problems with the paths.

这篇关于示例Pyspark程序返回[WinError 2]系统找不到文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆