pyspark Py4J 错误使用 canopy :PythonAccumulatorV2([class java.lang.String, class java.lang.Integer, class java.lang.String]) 不存在 [英] pyspark Py4J error using canopy :PythonAccumulatorV2([class java.lang.String, class java.lang.Integer, class java.lang.String]) does not exist
问题描述
我在 Windows 以及 python 和 pyspark 上安装了 canopy IDE.在执行程序代码时,出现了sparK Context的问题:
I installed canopy IDE on windows as well as python and pyspark. When executing the code of a program, there was problem of a sparK Context:
findspark.init()
conf = SparkConf().setMaster('local').setAppName('MonEssai')
sc = SparkContext.getOrCreate();
lines = sc.textFile("file:///PremiéreEssai/ file9.txt")
fun = lines.flatMap(listsGraph)
results =fun.collect()
for result1 in results:
if(result1):
if ((result1[0].strip().startswith("sub_"))|(result1[0].strip().startswith("start"))):
for k in range(0,len(result1)):
if result1[k] not in Loc:
Loc.append(result1[k])
else :
for j in range(0,len(result1)):
if result1[j] not in Ext:
Ext.append(result1[j])
result3 = sc.parallelize(Ext)
ExtSimilarity= result3.map(MatchExt).filter(lambda x: x != None).collect()
#print(ExtSimilarity)
#print(Loc)
result3 = sc.parallelize(Loc)
result9= result3.map(pos_debut)
result11= result9.map(opcode)
VectOpcode= result11.flatMapValues(f).flatMap(lambda X: [((X[0],len(X[1])))]).groupByKey().mapValues(list)
VectOpcode2 = VectOpcode.collect()
我收到以下错误:
Py4JError:调用时发生错误None.org.apache.spark.api.python.PythonAccumulatorV2.痕迹:py4j.Py4JException: 构造函数org.apache.spark.api.python.PythonAccumulatorV2([类java.lang.String,类 java.lang.Integer,类 java.lang.String])不存在
Py4JError: An error occurred while calling None.org.apache.spark.api.python.PythonAccumulatorV2. Trace: py4j.Py4JException: Constructor org.apache.spark.api.python.PythonAccumulatorV2([class java.lang.String, class java.lang.Integer, class java.lang.String]) does not exist
Py4JErrorTraceback (most recent call last)
C:\Premi�reEssai\maman.py in <module>()
818 findspark.init()
819 conf = SparkConf().setMaster('local').setAppName('MonEssai')
--> 820 sc = SparkContext.getOrCreate();
821 lines = sc.textFile("file:///PremiéreEssai/ file9.txt")
822 fun = lines.flatMap(listsGraph)
C:\Users\hene\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\pyspark\context.pyc in getOrCreate(cls, conf)
347 with SparkContext._lock:
348 if SparkContext._active_spark_context is None:
--> 349 SparkContext(conf=conf or SparkConf())
350 return SparkContext._active_spark_context
351
C:\Users\hene\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\pyspark\context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
116 try:
117 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
--> 118 conf, jsc, profiler_cls)
119 except:
120 # If an error occurs, clean up in order to allow future SparkContext creation:
C:\Users\hene\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\pyspark\context.pyc in _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, jsc, profiler_cls)
187 self._accumulatorServer = accumulators._start_update_server(auth_token)
188 (host, port) = self._accumulatorServer.server_address
--> 189 self._javaAccumulator = self._jvm.PythonAccumulatorV2(host, port, auth_token)
190 self._jsc.sc().register(self._javaAccumulator)
191
C:\Users\hene\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\py4j\java_gateway.pyc in __call__(self, *args)
1523 answer = self._gateway_client.send_command(command)
1524 return_value = get_return_value(
-> 1525 answer, self._gateway_client, None, self._fqn)
1526
1527 for temp_arg in temp_args:
C:\Users\hene\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\py4j\protocol.pyc in get_return_value(answer, gateway_client, target_id, name)
330 raise Py4JError(
331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
--> 332 format(target_id, ".", name, value))
333 else:
334 raise Py4JError(
Py4JError: An error occurred while calling None.org.apache.spark.api.python.PythonAccumulatorV2. Trace:
py4j.Py4JException: Constructor org.apache.spark.api.python.PythonAccumulatorV2([class java.lang.String, class java.lang.Integer, class java.lang.String]) does not exist
at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179)
at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196)
at py4j.Gateway.invoke(Gateway.java:237)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
所以,我被困在这我该怎么办?
So, I'm stuck in this what should I do?
推荐答案
虽然我的设置不同,但我在一小时前遇到了完全相同的错误.我遇到的问题是 pyspark 版本与 spark 版本不同.您可以运行 pip list
来检查您的 pyspark 版本.
While my setup is different, I just had the exact same error an hour ago. The problem I had is pyspark version is differnt from the spark version. You can run pip list
to check your pyspark version.
这篇关于pyspark Py4J 错误使用 canopy :PythonAccumulatorV2([class java.lang.String, class java.lang.Integer, class java.lang.String]) 不存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!