未找到密钥:_PYSPARK_DRIVER_CALLBACK_HOST [英] key not found: _PYSPARK_DRIVER_CALLBACK_HOST

查看:17
本文介绍了未找到密钥:_PYSPARK_DRIVER_CALLBACK_HOST的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行此代码:

I'm trying to run this code:

import pyspark
from pyspark.sql import SparkSession

spark = SparkSession.builder 
        .master("local") 
        .appName("Word Count") 
        .getOrCreate()

df = spark.createDataFrame([
    (1, 144.5, 5.9, 33, 'M'),
    (2, 167.2, 5.4, 45, 'M'),
    (3, 124.1, 5.2, 23, 'F'),
    (4, 144.5, 5.9, 33, 'M'),
    (5, 133.2, 5.7, 54, 'F'),
    (3, 124.1, 5.2, 23, 'F'),
    (5, 129.2, 5.3, 42, 'M'),
   ], ['id', 'weight', 'height', 'age', 'gender'])

df.show()
print('Count of Rows: {0}'.format(df.count()))
print('Count of distinct Rows: {0}'.format((df.distinct().count())))

spark.stop()

然后出现错误

18/06/22 11:58:39 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[main,5,main]
java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CALLBACK_HOST
    ...
Exception: Java gateway process exited before sending its port number

我使用的是 PyCharm 和 MacOS、Python 3.6、Spark 2.3.1

I'm using PyCharm and MacOS, Python 3.6, Spark 2.3.1

这个错误的可能原因是什么?

What is the possible reason of this error?

推荐答案

此错误是由于版本不匹配造成的.回溯中引用的环境变量 (_PYSPARK_DRIVER_CALLBACK_HOST) 已在 期间删除Py4j 依赖于 0.10.7 并在 2.3.1 中向后移植到 2.3 分支.

This error is a result of a version mismatch. Environment variable which is referenced in the traceback (_PYSPARK_DRIVER_CALLBACK_HOST) has been removed during update Py4j dependency to 0.10.7 and backported to 2.3 branch in 2.3.1.

考虑版本信息:

我使用的是 PyCharm 和 MacOS、Python 3.6、Spark 2.3.1

I'm using PyCharm and MacOS, Python 3.6, Spark 2.3.1

您似乎安装了 2.3.1 软件包,但 SPARK_HOME 指向较旧的(2.3.0 或更早版本)安装.

it looks like you have 2.3.1 package installed, but SPARK_HOME points to an older (2.3.0 or earlier) installation.

这篇关于未找到密钥:_PYSPARK_DRIVER_CALLBACK_HOST的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆