PySpark sqlContext 读取 Postgres 9.6 NullPointerException [英] PySpark sqlContext read Postgres 9.6 NullPointerException

查看:23
本文介绍了PySpark sqlContext 读取 Postgres 9.6 NullPointerException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试使用 PySpark 从 Postgres 数据库读取表.我已经设置了以下代码并验证了 SparkContext 存在:

Trying to read a table with PySpark from a Postgres DB. I have set up the following code and verified SparkContext exists:

import os

os.environ['PYSPARK_SUBMIT_ARGS'] = '--driver-class-path /tmp/jars/postgresql-42.0.0.jar --jars /tmp/jars/postgresql-42.0.0.jar pyspark-shell'


from pyspark import SparkContext, SparkConf

conf = SparkConf()
conf.setMaster("local[*]")
conf.setAppName('pyspark')

sc = SparkContext(conf=conf)


from pyspark.sql import SQLContext

properties = {
    "driver": "org.postgresql.Driver"
}
url = 'jdbc:postgresql://tom:@localhost/gqp'

sqlContext = SQLContext(sc)
sqlContext.read \
    .format("jdbc") \
    .option("url", url) \
    .option("driver", properties["driver"]) \
    .option("dbtable", "specimen") \
    .load()

我收到以下错误:

Py4JJavaError: An error occurred while calling o812.load. : java.lang.NullPointerException

我的数据库的名称是 gqp,表是 specimen,并且已经使用 Postgres.app macOS 验证它在 localhost 上运行应用.

The name of my database is gqp, table is specimen, and have verified it is running on localhost using the Postgres.app macOS app.

推荐答案

URL 是问题所在!

原来是:url = 'jdbc:postgresql://tom:@localhost/gqp'

我删除了 tom:@ 部分,它起作用了.URL 必须遵循以下模式:jdbc:postgresql://ip_address:port/db_name,而我的是直接从 Flask 项目复制的.

I removed the tom:@ part, and it worked. The URL must follow the pattern: jdbc:postgresql://ip_address:port/db_name, whereas mine was directly copied from a Flask project.

如果您正在阅读本文,希望您不要犯同样的错误:)

If you're reading this, hope you didn't make this same mistake :)

这篇关于PySpark sqlContext 读取 Postgres 9.6 NullPointerException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆