Apache pyspark使用oracle jdbc提取数据.找不到驱动程序 [英] Apache pyspark using oracle jdbc to pull data. Driver cannot be found
问题描述
我正在Windows 7上使用apache spark pyspark(spark-1.5.2-bin-hadoop2.6).
I am using apache spark pyspark (spark-1.5.2-bin-hadoop2.6) on windows 7.
在pyspark中运行python脚本时,我一直收到此错误.
I keep getting this error when I run my python script in pyspark.
调用o23.load时发生错误. java.sql.SQLException:找不到适用于jdbc:oracle:thin:---------------------------------的驱动程序---连接
An error occured while calling o23.load. java.sql.SQLException: No suitable driver found for jdbc:oracle:thin:------------------------------------connection
这是我的python文件
Here is my python file
import os
os.environ["SPARK_HOME"] = "C:\\spark-1.5.2-bin-hadoop2.6"
os.environ["SPARK_CLASSPATH"] = "L:\\Pyspark_Snow\\ojdbc6.jar"
from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext
spark_config = SparkConf().setMaster("local[8]")
sc = SparkContext(conf=spark_config)
sqlContext = SQLContext(sc)
df = (sqlContext
.load(source="jdbc",
url="jdbc:oracle:thin://x.x.x.x/xdb?user=xxxxx&password=xxxx",
dbtable="x.users")
)
sc.stop()
推荐答案
不幸的是,更改环境变量SPARK_CLASSPATH
无效.您需要声明
Unfortunately changing environment variable SPARK_CLASSPATH
won't work. You need to declare
spark.driver.extraClassPath L:\\Pyspark_Snow\\ojdbc6.jar
在您的/path/to/spark/conf/spark-defaults.conf
中
或仅使用附加参数--jars
执行spark-submit
作业:
in your /path/to/spark/conf/spark-defaults.conf
or simply execute spark-submit
job with additional argument --jars
:
spark-submit --jars "L:\\Pyspark_Snow\\ojdbc6.jar" yourscript.py
这篇关于Apache pyspark使用oracle jdbc提取数据.找不到驱动程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!