尝试从 Spark 连接到 Oracle [英] Trying to connect to Oracle from Spark

查看:36
本文介绍了尝试从 Spark 连接到 Oracle的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将 Oracle 连接到 Spark 并希望从某些表和 SQL 查询中提取数据.但我无法连接到 Oracle.我尝试了不同的解决方案,但没有看.我已按照以下步骤操作.如果我需要进行任何更改,请纠正我.

I am trying to connect to Oracle to Spark and want pull data from some table and SQL queries. But I am not able to connect to Oracle. I have tried different work around options, but no look. I have followed the below steps. Please correct me if I need to make any changes.

我使用的是 Windows 7 机器.我使用 Jupyter notebook 来使用 Pyspark.我有 python 2.7 和 Spark 2.1.0.我在环境变量中设置了一个 spark 类路径:

I am using Windows 7 machine. I using Jupyter notebook to use Pyspark. I have python 2.7 and Spark 2.1.0. I have set a spark Class path in environment variables:

  SPARK_CLASS_PATH = C:\Oracle\Product\11.2.0\client_1\jdbc\lib\ojdbc6.jar

jdbcDF = sqlContext.read.format("jdbc").option("driver", "oracle.jdbc.driver.OracleDriver").option("url", "jdbc:oracle://dbserver:port#/database").option("dbtable","Table_name").option("user","username").option("password","password").load()

jdbcDF = sqlContext.read.format("jdbc").option("driver", "oracle.jdbc.driver.OracleDriver").option("url", "jdbc:oracle://dbserver:port#/database").option("dbtable","Table_name").option("user","username").option("password","password").load()

错误:

1.Py4JJava错误:

1.Py4JJavaError:

An error occurred while calling o148.load.
: java.sql.SQLException: Invalid Oracle URL specified

2.Py4JJava 错误:

2.Py4JJavaError:

An error occurred while calling o114.load. : java.lang.ClassNotFoundException: oracle.jdbc.driver.OracleDriver

另一个场景:

  from pyspark import SparkContext, SparkConf
    from pyspark.sql import SQLContext
    ORACLE_DRIVER_PATH = "C:\Oracle\Product\11.2.0\client_1\jdbc\lib\ojdbc7.jar"                                            
    Oracle_CONNECTION_URL ="jdbc:oracle:thin:username/password@servername:port#/dbservicename"    
   conf = SparkConf()
   conf.setMaster("local")
   conf.setAppName("Oracle_imp_exp")       
   sqlContext = SQLContext(sc)
   ora_tmp=sqlContext.read.format('jdbc').options(
        url=Oracle_CONNECTION_URL,
        dbtable="tablename",
        driver="oracle.jdbc.OracleDriver"
        ).load() 

我遇到以下错误.

Error: IllegalArgumentException: u"Error while instantiating org.apache.spark.sql.hive.HiveSessionState':"

请帮我解决这个问题.

推荐答案

此更改解决了这个问题.

This one got worked out with this change.

   sqlContext = SQLContext(sc)
   ora_tmp=spark.read.format('jdbc').options(
        url=Oracle_CONNECTION_URL,
        dbtable="tablename",
        driver="oracle.jdbc.OracleDriver"
        ).load() 

这篇关于尝试从 Spark 连接到 Oracle的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆