Pyspark:使用Python从Spark 2.4连接到MS SQL Server 2017时没有合适的驱动程序错误 [英] Pyspark: No suitable Driver error while connecting to MS SQL Server 2017 from Spark 2.4 using Python

查看:757
本文介绍了Pyspark:使用Python从Spark 2.4连接到MS SQL Server 2017时没有合适的驱动程序错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用python即pyspark运行spark作业时遇到问题. 请在下面查看代码段

I am facing a problem while running spark job using python i.e. pyspark. Please see below the code snippets

from pyspark.sql import SparkSession
from os.path import abspath
from pyspark.sql.functions import max,min,sum,col
from pyspark.sql import functions as F
spark = SparkSession.builder.appName("test").config("spark.driver.extraClassPath", "/usr/dt/mssql-jdbc-6.4.0.jre8.jar").getOrCreate()
spark.conf.set("spark.sql.execution.arrow.enabled", "true")
spark.conf.set("spark.sql.session.timeZone", "Etc/UTC")
warehouse_loc = abspath('spark-warehouse')

#loading data from MS SQL Server 2017
df = spark.read.format("jdbc").options(url="jdbc:sqlserver://10.90.3.22;DATABASE=TransTrak_V_1.0;user=sa;password=m2m@ipcl1234",properties = { "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver" },dbtable="Current_Voltage").load()

运行此代码时,遇到以下错误:

When I run this code, I am facing the following error:

py4j.protocol.Py4JJavaError: An error occurred while calling o38.load.
: java.sql.SQLException: No suitable driver

与以前可以正常运行的代码相同.但是,由于某些原因,我不得不重新安装centOS 7,然后重新安装Python 3.6.我已经将python 3.6设置为spark中的默认python,即当我启动pyspark时,默认python是3.6.

The same code used to run fine earlier. However, due to some reasons, I had to reinstall centOS 7 again and then Python 3.6. I have set python 3.6 as a default python in spark i.e. when I start pyspark the default python is 3.6.

只需提一下,系统默认的python是python 2.7.我正在使用centOS7.

Just to mention, the system default python is Python 2.7. I am using centOS 7.

这是怎么回事?有人可以帮忙吗?

What is going wrong here? Can anybody please help on this?

推荐答案

好,因此,长时间搜索后,似乎openjdkjava-1.8.0-openjdk-1.8.0.131-11.b12.el7.x86_64火花可能无法正常工作.当我看到默认的Java时,它如下所示:

Ok, so after long search, it appears that probably spark doesn't work properly with openjdk i.e. java-1.8.0-openjdk-1.8.0.131-11.b12.el7.x86_64. When I see the default Java I see it is as follows

openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-b12)
OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)

然后,我尝试从官方站点安装Oracle JDK 8,然后遇到了单独的问题. 因此,简而言之,我无法像之前那样运行spark作业.

Then I tried to install Oracle JDK 8 from official site, however, then I faced separate issues. So in nutshell, I am not able to run the spark jobs like earlier.

这篇关于Pyspark:使用Python从Spark 2.4连接到MS SQL Server 2017时没有合适的驱动程序错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆