适用于SQL Server和Azure SQL的Apache Spark连接器 [英] Apache Spark Connector for SQL Server and Azure SQL
问题描述
我正在尝试使用此连接器将数据从Azure Databricks写入Azure SQL-com.microsoft.azure:spark-mssql-connector_2.12_3.0:1.0.0,但收到以下错误消息-
I'm trying to write data from Azure Databricks to Azure SQL using this connector - com.microsoft.azure:spark-mssql-connector_2.12_3.0:1.0.0, but getting the below error message -
作业由于阶段失败而中止:阶段1.0中的任务0失败了4次,最近一次失败:阶段1.0中的任务0.3(TID 4、10.139.64.4,执行者0)丢失:java.lang.NoClassDefFoundError:com/microsoft/sqlserver/jdbc/ISQLServerBulkData
Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 4, 10.139.64.4, executor 0): java.lang.NoClassDefFoundError: com/microsoft/sqlserver/jdbc/ISQLServerBulkData
此Spark连接器是否可以与Azure Databricks一起用于Azure SQL?有没有人对此进行过测试?
Is this spark connector working with Azure Databricks to Azure SQL? have any one tested this?
推荐答案
由于某些原因,Microsoft仅发布连接器本身的jar,并且不包含连接器所需的JDBC驱动程序.如果您自己构建代码,则构建系统会生成 assembly
工件,但可以使用,但尚未发布.您可以通过显式添加JDBC驱动程序( com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8
坐标)来解决此问题(例如spark-submit/spark-shell/pyspark等):
For some reason, Microsoft publishes only jar for connector itself, and it doesn't include JDBC driver that is required for connector. If you build code yourself, build system produces the assembly
artifact, but that could be used, but it's isn't published. You can workaround the problem by adding the JDBC driver explicitly (the com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8
coordinate), like this (for spark-submit/spark-shell/pyspark, etc.):
bin/spark-shell --packages \
com.microsoft.azure:spark-mssql-connector_2.12_3.0:1.0.0-alpha,com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre8
或通过UI添加两个库
这篇关于适用于SQL Server和Azure SQL的Apache Spark连接器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!