Spark-sqlserver连接 [英] Spark-sqlserver connection

查看:199
本文介绍了Spark-sqlserver连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们可以将spark与sql-server连接吗?如果是这样,怎么办?
我是spark的新手,我想将服务器连接到spark并直接从sql-server工作,而不是上传.txt或.csv文件。请帮忙,谢谢。

Can we connect spark with sql-server? If so, how? I am new to spark, I want to connect the server to spark and work directly from sql-server instead of uploading .txt or .csv file. Please help, Thank you.

推荐答案

下面是一些代码段。 DataFrame用于创建表t2和插入数据。 SqlContext用于将t2表中的数据加载到DataFrame中。我在我的spark-default.conf文件中添加了spark.driver.extraClassPath和spark.executor.extraClassPath。

Here are some code snippets. A DataFrame is used to create the table t2 and insert data. The SqlContext is used to load the data from the t2 table into a DataFrame. I added the spark.driver.extraClassPath and spark.executor.extraClassPath to my spark-default.conf file.

//Spark 1.4.1

//Insert data from DataFrame

case class Conf(mykey: String, myvalue: String)

val data = sc.parallelize( Seq(Conf("1", "Delaware"), Conf("2", "Virginia"), Conf("3", "Maryland"), Conf("4", "South Carolina") ))

val df = data.toDF()

val url = "jdbc:sqlserver://wcarroll3:1433;database=mydb;user=ReportUser;password=ReportUser"

val table = "t2"

df.insertIntoJDBC(url, table, true)

//Load from database using SqlContext

val url = "jdbc:sqlserver://wcarroll3:1433;database=mydb;user=ReportUser;password=ReportUser"

val driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver";

val tbl = { sqlContext.load("jdbc", Map( "url" -> url, "driver" -> driver, "dbtable" -> "t2", "partitionColumn" -> "mykey", "lowerBound" -> "0", "upperBound" -> "100", "numPartitions" -> "1" ))}

tbl.show()

要考虑的一些问题是:

确保为端口1433打开了防火墙端口。
如果使用Microsoft Azure SQL Server DB,则表需要主键。某些方法创建了表,但是Spark的代码没有创建主键,因此表创建失败。

Make sure firewall ports are open for port 1433. If using Microsoft Azure SQL Server DB, tables require a primary key. Some of the methods create the table, but Spark's code is not creating the primary key so the table creation fails.

要注意的其他详细信息: https://docs.databricks.com/spark/latest/data-sources/sql- database.html

Other details to take care: https://docs.databricks.com/spark/latest/data-sources/sql-databases.html

源:> https://blogs.msdn.microsoft.com/bigdatasupport/2015/10/22/how-to-allow -spark-to-access-microsoft-sql-server /

这篇关于Spark-sqlserver连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆