在Spark JDBC中使用谓词读取 [英] Using predicates in spark jdbc read
问题描述
我正在将数据从sql服务器拉到hdfs.这是我的摘录,
I am pulling data from sql server to hdfs. Here is my snippet for that,
val predicates = Array[String]("int_id < 500000", "int_id >= 500000 && int_id < 1000000")
val jdbcDF = spark.read.format("jdbc")
.option("url", dbUrl)
.option("databaseName", "DatabaseName")
.option("dbtable", table)
.option("user", "***")
.option("password", "***")
.option("predicates", predicates)
.load()
我的Intellij IDE一直说
My Intellij IDE keeps saying that
类型不匹配,预期为Boolean或Long或Double或String,实际: Array [String]"
"Type mismatch, expected Boolean or Long or Double or String, Actual : Array[String]"
谓词中的
.不确定这有什么问题.谁能看到这有什么问题吗?另外如何在这里使用提取大小?
in predicates. Not sure whats wrong with this. Can anyone see whats wrong with this? Also how do I use fetch size here?
谢谢.
推荐答案
对于option
方法仅接受Boolean
s,Long
s,Double
s或String
s.要将predicates
作为Array[String]
传递,您必须使用jdbc
方法,而不是在format
方法中指定它.
To option
method accepts only Boolean
s, Long
s, Double
s or String
s. To pass the predicates
as an Array[String]
you have to use the jdbc
method instead of specifying it in the format
method.
val predicates = Array[String]("int_id < 500000", "int_id >= 500000 && int_id < 1000000")
val jdbcDF = spark.read.jdbc(
url = dbUrl,
table = table,
predicates = predicates,
connectionProperties = new Properties(???) // user, pass, db, etc.
)
您可以在此处查看示例.
这篇关于在Spark JDBC中使用谓词读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!