在Spark JDBC读取方法中使用谓词 [英] Using predicates in Spark JDBC read method

查看:95
本文介绍了在Spark JDBC读取方法中使用谓词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将数据从sql服务器拉到hdfs.这是我的摘录,

I am pulling data from sql server to hdfs. Here is my snippet for that,

val predicates = Array[String]("int_id < 500000", "int_id >= 500000 && int_id < 1000000")

  val jdbcDF = spark.read.format("jdbc")
      .option("url", dbUrl)
      .option("databaseName", "DatabaseName")
      .option("dbtable", table)
      .option("user", "***")
      .option("password", "***")
      .option("predicates", predicates)
      .load()

我的Intellij IDE一直说

My Intellij IDE keeps saying that

类型不匹配,预期为Boolean或Long或Double或String,实际: Array [String]"

"Type mismatch, expected Boolean or Long or Double or String, Actual : Array[String]"

谓词中的

.不确定这有什么问题.谁能看到这有什么问题吗?另外如何在这里使用提取大小?

in predicates. Not sure whats wrong with this. Can anyone see whats wrong with this? Also how do I use fetch size here?

谢谢.

推荐答案

option方法仅接受Boolean s,Long s,Double s或String s.要将predicates作为Array[String]传递,必须使用jdbc方法,而不是在format方法中指定它.

The option method accepts only Booleans, Longs, Doubles or Strings. To pass the predicates as an Array[String] you have to use the jdbc method instead of specifying it in the format method.

val predicates = Array[String]("int_id < 500000", "int_id >= 500000 && int_id < 1000000")

val jdbcDF = spark.read.jdbc(
  url = dbUrl,
  table = table,
  predicates = predicates,
  connectionProperties = new Properties(???) // user, pass, db, etc.
)

您可以看到示例这里.

这篇关于在Spark JDBC读取方法中使用谓词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆