通过 JDBC 集成 Spark SQL 和 Apache Drill [英] Integrating Spark SQL and Apache Drill through JDBC

查看：31 发布时间：2021/11/14 21:55:19 hadoop jdbc apache-spark apache-spark-sql apache-drill

本文介绍了通过 JDBC 集成 Spark SQL 和 Apache Drill的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想根据使用 Apache Drill 对 CSV 数据(在 HDFS 上)执行的查询结果创建一个 Spark SQL DataFrame.我成功配置了 Spark SQL 使其通过 JDBC 连接到 Drill:

I would like to create a Spark SQL DataFrame from the results of a query performed over CSV data (on HDFS) with Apache Drill. I successfully configured Spark SQL to make it connect to Drill via JDBC:

Map<String, String> connectionOptions = new HashMap<String, String>();
connectionOptions.put("url", args[0]);
connectionOptions.put("dbtable", args[1]);
connectionOptions.put("driver", "org.apache.drill.jdbc.Driver");

DataFrame logs = sqlc.read().format("jdbc").options(connectionOptions).load();

Spark SQL 执行两个查询:第一个查询获取架构，第二个查询获取实际数据:

Spark SQL performs two queries: the first one to get the schema, and the second one to retrieve the actual data:

SELECT * FROM (SELECT * FROM dfs.output.`my_view`) WHERE 1=0

SELECT "field1","field2","field3" FROM (SELECT * FROM dfs.output.`my_view`)

第一个成功，但在第二个中，Spark 将字段括在双引号内，这是 Drill 不支持的，因此查询失败.

The first one is successful, but in the second one Spark encloses fields within double quotes, which is something that Drill doesn't support, so the query fails.

有人设法让这种集成工作了吗?

Did someone managed to get this integration working?

谢谢！

推荐答案

你可以为此添加 JDBC Dialect 并在使用 jdbc connector 之前注册该方言

you can add JDBC Dialect for this and register the dialect before using jdbc connector

case object DrillDialect extends JdbcDialect {

  def canHandle(url: String): Boolean = url.startsWith("jdbc:drill:")

  override def quoteIdentifier(colName: java.lang.String): java.lang.String = {
    return colName
  }

  def instance = this
}

JdbcDialects.registerDialect(DrillDialect)

这篇关于通过 JDBC 集成 Spark SQL 和 Apache Drill的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

通过 JDBC 集成 Spark SQL 和 Apache Drill [英] Integrating Spark SQL and Apache Drill through JDBC

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

通过 JDBC 集成 Spark SQL 和 Apache Drill [英] Integrating Spark SQL and Apache Drill through JDBC

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭