Zeppelin中的AWS Redshift驱动程序 [英] AWS Redshift driver in Zeppelin

查看：135 发布时间：2019/9/2 14:29:06 jdbc apache-spark amazon-redshift apache-zeppelin

本文介绍了Zeppelin中的AWS Redshift驱动程序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用齐柏林飞艇(Zeppelin)在Redshift中浏览我的数据.一个带有Spark的小型EMR集群正在运行.我正在加载databricks的spark-redshift库

I want to explore my data in Redshift using notebook Zeppelin. A small EMR cluster with Spark is running behind. I am loading databricks' spark-redshift library

%dep
z.reset()
z.load("com.databricks:spark-redshift_2.10:0.6.0")

然后

import org.apache.spark.sql.DataFrame

val query = "..."

val url = "..."
val port=5439
val table = "..."
val database = "..."
val user = "..."
val password = "..."

val df: DataFrame = sqlContext.read
  .format("com.databricks.spark.redshift")
  .option("url", s"jdbc:redshift://${url}:$port/$database?user=$user&password=$password")
  .option("query",query)
  .option("tempdir", "s3n://.../tmp/data")
  .load()

df.show

但是我得到了错误

java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver

我添加了选项

option("jdbcdriver", "com.amazon.redshift.jdbc41.Driver")

但不是更好.我想我需要在某个地方指定redshift的JDBC驱动程序，就像我将--driver-class-path传递给spark-shell一样，但是如何使用齐柏林飞艇来做到这一点?

but not for the better. I think I need to specify redshift's JDBC driver somewhere like I would passing --driver-class-path to spark-shell, but how to do that with zeppelin?

推荐答案

您可以使用Zeppelin的

You can add external jars with dependencies like the JDBC driver using either Zeppelin's dependency-loading mechanism or, in case of Spark, using %dep dynamic dependency loader

当您的代码需要外部库时，无需执行下载/复制/重新启动Zeppelin的操作，就可以使用％dep解释器轻松地完成以下工作.

When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %dep interpreter.

从Maven存储库递归加载库
从本地文件系统加载库
添加其他Maven存储库
自动将库添加到SparkCluster(可以关闭)

Load libraries recursively from Maven repository
Load libraries from local filesystem
Add additional maven repository
Automatically add libraries to SparkCluster (You can turn off)

后者看起来像:

%dep
// loads with all transitive dependencies from Maven repo
z.load("groupId:artifactId:version")

// or add artifact from filesystem
z.load("/path/to.jar")

，按照惯例，必须在注释的第一段中.

and by convention have to be in the first paragraph of the note.

这篇关于Zeppelin中的AWS Redshift驱动程序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Zeppelin中的AWS Redshift驱动程序 [英] AWS Redshift driver in Zeppelin

问题描述

推荐答案

相关文章

Java相关最新文章

热门教程

热门工具

登录关闭

Zeppelin中的AWS Redshift驱动程序 [英] AWS Redshift driver in Zeppelin

问题描述

推荐答案

相关文章

Java相关最新文章

热门教程

热门工具

登录 关闭

登录关闭