Oozie + Sqoop:JDBC 驱动程序 Jar 位置 [英] Oozie + Sqoop: JDBC Driver Jar Location

查看:53
本文介绍了Oozie + Sqoop:JDBC 驱动程序 Jar 位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个基于 cloudera 的 6 节点 hadoop 集群,我正在尝试从 oozie 中的 sqoop 操作连接到 oracle 数据库.

I have a 6 node cloudera based hadoop cluster and I'm trying to connect to an oracle database from a sqoop action in oozie.

我已将我的 ojdbc6.jar 复制到 sqoop lib 位置(对我来说恰好位于:/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib/) 在所有节点上,并已验证我可以从所有 6 个节点运行一个简单的sqoop eval".

I have copied my ojdbc6.jar into the sqoop lib location (which for me happens to be at: /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib/ ) on all the nodes and have verified that I can run a simple 'sqoop eval' from all the 6 nodes.

现在,当我使用 Oozie 的 sqoop 操作运行相同的命令时,出现无法加载数据库驱动程序类:oracle.jdbc.OracleDriver"

Now when I run the same command using Oozie's sqoop action, I get "Could not load db driver class: oracle.jdbc.OracleDriver"

我已阅读这篇文章 关于使用共享库,当我们谈论我的任务/操作/工作流特定依赖项时,这对我来说很有意义.但我认为 JDBC 驱动程序安装是 sqoop 的扩展,所以我认为它属于 sqoop 安装库.

I have read this article about using shared libs and it makes sense to me when we're talking about my task/action/workflow specific dependencies. But I see a JDBC driver installation as an extention to sqoop and so I think it belongs in the sqoop installation lib.

现在的问题是,虽然 sqoop 看到了我放入它的 lib 文件夹中的这个 ojdbc6 jar,但为什么我的 Oozie 工作流程没有看到它?

Now the question is, while sqoop sees this ojdbc6 jar I have put into it's lib folder, how come my Oozie workflow doesn't see it?

这是预期的还是我遗漏了什么?

Is this something expected or am I missing something?

顺便说一句,您认为 JDBC 驱动程序 jar 的合适位置在哪里?

As an aside, what do you guy think about where is the appropriate location for a JDBC driver jar?

提前致谢!

推荐答案

JDBC 驱动程序 jar(以及它依赖的任何 jar)应该放在 HDFS 上的 Oozie sharelib 文件夹中.我正在运行 Hortonworks Data Platform 1.2 而不是 Cloudera 4.2,所以细节可能会有所不同,但我的 JDBC 驱动程序位于 /user/oozie/share/lib/sqoop.这应该允许您通过 Oozie 使用 JDBC 运行 Sqoop.

The JDBC driver jar (and any jars it depends on) should go in your Oozie sharelib folder on HDFS. I'm running Hortonworks Data Platform 1.2 instead of Cloudera 4.2 so the details may vary, but my JDBC driver is located in /user/oozie/share/lib/sqoop. This should allow you to run Sqoop with the JDBC via Oozie.

数据节点上的sqoop库中的JDBC驱动jar包是没有必要的.在我的设置中,我无法从数据节点上的命令行运行简单的 sqoop eval.我理解为什么你认为这会奏效的逻辑.JDBC 驱动程序 jar 需要在 HDFS 上的原因是所有数据节点都可以访问它.您的解决方案应该实现相同的目标.我对 Oozie 的内部运作不够熟悉,无法说明为什么使用 sharelib 有效,而您的解决方案却无效.

It is not necessary to put to the JDBC driver jar in the sqoop lib on the data nodes. In my setupt I can't run a simple sqoop eval from the command line on my data nodes. I understand the logic for why you thought this would work. The reason the JDBC driver jar needs to be on HDFS is so that all the data nodes have access to it. Your solution should accomplish the same goal. I'm not familiar enough with the inner workings of Oozie to say why using the sharelib works but your solution does not.

这篇关于Oozie + Sqoop:JDBC 驱动程序 Jar 位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆