如何在 Java 程序中使用 Sqoop? [英] How to use Sqoop in Java Program?

查看:34
本文介绍了如何在 Java 程序中使用 Sqoop?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道如何通过命令行使用 sqoop.但是不知道如何使用java程序调用sqoop命令.谁能给出一些代码视图?

I know how to use sqoop through command line. But dont know how to call sqoop command using java programs . Can anyone give some code view?

推荐答案

通过在类路径中包含 sqoop jar 并调用 Sqoop.runTool() 方法,您可以从 Java 代码内部运行 sqoop.您必须像命令行一样以编程方式创建 sqoop 所需的参数(例如 --connect 等).

You can run sqoop from inside your java code by including the sqoop jar in your classpath and calling the Sqoop.runTool() method. You would have to create the required parameters to sqoop programmatically as if it were the command line (e.g. --connect etc.).

请注意以下事项:

  • 确保 sqoop 工具名称(例如导入/导出等)是第一个参数.
  • 注意类路径排序 - 执行可能会失败,因为 sqoop 需要库的版本 X,而您使用了不同的版本.确保 sqoop 需要的库不会被您自己的依赖项所掩盖.我在使用 commons-io(sqoop 需要 v1.4)时遇到过这样的问题,并且因为我使用的是 commons-io v1.2,所以出现了 NoSuchMethod 异常.
  • 每个参数都需要在一个单独的数组元素上.例如,--connect jdbc:mysql:..."应该作为数组中的两个单独元素传递,而不是一个.
  • sqoop 解析器知道如何接受双引号参数,因此如果需要,请使用双引号(我始终建议).唯一的例外是需要单个字符的 fields-delimited-by 参数,因此不要用双引号将其引用.
  • 我建议将命令行参数创建逻辑和实际执行分开,这样您的逻辑就可以在不实际运行工具的情况下进行正确测试.
  • 最好使用 --hadoop-home 参数,以防止对环境的依赖.
  • Sqoop.runTool() 相对于 Sqoop.Main() 的优点是 runTool() 返回错误执行代码.
  • Make sure that the sqoop tool name (e.g. import/export etc.) is the first parameter.
  • Pay attention to classpath ordering - The execution might fail because sqoop requires version X of a library and you use a different version. Ensure that the libraries that sqoop requires are not overshadowed by your own dependencies. I've encountered such a problem with commons-io (sqoop requires v1.4) and had a NoSuchMethod exception since I was using commons-io v1.2.
  • Each argument needs to be on a separate array element. For example, "--connect jdbc:mysql:..." should be passed as two separate elements in the array, not one.
  • The sqoop parser knows how to accept double-quoted parameters, so use double quotes if you need to (I suggest always). The only exception is the fields-delimited-by parameter which expects a single char, so don't double-quote it.
  • I'd suggest splitting the command-line-arguments creation logic and the actual execution so your logic can be tested properly without actually running the tool.
  • It would be better to use the --hadoop-home parameter, in order to prevent dependency on the environment.
  • The advantage of Sqoop.runTool() as opposed to Sqoop.Main() is the fact that runTool() return the error code of the execution.

希望有所帮助.

final int ret = Sqoop.runTool(new String[] { ... });
if (ret != 0) {
  throw new RuntimeException("Sqoop failed - return code " + Integer.toString(ret));
}

RL

这篇关于如何在 Java 程序中使用 Sqoop?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆