通过SQOOP操作在OOZIE中列出MS SQL Server表 [英] Listing MS SQL Server table in OOZIE via SQOOP Action

查看:287
本文介绍了通过SQOOP操作在OOZIE中列出MS SQL Server表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我能够在CLI中完美执行以下SQOOP命令.

sqoop list-tables
--connect 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db'
--connection-manager org.apache.sqoop.manager.SQLServerManager
--driver com.microsoft.sqlserver.jdbc.SQLServerDriver 
-- --schema schma

但是在OOZIE(HUE)中尝试同样的方法时会出现错误

2055 [main]错误org.apache.sqoop.manager.CatalogQueryManager- 无法列出表java.sql.SQLException:未找到合适的驱动程序 对于'jdbc:sqlserver://xx.xx.xx.xx \ MSSQLSERVER2012:1433;用户名= usr;密码= xxx;数据库= db'

-

2057 [main]错误org.apache.sqoop.Sqoop-运行了异常 Sqoop:java.lang.RuntimeException:java.sql.SQLException:不适合 找到'jdbc:sqlserver://xx.xx.xx.xx \ MSSQLSERVER2012:1433;用户名= usr;密码= xxx;数据库= db'的驱动程序

我们如何让它在oozie中工作? (致力于Cloudera Hadoop发行版)

解决方案

这对于使用CDH 5.11和Hue Workflow Editor创建Oozie> Sqoop1工作流程的我来说很有效...但是它要求您对用户名和密码进行硬编码.参数...屏幕截图包含在下面.

这是分步操作:

  1. 打开"Hue">工作流编辑器"
  2. 创建新的工作流程
  3. 将Sqoop 1操作拖到将您的操作放到这里"灰色框中.
  4. 忽略默认的Sqoop命令框,而是单击Sqoop命令框下方ARGUMENTS右侧的+,以添加新的参数.
  5. 添加不带双引号的"import"作为第一个参数.
  6. 删除Sqoop命令框的全部内容,它必须为空.
  7. 添加一个值为"--connect"且不带双引号的新参数.
  8. 添加一个值为"jdbc:sqlserver://YourServerNameHere; database = YourDatabaseNameHere"的新参数
  9. 添加值为"--username"的新参数
  10. 添加一个值为"YourSQLServerNamedUserNameHere"的新参数
  11. 添加值为"--password"的新自变量
  12. 添加值为"--query"的新参数
  13. 添加一个值为"Select * from OptionalDBNameHere.SchemaNameHere.TableNameHere Where $ CONDITIONS"的新参数
  14. 添加值为"--delete-target-dir"的新参数
  15. 添加值为"--target-dir"的新参数
  16. 添加值为"hdfs://FDQServerName:PortNumber8020IsDefault/User/full/path/to/where/you/want/the/csv/file/placed/in/hdfs/NewFolderForThisTableHere"的新参数–每次您运行sqoop作业时,最后一个文件夹将被删除并重新创建.
  17. 添加一个值为"num-mappers"的新参数
  18. 添加值为"1"的新自变量

重要提示:

A.在第13项中的SQL Select语句的末尾必须具有"where $ CONDITIONS".如果没有它,它将无法运行.

B.这将使用一个SQL Server命名用户帐户,该帐户可以访问要Sqoop的DBServer数据库和表.

B.如果您的命名用户"没有将默认模式设置为"dbo",或者表的模式不是数据库和用户的默认模式,则需要输入这样的参数.

C. SQL Server JDBC驱动程序已正确放置在您的安装中.对于我的特定版本的Cloudera,位置为:"/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/sqoop/lib/sqljdbc41.jar",但您也可以尝试将可以在"/var/lib/oozie"或"/var/lib/sqoop"中找到它.

D.我没有成功用job.properties文件中的值替换我作为参数硬编码的用户名和密码.我相信这是有可能的,但是我一直找不到能清楚地表明如何做到这一点的人,而且经过数天的反复试验和失败也没有成功.

以下是屏幕快照,显示完成后的外观. SqoopCommandAsArguments SqoopCommandAsArguments成功

I am able to execute the following SQOOP command in CLI perfectly.

sqoop list-tables
--connect 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db'
--connection-manager org.apache.sqoop.manager.SQLServerManager
--driver com.microsoft.sqlserver.jdbc.SQLServerDriver 
-- --schema schma

But getting errors while trying out the same in OOZIE (HUE)

2055 [main] ERROR org.apache.sqoop.manager.CatalogQueryManager - Failed to list tables java.sql.SQLException: No suitable driver found for 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db'

-

2057 [main] ERROR org.apache.sqoop.Sqoop - Got exception running Sqoop: java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db'

How can we get it to work in oozie? (Working on Cloudera Hadoop Distribution)

解决方案

This worked for me using CDH 5.11 and the Hue Workflow Editor to create an Oozie>Sqoop1 workflow...but it REQUIRES you to hard code the UserName and Password arguments... Screenshots are included below.

Here is the Step-by-Step:

  1. Open the Hue > Workflow Editor
  2. Create a new workflow
  3. Drag the Sqoop 1 action into the the "drop your action here" grey box.
  4. Ignore the default Sqoop command box and instead hit the + to the right of the ARGUMENTS below the Sqoop command box to add a new argument.
  5. Add "import" without the double quote marks as the very first argument.
  6. Delete the entire content of the Sqoop command box, it needs to be empty.
  7. Add a new argument with the value of "--connect" without the double quotes.
  8. Add a new argument with the value of "jdbc:sqlserver://YourServerNameHere;database=YourDatabaseNameHere"
  9. Add a new argument with the value of "--username"
  10. Add a new argument with the value of "YourSQLServerNamedUserNameHere"
  11. Add a new argument with the value of "--password"
  12. Add a new argument with the value of "--query"
  13. Add a new argument with the value of "Select * from OptionalDBNameHere.SchemaNameHere.TableNameHere Where $CONDITIONS"
  14. Add a new argument with the value of "--delete-target-dir"
  15. Add a new argument with the value of "--target-dir"
  16. Add a new argument with the value of "hdfs://FDQServerName:PortNumber8020IsDefault/User/full/path/to/where/you/want/the/csv/file/placed/in/hdfs/NewFolderForThisTableHere" -- The last folder will be deleted and re-created each time you run the sqoop job.
  17. Add a new argument with the value of "num-mappers"
  18. Add a new argument with the value of "1"

Important:

A. The "Where $CONDITIONS" is critical to have at the end of the SQL Select statement in item 13. It will not run without it.

B. This uses a SQL Server Named User account with access to the DBServer Database and Table you want to Sqoop.

B. Entering arguments like this is required if your Named User does not have the default schema set to "dbo" or if the schema of your table is not the default schema for the database and user.

C. The SQL Server JDBC driver is placed correctly in your installation. For my particular version of Cloudera the location is: "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/sqoop/lib/sqljdbc41.jar" but you may also try putting it in either "/var/lib/oozie" or "/var/lib/sqoop"...not sure either of those work on their own.

D. I have not been successful at replacing the UserName and Password I hardcoded in as Arguments with values from a job.properties file. I believe it is possible but I have been unable to find anyone who can clearly show how to do it and days of brute force trial and error have been unsuccessful.

Here are screenshots showing what this looks like when done. SqoopCommandAsArguments SqoopCommandAsArgumentsSuccess

这篇关于通过SQOOP操作在OOZIE中列出MS SQL Server表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆