Sqoop作业因Oracle导入的KiteSDK验证错误而失败 [英] Sqoop job fails with KiteSDK validation error for Oracle import

查看:485
本文介绍了Sqoop作业因Oracle导入的KiteSDK验证错误而失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行Sqoop作业以从Oracle数据库加载并以Parquet格式加载到Hadoop集群。作业是增量工作。

I am attempting to run a Sqoop job to load from an Oracle db and into Parquet format to a Hadoop cluster. The job is incremental.

Sqoop版本为1.4.6。 Oracle版本是12c。 Hadoop版本是2.6.0(发行版是Cloudera 5.5.1)。

Sqoop version is 1.4.6. Oracle version is 12c. Hadoop version is 2.6.0 (distro is Cloudera 5.5.1).

Sqoop命令是(创建作业并执行):

The Sqoop command is (this creates the job, and executes it):

$ sqoop job -fs hdfs://<HADOOPNAMENODE>:8020 \
--create myJob \
-- import \
--connect jdbc:oracle:thin:@<DBHOST>:<DBPORT>/<DBNAME> \
--username <USERNAME> \
-P \
--as-parquetfile \
--table <USERNAME>.<TABLENAME> \
--target-dir <HDFSPATH>  \
--incremental append  \
--check-column <TABLEPRIMARYKEY>

$ sqoop job --exec myJob

执行时出错:

16/02/05 11:25:30 ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.ValidationException: Dataset name
05112528000000918_2088_<USERNAME>.<TABLENAME>
is not alphanumeric (plus '_')
    at org.kitesdk.data.ValidationException.check(ValidationException.java:55)
    at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:103)
    at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:66)
    at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209)
    at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137)
    at org.kitesdk.data.Datasets.create(Datasets.java:239)
    at org.kitesdk.data.Datasets.create(Datasets.java:307)
    at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107)
    at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:80)
    at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:106)
    at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260)
    at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:668)
    at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:444)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
    at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
    at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

故障排除步骤:

0)HDFS稳定,其他Sqoop作业正常,Oracle源数据库已启动并且连接已经过测试d。

0) HDFS is stable, other Sqoop jobs are functional, Oracle source DB is up and the connection has been tested.

1)我尝试在Oracle中创建同义词,那样我就可以简单地使用--table选项:

1) I tried creating a synonym in Oracle, that way I could simply have the --table option as:

-表TABLENAME(无用户名)

--table TABLENAME (without the username)

这给我一个错误,表明表名不正确。它需要--table选项的完整USERNAME.TABLENAME。

This gave me an error that the table name was not correct. It needs the full USERNAME.TABLENAME for the --table option.

错误:

16/02/05 12:04:46 ERROR tool.ImportTool: Imported Failed: There is no column found in the target table <TABLENAME>. Please ensure that your table name is correct.

2)我确保这是一个镶木地板问题。我删除了--as-parquetfile选项,该工作成功

2) I made sure that this is a Parquet issue. I removed the --as-parquetfile option and the job was successful.

3)我想知道这是否是由增量选项引起的。我删除了--incremental附加& --check-column选项,该作业成功

3) I wondered if this is somehow caused by the incremental options. I removed the --incremental append & --check-column options and the job was successful. This confuses me.

4)我尝试使用MySQL进行这项工作,但成功

4) I tried the job with MySQL and it was successful.

有人遇到类似问题吗?有没有办法(甚至建议)禁用风筝验证?似乎是用点(。)创建了数据集,然后Kite SDK抱怨这个点-但这是我的假设,因为我对Kite SDK不太熟悉。

Has anyone run into something similar? Is there a way (or is it even advisable) to disable the Kite validation? It seems that the dataset is being created with dots ("."), which then Kite SDK complains about - but this is an assumption on my part as I am not too familiar with Kite SDK.

预先感谢

Jose

推荐答案

已解决。与Oracle 12c的JDBC连接似乎存在一个已知问题。使用特定的OJDBC6(而不是7)可以解决问题。仅供参考-OJDBC安装在/ usr / share / java /中,并在/ installpath ... / lib / sqoop / lib /

Resolved. There seems to be a known issue with the JDBC connectivity to Oracle 12c. Using a specific OJDBC6 (instead of 7) did the trick. FYI - the OJDBC is installed in /usr/share/java/ and a symbolic link is created in /installpath.../lib/sqoop/lib/

这篇关于Sqoop作业因Oracle导入的KiteSDK验证错误而失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆