Sqoop作业因Oracle导入的KiteSDK验证错误而失败 [英] Sqoop job fails with KiteSDK validation error for Oracle import
问题描述
我正在尝试运行Sqoop作业以从Oracle数据库加载并以Parquet格式加载到Hadoop集群。作业是增量工作。
I am attempting to run a Sqoop job to load from an Oracle db and into Parquet format to a Hadoop cluster. The job is incremental.
Sqoop版本为1.4.6。 Oracle版本是12c。 Hadoop版本是2.6.0(发行版是Cloudera 5.5.1)。
Sqoop version is 1.4.6. Oracle version is 12c. Hadoop version is 2.6.0 (distro is Cloudera 5.5.1).
Sqoop命令是(创建作业并执行):
The Sqoop command is (this creates the job, and executes it):
$ sqoop job -fs hdfs://<HADOOPNAMENODE>:8020 \
--create myJob \
-- import \
--connect jdbc:oracle:thin:@<DBHOST>:<DBPORT>/<DBNAME> \
--username <USERNAME> \
-P \
--as-parquetfile \
--table <USERNAME>.<TABLENAME> \
--target-dir <HDFSPATH> \
--incremental append \
--check-column <TABLEPRIMARYKEY>
$ sqoop job --exec myJob
执行时出错:
16/02/05 11:25:30 ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.ValidationException: Dataset name
05112528000000918_2088_<USERNAME>.<TABLENAME>
is not alphanumeric (plus '_')
at org.kitesdk.data.ValidationException.check(ValidationException.java:55)
at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:103)
at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:66)
at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209)
at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137)
at org.kitesdk.data.Datasets.create(Datasets.java:239)
at org.kitesdk.data.Datasets.create(Datasets.java:307)
at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107)
at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:80)
at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:106)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:668)
at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:444)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
故障排除步骤:
0)HDFS稳定,其他Sqoop作业正常,Oracle源数据库已启动并且连接已经过测试d。
0) HDFS is stable, other Sqoop jobs are functional, Oracle source DB is up and the connection has been tested.
1)我尝试在Oracle中创建同义词,那样我就可以简单地使用--table选项:
1) I tried creating a synonym in Oracle, that way I could simply have the --table option as:
-表TABLENAME(无用户名)
--table TABLENAME (without the username)
这给我一个错误,表明表名不正确。它需要--table选项的完整USERNAME.TABLENAME。
This gave me an error that the table name was not correct. It needs the full USERNAME.TABLENAME for the --table option.
错误:
16/02/05 12:04:46 ERROR tool.ImportTool: Imported Failed: There is no column found in the target table <TABLENAME>. Please ensure that your table name is correct.
2)我确保这是一个镶木地板问题。我删除了--as-parquetfile选项,该工作成功。
2) I made sure that this is a Parquet issue. I removed the --as-parquetfile option and the job was successful.
3)我想知道这是否是由增量选项引起的。我删除了--incremental附加& --check-column选项,该作业成功。
3) I wondered if this is somehow caused by the incremental options. I removed the --incremental append & --check-column options and the job was successful. This confuses me.
4)我尝试使用MySQL进行这项工作,但成功。
4) I tried the job with MySQL and it was successful.
有人遇到类似问题吗?有没有办法(甚至建议)禁用风筝验证?似乎是用点(。)创建了数据集,然后Kite SDK抱怨这个点-但这是我的假设,因为我对Kite SDK不太熟悉。
Has anyone run into something similar? Is there a way (or is it even advisable) to disable the Kite validation? It seems that the dataset is being created with dots ("."), which then Kite SDK complains about - but this is an assumption on my part as I am not too familiar with Kite SDK.
预先感谢
Jose
推荐答案
已解决。与Oracle 12c的JDBC连接似乎存在一个已知问题。使用特定的OJDBC6(而不是7)可以解决问题。仅供参考-OJDBC安装在/ usr / share / java /中,并在/ installpath ... / lib / sqoop / lib /
Resolved. There seems to be a known issue with the JDBC connectivity to Oracle 12c. Using a specific OJDBC6 (instead of 7) did the trick. FYI - the OJDBC is installed in /usr/share/java/ and a symbolic link is created in /installpath.../lib/sqoop/lib/
这篇关于Sqoop作业因Oracle导入的KiteSDK验证错误而失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!