Sqoop导出到MySQL导出作业失败的tool.ExportTool,但获得记录 [英] Sqoop export to MySQL export job failed tool.ExportTool but got records
问题描述
这是来自
以下是 sqoop
后面的结果,它很好:
文件。在将数据导出到MySQL时,Sqoop没有任何选项可以忽略标题。在执行 sqoop-export
之前,您必须手动删除标题。
任何想法或我应该忽略?
由于这只是一行,处理包含头部的分割的映射器会抛出异常,但它们不足以杀死该作业。谁喜欢在Job执行日志中看到异常。
This is a follow-up question from
sqoop export local csv to MySQL error on mapreduce
I was able to run the sqoop job and got the data into MySQL from local .csv file using below command:
$ sqoop export -fs local -jt local -D 'mapreduce.application.framework.path=/usr/hdp/2.5.0.0-1245/hadoop/mapreduce.tar.gz' --connect jdbc:mysql://172.52.21.64:3306/cf_ae07c762_41a9_4b46_af6c_a29ecb050204 --username username --password password --table test3 --export-dir file:///home/username/folder/test3.csv
However, even when I got the records exported successfully after I checked in MySQL, I still saw the error ERROR tool.ExportTool: Error during export: Export job failed!
Full logs below:
17/04/10 09:36:28 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/04/10 09:36:28 INFO mapreduce.Job: Running job: job_local2136897360_0001
17/04/10 09:36:28 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/04/10 09:36:28 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.sqoop.mapreduce.NullOutputCommitter
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Waiting for map tasks
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Starting task: attempt_local2136897360_0001_m_000000_0
17/04/10 09:36:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/10 09:36:28 INFO mapred.MapTask: Processing split: Paths:/home/username/folder/test3.csv:36+7,/home/username/folder/test3.csv:43+8
17/04/10 09:36:28 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
17/04/10 09:36:28 INFO mapred.LocalJobRunner:
17/04/10 09:36:28 INFO mapred.Task: Task:attempt_local2136897360_0001_m_000000_0 is done. And is in the process of committing
17/04/10 09:36:28 INFO mapred.LocalJobRunner: map
17/04/10 09:36:28 INFO mapred.Task: Task 'attempt_local2136897360_0001_m_000000_0' done.
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local2136897360_0001_m_000000_0
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Starting task: attempt_local2136897360_0001_m_000001_0
17/04/10 09:36:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/10 09:36:28 INFO mapred.MapTask: Processing split: Paths:/home/username/folder/test3.csv:0+12
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Exception raised during data export
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Exception:
java.lang.RuntimeException: Can't parse input data: 'id'
at test3.__loadFromFields(test3.java:316)
at test3.parse(test3.java:254)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NumberFormatException: For input string: "id"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.valueOf(Integer.java:582)
at test3.__loadFromFields(test3.java:303)
... 13 more
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Dumping data is not allowed by default, please run the job with -Dorg.apache.sqoop.export.text.dump_data_on_error=true to get corrupted line.
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: On input file: file:/home/username/folder/test3.csv
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: At position 0
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Currently processing split:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Paths:/home/username/folder/test3.csv:0+12
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: This issue might not necessarily be caused by current input
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: due to the batching nature of export.
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Starting task: attempt_local2136897360_0001_m_000002_0
17/04/10 09:36:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/10 09:36:28 INFO mapred.MapTask: Processing split: Paths:/home/username/folder/test3.csv:12+12
17/04/10 09:36:28 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
17/04/10 09:36:28 INFO mapred.LocalJobRunner:
17/04/10 09:36:28 INFO mapred.Task: Task:attempt_local2136897360_0001_m_000002_0 is done. And is in the process of committing
17/04/10 09:36:28 INFO mapred.LocalJobRunner: map
17/04/10 09:36:28 INFO mapred.Task: Task 'attempt_local2136897360_0001_m_000002_0' done.
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local2136897360_0001_m_000002_0
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Starting task: attempt_local2136897360_0001_m_000003_0
17/04/10 09:36:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/10 09:36:28 INFO mapred.MapTask: Processing split: Paths:/home/username/folder/test3.csv:24+12
17/04/10 09:36:28 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
17/04/10 09:36:28 INFO mapred.LocalJobRunner:
17/04/10 09:36:28 INFO mapred.Task: Task:attempt_local2136897360_0001_m_000003_0 is done. And is in the process of committing
17/04/10 09:36:28 INFO mapred.LocalJobRunner: map
17/04/10 09:36:28 INFO mapred.Task: Task 'attempt_local2136897360_0001_m_000003_0' done.
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local2136897360_0001_m_000003_0
17/04/10 09:36:28 INFO mapred.LocalJobRunner: map task executor complete.
17/04/10 09:36:28 WARN mapred.LocalJobRunner: job_local2136897360_0001
java.lang.Exception: java.io.IOException: Can't export data, please check failed map task logs
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:122)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Can't parse input data: 'id'
at test3.__loadFromFields(test3.java:316)
at test3.parse(test3.java:254)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89)
... 11 more
Caused by: java.lang.NumberFormatException: For input string: "id"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.valueOf(Integer.java:582)
at test3.__loadFromFields(test3.java:303)
... 13 more
17/04/10 09:36:29 INFO mapreduce.Job: Job job_local2136897360_0001 running in uber mode : false
17/04/10 09:36:29 INFO mapreduce.Job: map 100% reduce 0%
17/04/10 09:36:29 INFO mapreduce.Job: Job job_local2136897360_0001 failed with state FAILED due to: NA
17/04/10 09:36:29 INFO mapreduce.Job: Counters: 15
File System Counters
FILE: Number of bytes read=673345391
FILE: Number of bytes written=679694703
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=2
Map output records=2
Input split bytes=388
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0
Total committed heap usage (bytes)=2805989376
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
17/04/10 09:36:29 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 5.4541 seconds (0 bytes/sec)
17/04/10 09:36:29 INFO mapreduce.ExportJobBase: Exported 2 records.
17/04/10 09:36:29 ERROR mapreduce.ExportJobBase: Export job failed!
17/04/10 09:36:29 ERROR tool.ExportTool: Error during export: Export job failed!
Any idea or should I just ignore? I don't want to make a mistake and leave it as-is when running larger jobs and miss something.
UPDATE 1
Below is the .csv content without empty line or space
Here is the result after sqoop
and it was fine:
The error is due to the CSV header in the file. Sqoop does not have any options to ignore the header while exporting data into MySQL. You would have to manually remove the header before performing sqoop-export
.
Any idea or should I just ignore?
Since this is only one line, the mapper processing the split containing the header would throw exceptions but they are not potential enough to KILL the job. Between who likes to see an exception in the Job execution log.
这篇关于Sqoop导出到MySQL导出作业失败的tool.ExportTool,但获得记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!