Sqoop导出到MySQL导出作业失败的tool.ExportTool,但获得记录 [英] Sqoop export to MySQL export job failed tool.ExportTool but got records

查看:4968
本文介绍了Sqoop导出到MySQL导出作业失败的tool.ExportTool,但获得记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是来自



以下是 sqoop 后面的结果,它很好:

解决方案

文件。在将数据导出到MySQL时,Sqoop没有任何选项可以忽略标题。在执行 sqoop-export 之前,您必须手动删除标题。


任何想法或我应该忽略?

由于这只是一行,处理包含头部的分割的映射器会抛出异常,但它们不足以杀死该作业。谁喜欢在Job执行日志中看到异常。


This is a follow-up question from

sqoop export local csv to MySQL error on mapreduce

I was able to run the sqoop job and got the data into MySQL from local .csv file using below command:

$ sqoop export -fs local -jt local -D 'mapreduce.application.framework.path=/usr/hdp/2.5.0.0-1245/hadoop/mapreduce.tar.gz' --connect jdbc:mysql://172.52.21.64:3306/cf_ae07c762_41a9_4b46_af6c_a29ecb050204 --username username --password password --table test3 --export-dir file:///home/username/folder/test3.csv

However, even when I got the records exported successfully after I checked in MySQL, I still saw the error ERROR tool.ExportTool: Error during export: Export job failed!

Full logs below:

17/04/10 09:36:28 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/04/10 09:36:28 INFO mapreduce.Job: Running job: job_local2136897360_0001
17/04/10 09:36:28 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/04/10 09:36:28 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.sqoop.mapreduce.NullOutputCommitter
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Waiting for map tasks
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Starting task: attempt_local2136897360_0001_m_000000_0
17/04/10 09:36:28 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
17/04/10 09:36:28 INFO mapred.MapTask: Processing split: Paths:/home/username/folder/test3.csv:36+7,/home/username/folder/test3.csv:43+8
17/04/10 09:36:28 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
17/04/10 09:36:28 INFO mapred.LocalJobRunner:
17/04/10 09:36:28 INFO mapred.Task: Task:attempt_local2136897360_0001_m_000000_0 is done. And is in the process of committing
17/04/10 09:36:28 INFO mapred.LocalJobRunner: map
17/04/10 09:36:28 INFO mapred.Task: Task 'attempt_local2136897360_0001_m_000000_0' done.
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local2136897360_0001_m_000000_0
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Starting task: attempt_local2136897360_0001_m_000001_0
17/04/10 09:36:28 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
17/04/10 09:36:28 INFO mapred.MapTask: Processing split: Paths:/home/username/folder/test3.csv:0+12
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Exception raised during data export
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Exception:
java.lang.RuntimeException: Can't parse input data: 'id'
    at test3.__loadFromFields(test3.java:316)
    at test3.parse(test3.java:254)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NumberFormatException: For input string: "id"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:492)
    at java.lang.Integer.valueOf(Integer.java:582)
    at test3.__loadFromFields(test3.java:303)
    ... 13 more
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Dumping data is not allowed by default, please run the job with -Dorg.apache.sqoop.export.text.dump_data_on_error=true to get corrupted line.
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: On input file: file:/home/username/folder/test3.csv
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: At position 0
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Currently processing split:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: Paths:/home/username/folder/test3.csv:0+12
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: This issue might not necessarily be caused by current input
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper: due to the batching nature of export.
17/04/10 09:36:28 ERROR mapreduce.TextExportMapper:
17/04/10 09:36:28 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Starting task: attempt_local2136897360_0001_m_000002_0
17/04/10 09:36:28 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
17/04/10 09:36:28 INFO mapred.MapTask: Processing split: Paths:/home/username/folder/test3.csv:12+12
17/04/10 09:36:28 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
17/04/10 09:36:28 INFO mapred.LocalJobRunner:
17/04/10 09:36:28 INFO mapred.Task: Task:attempt_local2136897360_0001_m_000002_0 is done. And is in the process of committing
17/04/10 09:36:28 INFO mapred.LocalJobRunner: map
17/04/10 09:36:28 INFO mapred.Task: Task 'attempt_local2136897360_0001_m_000002_0' done.
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local2136897360_0001_m_000002_0
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Starting task: attempt_local2136897360_0001_m_000003_0
17/04/10 09:36:28 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
17/04/10 09:36:28 INFO mapred.MapTask: Processing split: Paths:/home/username/folder/test3.csv:24+12
17/04/10 09:36:28 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
17/04/10 09:36:28 INFO mapred.LocalJobRunner:
17/04/10 09:36:28 INFO mapred.Task: Task:attempt_local2136897360_0001_m_000003_0 is done. And is in the process of committing
17/04/10 09:36:28 INFO mapred.LocalJobRunner: map
17/04/10 09:36:28 INFO mapred.Task: Task 'attempt_local2136897360_0001_m_000003_0' done.
17/04/10 09:36:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local2136897360_0001_m_000003_0
17/04/10 09:36:28 INFO mapred.LocalJobRunner: map task executor complete.
17/04/10 09:36:28 WARN mapred.LocalJobRunner: job_local2136897360_0001
java.lang.Exception: java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:122)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Can't parse input data: 'id'
    at test3.__loadFromFields(test3.java:316)
    at test3.parse(test3.java:254)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89)
    ... 11 more
Caused by: java.lang.NumberFormatException: For input string: "id"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:492)
    at java.lang.Integer.valueOf(Integer.java:582)
    at test3.__loadFromFields(test3.java:303)
    ... 13 more
17/04/10 09:36:29 INFO mapreduce.Job: Job job_local2136897360_0001 running in uber mode : false
17/04/10 09:36:29 INFO mapreduce.Job:  map 100% reduce 0%
17/04/10 09:36:29 INFO mapreduce.Job: Job job_local2136897360_0001 failed with state FAILED due to: NA
17/04/10 09:36:29 INFO mapreduce.Job: Counters: 15
    File System Counters
        FILE: Number of bytes read=673345391
        FILE: Number of bytes written=679694703
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Map-Reduce Framework
        Map input records=2
        Map output records=2
        Input split bytes=388
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=0
        Total committed heap usage (bytes)=2805989376
    File Input Format Counters
        Bytes Read=0
    File Output Format Counters
        Bytes Written=0
17/04/10 09:36:29 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 5.4541 seconds (0 bytes/sec)
17/04/10 09:36:29 INFO mapreduce.ExportJobBase: Exported 2 records.
17/04/10 09:36:29 ERROR mapreduce.ExportJobBase: Export job failed!
17/04/10 09:36:29 ERROR tool.ExportTool: Error during export: Export job failed!

Any idea or should I just ignore? I don't want to make a mistake and leave it as-is when running larger jobs and miss something.

UPDATE 1

Below is the .csv content without empty line or space

Here is the result after sqoop and it was fine:

解决方案

The error is due to the CSV header in the file. Sqoop does not have any options to ignore the header while exporting data into MySQL. You would have to manually remove the header before performing sqoop-export.

Any idea or should I just ignore?

Since this is only one line, the mapper processing the split containing the header would throw exceptions but they are not potential enough to KILL the job. Between who likes to see an exception in the Job execution log.

这篇关于Sqoop导出到MySQL导出作业失败的tool.ExportTool,但获得记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆