在 sqoop 导入或导出期间处理不良记录 [英] Handling bad records during sqoop import or export
问题描述
我查看了 sqoop 导出操作提供的选项,但找不到任何处理不良记录的选项.例如,偶尔可能会出现一个字符,其中在大量记录中需要一个数字.有没有办法在 sqoop 中处理这些场景,而不会使作业失败并在文件中提供错误记录.
I looked at the options provided by sqoop export operation but could not find any options to handle bad records. For example once in a while it is possible that a character is present where a number is expected in a huge set of records. Is there a way to handle these scenarios in sqoop without failing the job and providing the bad records in a file.
推荐答案
Sqoop 当前期望导出的数据是干净的,并且不提供处理损坏数据的设施.在使用 Sqoop 导出数据之前,您可以使用 MR/Pig/Hive 作业来清理数据.
Sqoop currently expects that the data to export is clean and do not provide facilities to handle corrupted data. You can use MR/Pig/Hive job to clean your data up, prior using Sqoop to export them.
这篇关于在 sqoop 导入或导出期间处理不良记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!