源长度上的Distcp不匹配 [英] Distcp Mismatch in length of source

查看:331
本文介绍了源长度上的Distcp不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在两个不同的 hadoop 集群之间执行 distcp 命令时,我遇到了问题,

I am facing issue while executing distcp command between two different hadoop clusters,

由以下原因引起:java.io.IOException:长度不匹配 来源:hdfs://ip1/xxxxxxxxxx/xxxxx和 目标:hdfs://nameservice1/xxxxxx/.distcp.tmp.attempt_1483200922993_0056_m_000011_2

Caused by: java.io.IOException: Mismatch in length of source:hdfs://ip1/xxxxxxxxxx/xxxxx and target:hdfs://nameservice1/xxxxxx/.distcp.tmp.attempt_1483200922993_0056_m_000011_2

我尝试使用-pb和-skipcrccheck:

I tried using -pb and -skipcrccheck:

hadoop distcp -pb -skipcrccheck -update hdfs://ip1/xxxxxxxxxx/xxxxx hdfs:///xxxxxxxxxxxx/ 

hadoop distcp -pb  hdfs://ip1/xxxxxxxxxx/xxxxx hdfs:///xxxxxxxxxxxx/ 

hadoop distcp -skipcrccheck -update hdfs://ip1/xxxxxxxxxx/xxxxx hdfs:///xxxxxxxxxxxx/ 

但似乎没有任何作用.

请提供任何解决方案.

推荐答案

我在两个版本完全相同的Hadoop集群之间遇到distcp的相同问题.对我来说,这是由于源目录之一中的某些文件仍处于打开状态.一旦我为每个源目录分别运行了distcp,我就能发现是这种情况-除了一个带有打开文件的目录之外,该文件对所有目录都适用,并且仅对那些文件有效.当然,乍一看很难说.

I was facing the same issue with distcp between two Hadoop clusters of exactly the same version. For me it turned out to be due to some files in one of the source directories being still open. Once I ran distcp for each source directory individually I was able to find that was the case - it worked fine for all but the one directory with the open files and only for those files. Of course it's hard to tell at first blush.

这篇关于源长度上的Distcp不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆