如何修复在 src 文件系统问题上更改的资源 [英] How to fix resource changed on src filesystem issue
问题描述
我正在尝试在执行 SQL
的 MR 上使用 Hive
并且中途失败并出现以下错误:
I'm trying to use Hive
on MR executing SQL
and it fails half way with errors below:
Application application_1570514228864_0001 failed 2 times due to AM Container for appattempt_1570514228864_0001_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2019-10-08 13:57:49.272]Failed to download resource { { s3a://tpcds/tmp/hadoop-yarn/staging/root/.staging/job_1570514228864_0001/libjars, 1570514262820, FILE, null },pending,[(container_1570514228864_0001_02_000001)],1132444167207544,DOWNLOADING} java.io.IOException: Resource s3a://tpcds/tmp/hadoop-yarn/staging/root/.staging/job_1570514228864_0001/libjars changed on src filesystem (expected 1570514262820, was 1570514269265
从我的角度来看,错误日志中的关键信息是 libjars 在 src 文件系统上已更改(预期为 1570514262820,为 1570514269265
.SO 上有几个关于此问题的线程,但尚未得到答复,例如 thread1 和 thread2.
The key message from the error log from my perspective is libjars changed on src filesystem (expected 1570514262820, was 1570514269265
. There are several threads about this issue at SO but not been answered yet, like thread1 and thread2.
我从 apache jira 和 redhat bugzilla.我通过 NTP
通过所有相关节点同步时钟.但同样的问题仍然存在.
I found something valuable from apache jira and redhat bugzilla. I synced clock by NTP
through all nodes related. But same issue is still there.
欢迎任何评论,谢谢.
推荐答案
我还是不明白为什么资源文件的时间戳不一致,也没有办法通过配置方式修复,AFAIK.
I still didn't know why the timestamp of resource file is inconsistent and there isn't a way to fix it in configuration way, AFAIK.
但是,我设法找到了一种解决方法来跳过该问题.让我在这里为可能遇到相同问题的任何人总结一下.
However, I managed to find a workaround to skip the issue. Let me summarize it here for anyone who might run into same issue.
通过查看错误日志并在Hadoop
源代码中搜索,我们可以在hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main追踪问题/java/org/apache/hadoop/yarn/util/FSDownload.java
.
By checking error log and search it at Hadoop
source code, we can trace the issue at hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
.
去掉异常抛出语句,
private void verifyAndCopy(Path destination)
throws IOException, YarnException {
final Path sCopy;
try {
sCopy = resource.getResource().toPath();
} catch (URISyntaxException e) {
throw new IOException("Invalid resource", e);
}
FileSystem sourceFs = sCopy.getFileSystem(conf);
FileStatus sStat = sourceFs.getFileStatus(sCopy);
if (sStat.getModificationTime() != resource.getTimestamp()) {
/**
throw new IOException("Resource " + sCopy +
" changed on src filesystem (expected " + resource.getTimestamp() +
", was " + sStat.getModificationTime());
**/
LOG.debug("[Gearon][Info] The timestamp is not consistent among resource files.
" +
"Stop throwing exception . It doesn't affect other modules. ");
}
if (resource.getVisibility() == LocalResourceVisibility.PUBLIC) {
if (!isPublic(sourceFs, sCopy, sStat, statCache)) {
throw new IOException("Resource " + sCopy +
" is not publicly accessible and as such cannot be part of the" +
" public cache.");
}
}
downloadAndUnpack(sCopy, destination);
}
构建hadoop-yarn-project
并将'hadoop-yarn-common-x.x.x.jar复制到
$HADOOP_HOME/share/hadoop/yarn`.
Build hadoop-yarn-project
and copy 'hadoop-yarn-common-x.x.x.jarto
$HADOOP_HOME/share/hadoop/yarn`.
将此线程留在这里,感谢您提供有关如何在不更改 hadoop
源的情况下修复它的任何进一步解释.
Leave this thread here and thanks for any further explanation about how to fix it without changing hadoop
source.
这篇关于如何修复在 src 文件系统问题上更改的资源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!