如何修复在 src 文件系统问题上更改的资源 [英] How to fix resource changed on src filesystem issue

查看:32
本文介绍了如何修复在 src 文件系统问题上更改的资源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在执行 SQL 的 MR 上使用 Hive 并且中途失败并出现以下错误:

I'm trying to use Hive on MR executing SQL and it fails half way with errors below:

Application application_1570514228864_0001 failed 2 times due to AM Container for appattempt_1570514228864_0001_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2019-10-08 13:57:49.272]Failed to download resource { { s3a://tpcds/tmp/hadoop-yarn/staging/root/.staging/job_1570514228864_0001/libjars, 1570514262820, FILE, null },pending,[(container_1570514228864_0001_02_000001)],1132444167207544,DOWNLOADING} java.io.IOException: Resource s3a://tpcds/tmp/hadoop-yarn/staging/root/.staging/job_1570514228864_0001/libjars changed on src filesystem (expected 1570514262820, was 1570514269265

从我的角度来看,错误日志中的关键信息是 libjars 在 src 文件系统上已更改(预期为 1570514262820,为 1570514269265.SO 上有几个关于此问题的线程,但尚未得到答复,例如 thread1thread2.

The key message from the error log from my perspective is libjars changed on src filesystem (expected 1570514262820, was 1570514269265. There are several threads about this issue at SO but not been answered yet, like thread1 and thread2.

我从 apache jiraredhat bugzilla.我通过 NTP 通过所有相关节点同步时钟.但同样的问题仍然存在.

I found something valuable from apache jira and redhat bugzilla. I synced clock by NTP through all nodes related. But same issue is still there.

欢迎任何评论,谢谢.

推荐答案

我还是不明白为什么资源文件的时间戳不一致,也没有办法通过配置方式修复,AFAIK.

I still didn't know why the timestamp of resource file is inconsistent and there isn't a way to fix it in configuration way, AFAIK.

但是,我设法找到了一种解决方法来跳过该问题.让我在这里为可能遇到相同问题的任何人总结一下.

However, I managed to find a workaround to skip the issue. Let me summarize it here for anyone who might run into same issue.

通过查看错误日志并在Hadoop源代码中搜索,我们可以在hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main追踪问题/java/org/apache/hadoop/yarn/util/FSDownload.java.

By checking error log and search it at Hadoop source code, we can trace the issue at hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java.

去掉异常抛出语句,

  private void verifyAndCopy(Path destination)
      throws IOException, YarnException {
    final Path sCopy;
    try {
      sCopy = resource.getResource().toPath();
    } catch (URISyntaxException e) {
      throw new IOException("Invalid resource", e);
    }
    FileSystem sourceFs = sCopy.getFileSystem(conf);
    FileStatus sStat = sourceFs.getFileStatus(sCopy);
    if (sStat.getModificationTime() != resource.getTimestamp()) {
            /**
      throw new IOException("Resource " + sCopy +
          " changed on src filesystem (expected " + resource.getTimestamp() +
          ", was " + sStat.getModificationTime());
          **/
            LOG.debug("[Gearon][Info] The timestamp is not consistent among resource files.
" +
                            "Stop throwing exception . It doesn't affect other modules. ");
    }
    if (resource.getVisibility() == LocalResourceVisibility.PUBLIC) {
      if (!isPublic(sourceFs, sCopy, sStat, statCache)) {
        throw new IOException("Resource " + sCopy +
            " is not publicly accessible and as such cannot be part of the" +
            " public cache.");
      }
    }

    downloadAndUnpack(sCopy, destination);
  }

构建hadoop-yarn-project 并将'hadoop-yarn-common-x.x.x.jar复制到$HADOOP_HOME/share/hadoop/yarn`.

Build hadoop-yarn-project and copy 'hadoop-yarn-common-x.x.x.jarto$HADOOP_HOME/share/hadoop/yarn`.

将此线程留在这里,感谢您提供有关如何在不更改 hadoop 源的情况下修复它的任何进一步解释.

Leave this thread here and thanks for any further explanation about how to fix it without changing hadoop source.

这篇关于如何修复在 src 文件系统问题上更改的资源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆