无法报告600秒的状态。杀!报告hadoop的进展情况 [英] Failed to report status for 600 seconds. Killing! Reporting progress in hadoop

查看:96
本文介绍了无法报告600秒的状态。杀!报告hadoop的进展情况的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我收到以下错误:

 任务attempt_201304161625_0028_m_000000_0未能在600秒内报告状态。杀! 

作为我的地图作业。这个问题类似于 this 这个,以及这个。然而,我不想在hadoop杀死一个不报告进度的任务之前增加默认时间,即

  Configuration conf = new Configuration(); 
long milliSeconds = 1000 * 60 * 60;
conf.setLong(mapred.task.timeout,milliSeconds);

相反,我想定期使用 context.progress() context.setStatus(Some Message) context.getCounter(SOME_ENUM.PROGRESS).increment(1) 或类似的东西。但是,这仍然会导致工作被杀害。以下是我试图报告进度的代码片段。映射器:

  protected void map(Key key,Value value,Context context)throws IOException,InterruptedException {

//做一些事情
Optimiser optimiser = new Optimiser();
optimiser.optimiseFurther(< some parameters> context);
// more things
context.write(newKey,newValue);
}

Optimiser类中的optimiseFurther方法:

  public void optimiseFurther(< Some parameters>,TaskAttemptContext context){

int count = 0;
while(something true){
//优化

//尝试报告进度
context.setStatus(Progressing:+ count);
System.out.println(Optimize Progress:+ context.getStatus());
context.progress();
count ++;






映射器的输出显示状态正在更新:

 优化进度:进度:0 
优化进度:进度:1
优化进度:进度: 2
...

然而,这项工作仍然在预设金额时间。我是否以错误的方式使用上下文?为了成功报告进度,我还需要在工作设置中做些什么?

解决方案

这个问题与 Hadoop 0.20中的错误,因此调用 context.setStatus() context.progress()没有被报告给底层框架(调用设置各种计数器也不起作用)。有补丁可用,因此更新到Hadoop的较新版本应该可以解决这个问题。


I receive the following error:

Task attempt_201304161625_0028_m_000000_0 failed to report status for 600 seconds. Killing! 

for my Map jobs. This question is similar to this, this, and this. However, I do not want to increase the default time before hadoop kills a task that doesn't report progress, i.e.,

Configuration conf=new Configuration();
long milliSeconds = 1000*60*60;
conf.setLong("mapred.task.timeout", milliSeconds);

Instead, I want to periodically report progress using either context.progress(), context.setStatus("Some Message") or context.getCounter(SOME_ENUM.PROGRESS).increment(1) or something similar. However, this still causes the job to be killed. Here are the snippets of code where I am attempting to report progress. The mapper:

protected void map(Key key, Value value, Context context) throws IOException, InterruptedException {

    //do some things
    Optimiser optimiser = new Optimiser();
    optimiser.optimiseFurther(<some parameters>, context);
    //more things
    context.write(newKey, newValue);
}

the optimiseFurther method within the Optimiser class:

public void optimiseFurther(<Some parameters>, TaskAttemptContext context) {

    int count = 0;
    while(something is true) {
        //optimise

        //try to report progress
        context.setStatus("Progressing:" + count);
        System.out.println("Optimise Progress:" + context.getStatus());
        context.progress();
        count++;
    }
}

The output from a mapper shows the status is being updated:

Optimise Progress:Progressing:0
Optimise Progress:Progressing:1
Optimise Progress:Progressing:2
...

However, the job is still being killed after the default amount of time. Am I using the context in the wrong way? Is there anything else I need to do in the job setup in order to report the progress successfully?

解决方案

This problem is to do with a bug in Hadoop 0.20 whereby calls to context.setStatus() and context.progress() are not being reported to the underlying framework (calls to set various counters are not working either). There is a patch available, so updating to a newer version of Hadoop should fix this.

这篇关于无法报告600秒的状态。杀!报告hadoop的进展情况的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆