如何解决“任务attempt_201104251139_0295_r_000006_0未能报告600秒的状态。” [英] How to fix "Task attempt_201104251139_0295_r_000006_0 failed to report status for 600 seconds."

查看:120
本文介绍了如何解决“任务attempt_201104251139_0295_r_000006_0未能报告600秒的状态。”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个mapreduce作业从数据集中提取一些信息。数据集是用户对电影的评价。用户数量约为25万,电影数量约为30万。地图的输出是<用户,<电影,评级> *>和< movie,< user,rating> *> 。在缩减器中,我将处理这些对。



但是,当我运行作业时,映射器按预期完成,但减速器始终抱怨

 任务尝试_ *未能报告600秒的状态。 

我知道这是由于未能更新状态,所以我添加了一个 context.progress()在我的代码中是这样的:

  int count = 0; 
while(values.hasNext()){
if(count ++%100 == 0){
context.progress();
}
/ *其他代码在这里* /
}

不幸的是,这并没有帮助。

以下是日志:

 任务attempt_201104251139_0295_r_000014_1未能报告600秒的状态。杀! 
11/05/03 10:09:09信息mapred.JobClient:任务ID:attempt_201104251139_0295_r_000012_1,状态:FAILED
任务attempt_201104251139_0295_r_000012_1未能报告600秒的状态。杀!
11/05/03 10:09:09信息mapred.JobClient:任务ID:attempt_201104251139_0295_r_000006_1,状态:FAILED
任务attempt_201104251139_0295_r_000006_1未能报告600秒的状态。杀!

顺便说一句,错误发生在减少到复制阶段,日志说:

 减少>复制(31的28中26.69 MB / s)> :丢失的任务跟踪器:tracker_hadoop-56:localhost / 127.0.0.1:34385 

感谢您的帮助。

解决方案

最简单的方法是设置这个配置参数:

 <性> 
<名称> mapred.task.timeout< / name>
<值> 1800000< /值> <! - 30分钟 - >
< / property>

位于 mapred-site.xml p>

I wrote a mapreduce job to extract some info from a dataset. The dataset is users' rating about movies. The number of users is about 250K and the number of movies is about 300k. The output of map is <user, <movie, rating>*> and <movie,<user,rating>*>. In the reducer, I will process these pairs.

But when I run the job, the mapper completes as expected, but reducer always complain that

Task attempt_* failed to report status for 600 seconds.

I know this is due to failed to update status, so I added a call to context.progress() in my code like this:

int count = 0;
while (values.hasNext()) {
  if (count++ % 100 == 0) {
    context.progress();
  }
  /*other code here*/
}

Unfortunately, this does not help. Still many reduce tasks failed.

Here is the log:

Task attempt_201104251139_0295_r_000014_1 failed to report status for 600 seconds. Killing!
11/05/03 10:09:09 INFO mapred.JobClient: Task Id : attempt_201104251139_0295_r_000012_1, Status : FAILED
Task attempt_201104251139_0295_r_000012_1 failed to report status for 600 seconds. Killing!
11/05/03 10:09:09 INFO mapred.JobClient: Task Id : attempt_201104251139_0295_r_000006_1, Status : FAILED
Task attempt_201104251139_0295_r_000006_1 failed to report status for 600 seconds. Killing!

BTW, the error happened in reduce to copy phase, the log says:

reduce > copy (28 of 31 at 26.69 MB/s) > :Lost task tracker: tracker_hadoop-56:localhost/127.0.0.1:34385

Thanks for the help.

解决方案

The easiest way will be to set this configuration parameter:

<property>
  <name>mapred.task.timeout</name>
  <value>1800000</value> <!-- 30 minutes -->
</property>

in mapred-site.xml

这篇关于如何解决“任务attempt_201104251139_0295_r_000006_0未能报告600秒的状态。”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆