Hadoop推测性任务执行 [英] Hadoop speculative task execution

查看:121
本文介绍了Hadoop推测性任务执行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Google的MapReduce文件中,他们有一个备份任务,我认为这与Hadoop中的推测任务是一回事。投机任务如何实施?当我开始一项推测性任务时,任务是从最初开始,慢慢开始,还是从早期任务已经到达的地方开始(如果是,是否必须复制所有中间状态和数据?)

解决方案

Hadoop系统的一个问题是,通过将任务分配到多个节点上,有可能让一些慢速节点 - 限制程序的其余部分。

由于各种原因,包括硬件退化或软件错误配置,任务可能会变慢,但原因可能很难检测到,因为任务仍然成功完成,尽管比预期的时间长。 Hadoop不会尝试诊断和修复运行缓慢的任务;相反,它会尝试检测任务的运行速度是否比预期慢,并启动另一个等效的任务作为备份。这被称为任务的推测性执行。例如,如果一个节点有一个慢速磁盘控制器,那么它可能只读取其输入速度的10%,即全部速度其他节点。因此,当99个地图任务已经完成时,系统仍在等待最终地图任务进行签入,这比所有其他节点花费的时间要长得多。



通过强制任务彼此独立运行,个别任务不知道他们的输入来自哪里。任务信任Hadoop平台只提供适当的输入。因此,同一输入可以并行处理多次,以利用机器能力的差异。随着作业中的大部分任务即将结束,Hadoop平台将为剩余的任务安排多个节点的冗余副本,这些副本不需要其他工作来执行。这个过程被称为投机执行。任务完成后,他们会向JobTracker公布这一事实。无论任务的任何副本首先完成成为最终副本。如果其他副本是推测性执行的,则Hadoop会通知TaskTracker放弃这些任务并放弃其输出。首先,Reducer接收来自Mapper成功完成的输入。

默认情况下启用投机执行。您可以通过设置 mapred.map.tasks.speculative.execution mapred.reduce.tasks.speculative来禁用映射器和缩减器的推测性执行。执行 JobConf选项为false,分别使用旧的API,而使用较新的API,您可以考虑更改 mapreduce.map.speculative mapreduce.reduce.speculative



所以要回答你的问题,它会重新开始,并且与其他任务有多少无关完成/完成。

参考: http:// developer .yahoo.com / hadoop / tutorial / module4.html

In Google's MapReduce paper, they have a backup task, I think it's the same thing with speculative task in Hadoop. How is the speculative task implemented? When I start a speculative task, does the task start from the very begining as the older and slowly one, or just start from where the older task has reached(if so, does it have to copy all the intermediate status and data?)

解决方案

One problem with the Hadoop system is that by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program.

Tasks may be slow for various reasons, including hardware degradation, or software mis-configuration, but the causes may be hard to detect since the tasks still complete successfully, albeit after a longer time than expected. Hadoop doesn’t try to diagnose and fix slow-running tasks; instead, it tries to detect when a task is running slower than expected and launches another, equivalent, task as a backup. This is termed speculative execution of tasks.

For example if one node has a slow disk controller, then it may be reading its input at only 10% the speed of all the other nodes. So when 99 map tasks are already complete, the system is still waiting for the final map task to check in, which takes much longer than all the other nodes.

By forcing tasks to run in isolation from one another, individual tasks do not know where their inputs come from. Tasks trust the Hadoop platform to just deliver the appropriate input. Therefore, the same input can be processed multiple times in parallel, to exploit differences in machine capabilities. As most of the tasks in a job are coming to a close, the Hadoop platform will schedule redundant copies of the remaining tasks across several nodes which do not have other work to perform. This process is known as speculative execution. When tasks complete, they announce this fact to the JobTracker. Whichever copy of a task finishes first becomes the definitive copy. If other copies were executing speculatively, Hadoop tells the TaskTrackers to abandon the tasks and discard their outputs. The Reducers then receive their inputs from whichever Mapper completed successfully, first.

Speculative execution is enabled by default. You can disable speculative execution for the mappers and reducers by setting the mapred.map.tasks.speculative.execution and mapred.reduce.tasks.speculative.execution JobConf options to false, respectively using old API, while with newer API you may consider changing mapreduce.map.speculative and mapreduce.reduce.speculative.

So to answer your question it does start afresh and has nothing to do with how much the other task has done/completed.

Reference: http://developer.yahoo.com/hadoop/tutorial/module4.html

这篇关于Hadoop推测性任务执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆