如何在多核8节点集群中安排Hadoop Map任务？ [英] How to schedule Hadoop Map tasks in multi-core 8 node cluster?

查看：148 发布时间：2018/5/31 18:43:43 hadoop mapreduce cloudera

本文介绍了如何在多核8节点集群中安排Hadoop Map任务？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个仅限地图（无缩减阶段）的程序。输入文件的大小足以创建7个地图任务，并且通过查看生成的输出（部分000到部分006）来验证。现在，我的集群有8个节点，每个节点有8个内核，8个内存和共享文件系统托管在头节点。

我可以选择运行全部7仅在1个节点中映射任务或在7个不同的从节点中运行7个映射任务（每个节点有1个任务）。如果我可以这样做，那么需要在我的代码和配置文件中进行哪些更改。

我尝试将参数mapred.tasktracker.map.tasks.maximum设置为1和7在我的代码只，但我没有发现任何可观的时差。在我的配置文件中，它被设置为1。处理每个节点上应该启动的地图任务的数量，而不是每个地图任务要使用的节点的数量。在Hadoop体系结构中，每个节点（从属）有1个任务跟踪器，主节点（主控）上有1个任务跟踪器。所以如果你设置属性 mapred.tasktracker.map.tasks.maximum ，它只会改变每个节点要执行的地图任务的数量。
mapred.tasktracker.map.tasks.maximum的范围来自 1/2 * cores / node 到 2 * cores / node

您需要设置的地图任务的数量应该使用 setNumMapTasks（int）

I have a "map only" (no reduce phase) program. The size of input file is large enough to create 7 map tasks and I have verified that by looking the output produced (part-000 to part006) . Now, my cluster has 8 nodes each with 8 cores and 8 GB of memory and shared filesystem hosted at head node.

My question is can I choose between running all the 7 map tasks in 1 node only or running the 7 map tasks in 7 different slave nodes (1 task per node). If I can do so, then what change in my code and configuration file is needed.

I tried setting the parameter "mapred.tasktracker.map.tasks.maximum" to 1 and 7 in my code only but I didnot find any appreciable time difference. In my configuration file it is set as 1.

解决方案

"mapred.tasktracker.map.tasks.maximum" deals with the number of map tasks that should be launched on each node, not the number of nodes to be used for each map task. In the Hadoop architecture, there is 1 tasktracker for each node (slaves) and 1 job tracker on a master node (master). So if you set the property mapred.tasktracker.map.tasks.maximum, it will only change the number of map tasks to be executed per node. The range of "mapred.tasktracker.map.tasks.maximum" is from 1/2*cores/node to 2*cores/node

The number of map tasks that you want overall should be set using setNumMapTasks(int)

这篇关于如何在多核8节点集群中安排Hadoop Map任务？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在多核8节点集群中安排Hadoop Map任务？ [英] How to schedule Hadoop Map tasks in multi-core 8 node cluster?

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

如何在多核8节点集群中安排Hadoop Map任务？ [英] How to schedule Hadoop Map tasks in multi-core 8 node cluster?

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭