如何在多核 8 节点集群中调度 Hadoop Map 任务? [英] How to schedule Hadoop Map tasks in multi-core 8 node cluster?

查看:19
本文介绍了如何在多核 8 节点集群中调度 Hadoop Map 任务?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个仅地图"(无缩减阶段)程序.输入文件的大小足以创建 7 个地图任务,我通过查看生成的输出 (part-000 to part006) 验证了这一点.现在,我的集群有 8 个节点,每个节点有 8 个内核和 8 GB 内存,并且共享文件系统托管在头节点上.

I have a "map only" (no reduce phase) program. The size of input file is large enough to create 7 map tasks and I have verified that by looking the output produced (part-000 to part006) . Now, my cluster has 8 nodes each with 8 cores and 8 GB of memory and shared filesystem hosted at head node.

我的问题是我可以选择仅在 1 个节点中运行所有 7 个映射任务还是在 7 个不同的从节点中运行 7 个映射任务(每个节点 1 个任务).如果我可以这样做,那么需要对我的代码和配置文件进行哪些更改.

My question is can I choose between running all the 7 map tasks in 1 node only or running the 7 map tasks in 7 different slave nodes (1 task per node). If I can do so, then what change in my code and configuration file is needed.

我尝试仅在我的代码中将参数mapred.tasktracker.map.tasks.maximum"设置为 1 和 7,但我没有发现任何明显的时差.在我的配置文件中,它设置为 1.

I tried setting the parameter "mapred.tasktracker.map.tasks.maximum" to 1 and 7 in my code only but I didnot find any appreciable time difference. In my configuration file it is set as 1.

推荐答案

"mapred.tasktracker.map.tasks.maximum" 处理每个节点应该启动的map任务数,不是每个地图任务要使用的节点数.在 Hadoop 架构中,每个节点(从属节点)有 1 个任务跟踪器,主节点(主节点)上有 1 个作业跟踪器.所以如果你设置属性mapred.tasktracker.map.tasks.maximum,它只会改变每个节点要执行的地图任务的数量."mapred.tasktracker.map.tasks.maximum" 的范围是 1/2*cores/node2*cores/node

"mapred.tasktracker.map.tasks.maximum" deals with the number of map tasks that should be launched on each node, not the number of nodes to be used for each map task. In the Hadoop architecture, there is 1 tasktracker for each node (slaves) and 1 job tracker on a master node (master). So if you set the property mapred.tasktracker.map.tasks.maximum, it will only change the number of map tasks to be executed per node. The range of "mapred.tasktracker.map.tasks.maximum" is from 1/2*cores/node to 2*cores/node

应该使用 setNumMapTasks(int)

这篇关于如何在多核 8 节点集群中调度 Hadoop Map 任务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆