MR工作各个阶段的顺序正确吗? [英] Correct order of various phases of MR job?

查看：341 发布时间：2020/5/5 15:48:48 hadoop mapreduce yarn hadoop2

本文介绍了MR工作各个阶段的顺序正确吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图了解MR Job经历的各个阶段.我阅读了相同的在线文档.

I am trying to understand the various phases which a MR Job goes through. I read online documentation for the same.

基于此，我对序列的理解如下:

Based on this, my understand on the sequence is as below:

map()->分区程序->排序(在mapper机器上)->随机播放->排序(在reducer机器上)-> groupBy(Key)(在reducer机器上)-> reduce()

map() -> Partitioner -> Sorting (at mapper machine) -> Shuffle -> Sorting (at reducer machine) -> groupBy(Key) (at reducer machine) -> reduce()

这是执行MR作业的正确顺序吗?

Is this the correct sequence in which a MR Job executes?

推荐答案

地图的各个阶段都会减少工作量:

Various phases of a map reduce job:

地图阶段:

从HDFS中读取分配的输入

Reads assigned input split from HDFS

将输入作为键值对解析到记录中

Parses input into records as key-value pairs

将映射功能应用于每个记录

Applies map function to each record

通知主节点其完成情况

分区阶段

每个映射器必须确定哪个减速器将接收每个输出

Each mapper must determine which reducer will receive each of the outputs

对于任何键，目标分区都是相同的

For any key, destination partition is the same

否.分区数=减速器数量

No. of partitions = No. of reducers

随机播放阶段

从所有地图任务中获取与reduce任务的存储桶对应的部分的输入数据

排序阶段

合并将所有地图输出分类为一次运行

减少阶段

将用户定义的reduce函数应用于合并的un

Apply user defined reduce function to merged un

参数是键和相应的值列表

Argument are the key and corresponding list of values

将输出写入HDFS中的文件

Writes output to a file in HDFS

这篇关于MR工作各个阶段的顺序正确吗?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

MR工作各个阶段的顺序正确吗? [英] Correct order of various phases of MR job?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

MR工作各个阶段的顺序正确吗? [英] Correct order of various phases of MR job?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭