什么先运行:分区器或组合器? [英] What runs first: the partitioner or the combiner?

查看:107
本文介绍了什么先运行:分区器或组合器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我认为它是第一个分割器,然后是合并器,然后将键重定向到不同的减速器,看起来像分区器,所以我很困惑。
请帮助我理解。

解决方案

直接回答您的问题是=> COMBINER



详细信息:
Combiner可以在地图阶段看作是迷你减速器。他们在映射器结果进一步分发之前对映射器结果执行本地减少。一旦Combiner功能被执行,它就会被传递给Reducer进行进一步的工作。



其中as



当我们在减速器上工作时,分区器进入画面。因此,分区器决定哪个还原器负责特定的密钥。他们基本上使用Mapper Result(如果使用组合器结果),然后根据键将其发送给负责的Reducer。



为了更好地理解,您可以参考下面的图片,我从雅虎开发者教程中获取了关于Hadoop的图片。



以下是教程


I was wondering between partitioner and combiner, which runs first?

I was of the opinion it is the partitiner first and then combiner and then the keys are redirected to different reducers, which appears like the partitioner, and so I'm confused. Please help me understand.

解决方案

The direct answer to your question is => COMBINER

Details: Combiner can be viewed as mini-reducers in the map phase. They perform a local-reduce on the mapper results before they are distributed further. Once the Combiner functionality is executed, it is then passed on to the Reducer for further work.

where as

The partitioner comes into the picture when we are working one more than on reducer. So, the partitioner decides which reducer is responsible for a particular key. They basically take the Mapper Result(if Combiner is used then Combiner Result) and send it to the responsible Reducer based on the key.

For a better understanding you can refer the following image, which I have taken from Yahoo Developer Tutorial on Hadoop.

Here is the tutorial .

这篇关于什么先运行:分区器或组合器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆