hadoop:0减速器和身份缩减器之间的区别? [英] hadoop: difference between 0 reducer and identity reducer?
本文介绍了hadoop:0减速器和身份缩减器之间的区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
- 0缩减器意味着减少步骤将被跳过并且映射器输出将被最终输出
- 还原剂意味着洗牌/分拣仍然会发生?
解决方案
您的理解是正确的。我将其定义如下:
如果您不需要对地图结果进行排序 - 您将0设置为减少,并且该作业称为仅地图。
如果您需要对映射结果进行排序,但不需要任何聚合 - 您可以选择身份缩减器。
为了完成这个图片,我们有第三种情况:我们需要聚合,在这种情况下,我们需要reducer。
I am just trying to confirm my understanding of difference between 0 reducer and identity reducer.
- 0 reducer means reduce step will be skipped and mapper output will be the final out
- Identity reducer means then shuffling/sorting will still take place?
解决方案
You understanding is correct. I would define it as following:
If you do not need sorting of map results - you set 0 reduced,and the job is called map only.
If you need to sort the mapping results, but do not need any aggregation - you choose identity reducer.
And to complete the picture we have a third case : we do need aggregation and, in this case we need reducer.
这篇关于hadoop:0减速器和身份缩减器之间的区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文