“合并者" Mapreduce作业中的课程 [英] “Combiner" Class in a mapreduce job

查看:98
本文介绍了“合并者" Mapreduce作业中的课程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

组合器在Mapper之后运行,在Reducer之前运行,它将接收给定节点上Mapper实例发出的所有数据作为输入.然后将输出发送到Reducers.

A Combiner runs after the Mapper and before the Reducer,it will receive as input all data emitted by the Mapper instances on a given node. then emits output to the Reducers.

而且,如果归约函数既是 可交换的又是关联的 ,则可以将其用作组合器.

And also,If a reduce function is both commutative and associative, then it can be used as a Combiner.

我的问题是,在这种情况下," 可交换和关联 "是什么意思?

My Question is what does the phrase "commutative and associative" mean in this situation?

推荐答案

假设您有一个数字列表,即1 2 3 4 5 6.

Assume you have a list of numbers, 1 2 3 4 5 6.

这里的关联"意味着您可以进行操作并将其应用于任何子组,然后将其应用于这些结果并获得相同的答案:

Associative here means you can take your operation and apply it to any subgroup, then apply it to the result of those and get the same answer:

(1) + (2 + 3) + (4 + 5 + 6)
  ==
(1 + 2) + (3 + 4) + (5) + (6)
  ==
...

这里的括号是组合器的执行.

Think of the parenthesis here as the execution of a combiner.

交换性表示顺序无关紧要,所以:

Commutative means that the order doesn't matter, so:

1 + 2 + 3 + 4 + 5 + 6
  ==
2 + 4 + 6 + 1 + 2 + 3
  ==
...

例如,加法适合该属性,如前所示. 最大值"也适合上面的此属性,因为最大值的最大值是最大值. max(a,b)== max(b,a).

For example, addition, fits this property, as seen before. "Maximum" fits this property above as well, because the max of maxs is the max. max(a,b) == max(b,a).

中位数是一个不起作用的示例:中位数不是真正的中位数.

Median is an example that doesn't work: the median of medians is not the true median.

请不要忘记组合器的另一个重要属性:键/值的输入类型和键/值的输出类型必须相同.例如,您不能输入string:int并返回string:float.

Don't forget another important property of a combiner: the input types for the key/value and the output types of the key/value need to be the same. For example, you can't take in a string:int and return a string:float.

通常,化简器可能会输出某种字符串而不是数字值,这可能会阻止您仅将化简器作为组合器插入.

Often times, the reducer might output some sort of string instead of numerical value, which may prevent you from just plugging in your reducer as the combiner.

这篇关于“合并者" Mapreduce作业中的课程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆