Hadoop MapReduce:关于 reducer 数量的说明 [英] Hadoop MapReduce: Clarification on number of reducers

查看:25
本文介绍了Hadoop MapReduce:关于 reducer 数量的说明的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 MapReduce 框架中,mapper 生成的每个 key 都使用一个 reducer.

In the MapReduce framework, one reducer is used for each key generated by the mapper.

所以你会认为在 Hadoop MapReduce 中指定 Reducer 的数量没有任何意义,因为它依赖于程序.但是,Hadoop 允许您指定要使用的 reducer 的数量(-D mapred.reduce.tasks=# of reducers).

So you would think that specifying the number of Reducers in Hadoop MapReduce wouldn't make any sense because it's dependent on the program. However, Hadoop allows you to specify the number of reducers to use (-D mapred.reduce.tasks=# of reducers).

这是什么意思?减速器数量的参数值是否指定了有多少机器资源进入减速器,而不是实际使用的减速器数量?

What does this mean? Is the parameter value for number of reducers specifying how many machine resources go to the reducers instead of the number of actual reducers used?

推荐答案

一个reducer用于mapper生成的每个key

one reducer is used for each key generated by the mapper

此评论不正确.对分组比较器分组的每个键执行一次 reduce() 方法调用.reducer(任务)是一个处理零个或多个 reduce() 调用的进程.您所指的属性是关于reducer tasks的数量.

This comment is not correct. One call to the reduce() method is done for each key grouped by the grouping comparator. A reducer (task) is a process that handles zero or more calls to reduce(). The property to which you refer is talking about the number of reducer tasks.

这篇关于Hadoop MapReduce:关于 reducer 数量的说明的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆