mapreduce,对值进行排序 [英] mapreduce, sort values

查看:149
本文介绍了mapreduce,对值进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的映射器有一个输出:

I have an ouput from my mapper:

Mapper: KEY, VALUE(Timestamp, someOtherAttrbibutes)

我的减速机确实收到:

Reducer: KEY, Iterable<VALUE(Timestamp, someOtherAttrbibutes)>

我希望Iterable<VALUE(Timestamp, someOtherAttrbibutes)>通过 Timestamp 属性进行排序.有可能实施吗?

I want Iterable<VALUE(Timestamp, someOtherAttrbibutes)> to ordered by Timestamp attribute. Is there any possibility to implement it?

我想避免在Reducer代码中进行手动排序. http://cornercases.wordpress.com/2011/08/18/hadoop-object-reuse-pitfall-all-my-reducer-values-are-the-same/

I would like to avoid manual sorting inside Reducer code. http://cornercases.wordpress.com/2011/08/18/hadoop-object-reuse-pitfall-all-my-reducer-values-are-the-same/

我将不得不从Iterable中深度复制"所有对象,这会导致巨大的内存开销. :(((

I'll have to "deep-copy" all objects from Iterable and it can cause huge memory overhead. :(((

推荐答案

这相对容易,您需要为VALUE类编写比较器类.

It's relatively easy, you need to write comparator class for your VALUE class.

在此处仔细查看:

Take a closer look here: http://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/ especially at A solution for secondary sorting part.

这篇关于mapreduce,对值进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆