Hadoop mapreduce编程 [英] Hadoop mapreduce programming

查看:122
本文介绍了Hadoop mapreduce编程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用Hadoop mapreduce编程来获得排序的o / p。

有没有什么办法可以按排序顺序得到最终的键值对。 (按键或值)。



任何指针都非常赞赏。



谢谢
$ R

解决方案

默认情况下,MapReduce会按键输入记录。



然而,它可能会帮助您下载最新的Hadoop版本并查看它们的示例。也有不同的排序示例。



如果您需要更多关于排序顺序的信息,可以更改它。



排序顺序对于键是由RawComparator控制的,如下所示:


  1. 如果属性mapred.output.key.comparator。类被设置,则使用该类
    的一个实例。 (JobConf上的setOutputKeyComparatorClass()方法是一种便捷的
    方法来设置此属性。)

  2. 否则,键必须是WritableComparable的子类,并且使用注册的
    比较器作为密钥类。


  3. 如果没有已注册的比较器,则使用RawComparator对$ b $进行反序列化b字节流被比较成对象并委托给WritableCompar
    的compareTo()方法。
    这些规则强化了为您的自定义Writable类注册RawCompara
    优化版本的重要性,以及通过设置自己的比较器来覆盖
    排序顺序很简单。



How do I get the sorted o/p using Hadoop mapreduce programming.

Is there any way to get final key-value pair in sorted order. ( either by key or value).

Any pointers on this greatly appreciated.

Thank You R

解决方案

By default, MapReduce will sort input records by their keys.

However, it might help you more to download latest Hadoop release and check out examples they have. There are different sort examples as well.

If you need more information on sort order, this is how it can be changed.

The sort order for keys is controlled by a RawComparator, which is found as follows:

  1. If the property mapred.output.key.comparator.class is set, an instance of that class is used. (The setOutputKeyComparatorClass() method on JobConf is a convenient way to set this property.)

  2. Otherwise, keys must be a subclass of WritableComparable, and the registered comparator for the key class is used.

  3. If there is no registered comparator, then a RawComparator is used that deserializes the byte streams being compared into objects and delegates to the WritableCompar able’s compareTo() method. These rules reinforce why it’s important to register optimized versions of RawCompara tors for your own custom Writable classes, and also that it’s straightforward to override the sort order by setting your own comparator.

这篇关于Hadoop mapreduce编程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆