如何在hadoop中按值对字数进行排序? [英] how to sort word count by value in hadoop?

查看:25
本文介绍了如何在hadoop中按值对字数进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想学习如何在 hadoop 中按值对字数进行排序.我知道 hadoop 采用排序键,但不是按值.

hi i wanted to learn how to sort the word count by value in hadoop.i know hadoop takes of sorting keys, but not by values.

我知道要对值进行排序,我们必须有一个分区器、分组比较器和一个排序比较器

i know to sort the values we must have a partitioner,groupingcomparator and a sortcomparator

但我在将这些概念一起应用以按值对字数进行排序时有点困惑.

but i am bit confused in applying these concepts together to sort the word count by value.

我们是否需要另一个 map reduce 作业来实现相同的目的,或者需要一个组合器来计算出现次数,然后在这里排序并将相同的结果发送到 reducer?

do we need another map reduce job to achieve the same or else a combiner to count the occurrences and then sort here and emit the same to reducer?

谁能解释如何按值对字数进行排序?

can any one explain how to sort word count example by values?

推荐答案

你需要有第二个 mapreduce 作业.除非您根据总计数(第一个 MR 工作所做的)得出结论,否则您怎么能想到按值排序(单词的计数)?逻辑上不可能.

You need to have a second mapreduce job. Unless you conclude on the the totals counts (which the first MR job does) how can you think of sorting by value (the counts of the words)? Logically not possible.

这篇关于如何在hadoop中按值对字数进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆