在MapReduce中聚合 [英] Aggregation in MapReduce

查看：473 发布时间：2018/6/1 12:49:06 java hadoop mapreduce

本文介绍了在MapReduce中聚合的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们如何在.csv中找到列的最大和最小元素。

我们应该将哪些内容传递给映射器的context.write（key，value）。

是否是该csv文件的每一列？

解决方案

解决方案

这对于SO问题有点宽泛，但我会咬人。

映射器用于将值映射到键。假设您的CSV包含4列数字值：

42,71,45,22

blockquote>

您将键映射到每个值;有效的是什么会像CSV中的标题。假设第4列代表小部件数量。您可以将number_of_widgets作为关键字映射到映射器中第4列的值。

缩减器将获得给定键的所有值。 那就是，你可以找出你的最小/最大值。您只需遍历该键的所有值并记录最小值和最大值。
How can we find tha maximum and minimum element of a column in a .csv.

What should we pass into context.write(key,value) of mapper.

Whether it is each column of that csv file?

Solution
解决方案
This is a bit broad for an SO question but I'll bite.

Your mapper is for mapping values to keys. Lets say your CSV has 4 columns with numeric values:

42, 71, 45, 22

You map a key to each value; effectively what would be like the header in the CSV. Lets say column 4 represented "Number of widgets". You'd map "number_of_widgets" as the key to the value of column 4 in your mapper.

The reducer is going to get all the values for a given key. That's where you figure out your min/max. You just iterate though all the values for the key and keep track of the min and max.

这篇关于在MapReduce中聚合的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在MapReduce中聚合 [英] Aggregation in MapReduce

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

在MapReduce中聚合 [英] Aggregation in MapReduce

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭