在HADOOP映射中使用泛型可以减少问题 [英] use of Generics in HADOOP map reduce problems

查看:224
本文介绍了在HADOOP映射中使用泛型可以减少问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题似乎对HADOOP用户来说很愚蠢。但是我对使用泛型在减少像WORD COUNT这样的地图问题上有点困惑。

我知道泛型使用bascialy来进行类型铸造和类型安全。但是我不能在这里把这个概念联系起来。



在字数问题中,

  public class WordCountMapper extends 
Mapper< LongWritable,Text,Text,LongWritable> {
@Override
protected void map(LongWritable key,Text value,Context context)
throws IOException,InterruptedException {
// TODO自动生成的方法存根
。 ..

}
}

}

请任何人都可以在此澄清我使用泛型。如果我在问这个问题时犯了什么错误,请纠正我。

我现在明白这里使用的泛型是用于键值对(KEY IN,VALUE IN,KEY OUT,VALUE OUT)。但我仍然不清楚,为什么泛型在这里用于关键值对。是不是
有其他方式来做同样的事情。 在这里使用泛型的好处是什么?



谢谢!

解决方案

MapReduce在Mapper和Reducer中专门使用泛型来指定希望读入和写出什么类型的输入和输出。



在例如,您指定了具有指定泛型的 WordCountMapper 扩展 Mapper Mapper< LongWritable,Text, Text,LongWritable> 其中前两个类 LongWritable Text 表示输入关键字和值 Mapper类期望读取,而最后两个类 Text LongWritable 代表输出键和值类, map 方法预计会发出。



线程讨论可以更深入地了解为什么根erics已经在MapReduce中实现。此外,此 JIRA问题提供了更多信息。


My question seems to be silly to the HADOOP users. But I am little confused with use of Generics in map reduce problem like "WORD COUNT".

I know that Generics are used bascialy for Type Casting and Type Safety. But I can not link up the concept here.

In word count problem,

public class WordCountMapper extends
        Mapper<LongWritable, Text, Text, LongWritable> {
    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
        // TODO Auto-generated method stub
        ...

        }
    }

}

Please can anyone clear me the use of Generics here. Please correct me if I've done any mistake while asking this question.

I now understand the generics are used here for key value pair (KEY IN, VALUE IN, KEY OUT, VALUE OUT). But still I am not clear, why Generics is used here for key value pair. Is not there other way to do the same. What is the benefit of using Generics here?

Thanks!

解决方案

MapReduce uses Generics specifically in Mapper and Reducer to specify what kind of input and output is expected to read in and write out.

In the example you have specified your WordCountMapper extends Mapper class with specified generics Mapper<LongWritable, Text, Text, LongWritable> where the first two classes LongWritable and Text represents the input key and value the Mapper class is expecting to read, while the last two classes Text and LongWritable represents the output key and value classes the map method is expected to emit out.

This thread discussion gives more insight into why generics have been implemented in MapReduce. Also, this JIRA Issue gives more information.

这篇关于在HADOOP映射中使用泛型可以减少问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆