为什么在Mapper类中未使用LongWritable(密钥)? [英] Why LongWritable (key) has not been used in Mapper class?

查看:446
本文介绍了为什么在Mapper类中未使用LongWritable(密钥)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

映射器:

Mapper类是泛型类型,具有四个形式类型参数,用于指定映射函数的输入键,输入值,输出键和输出值类型

The Mapper class is a generic type, with four formal type parameters that specify the input key, input value, output key, and output value types of the map function

public class MaxTemperatureMapper
    extends Mapper<LongWritable, Text, Text, IntWritable> {
        private static final int MISSING = 9999;
        @Override
        public void map(LongWritable key, Text value, Context context)
          throws IOException, InterruptedException {
            String line = value.toString();
            String year = line.substring(15, 19);
            int airTemperature;
            if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs
                airTemperature = Integer.parseInt(line.substring(88, 92));
            } else {
                airTemperature = Integer.parseInt(line.substring(87, 92));
        }
        String quality = line.substring(92, 93);
        if (airTemperature != MISSING && quality.matches("[01459]")) {
            context.write(new Text(year), new IntWritable(airTemperature));
        }
    }

减速器:

四个正式类型参数用于指定输入和输出类型,这 减少功能的时间. reduce函数的输入类型必须与map函数的输出类型相匹配:Text和IntWritable

Four formal type parameters are used to specify the input and output types, this time for the reduce function. The input types of the reduce function must match the output types of the map function: Text and IntWritable

public class MaxTemperatureReducer
extends Reducer<Text, IntWritable, Text, IntWritable> {
@Override
    public void reduce(Text key, Iterable<IntWritable> values, Context context)
    throws IOException, InterruptedException {
        int maxValue = Integer.MIN_VALUE;
        for (IntWritable value : values) {
            maxValue = Math.max(maxValue, value.get());
        }
    context.write(key, new IntWritable(maxValue));
    }
}

但是在此示例中,从未使用过密钥.

But in this example, key was never used.

映射器中的密钥有什么用,而根本没有使用过?

What is the use of key in Mapper, which has not been used at all?

为什么密钥是可写的?

推荐答案

在此示例中使用的输入格式为

The input format in this example used is TextInputFormat which produces the key/value pair as LongWritable/Text.

此处的键LongWritable表示从给定输入文件的Input Split中读取的当前行的偏移位置. Text代表实际的当前行本身.

Here the key LongWritable represents the offset location of the current line being read from the Input Split of the given input file. Where the Text represents the actual current line itself.

我们不能说LongWritable键为文件中的每一行指定的行偏移值都没有用.这取决于用例,根据您的情况,此输入键并不重要.

We cannot say this line offset value given by the LongWritable key for every line in the file is not useful. It depends upon the usecases, as per your case this input key is not significant.

我们有除TextInputFormat以外的众多InputFormat类型,它们以不同的方式解析输入文件中的行并产生其相关的键/值对.

Where as we have numerous types of InputFormat types other than TextInputFormat which parses the lines from the input file in different ways and produces its relevant key/value pairs.

例如 KeyValueTextInputFormat TextInputFormat的子类,它使用configures delimiter解析每一行,并将键/值生成为Text/Text.

For example the KeyValueTextInputFormat is a subclass of TextInputFormat , it parses every line using configures delimiter and produces the key/value as Text/Text.

- 在一些输入格式和键/值类型的列表下方找到

- Find below the list of few Input formats and key/value types,

KeyValueTextInputFormat  Text/Text

NLineInputFormat         LongWritable/Text

FixedLengthInputFormat   LongWritable/BytesWritable

除了我们有几种输入格式外,它们在声明时采用基于泛型的自定义键/值类型.如SequenceFileInputFormat, CombineFileInputFormat.请看一下Hadoop权威指南中的输入格式"一章.

Other than we have few Input formats which take the Generics-based custom key/value types upon declaration. Such like SequenceFileInputFormat, CombineFileInputFormat. Kindly give a look to the Input Format chapter in Hadoop definitive guide.

希望这会有所帮助.

这篇关于为什么在Mapper类中未使用LongWritable(密钥)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆