在下面的代码中如何生成对象? [英] How object is getting generated in below code?

查看:146
本文介绍了在下面的代码中如何生成对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图理解一个Java代码。 (Java的基础知识)

这里是

WordCountMapper Class

  package com.company; 

导入org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import java.io.IOException;

公共类WordCountMapper扩展了Mapper< LongWritable,Text,Text,IntWritable> {
@Override
public void map(LongWritable key,Text value,Context context)throws IOException,InterruptedException {

String line = value.toString();
for(String word:line.split()){

if(word.length()> 0){
context.write(new Text(word ),新的IntWritable(1));



$ b

Mapper Class

  package org.apache.hadoop.mapreduce; 

import java.io.IOException;
import org.apache.hadoop.classification.InterfaceAudience.Public;
import org.apache.hadoop.classification.InterfaceStability.Stable;

@ InterfaceAudience.Public
@ InterfaceStability.Stable
public class Mapper< KEYIN,VALUEIN,KEYOUT,VALUEOUT> {
public Mapper(){
}

保护无效设置(Mapper< KEYIN,VALUEIN,KEYOUT,VALUEOUT> .Context上下文)
抛出IOException,InterruptedException {

$ b保护无效映射(KEYIN键,VALUEIN值,Mapper< KEYIN,VALUEIN,KEYOUT,VALUEOUT> .Context上下文)
抛出IOException,InterruptedException {
context.write(key,value);


保护无效清理(Mapper< KEYIN,VALUEIN,KEYOUT,VALUEOUT> .Context上下文)
抛出IOException,InterruptedException {
}

public void run(Mapper< KEYIN,VALUEIN,KEYOUT,VALUEOUT> .Context context)throws IOException,InterruptedException {
setup(context); $ context(),context.getCurrentValue(),context);
}
cleanup(context);


公共抽象类Context实现MapContext< KEYIN,VALUEIN,KEYOUT,VALUEOUT> {
public Context(){
}

}

主要方法类

  package com.company ; 
导入org.apache.hadoop.conf.Configuration;
导入org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {
public static void main(String [] args)throws Exception {
if(args.length!= 2){
System.err .println(Invalid Command);
System.err.println(Usage:WordCount< input path>< output path>);
System.exit(0);
}
配置conf = new Configuration();
工作职位=新职位(conf,wordcount);
job.setJarByClass(WordCount.class);
FileInputFormat.addInputPath(job,new Path(args [0]));
FileOutputFormat.setOutputPath(job,new Path(args [1]));
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
}

我怀疑WordCount类中的Text值是如何存在的?我的意思是它的一个对象,但是在它生成的地方,主类方法没有实例化Text类的实例。



这意味着什么 - 在创建类像下面的格式之前,我从来没有见过这样的事情

  public class Mapper< KEYIN,VALUEIN,KEYOUT,VALUEOUT> 
{

有什么建议吗?

解决方案

您粘贴的代码是用

基本上你有三个类:



其实我在你的问题中可能会出现一个 WordCountReducer 类,但似乎没有。



任何方式:文本将通过将其作为文件复制到您的Hadoop集群中并必须在HDFS(Hadoop文件系统)上运行之前存在。



此行代码指向一个HDFS路径:

  FileInputFormat.addInputPath(job,new Path(args [0])); 

关于代码的问题:

  public class Mapper< KEYIN,VALUEIN,KEYOUT,VALUEOUT> 

这些是通用类型(请参阅 tutorial ),你必须在每次你映射一个映射器时声明它。



code> WordCount mapper实际上是这个 Mapper 类的子类并指定了四种类型:

  public class WordCountMapper扩展了Mapper< LongWritable,Text,Text,IntWritable> 

以下是信件:

  KEYIN = LongWritable 
VALUEIN = Text
KEYOUT = Text
VALUEOUT = IntWritable


I'm trying to understand one java code. (Basic knowledge of Java)

Here its is

WordCountMapper Class

package com.company;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import java.io.IOException;

public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
    @Override
    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

        String line = value.toString();
        for (String word : line.split(" ")) {

            if (word.length() > 0) {
                context.write(new Text(word), new IntWritable(1));

        }

    }

Mapper Class

    package org.apache.hadoop.mapreduce;

import java.io.IOException;
import org.apache.hadoop.classification.InterfaceAudience.Public;
import org.apache.hadoop.classification.InterfaceStability.Stable;

@InterfaceAudience.Public
@InterfaceStability.Stable
public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT> {
    public Mapper() {
    }

    protected void setup(Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>.Context context)
            throws IOException, InterruptedException {
    }

    protected void map(KEYIN key, VALUEIN value, Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>.Context context)
            throws IOException, InterruptedException {
        context.write(key, value);
    }

    protected void cleanup(Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>.Context context)
            throws IOException, InterruptedException {
    }

    public void run(Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>.Context context) throws IOException, InterruptedException {
        setup(context);
        while (context.nextKeyValue()) {
            map(context.getCurrentKey(), context.getCurrentValue(), context);
        }
        cleanup(context);
    }

    public abstract class Context implements MapContext<KEYIN, VALUEIN, KEYOUT, VALUEOUT> {
        public Context() {
        }

}

}

Main method class

    package com.company;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {
public static void main(String[] args) throws Exception {
if(args.length !=2){
System.err.println("Invalid Command");
System.err.println("Usage: WordCount <input path> <output path>");
System.exit(0);
}
Configuration conf = new Configuration();
Job job = new Job(conf, "wordcount");
job.setJarByClass(WordCount.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
}

My doubt is in WordCount class how Text value is coming into existance ? I mean its an object but where its getting generated, there is no sign in main method class to instantiate instance of Text class.

And what it means - , I have never seen this before creating class like in below format

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>
{

Any suggestions ?

解决方案

The code you have pasted is meant to run using the Hadoop MapReduce framework.

Basically you have here three classes:

  • The WordCount mapper which seems to split strings and write these to the Hadoop streaming context
  • The Mapper class which is part of the Hadoop streaming libraries
  • The WordCount driver which submits the job to the Hadoop cluster

Actually I would have expected a WordCountReducer class in your question, but that seems not to be there.

Any way: the text will "come to existence" by copying it as a file to your Hadoop cluster and must be on HDFS (Hadoop File System) before you run the job.

This line of code refers to one HDFS path:

FileInputFormat.addInputPath(job, new Path(args[0]));

And regarding the question about the code:

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>

These are generic types (see this tutorial here) which have to be declared each time you subclass a mapper.

Your WordCount mapper actually subclasses this Mapper class and specifies the four types:

public class WordCountMapper extends Mapper<LongWritable,Text,Text,IntWritable>

These are the correspondences:

KEYIN    = LongWritable
VALUEIN  = Text
KEYOUT   = Text
VALUEOUT = IntWritable

这篇关于在下面的代码中如何生成对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆