HADOOP - 1.2.1稳定的字数统计实例 [英] HADOOP - Word Count Example for 1.2.1 Stable

查看:117
本文介绍了HADOOP - 1.2.1稳定的字数统计实例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过hadoop 1.2.1的一个字数计算例子。但是有些东西一定会改变的,因为我似乎无法使它工作。



这是我的Reduce类:

  public static class Reduce extends Reducer< WritableComparable,Writable,WritableComparable,Writable> {
$ b $ public void reduce(WritableComparable key,
Iterator< Writable> values,
OutputCollector< WritableComparable,NullWritable>输出,
Reporter记者)抛出IOException {

output.collect(key,NullWritable.get());

}

}

而我的主要函数:

  public static void main(String [] args)throws Exception {

JobConf jobConf =新的JobConf(MapDemo.class);

jobConf.setNumMapTasks(10);
jobConf.setNumReduceTasks(1);

jobConf.setJobName(MapDemo);

jobConf.setOutputKeyClass(Text.class);
jobConf.setOutputValueClass(NullWritable.class);

jobConf.setMapperClass(Map.class);
jobConf.setReducerClass(Reduce.class);

jobConf.setInputFormat(TextInputFormat.class);
jobConf.setOutputFormat(TextOutputFormat.class);

FileInputFormat.setInputPaths(jobConf,new Path(args [0]));
FileOutputFormat.setOutputPath(jobConf,new Path(args [1]));

JobClient.runJob(jobConf);
}

我的IDE告诉我存在一个错误,由Maven证实: p>

  [错误]编译错误:
[信息] --------------- ----------------------------------------------
[错误] com / example / mapreduce / MapDemo.java:[71,16]类org.apache.hadoop.mapred.JobConf中的方法setReducerClass不能应用于给定的类型;
需要:java.lang.Class<?扩展org.apache.hadoop.mapred.Reducer>
found:java.lang.Class< com.example.mapreduce.MapDemo.Reduce>
reason:实际参数java.lang.Class< com.example.mapreduce.MapDemo.Reduce>无法转换为java.lang.Class<扩展org.apache.hadoop.mapred.Reducer>通过方法调用转换
[INFO] 1错误
[INFO] ------------------------------ -------------------------------
[信息] ------------ -------------------------------------------------- ----------
[INFO] BUILD FAILURE
[INFO] ------------------------ ------------------------------------------------
[INFO]总时间:1.679s
[信息]完成时间:周一9月16日09:23:08 PDT 2013
[INFO]最终记忆:17M / 202M
[信息] -------------------------------------------------- ----------------------
[错误]无法执行目标org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile):编译失败
[错误] com / example / mapreduce / MapDemo.java:[71,16]类org.apache.hadoop.mapred.JobConf中的方法setReducerClass不能应用于给定类型;
[错误] required:java.lang.Class<?扩展org.apache.hadoop.mapred.Reducer>
[错误] found:java.lang.Class< com.example.mapreduce.MapDemo.Reduce>

我相信网上的数字例子在1.2.1中已经过期。我该如何解决?有没有人有一个链接到一个工作1.2.1字数java源?

解决方案

你跟随哪个链接?我从来没有见过这样的WC。但是,无论你使用的是旧API,它们肯定是过时的。我怀疑你是否正确地遵守了它。



这应该可以工作:

  public class WordCount {
/ **
* WordCount的地图类。
* /
public static class TokenCounterMapper extends
Mapper< Object,Text,Text,IntWritable> {

private static static IntWritable one = new IntWritable(1);
私人文字=新文字();
$ b $ public void map(Object key,Text value,Context context)
throws IOException,InterruptedException {

StringTokenizer itr = new StringTokenizer(value.toString()) ;
while(itr.hasMoreTokens()){
word.set(itr.nextToken());
context.write(word,one);



$ b / **
* WordCount的缩减器类
* /
公共静态类TokenCounterReducer扩展
Reducer<文本,IntWritable,文本,IntWritable> {
public void reduce(Text key,Iterable< IntWritable> values,
Context context)throws IOException,InterruptedException {
int sum = 0; (IntWritable value:values)

sum + = value.get();
}
context.write(key,new IntWritable(sum));
}
}

/ **
*主要入口点。
* /
public static void main(String [] args)throws Exception {
Configuration conf = new Configuration();
conf.addResource(new Path(/ Users / miqbal1 / hadoop-eco / hadoop-1.1.2 / conf / core-site.xml));
conf.addResource(new Path(/ Users / miqbal1 / hadoop-eco / hadoop-1.1.2 / conf / hdfs-site.xml));
conf.set(fs.default.name,hdfs:// localhost:9000);
conf.set(mapred.job.tracker,localhost:9001);
工作职位=新职位(conf,WordCount);
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenCounterMapper.class);
job.setReducerClass(TokenCounterReducer.class);
job.setNumReduceTasks(2);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job,new Path(/ inputs / demo.txt));
FileOutputFormat.setOutputPath(job,new Path(/ outputs / 1111223));
System.exit(job.waitForCompletion(true)?0:1);


$ / code $


几乎没有观察到:




  • 由于我看到 NullWritable 从Reducer发出,因此您没有发出任何计数。它只会发出没有任何计数的密钥。

  • 使用正确类型输入和输出键/值
  • >
  • 使用新API 。它更干净,更好。


I am working through a word count example for hadoop 1.2.1. But something must have changed, because I cant seem to get it to work.

Here is my Reduce class:

public static class Reduce extends Reducer<WritableComparable, Writable, WritableComparable, Writable> {

    public void reduce(WritableComparable key,
                       Iterator<Writable> values,
                       OutputCollector<WritableComparable, NullWritable> output,
                       Reporter reporter) throws IOException {

        output.collect(key, NullWritable.get());

    }

}

And my main function:

public static void main(String[] args) throws Exception {

    JobConf jobConf = new JobConf(MapDemo.class);

    jobConf.setNumMapTasks(10);
    jobConf.setNumReduceTasks(1);

    jobConf.setJobName("MapDemo");

    jobConf.setOutputKeyClass(Text.class);
    jobConf.setOutputValueClass(NullWritable.class);

    jobConf.setMapperClass(Map.class);
    jobConf.setReducerClass(Reduce.class);

    jobConf.setInputFormat(TextInputFormat.class);
    jobConf.setOutputFormat(TextOutputFormat.class);

    FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
    FileOutputFormat.setOutputPath(jobConf, new Path(args[1]));

    JobClient.runJob(jobConf);
}

My IDE is telling me there is an error, corroborated by Maven:

[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] com/example/mapreduce/MapDemo.java:[71,16] method setReducerClass in class org.apache.hadoop.mapred.JobConf cannot be applied to given types;
required: java.lang.Class<? extends org.apache.hadoop.mapred.Reducer>
found: java.lang.Class<com.example.mapreduce.MapDemo.Reduce>
reason: actual argument java.lang.Class<com.example.mapreduce.MapDemo.Reduce> cannot be converted to java.lang.Class<? extends org.apache.hadoop.mapred.Reducer> by method invocation conversion
[INFO] 1 error
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.679s
[INFO] Finished at: Mon Sep 16 09:23:08 PDT 2013
[INFO] Final Memory: 17M/202M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project inventory: Compilation failure
[ERROR] com/example/mapreduce/MapDemo.java:[71,16] method setReducerClass in class org.apache.hadoop.mapred.JobConf cannot be applied to given types;
[ERROR] required: java.lang.Class<? extends org.apache.hadoop.mapred.Reducer>
[ERROR] found: java.lang.Class<com.example.mapreduce.MapDemo.Reduce>

I believe the word count examples online are out of date for 1.2.1. How do I fix this? Does anyone have a link to a working 1.2.1 word count java source?

解决方案

Which link have you followed? I have never seen this kind of WC. But whatever you have followed is definitely outdated since it is making use of the old API. And I doubt if you have followed it properly.

This should work :

public class WordCount {
    /**
     * The map class of WordCount.
     */
    public static class TokenCounterMapper extends
            Mapper<Object, Text, Text, IntWritable> {

        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        public void map(Object key, Text value, Context context)
                throws IOException, InterruptedException {              

            StringTokenizer itr = new StringTokenizer(value.toString());
            while (itr.hasMoreTokens()) {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }
    }

    /**
     * The reducer class of WordCount
     */
    public static class TokenCounterReducer extends
            Reducer<Text, IntWritable, Text, IntWritable> {
        public void reduce(Text key, Iterable<IntWritable> values,
                Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }
            context.write(key, new IntWritable(sum));
        }
    }

    /**
     * The main entry point.
     */
    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/core-site.xml"));
        conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/hdfs-site.xml"));
        conf.set("fs.default.name", "hdfs://localhost:9000");
        conf.set("mapred.job.tracker", "localhost:9001");
        Job job = new Job(conf, "WordCount");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenCounterMapper.class);
        job.setReducerClass(TokenCounterReducer.class);
        job.setNumReduceTasks(2);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path("/inputs/demo.txt"));
        FileOutputFormat.setOutputPath(job, new Path("/outputs/1111223"));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

Few observations to make :

  • You are not emitting any count as I can see NullWritable getting emitted from your Reducer. It will just emit the key without any count.
  • Use proper types for your input and output keys/values.
  • Use the new API. It is cleaner and better.

这篇关于HADOOP - 1.2.1稳定的字数统计实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆