HADOOP - 1.2.1稳定的字数统计实例 [英] HADOOP - Word Count Example for 1.2.1 Stable
问题描述
我正在通过hadoop 1.2.1的一个字数计算例子。但是有些东西一定会改变的,因为我似乎无法使它工作。
这是我的Reduce类:
public static class Reduce extends Reducer< WritableComparable,Writable,WritableComparable,Writable> {
$ b $ public void reduce(WritableComparable key,
Iterator< Writable> values,
OutputCollector< WritableComparable,NullWritable>输出,
Reporter记者)抛出IOException {
output.collect(key,NullWritable.get());
}
}
而我的主要函数:
public static void main(String [] args)throws Exception {
JobConf jobConf =新的JobConf(MapDemo.class);
jobConf.setNumMapTasks(10);
jobConf.setNumReduceTasks(1);
jobConf.setJobName(MapDemo);
jobConf.setOutputKeyClass(Text.class);
jobConf.setOutputValueClass(NullWritable.class);
jobConf.setMapperClass(Map.class);
jobConf.setReducerClass(Reduce.class);
jobConf.setInputFormat(TextInputFormat.class);
jobConf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(jobConf,new Path(args [0]));
FileOutputFormat.setOutputPath(jobConf,new Path(args [1]));
JobClient.runJob(jobConf);
}
我的IDE告诉我存在一个错误,由Maven证实: p>
[错误]编译错误:
[信息] --------------- ----------------------------------------------
[错误] com / example / mapreduce / MapDemo.java:[71,16]类org.apache.hadoop.mapred.JobConf中的方法setReducerClass不能应用于给定的类型;
需要:java.lang.Class<?扩展org.apache.hadoop.mapred.Reducer>
found:java.lang.Class< com.example.mapreduce.MapDemo.Reduce>
reason:实际参数java.lang.Class< com.example.mapreduce.MapDemo.Reduce>无法转换为java.lang.Class<扩展org.apache.hadoop.mapred.Reducer>通过方法调用转换
[INFO] 1错误
[INFO] ------------------------------ -------------------------------
[信息] ------------ -------------------------------------------------- ----------
[INFO] BUILD FAILURE
[INFO] ------------------------ ------------------------------------------------
[INFO]总时间:1.679s
[信息]完成时间:周一9月16日09:23:08 PDT 2013
[INFO]最终记忆:17M / 202M
[信息] -------------------------------------------------- ----------------------
[错误]无法执行目标org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile):编译失败
[错误] com / example / mapreduce / MapDemo.java:[71,16]类org.apache.hadoop.mapred.JobConf中的方法setReducerClass不能应用于给定类型;
[错误] required:java.lang.Class<?扩展org.apache.hadoop.mapred.Reducer>
[错误] found:java.lang.Class< com.example.mapreduce.MapDemo.Reduce>
我相信网上的数字例子在1.2.1中已经过期。我该如何解决?有没有人有一个链接到一个工作1.2.1字数java源?
你跟随哪个链接?我从来没有见过这样的WC。但是,无论你使用的是旧API,它们肯定是过时的。我怀疑你是否正确地遵守了它。
这应该可以工作:
public class WordCount {
/ **
* WordCount的地图类。
* /
public static class TokenCounterMapper extends
Mapper< Object,Text,Text,IntWritable> {
private static static IntWritable one = new IntWritable(1);
私人文字=新文字();
$ b $ public void map(Object key,Text value,Context context)
throws IOException,InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString()) ;
while(itr.hasMoreTokens()){
word.set(itr.nextToken());
context.write(word,one);
$ b / **
* WordCount的缩减器类
* /
公共静态类TokenCounterReducer扩展
Reducer<文本,IntWritable,文本,IntWritable> {
public void reduce(Text key,Iterable< IntWritable> values,
Context context)throws IOException,InterruptedException {
int sum = 0; (IntWritable value:values)
(
sum + = value.get();
}
context.write(key,new IntWritable(sum));
}
}
/ **
*主要入口点。
* /
public static void main(String [] args)throws Exception {
Configuration conf = new Configuration();
conf.addResource(new Path(/ Users / miqbal1 / hadoop-eco / hadoop-1.1.2 / conf / core-site.xml));
conf.addResource(new Path(/ Users / miqbal1 / hadoop-eco / hadoop-1.1.2 / conf / hdfs-site.xml));
conf.set(fs.default.name,hdfs:// localhost:9000);
conf.set(mapred.job.tracker,localhost:9001);
工作职位=新职位(conf,WordCount);
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenCounterMapper.class);
job.setReducerClass(TokenCounterReducer.class);
job.setNumReduceTasks(2);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job,new Path(/ inputs / demo.txt));
FileOutputFormat.setOutputPath(job,new Path(/ outputs / 1111223));
System.exit(job.waitForCompletion(true)?0:1);
$ / code $
几乎没有观察到:
- 由于我看到 NullWritable 从Reducer发出,因此您没有发出任何计数。它只会发出没有任何计数的密钥。
- 使用正确类型输入和输出键/值。
>
- 使用新API 。它更干净,更好。
I am working through a word count example for hadoop 1.2.1. But something must have changed, because I cant seem to get it to work.
Here is my Reduce class:
public static class Reduce extends Reducer<WritableComparable, Writable, WritableComparable, Writable> {
public void reduce(WritableComparable key,
Iterator<Writable> values,
OutputCollector<WritableComparable, NullWritable> output,
Reporter reporter) throws IOException {
output.collect(key, NullWritable.get());
}
}
And my main function:
public static void main(String[] args) throws Exception {
JobConf jobConf = new JobConf(MapDemo.class);
jobConf.setNumMapTasks(10);
jobConf.setNumReduceTasks(1);
jobConf.setJobName("MapDemo");
jobConf.setOutputKeyClass(Text.class);
jobConf.setOutputValueClass(NullWritable.class);
jobConf.setMapperClass(Map.class);
jobConf.setReducerClass(Reduce.class);
jobConf.setInputFormat(TextInputFormat.class);
jobConf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
FileOutputFormat.setOutputPath(jobConf, new Path(args[1]));
JobClient.runJob(jobConf);
}
My IDE is telling me there is an error, corroborated by Maven:
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] com/example/mapreduce/MapDemo.java:[71,16] method setReducerClass in class org.apache.hadoop.mapred.JobConf cannot be applied to given types;
required: java.lang.Class<? extends org.apache.hadoop.mapred.Reducer>
found: java.lang.Class<com.example.mapreduce.MapDemo.Reduce>
reason: actual argument java.lang.Class<com.example.mapreduce.MapDemo.Reduce> cannot be converted to java.lang.Class<? extends org.apache.hadoop.mapred.Reducer> by method invocation conversion
[INFO] 1 error
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.679s
[INFO] Finished at: Mon Sep 16 09:23:08 PDT 2013
[INFO] Final Memory: 17M/202M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project inventory: Compilation failure
[ERROR] com/example/mapreduce/MapDemo.java:[71,16] method setReducerClass in class org.apache.hadoop.mapred.JobConf cannot be applied to given types;
[ERROR] required: java.lang.Class<? extends org.apache.hadoop.mapred.Reducer>
[ERROR] found: java.lang.Class<com.example.mapreduce.MapDemo.Reduce>
I believe the word count examples online are out of date for 1.2.1. How do I fix this? Does anyone have a link to a working 1.2.1 word count java source?
解决方案 Which link have you followed? I have never seen this kind of WC. But whatever you have followed is definitely outdated since it is making use of the old API. And I doubt if you have followed it properly.
This should work :
public class WordCount {
/**
* The map class of WordCount.
*/
public static class TokenCounterMapper extends
Mapper<Object, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
/**
* The reducer class of WordCount
*/
public static class TokenCounterReducer extends
Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}
/**
* The main entry point.
*/
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/core-site.xml"));
conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/hdfs-site.xml"));
conf.set("fs.default.name", "hdfs://localhost:9000");
conf.set("mapred.job.tracker", "localhost:9001");
Job job = new Job(conf, "WordCount");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenCounterMapper.class);
job.setReducerClass(TokenCounterReducer.class);
job.setNumReduceTasks(2);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("/inputs/demo.txt"));
FileOutputFormat.setOutputPath(job, new Path("/outputs/1111223"));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
Few observations to make :
- You are not emitting any count as I can see NullWritable getting emitted from your Reducer. It will just emit the key without any count.
- Use proper types for your input and output keys/values.
- Use the new API. It is cleaner and better.
这篇关于HADOOP - 1.2.1稳定的字数统计实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!