hadoop mapreduce中单独的输出文件 [英] Separate output files in hadoop mapreduce

查看：299 发布时间：2018/5/31 19:48:33 hadoop mapreduce

本文介绍了hadoop mapreduce中单独的输出文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的问题可能已经被问到了，但我找不到明确的答案。

我的MapReduce是一个基本的WordCount。我当前的输出文件是：

  //文件名：'part-r-00000'
 789 a 
 755 #c 
 456 d 
 123 #b

如何更改输出文件名？

然后，是否可以有2个输出文件：

//第一个输出文件 789 a 456 d //第二个输出文件 123 #b 755 #c
这是我的reduce类：

public static class SortReducer extends Reducer< IntWritable，Text，IntWritable，Text> { public void reduce（IntWritable key，Text value，Context context）throws IOException，InterruptedException { context.write（key，value）; 这是我的分区类： p> public class TweetPartitionner扩展了Partitioner< Text，IntWritable> { @Override public int getPartition （Text a_key，IntWritable a_value，int a_nbPartitions）{ if（a_key.toString（）。startsWith（＃）） return 1; 返回0; } } 非常感谢！解决方案在您的工作文件集中 job.setNumReduceTasks（2）; 从映射器发射 a 789 #c 755 d 456 #b 123 编写分区程序，将分区程序添加到作业配置中，在分区程序中检查密钥是否以＃开头返回1 else 0 在reducer交换键和值 My question has probably already been asked but I can not find a clear answer to my question. My MapReduce is a basic WordCount. My current output file is : // filename : 'part-r-00000' 789 a 755 #c 456 d 123 #b How can I change the ouput filename ? Then, is-it possible to have 2 output files : // First output file 789 a 456 d // Second output file 123 #b 755 #c Here's my reduce class : public static class SortReducer extends Reducer<IntWritable, Text, IntWritable, Text> { public void reduce(IntWritable key, Text value, Context context) throws IOException, InterruptedException { context.write(key, value); } } Here's my Partitionner Class : public class TweetPartitionner extends Partitioner<Text, IntWritable>{ @Override public int getPartition(Text a_key, IntWritable a_value, int a_nbPartitions) { if(a_key.toString().startsWith("#")) return 1; return 0; } } Thanks a lot ! 解决方案 In your job file set job.setNumReduceTasks(2); From mapper emit a 789 #c 755 d 456 #b 123 write a partitioner, add partitioner to job config, In partitioner check if key starts with # return 1 else 0 in reducer swap key and value 这篇关于hadoop mapreduce中单独的输出文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

hadoop mapreduce中单独的输出文件 [英] Separate output files in hadoop mapreduce

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

hadoop mapreduce中单独的输出文件 [英] Separate output files in hadoop mapreduce

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭