使用Hadoop 0.20+生成多个输出文件 [英] Generating Multiple Output files with Hadoop 0.20+
问题描述
对MultipleOutputs的支持不在0.20。您将需要使用较旧的API。
它已被添加到0.21,目前还没有发布为org.apache.hadoop.mapreduce.lib.output.MultipleOutputs。
< p&a; >邮件列表中的这个线程讨论了这个问题。
I am trying to output the results of my reducer to multiple files. The data results are all contained in one file, and the rest of the results are split based on a category in their respected files. I know with 0.18 that you can do this with MultipleOutputs and it has not been removed. However, I am trying to make my application 0.20+ compliant. The existing Multiple outputs functionality still requires JobConf (which my application uses Job, and Configuration). How can I generate multiple outputs based on the key?
Support for MultipleOutputs isn't in 0.20. You will need to use the older API.
It has been added into 0.21 which is currently unreleased as org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.
This thread on the mailing list talks about this problem.
这篇关于使用Hadoop 0.20+生成多个输出文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!