Hadoop-如何收集无值的文本输出 [英] Hadoop - How to Collect Text Output Without Values

查看：120 发布时间：2020/5/5 14:01:11 java hadoop map

本文介绍了Hadoop-如何收集无值的文本输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在从事地图简化工作，我想知道是否有可能向我的输出文件发出自定义字符串.没有计数，没有其他数量，只是一小段文字.

I am working on a map reduce job, and I am wondering if it is possible to emit a custom string to my output file. No counts, no other quantities, just a blob of text.

这是我在想什么的基本思想

Here's the basic ideas of what Im thinking about

public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
        // this map doesn't do very much
        String line = value.toString();
        word.set(line);
        // emit to map output
        output.collect(word,one);

        // but how to i do something like output.collect(word)
        // because in my output file I want to control the text 
        // this is intended to be a map only job
    }
}

这种事情可能吗?这是通过使用hadoop的并行性来创建仅映射作业来转换数据，但不一定是整个MR框架.当我运行此作业时，我会在每个映射器的hdfs中获得一个输出文件.

Is this kind of thing possible? This is to create a map only job to transform data, using hadoop for its parallelism, but not necessarily the whole MR framework. When I run this job I get an output file in hdfs for each mapper.

$ hadoop fs -ls /Users/dwilliams/output
2013-09-15 09:54:23.875 java[3902:1703] Unable to load realm info from SCDynamicStore
Found 12 items
-rw-r--r--   1 dwilliams supergroup          0 2013-09-15 09:52 /Users/dwilliams/output/_SUCCESS
drwxr-xr-x   - dwilliams supergroup          0 2013-09-15 09:52 /Users/dwilliams/output/_logs
-rw-r--r--   1 dwilliams supergroup    7223469 2013-09-15 09:52 /Users/dwilliams/output/part-00000
-rw-r--r--   1 dwilliams supergroup    7225393 2013-09-15 09:52 /Users/dwilliams/output/part-00001
-rw-r--r--   1 dwilliams supergroup    7223560 2013-09-15 09:52 /Users/dwilliams/output/part-00002
-rw-r--r--   1 dwilliams supergroup    7222830 2013-09-15 09:52 /Users/dwilliams/output/part-00003
-rw-r--r--   1 dwilliams supergroup    7224602 2013-09-15 09:52 /Users/dwilliams/output/part-00004
-rw-r--r--   1 dwilliams supergroup    7225045 2013-09-15 09:52 /Users/dwilliams/output/part-00005
-rw-r--r--   1 dwilliams supergroup    7222759 2013-09-15 09:52 /Users/dwilliams/output/part-00006
-rw-r--r--   1 dwilliams supergroup    7223617 2013-09-15 09:52 /Users/dwilliams/output/part-00007
-rw-r--r--   1 dwilliams supergroup    7223181 2013-09-15 09:52 /Users/dwilliams/output/part-00008
-rw-r--r--   1 dwilliams supergroup    7223078 2013-09-15 09:52 /Users/dwilliams/output/part-00009

如何在1个文件中获得结果?我应该使用身份减少器吗?

How do I get the results in 1 file? Should I use the identity reducer?

Hadoop-如何收集无值的文本输出 [英] Hadoop - How to Collect Text Output Without Values

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Hadoop-如何收集无值的文本输出 [英] Hadoop - How to Collect Text Output Without Values

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭