Hadoop的生产没有输出 [英] Hadoop producing no output

查看：257 发布时间：2016/5/23 22:07:08 java api sorting hadoop output

本文介绍了Hadoop的生产没有输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我不得不使用旧的API Hadoop的工作运行，我搬到我实施新的API和我有运行它的问题。作业运行时没有异常抛出，但我从来没有产生的任何输出文件。按照旧的API会产生输出文件，我的结果的排序列表。这是作业正在运行：

 配置配置=新配置（）;
招聘职位= Job.getInstance（配置排序）;job.setOutputKeyClass（IntWritable.class）;
job.setOutputValueClass（IntWritable.class）;job.setMapperClass（SortMapper.class）;
job.setCombinerClass（SortReducer.class）;
job.setReducerClass（SortReducer.class）;job.setInputFormatClass（TextInputFormat.class）;
job.setOutputFormatClass（TextOutputFormat.class）;FileInputFormat.setInputPaths（工作，新的路径（inputFileLocation））;
FileOutputFormat.setOutputPath（工作，新的路径（outputFileLocation））;job.setJarByClass（HadoopTest.class）;长STARTTIME = System.currentTimeMillis的（）;
job.submit（）;
长ENDTIME = System.currentTimeMillis的（）;持续时间长=结束时间 - 的startTime;
的System.out.println（时间：+持续时间）;

这是我的映射IMPL：

 公共静态类SortMapper扩展MultithreadedMapper＆LT; LongWritable，文本，IntWritable，IntWritable＆GT; {
    私人最终静态IntWritable 1 =新IntWritable（1）;
    私人IntWritable intKey =新IntWritable（）;    @覆盖
    保护无效图（LongWritable键，文本价值，上下文的背景下）抛出IOException异常，InterruptedException的{
        intKey.set（的Integer.parseInt（value.toString（）））;
        context.write（intKey一种）;
    }
}

这是我的减速IMPL：

 公共静态类SortReducer扩展减速＆LT; IntWritable，IntWritable，IntWritable，IntWritable＆GT; {
    @覆盖
    保护无效减少（IntWritable键，可迭代＆LT; IntWritable＆GT;的价值观，上下文的背景下）抛出IOException异常，InterruptedException的{
        INT总和= 0;
        迭代器＆LT; IntWritable＆GT;迭代= values.iterator（）;
        而（iterator.hasNext（））{
            。总之+ = iterator.next（）获得（）;
        }
        context.write（关键，新IntWritable（和））;
    }
}

的日志出现如下（与旧的API运行时，我总是对投诉无法加载的境界映射信息...和无法加载原生的Hadoop ......

  2014年3月18日10：19：41.299的java [13311：1d03]无法加载从SCDynamicStore境界映射信息
14/03/18十时19分41秒WARN util.Native codeLoader：无法加载原生的Hadoop库平台...使用内置-java类适用
14/03/18十时19分41秒INFO Configuration.de precation：session.id是pcated德$ P $。相反，使用dfs.metrics.session-ID
14/03/18十时19分41秒INFO jvm.JvmMetrics：初始化JVM度量与processName =的JobTracker，的sessionId =
14/03/18十时19分41秒WARN均线preduce.JobSubmitter：解析不执行Hadoop的命令行选项。实施工具界面，并执行与ToolRunner您的应用程序来解决这个问题。
14/03/18十时19分41秒WARN均线preduce.JobSubmitter：没有工作的jar文件集。用户类可能不会被发现。见作业或作业＃setJar（字符串）。
14/03/18十时19分41秒INFO input.FileInputFormat：总输入路径的过程：2
14/03/18十时19分41秒INFO均线preduce.JobSubmitter：分割数：2
14/03/18 10时19分42秒INFO均线preduce.JobSubmitter：提交令牌任务：job_local904621238_0001
14/03/18 10时19分42秒WARN conf.Configuration： file:/tmp/hadoop-james.mchugh/ma$p$pd/staging/james.mchugh904621238/.staging/job_local904621238_0001/job.xml:an试图重写最后一个参数：MA preduce.job.end，notification.max.retry.interval;忽略。
14/03/18 10时19分42秒WARN conf.Configuration： file:/tmp/hadoop-james.mchugh/ma$p$pd/staging/james.mchugh904621238/.staging/job_local904621238_0001/job.xml:an试图重写最后一个参数：MA preduce.job.end，notification.max.attempts;忽略。
14/03/18 10时19分42秒WARN conf.Configuration： file:/tmp/hadoop-james.mchugh/ma$p$pd/local/localRunner/james.mchugh/job_local904621238_0001/job_local904621238_0001.xml:an试图重写最后一个参数：MA preduce.job.end，notification.max.retry.interval;忽略。
14/03/18 10时19分42秒WARN conf.Configuration： file:/tmp/hadoop-james.mchugh/ma$p$pd/local/localRunner/james.mchugh/job_local904621238_0001/job_local904621238_0001.xml:an试图重写最后一个参数：MA preduce.job.end，notification.max.attempts;忽略。
14/03/18 10时19分42秒INFO均线preduce.Job：跟踪工作的网址：HTTP：//本地主机：8080 /
14/03/18 10时19分42秒INFO均线pred.LocalJobRunner：在配置空OutputCommitter设置

解决方案

尝试job.waitForCompletion（真）;而不是job.submit（）;.既然你已经在当地一马preduce，应等待的结果了JUnit杀了当地的JobTracker之前。

I had a hadoop job running using the old API, I moved my implementation to the new API and am having problems running it. When the job runs no exceptions are thrown but I never get any output files produced. Under the old API it would produce output files with my sorted list of results. This is the job being run:

Configuration config = new Configuration();
Job job = Job.getInstance(config, "sorting");

job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);

job.setMapperClass(SortMapper.class);
job.setCombinerClass(SortReducer.class);
job.setReducerClass(SortReducer.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.setInputPaths(job, new Path(inputFileLocation));
FileOutputFormat.setOutputPath(job, new Path(outputFileLocation));

job.setJarByClass(HadoopTest.class);

long startTime = System.currentTimeMillis();
job.submit();
long endTime = System.currentTimeMillis();

long duration = endTime - startTime;
System.out.println("Duration: " + duration);

This is my mapper impl:

public static class SortMapper extends MultithreadedMapper<LongWritable, Text, IntWritable, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private IntWritable intKey = new IntWritable();

    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        intKey.set(Integer.parseInt(value.toString()));
        context.write(intKey, one);
    }
}

This is my reducer impl:

public static class SortReducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
    @Override
    protected void reduce(IntWritable key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        Iterator<IntWritable> iterator = values.iterator();
        while (iterator.hasNext()) {
            sum += iterator.next().get();
        }
        context.write(key, new IntWritable(sum));
    }
}

The logs appear as following (when running with the old API i always got the complaints about "unable to load realm mapping info..." and "Unable to load native-hadoop...":

2014-03-18 10:19:41.299 java[13311:1d03] Unable to load realm mapping info from SCDynamicStore
14/03/18 10:19:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/03/18 10:19:41 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
14/03/18 10:19:41 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
14/03/18 10:19:41 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
14/03/18 10:19:41 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
14/03/18 10:19:41 INFO input.FileInputFormat: Total input paths to process : 2
14/03/18 10:19:41 INFO mapreduce.JobSubmitter: number of splits:2
14/03/18 10:19:42 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local904621238_0001
14/03/18 10:19:42 WARN conf.Configuration: file:/tmp/hadoop-james.mchugh/mapred/staging/james.mchugh904621238/.staging/job_local904621238_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
14/03/18 10:19:42 WARN conf.Configuration: file:/tmp/hadoop-james.mchugh/mapred/staging/james.mchugh904621238/.staging/job_local904621238_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
14/03/18 10:19:42 WARN conf.Configuration: file:/tmp/hadoop-james.mchugh/mapred/local/localRunner/james.mchugh/job_local904621238_0001/job_local904621238_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
14/03/18 10:19:42 WARN conf.Configuration: file:/tmp/hadoop-james.mchugh/mapred/local/localRunner/james.mchugh/job_local904621238_0001/job_local904621238_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
14/03/18 10:19:42 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
14/03/18 10:19:42 INFO mapred.LocalJobRunner: OutputCommitter set in config null

解决方案

Try job.waitForCompletion(true); instead of job.submit();. Since you are running a mapreduce on local, you should wait the result before the JUnit kill your local jobtracker.

这篇关于Hadoop的生产没有输出的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Hadoop的生产没有输出 [英] Hadoop producing no output

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Hadoop的生产没有输出 [英] Hadoop producing no output

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭