无法提交并发的Hadoop作业 [英] Unable to submit concurrent Hadoop jobs

查看:93
本文介绍了无法提交并发的Hadoop作业的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在本地计算机上运行Hadoop 2.7,以及HBase 1.4Phoenix 4.15.我编写了一个应用程序,该应用程序提交了通过Phoenix删除HBase中的数据的map reduce作业.每个作业都由ThreadPoolExecutor的单个线程运行,如下所示:

I am running Hadoop 2.7 on my local machine, along with HBase 1.4 and Phoenix 4.15. I have written an application which submits map reduce jobs that delete data in HBase through Phoenix. Each job is run by an individual thread of a ThreadPoolExecutor and looks like this:

public class MRDeleteTask extends Task {

    private final Logger LOGGER = LoggerFactory.getLogger(MRDeleteTask.class);
    private String query;
    public MRDeleteTask(int id, String q) {
        this.setId(id);
        this.query = q;
    }

    @Override
    public void run() {
        LOGGER.info("Running Task: " + getId());
        try {
            Configuration configuration = HBaseConfiguration.create();
            Job job = Job.getInstance(configuration, "phoenix-mr-job-"+getId());
            LOGGER.info("mapper input: " + this.query);
            PhoenixMapReduceUtil.setInput(job, DeleteMR.PhoenixDBWritable.class, "Table", QUERY);
            job.setMapperClass(DeleteMR.DeleteMapper.class);
            job.setJarByClass(DeleteMR.class);
            job.setNumReduceTasks(0);
            job.setOutputFormatClass(NullOutputFormat.class);
            job.setOutputKeyClass(ImmutableBytesWritable.class);
            job.setOutputValueClass(Writable.class);
            TableMapReduceUtil.addDependencyJars(job);
            boolean result = job.waitForCompletion(true);

        }
        catch (Exception e) {
            LOGGER.info(e.getMessage());
        }
    }
}

如果ThreadPoolExecutor中只有1个线程,那么一切都很好.如果同时提交多个这样的Hadoop作业,则不会发生任何事情.根据日志,错误看起来像:

Everything is fine if there is only 1 thread in the ThreadPoolExecutor. If more than one such Hadoop jobs are submitted concurrently, nothing happens. As per the logs, the error looks like:

4439 [pool-1-thread-2] INFO  MRDeleteTask  - java.util.concurrent.ExecutionException: java.io.IOException: Unable to rename file: [/tmp/hadoop-user/mapred/local/1595274269610_tmp/tmp_phoenix-4.15.0-HBase-1.4-client.jar] to [/tmp/hadoop-user/mapred/local/1595274269610_tmp/phoenix-4.15.0-HBase-1.4-client.jar]

4439 [pool-1-thread-1] INFO  MRDeleteTask  - java.util.concurrent.ExecutionException: ExitCodeException exitCode=1: chmod: /private/tmp/hadoop-user/mapred/local/1595274269610_tmp/phoenix-4.15.0-HBase-1.4-client.jar: No such file or directory

使用ThreadPoolExecutor.submit()提交任务,并使用返回的未来future.isDone()检查其状态.

The tasks are submitted using ThreadPoolExecutor.submit() and their status is being checked using the returned future future.isDone().

推荐答案

作业没有提交给YARN,而是从Intellij本地运行.将以下内容添加到作业配置中可以解决此问题:

The jobs were not being submitted to YARN, but instead running locally from Intellij. Adding the following to the job configuration solved the issue:

conf.set("mapreduce.framework.name", "yarn");

这篇关于无法提交并发的Hadoop作业的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆