如何使用星星之火上下文中运行Apache中的星火并发作业(操作) [英] How to run concurrent jobs(actions) in Apache Spark using single spark context

查看:147
本文介绍了如何使用星星之火上下文中运行Apache中的星火并发作业(操作)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



它说,阿帕奇星火文档中的每个星火应用程序中,多个工作(星火行动)可能是,如果他们是由不同的线程提交的同时运行 。有人能解释如何实现这种并发对于下面的示例code?


It says in Apache Spark documentation "within each Spark application, multiple "jobs" (Spark actions) may be running concurrently if they were submitted by different threads". Can someone explain how to achieve this concurrency for the following sample code?

    SparkConf conf = new SparkConf().setAppName("Simple_App");
    JavaSparkContext sc = new JavaSparkContext(conf);

    JavaRDD<String> file1 = sc.textFile("/path/to/test_doc1");
    JavaRDD<String> file2 = sc.textFile("/path/to/test_doc2");

    System.out.println(file1.count());
    System.out.println(file2.count());

这两个作业是相互独立的,必须同时运行。

谢谢。

These two jobs are independent and must run concurrently.
Thank You.

推荐答案

尝试是这样的:

    final JavaSparkContext sc = new JavaSparkContext("local[2]","Simple_App");
    ExecutorService executorService = Executors.newFixedThreadPool(2);
    // Start thread 1
    Future<Long> future1 = executorService.submit(new Callable<Long>() {
        @Override
        public Long call() throws Exception {
            JavaRDD<String> file1 = sc.textFile("/path/to/test_doc1");
            return file1.count();
        }
    });
    // Start thread 2
    Future<Long> future2 = executorService.submit(new Callable<Long>() {
        @Override
        public Long call() throws Exception {
            JavaRDD<String> file2 = sc.textFile("/path/to/test_doc2");
            return file2.count();
        }
    });
    // Wait thread 1
    System.out.println("File1:"+future1.get());
    // Wait thread 2
    System.out.println("File2:"+future2.get());

这篇关于如何使用星星之火上下文中运行Apache中的星火并发作业(操作)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆