每当事件发生时触发一个火花作业 [英] Trigger a spark job whenever an event occurs

查看：71 发布时间：2020/4/25 8:33:59 java apache-spark apache-kafka runtime.exec kafka-consumer-api

本文介绍了每当事件发生时触发一个火花作业的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个spark应用程序，只要收到有关主题的kafka消息，该应用程序都应运行.

I have a spark application which should run whenever it receives a kafka message on a topic.

我每天不会收到超过5-6条消息，所以我不想采用火花流方法.相反，我尝试使用SparkLauncher提交应用程序，但是我不喜欢这种方法，因为我必须在代码中以编程方式设置spark和Java classpath以及所有必需的spark属性，例如执行程序核心，执行程序内存等.

I won't be receiving more than 5-6 messages a day so I don't want to take spark streaming approach. Instead I tried to submit the application using SparkLauncher but I don't like the approach as I have to set spark and Java classpath programmatically within my code along with all the necessary spark properties like executor cores, executor memory etc.

如何触发spark应用程序从spark-submit运行，但使其等待直到收到消息?

How do I trigger the spark application to run from spark-submit but make it wait until it receives a message?

任何指针都非常有用.

-使用`ProcessBuilder`-另一种方式

- Using `ProcessBuilder` - Another way

private static void executeProcess(Operation command, String database) throws IOException,
            InterruptedException {

        final File executorDirectory = new File("src/main/resources/");

private final static String shellScript = "./sparksubmit.sh";
ProcessBuilder processBuilder = new ProcessBuilder(shellScript, command.getOperation(), "argument-one");

        processBuilder.directory(executorDirectory);
          Process process = processBuilder.start();
          try {
            int shellExitStatus = process.waitFor();
            if (shellExitStatus != 0) {
                logger.info("Successfully executed the shell script");
            }
        } catch (InterruptedException ex) {
            logger.error("Shell Script process was interrupted");
        }
      }

-第三种方式:jsch

使用JSch通过SSH运行命令

我最喜欢的一本书 数据算法 使用这种方法

One of my favourite book Data algorithms uses this approach

// import required classes and interfaces
import org.apache.spark.deploy.yarn.Client;
import org.apache.spark.deploy.yarn.ClientArguments;
import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;

public class SubmitSparkJobToYARNFromJavaCode {

   public static void main(String[] arguments) throws Exception {

       // prepare arguments to be passed to 
       // org.apache.spark.deploy.yarn.Client object
       String[] args = new String[] {
           // the name of your application
           "--name",
           "myname",

           // memory for driver (optional)
           "--driver-memory",
           "1000M",

           // path to your application's JAR file 
           // required in yarn-cluster mode      
           "--jar",
           "/Users/mparsian/zmp/github/data-algorithms-book/dist/data_algorithms_book.jar",

           // name of your application's main class (required)
           "--class",
           "org.dataalgorithms.bonus.friendrecommendation.spark.SparkFriendRecommendation",

           // comma separated list of local jars that want 
           // SparkContext.addJar to work with      
           "--addJars",
           "/Users/mparsian/zmp/github/data-algorithms-book/lib/spark-assembly-1.5.2-hadoop2.6.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/log4j-1.2.17.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/junit-4.12-beta-2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jsch-0.1.42.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/JeraAntTasks.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jedis-2.5.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jblas-1.2.3.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/hamcrest-all-1.3.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/guava-18.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-math3-3.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-math-2.2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-logging-1.1.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-lang3-3.4.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-lang-2.6.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-io-2.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-httpclient-3.0.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-daemon-1.0.5.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-configuration-1.6.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-collections-3.2.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-cli-1.2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/cloud9-1.3.2.jar",

           // argument 1 to your Spark program (SparkFriendRecommendation)
           "--arg",
           "3",

           // argument 2 to your Spark program (SparkFriendRecommendation)
           "--arg",
           "/friends/input",

           // argument 3 to your Spark program (SparkFriendRecommendation)
           "--arg",
           "/friends/output",

           // argument 4 to your Spark program (SparkFriendRecommendation)
           // this is a helper argument to create a proper JavaSparkContext object
           // make sure that you create the following in SparkFriendRecommendation program
           // ctx = new JavaSparkContext("yarn-cluster", "SparkFriendRecommendation");
           "--arg",
           "yarn-cluster"
       };

       // create a Hadoop Configuration object
       Configuration config = new Configuration();

       // identify that you will be using Spark as YARN mode
       System.setProperty("SPARK_YARN_MODE", "true");

       // create an instance of SparkConf object
       SparkConf sparkConf = new SparkConf();

       // create ClientArguments, which will be passed to Client
       ClientArguments cArgs = new ClientArguments(args, sparkConf); 

       // create an instance of yarn Client client
       Client client = new Client(cArgs, config, sparkConf); 

       // submit Spark job to YARN
       client.run(); 
   }
}

这篇关于每当事件发生时触发一个火花作业的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

每当事件发生时触发一个火花作业 [英] Trigger a spark job whenever an event occurs

问题描述

推荐答案

-使用`ProcessBuilder`-另一种方式

- Using `ProcessBuilder` - Another way

-第三种方式:jsch

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

每当事件发生时触发一个火花作业 [英] Trigger a spark job whenever an event occurs

问题描述

推荐答案

-使用ProcessBuilder-另一种方式

- Using ProcessBuilder - Another way

-第三种方式:jsch

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

-使用`ProcessBuilder`-另一种方式

- Using `ProcessBuilder` - Another way

登录关闭