Spark 中的任务是什么?Spark worker 是如何执行jar 文件的? [英] What is a task in Spark? How does the Spark worker execute the jar file?

查看：44 发布时间：2021/11/12 5:39:24 apache-spark distributed-computing

本文介绍了Spark 中的任务是什么?Spark worker 是如何执行jar 文件的?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

阅读有关 http://spark.apache.org/docs/0.8 的一些文档后.0/cluster-overview.html，我有一些问题想澄清一下.

After reading some document on http://spark.apache.org/docs/0.8.0/cluster-overview.html, I got some question that I want to clarify.

以 Spark 为例:

Take this example from Spark:

JavaSparkContext spark = new JavaSparkContext(
  new SparkConf().setJars("...").setSparkHome....);
JavaRDD<String> file = spark.textFile("hdfs://...");

// step1
JavaRDD<String> words =
  file.flatMap(new FlatMapFunction<String, String>() {
    public Iterable<String> call(String s) {
      return Arrays.asList(s.split(" "));
    }
  });

// step2
JavaPairRDD<String, Integer> pairs =
  words.map(new PairFunction<String, String, Integer>() {
    public Tuple2<String, Integer> call(String s) {
      return new Tuple2<String, Integer>(s, 1);
    }
  });

// step3
JavaPairRDD<String, Integer> counts =
  pairs.reduceByKey(new Function2<Integer, Integer>() {
    public Integer call(Integer a, Integer b) {
      return a + b;
    }
  });

counts.saveAsTextFile("hdfs://...");

所以假设我有 3 个节点集群，节点 1 作为主节点运行，并且上面的驱动程序已经正确 jared(比如 application-test.jar).所以现在我在主节点上运行这段代码，我相信在创建 SparkContext 之后，application-test.jar 文件将被复制到工作节点(每个工作节点都会创建一个该应用程序的目录).

So let's say I have 3 nodes cluster, and node 1 running as master, and the above driver program has been properly jared (say application-test.jar). So now I'm running this code on the master node and I believe right after the SparkContext being created, the application-test.jar file will be copied to the worker nodes (and each worker will create a dir for that application).

所以现在我的问题是:示例任务中的 step1、step2 和 step3 是否发送给工作人员?如果是，那么工人如何执行呢?像java -cp "application-test.jar" step1 等等?

So now my question: Are step1, step2 and step3 in the example tasks that get sent over to the workers? If yes, then how does the worker execute that? Like java -cp "application-test.jar" step1 and so on?

Spark 中的任务是什么?Spark worker 是如何执行jar 文件的? [英] What is a task in Spark? How does the Spark worker execute the jar file?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark 中的任务是什么?Spark worker 是如何执行jar 文件的? [英] What is a task in Spark? How does the Spark worker execute the jar file?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭