是不是总是这样的驱动程序必须是一个主节点(是/否)吗? Apache的火花 [英] Is it always the case that Driver must be on a Master node (Yes/No) ? Apache-Spark

查看:151
本文介绍了是不是总是这样的驱动程序必须是一个主节点(是/否)吗? Apache的火花的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

难道总是这样的驱动程序(如运行在主节点程序)必须是主节点上?

Is it always the case that the Driver (as a program that runs the master node) must be on a master node ?

例如,如果我的设置与EC2一个主机和两个工人,不具有主必须从主EC2实例执行我的code?

For example, if I setup the ec2 with one master and two workers, does my code that has the main must be executed from the master EC2 instance ?

如果答案是否定的,这将是为建立系统所在的驱动程序是EC2的主节点之外的最佳方法(可以说,司机是从我的电脑运行,而硕士和工人都在EC2)? 我总是用火花提交,或者我可以从一个IDE,比如Eclipse和IntelliJ IDEA的办呢?

If answer is NO, what would be the best way to set-up the system where the driver is outside the ec2's master node (lets say, Driver is ran from my computer, while Master and Workers are on EC2)? Do I always have to use the spark-submit, or can I do it from an IDE such as Eclipse or IntelliJ IDEA?

如果答案是肯定的,这将是最好的参考,以了解更多有关它(因为我需要提供某种证明的)?

If answer is YES, what would be the best reference to learn more about it (since I need to provide some sort of a proof)?

好心谢谢您的回答,引用将是非常美联社preciated!

Thank you kindly for your answer, references would be highly appreciated!

推荐答案

没有,它并不一定要在主。

No, it doesn't have to be on the master.

使用火花提交可以使用的部署模式来控制你的驱动程序运行(作为客户端,运行在提交(可以是高手在机器上或另一种),或的群集,对工人)。

Using spark-submit you can use deploy-mode to control how your driver is run (either as a client, on the machine you run submit on (which could be master or another), or as cluster, on the workers).

还有就是工人和驱动程序,以便您的网络通信希望关闭,工人,从未在广域网上。

There is network communication between the workers and the driver so you want it 'close' to the workers, never across the WAN.

您可以运行里面一个REPL(火花壳),这可能与你的IDE进行访问。如果您使用的是动态语言例如Clojure,你也可以只创建一个 SparkContext 引用(通过)一本地集群,或者你想把工作到集群,并通过REPL那么code。在实践中并不如此简单。

You can run from inside a repl (spark-shell) which could be accessed from your IDE. If you're using a dynamic language like Clojure, you can also just create a SparkContext referencing (through master) a local cluster, or the cluster you want to put jobs to, and then code through the repl. In practice it isn't this easy.

这篇关于是不是总是这样的驱动程序必须是一个主节点(是/否)吗? Apache的火花的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆