使用火花提交提交申请EC2集群 [英] Use spark-submit to submit a application to EC2 cluster

查看:157
本文介绍了使用火花提交提交申请EC2集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新来的火花,我试图在EC2上运行它。我遵循的火花网页教程用火花EC2推出的Spark集群。于是,我尝试使用火花提交应用程序提交到集群。命令如下:

I am new to Spark and I am trying to run it on EC2. I follow the tutorial on spark webpage by using spark-ec2 to launch a Spark cluster. Then, I try to use spark-submit to submit the application to the cluster. The command looks like this:

./斌/火花提交--class org.apache.spark.examples.SparkPi --master火花://ec2-54-88-9-74.compute-1.amazonaws .COM:7077 --executor-2G内存--total执行人-芯1 ./examples/target/scala-2.10/spark-examples_2.10-1.0.0.jar 100

不过,我得到了以下错误:

However, I got the following error:

错误SparkDeploySchedulerBackend:应用程序已被杀害。原因:所有的高手都没有反应!放弃。

请让我知道如何解决它。谢谢你。

Please let me know how to fix it. Thanks.

推荐答案

您现在看到的这个问题,因为你的火花独立集群的主节点无法打开一个TCP连接返回到驱动器(您​​的计算机上)。默认模式火花提交终止的客户的它运行提交其机器上的驱动程序。

You're seeing this issue because the master node of your spark-standalone cluster cant open a TCP connection back to the drive (on your machine). The default mode of spark-submit is client which runs the driver on the machine that submitted it.

一个新的集群模式被添加到火花部署将作业提交到主在那里,然后运行在客户端上,不再需要直接连接。不幸的是这种模式在单机模式下支持。

A new cluster mode was added to spark-deploy that submits the job to the master where it is then run on a client, removing the need for a direct connection. Unfortunately this mode is not supported in standalone mode.

您可以投票给这里的JIRA问题:<一href="https://issues.apache.org/jira/browse/SPARK-2260">https://issues.apache.org/jira/browse/SPARK-2260

You can vote for the JIRA issue here: https://issues.apache.org/jira/browse/SPARK-2260

隧道通过SSH的连接是可能的,但等待时间将是一个大问题,因为司机将您的计算机上本地运行。

Tunneling your connection via SSH is possible but latency would be a big issue since the driver would be running locally on your machine.

这篇关于使用火花提交提交申请EC2集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆