调用火花EC2从EC2实例中:ssh连接到主机拒绝 [英] calling spark-ec2 from within an EC2 instance: ssh connection to host refused

查看:1192
本文介绍了调用火花EC2从EC2实例中:ssh连接到主机拒绝的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了运行Amplab的训练,我已经创建密钥对美国东1 ,已经安装了训练脚本(混帐克隆的git://github.com/amplab/training-scripts.git -b ampcamp4 ),并创造了ENV。在之后的http://ampcamp.berkeley.edu/big-data-mini-course/launching-a-bdas-cluster-on-ec2.html

In order to run Amplab's training exercises, I've create a keypair on us-east-1 , have installed the training scripts (git clone git://github.com/amplab/training-scripts.git -b ampcamp4) and created the env. variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY following the instructions in http://ampcamp.berkeley.edu/big-data-mini-course/launching-a-bdas-cluster-on-ec2.html

现在运行

 ./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1  -k myspark --copy launch try1

生成以下消息:

generates the following messages:

 johndoe@ip-some-instance:~/projects/spark/training-scripts$ ./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1  -k myspark --copy launch try1
 Setting up security groups...
 Searching for existing cluster try1...
 Latest Spark AMI: ami-19474270
 Launching instances...
 Launched 5 slaves in us-east-1b, regid = r-0c5e5ee3
 Launched master in us-east-1b, regid = r-316060de
 Waiting for instances to start up...
 Waiting 120 more seconds...
 Copying SSH key /home/johndoe/.ssh/myspark.pem to master...
 ssh: connect to host ec2-54-90-57-174.compute-1.amazonaws.com port 22: Connection refused
 Error connecting to host Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com 'mkdir -p ~/.ssh'' returned  non-zero exit status 255, sleeping 30
 ssh: connect to host ec2-54-90-57-174.compute-1.amazonaws.com port 22: Connection refused
 Error connecting to host Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com 'mkdir -p ~/.ssh'' returned non-zero exit status 255, sleeping 30
 ...
 ...
 subprocess.CalledProcessError: Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com '/root/spark/bin/stop-all.sh'' returned non-zero exit status 127

其中, root@ec2-54-90-57-174.compute-1.amazonaws.com 是用户放大器;主实例。我试过 -u EC2用户和增加 -w 一路攀升至600,但得到相同的错误。

where root@ec2-54-90-57-174.compute-1.amazonaws.com is the user & master instance. I've tried -u ec2-user and increasing -w all the way up to 600, but get the same error.

我可以看到主机和从机实例美国东部-1 当我登录到AWS控制台,其实我可以ssh到主实例从当地的 IP-一些实例壳。

I can see the master and slave instances in us-east-1 when I log into the AWS console, and I can actually ssh into the Master instance from the 'local' ip-some-instance shell.

我的理解是,火花EC2脚本需要定义主/从安全组(该端口听取等等)的照顾,我不应该调整这些设置。这就是说,主机和从机全听22后(端口:22,协议:TCP,来源:0.0.0.0/0 在ampcamp3奴隶/主人秒群体)。

My understanding is that the spark-ec2 script takes care of defining the Master/Slave security groups (which ports are listened to and so on), and I shouldn't have to tweak these settings. This said, master and slaves all listen to post 22 (Port:22, Protocol:tcp, Source:0.0.0.0/0 in the ampcamp3-slaves/masters sec. groups).

我在这里的损失,并且将AP preciate之前,我花了我所有的R&放的指针;在EC2实例D经费....谢谢

I'm at a loss here, and would appreciate any pointers before I spend all my R&D funds on EC2 instances.... Thanks.

推荐答案

这是最有可能通过SSH花费很长的时间来启动的情况下,造成120秒超时到期前的机器可以登录到造成的。您应该能够运行

This is most likely caused by SSH taking a long time to start up on the instances, causing the 120 second timeout to expire before the machines could be logged into. You should be able to run

./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1  -k myspark --copy launch --resume try1

(用 - 简历标记),从那里的东西留下来,而无需重新启动新的实例继续。这个问题将在星火1.2.0,在那里我们有智能检查SSH状态,而不是依赖于一个固定的超时的新机制。我们正在通过建设新的AMI也解决了长期SSH启动延迟背后的根本原因。

(with the --resume flag) to continue from where things left off without re-launching new instances. This issue will be fixed in Spark 1.2.0, where we have a new mechanism that intelligently checks the SSH status rather than relying on a fixed timeout. We're also addressing the root causes behind the long SSH startup delay by building new AMIs.

这篇关于调用火花EC2从EC2实例中:ssh连接到主机拒绝的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆