在泊坞窗阿帕奇星火 [英] Apache Spark on Docker

查看:133
本文介绍了在泊坞窗阿帕奇星火的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无法在泊坞窗运行Apache的火花。

当我尝试从我的司机沟通,引发主我收到一个错误:


  

15/04/03 13点08分28秒WARN TaskSchedulerImpl:初始作业尚未接受
  任何资源;检查你的集群用户界面,以确保工人
  注册并拥有足够的资源



解决方案

此错误听起来像工人没有与主注册。

这可以在主人的火花网页粪便检查的http://< masterip>:8080

您也可以简单地使用不同的码头工人的形象,或码头的图像与一个工程比较,看看有什么不同。

我已经dockerized一个火花主并的火花工人

如果你有一台Linux机器坐在NAT路由器后面,就像家里的防火墙,在私有192.168.1分配地址。*网络的机器,该脚本将下载的火花1.3.1主机和一个工人在不同的Docker容器分别地址192.168.1.10和0.11运行。您可能需要调整的地址,如果192.168.1.10和192.168.1.11已经在LAN上使用。

管道是桥接局域网的容器,而不是使用内部泊坞窗桥梁的工具。

火花需要所有的机器,以便能够彼此通信。据我所知,火花不分层,我看到工人们试图打开的端口给对方。所以在shell脚本我揭露所有的端口,如果机器,否则防火墙,如家庭NAT路由器后面是确定。

./运行泊坞窗火花

 #!/斌/庆典
须藤-v
MASTER = $(泊坞窗运行--name =大师-h主--add-主机主:192.168.1.10 --add主机spark1:192.168.1.11 --add主机spark2:192.168.1.12 --add-主持人spark3:192.168.1.13 --add主机spark4:192.168.1.14 --expose = 65535 --env SPARK_MASTER_IP = 192.168.1.10 -d drpaulbrewer /火花主:最新)
须藤管道的eth0 $ MASTER 192.168.1.10/24@192.168.1.1
SPARK1 = $(泊坞窗运行--name =spark1-h spark1 --add主机家庭:192.168.1.8 --add-主机主:192.168.1.10 --add主机spark1:192.168.1.11 --add-主持人spark2:192.168.1.12 --add主机spark3:192.168.1.13 --add主机spark4:192.168.1.14 --expose = 65535 --env纪念品= 10G --env主=火花://192.168。 1.10:7077 -v /数据:/数据-v / tmp目录:/ tmp目录-d drpaulbrewer /火花工人:最新)
须藤管道的eth0 $ SPARK1 192.168.1.11/24@192.168.1.1

运行此脚本,我可以在192.168.1.10:8080见师傅Web报表,或去到另一台机器我的局域网,有一个火花分布上,并运行后, ./火花壳 - 主火花://192.168.1.10:7077 ,它会弹出一个交互的shell阶

Can’t run Apache Spark on Docker.

When I try to communicate from my driver to spark master I receive next error:

15/04/03 13:08:28 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

解决方案

This error sounds like the workers have not registered with the master.

This can be checked at the master's spark web stool http://<masterip>:8080

You could also simply use a different docker image, or compare docker images with one that works and see what is different.

I have dockerized a spark master and spark worker.

If you have a Linux machine sitting behind a NAT router, like a home firewall, that allocates addresses in the private 192.168.1.* network to the machines, this script will download a spark 1.3.1 master and a worker to run in separate docker containers with addresses 192.168.1.10 and .11 respectively. You may need to tweak the addresses if 192.168.1.10 and 192.168.1.11 are already used on your LAN.

pipework is a utility for bridging the LAN to the container instead of using the internal docker bridge.

Spark requires all of the machines to be able to communicate with each other. As far as I can tell, spark is not hierarchical, I've seen the workers try to open ports to each other. So in the shell script I expose all the ports, which is OK if the machines are otherwise firewalled, such as behind a home NAT router.

./run-docker-spark

#!/bin/bash
sudo -v
MASTER=$(docker run --name="master" -h master --add-host master:192.168.1.10 --add-host spark1:192.168.1.11 --add-host spark2:192.168.1.12 --add-host spark3:192.168.1.13 --add-host spark4:192.168.1.14 --expose=1-65535 --env SPARK_MASTER_IP=192.168.1.10 -d drpaulbrewer/spark-master:latest)
sudo pipework eth0 $MASTER 192.168.1.10/24@192.168.1.1
SPARK1=$(docker run --name="spark1" -h spark1 --add-host home:192.168.1.8 --add-host master:192.168.1.10 --add-host spark1:192.168.1.11 --add-host spark2:192.168.1.12 --add-host spark3:192.168.1.13 --add-host spark4:192.168.1.14 --expose=1-65535 --env mem=10G --env master=spark://192.168.1.10:7077 -v /data:/data -v /tmp:/tmp -d drpaulbrewer/spark-worker:latest)
sudo pipework eth0 $SPARK1 192.168.1.11/24@192.168.1.1

After running this script I can see the master web report at 192.168.1.10:8080, or go to another machine on my LAN that has a spark distribution, and run ./spark-shell --master spark://192.168.1.10:7077 and it will bring up an interactive scala shell.

这篇关于在泊坞窗阿帕奇星火的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆