在泊坞窗阿帕奇星火 [英] Apache Spark on Docker
问题描述
无法在泊坞窗运行Apache的火花。
当我尝试从我的司机沟通,引发主我收到一个错误:
15/04/03 13点08分28秒WARN TaskSchedulerImpl:初始作业尚未接受
任何资源;检查你的集群用户界面,以确保工人
注册并拥有足够的资源
块引用>解决方案此错误听起来像工人没有与主注册。
这可以在主人的火花网页粪便检查
的http://< masterip>:8080
您也可以简单地使用不同的码头工人的形象,或码头的图像与一个工程比较,看看有什么不同。
如果你有一台Linux机器坐在NAT路由器后面,就像家里的防火墙,在私有192.168.1分配地址。*网络的机器,该脚本将下载的火花1.3.1主机和一个工人在不同的Docker容器分别地址192.168.1.10和0.11运行。您可能需要调整的地址,如果192.168.1.10和192.168.1.11已经在LAN上使用。
管道是桥接局域网的容器,而不是使用内部泊坞窗桥梁的工具。
火花需要所有的机器,以便能够彼此通信。据我所知,火花不分层,我看到工人们试图打开的端口给对方。所以在shell脚本我揭露所有的端口,如果机器,否则防火墙,如家庭NAT路由器后面是确定。
./运行泊坞窗火花
#!/斌/庆典
须藤-v
MASTER = $(泊坞窗运行--name =大师-h主--add-主机主:192.168.1.10 --add主机spark1:192.168.1.11 --add主机spark2:192.168.1.12 --add-主持人spark3:192.168.1.13 --add主机spark4:192.168.1.14 --expose = 65535 --env SPARK_MASTER_IP = 192.168.1.10 -d drpaulbrewer /火花主:最新)
须藤管道的eth0 $ MASTER 192.168.1.10/24@192.168.1.1
SPARK1 = $(泊坞窗运行--name =spark1-h spark1 --add主机家庭:192.168.1.8 --add-主机主:192.168.1.10 --add主机spark1:192.168.1.11 --add-主持人spark2:192.168.1.12 --add主机spark3:192.168.1.13 --add主机spark4:192.168.1.14 --expose = 65535 --env纪念品= 10G --env主=火花://192.168。 1.10:7077 -v /数据:/数据-v / tmp目录:/ tmp目录-d drpaulbrewer /火花工人:最新)
须藤管道的eth0 $ SPARK1 192.168.1.11/24@192.168.1.1运行此脚本,我可以在192.168.1.10:8080见师傅Web报表,或去到另一台机器我的局域网,有一个火花分布上,并运行后,
./火花壳 - 主火花://192.168.1.10:7077
,它会弹出一个交互的shell阶Can’t run Apache Spark on Docker.
When I try to communicate from my driver to spark master I receive next error:
15/04/03 13:08:28 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
解决方案This error sounds like the workers have not registered with the master.
This can be checked at the master's spark web stool
http://<masterip>:8080
You could also simply use a different docker image, or compare docker images with one that works and see what is different.
I have dockerized a spark master and spark worker.
If you have a Linux machine sitting behind a NAT router, like a home firewall, that allocates addresses in the private 192.168.1.* network to the machines, this script will download a spark 1.3.1 master and a worker to run in separate docker containers with addresses 192.168.1.10 and .11 respectively. You may need to tweak the addresses if 192.168.1.10 and 192.168.1.11 are already used on your LAN.
pipework is a utility for bridging the LAN to the container instead of using the internal docker bridge.
Spark requires all of the machines to be able to communicate with each other. As far as I can tell, spark is not hierarchical, I've seen the workers try to open ports to each other. So in the shell script I expose all the ports, which is OK if the machines are otherwise firewalled, such as behind a home NAT router.
./run-docker-spark
#!/bin/bash sudo -v MASTER=$(docker run --name="master" -h master --add-host master:192.168.1.10 --add-host spark1:192.168.1.11 --add-host spark2:192.168.1.12 --add-host spark3:192.168.1.13 --add-host spark4:192.168.1.14 --expose=1-65535 --env SPARK_MASTER_IP=192.168.1.10 -d drpaulbrewer/spark-master:latest) sudo pipework eth0 $MASTER 192.168.1.10/24@192.168.1.1 SPARK1=$(docker run --name="spark1" -h spark1 --add-host home:192.168.1.8 --add-host master:192.168.1.10 --add-host spark1:192.168.1.11 --add-host spark2:192.168.1.12 --add-host spark3:192.168.1.13 --add-host spark4:192.168.1.14 --expose=1-65535 --env mem=10G --env master=spark://192.168.1.10:7077 -v /data:/data -v /tmp:/tmp -d drpaulbrewer/spark-worker:latest) sudo pipework eth0 $SPARK1 192.168.1.11/24@192.168.1.1
After running this script I can see the master web report at 192.168.1.10:8080, or go to another machine on my LAN that has a spark distribution, and run
./spark-shell --master spark://192.168.1.10:7077
and it will bring up an interactive scala shell.这篇关于在泊坞窗阿帕奇星火的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!