如何建立齐柏林与远程电子病历纱群集中工作 [英] How to set up Zeppelin to work with remote EMR Yarn cluster
问题描述
我有星火1.4.1亚马逊EMR的Hadoop集群V2.6与纱线资源管理器。
我想单独的机器上部署齐柏林允许关闭EMR集群时,有没有运行的作业。
I have Amazon EMR Hadoop v2.6 cluster with Spark 1.4.1, with Yarn resource manager. I want to deploy Zeppelin on separate machine to allow turning off EMR cluster when there is no jobs running.
我试着以下从这里 HTTPS指令://zeppelin.incubator.apache。组织/文档/安装/ yarn_install.html
没有太大的成功。
I tried following instruction from here https://zeppelin.incubator.apache.org/docs/install/yarn_install.html with not much of success.
有人可以去神秘化的步骤齐柏林应如何连接到现有的纱线集群从不同的机器吗?
Can somebody demystify steps how Zeppelin should connect to existing Yarn cluster from different machine?
推荐答案
[1]适当PARAMS安装飞艇:
[1] install Zeppelin with proper params:
git clone https://github.com/apache/incubator-zeppelin.git ~/zeppelin;
cd ~/zeppelin;
mvn clean package -Pspark-1.4 -Dhadoop.version=2.6.0 -Phadoop-2.6 -Pyarn -DskipTests
[2]更新EMR_MASTER EC2安全组接受来自所有端口传入的请求,与齐柏林通讯(应该是特定的端口,还不知道是哪个)
[2] Update EMR_MASTER EC2 security groups to accept incoming requests from all ports, to communicate with Zeppelin (should be specific port, not yet know which)
[3]复制目录EMR_MASTER:在/ etc / Hadoop的/ conf目录到MY_STANDALONE_SERVER:/家庭/飞艇/ Hadoop的CONF
[3] Copy directory EMR_MASTER:/etc/hadoop/conf to MY_STANDALONE_SERVER:/home/zeppelin/hadoop-conf.
[4]飞艇/ conf目录/ zeppelin-env.sh应包含:
[4] zeppelin/conf/zeppelin-env.sh should contain:
export MASTER=yarn-client
export HADOOP_CONF_DIR=/home/zeppelin/hadoop-conf
注:星火参数,如 spark.executor.instances
从国米preTER设置拍摄的,被指定有
Note: Spark parameters like spark.executor.instances
are taken from Interpreter settings, is specified there.
这篇关于如何建立齐柏林与远程电子病历纱群集中工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!