Apache的火花1.3.0和纱线的整合和春季启动作为容器 [英] apache-spark 1.3.0 and yarn integration and spring-boot as a container

查看:237
本文介绍了Apache的火花1.3.0和纱线的整合和春季启动作为容器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我跑火花1.0.2和独立模式火花应用程序作为查询服务(就像火花外壳,但弹簧引导servlet容器内)。现在经过升级到1.3.1火花,并试图用纱线,而不是独立的集群事情南下我。我创建了尤伯杯罐子所有的依赖关系(火花芯,火花丝,弹簧引导),并试图部署我的应用程序。

I was running spark application as a query service (much like spark-shell but within servlet container of spring-boot) with spark 1.0.2 and standalone mode. Now After upgrading to spark 1.3.1 and trying to use Yarn instead of standalone cluster things going south for me. I created uber jar with all dependencies (spark-core, spark-yarn, spring-boot) and tried to deploy my application.

15/07/29 11:19:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

15/07/29 11:19:27 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

15/07/29 11:19:28 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

15/07/29 11:19:29 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

我也试图排除火花纱依赖性和运行,但同样的异常期间提供它。我们使用MAPR分布,他们说这是不可能不使用火花提交脚本运行纱火花的作业。我可以尝试使用该脚本作为我的生成神器是春天启动的jar(不是战争),但只是感觉不对发动我的web应用。我应该能够从我的容器周围没有其他办法初始化服务。

I also tried to exclude spark-yarn dependencies and supplied it during runtime but same exception. We use MapR distribution and they said it's not possible to run spark jobs on yarn without using spark-submit script. I can try to launch my webapp using that script as my build artifact is spring-boot jar (not war) but that just doesn't feel right. I should be able to init service from my container not other way around.

编辑1:我如何启动我的应用程序:
我从那里Hadoop的客户端安装和配置一台机器启动。

EDIT 1: how I launch my application: I launch it from a machine where hadoop client is installed and configured.

java -cp myspringbootapp.jar com.myapp.Application

com.myapp.Application轮流创建SparkContext作为Spring管理的bean。我使用后服务于用户的请求。

com.myapp.Application in turns creates SparkContext as a spring managed bean. That I use later to serve user requests.

推荐答案

我没有得到它与几个步骤的工作:1)排除Hadoop的罐子从尤伯杯罐(春季启动Maven插件为您提供了默认尤伯杯罐子,还有你需要使排斥)2)使用ZIP布局春季启动Maven插件,它允许你使用loader.path spring的配置在运行时提供额外的类路径中。 3)使用的Java -loader.path ='/路径/要/ Hadoop的/罐,/路径/要/ Hadoop的/ conf目录/'-jar myapp.jar

I did got it working with few steps: 1) Exclude hadoop jars from uber jar (spring-boot maven plugin gives you uber jar by default and there you need to make exclusion) 2) use ZIP layout with spring boot maven plugin that allows you to use loader.path spring configuration to provide extra classpath during runtime. 3) use java -loader.path='/path/to/hadoop/jar,/path/to/hadoop/conf/' -jar myapp.jar

PS - 错误我是越来越原定于classpath中Hadoop的罐子是没有适当的配置文件。默认情况下,Hadoop的罐子是挤满了它试图在来找到您的资源管理器0.0.0.0/0.0.0.0:8032 纱default.xml中。您仍然可以尝试包装的hadoop的jar,但一定要提供给您的自定义的Hadoop的conf路径。即纱线-site.xml中具有正确设置你的资源管理器的主机,端口,透明质酸等。

PS - error i was getting was due to hadoop jar being on classpath without proper configuration files. by default hadoop jar is packed with yarn-default.xml which tries to locate your resource manager at 0.0.0.0/0.0.0.0:8032. You can still try packing hadoop jar but be sure to provide path to your custom hadoop conf. i.e. yarn-site.xml which has proper setting for your resource manager host, port, ha etc.

这篇关于Apache的火花1.3.0和纱线的整合和春季启动作为容器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆