星火集群+ XD春 [英] Spark Cluster + Spring XD

查看:295
本文介绍了星火集群+ XD春的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图运行春季XD火花处理器流操作。

当火花指向本地的春天XD火花处理器模块的工作原理。该处理器出现故障时,我们点火花火花独立(在同一台机器上运行)或纱线的客户端来运行。是否有可能运行在独立的火花或纱线火花处理器春天XD内或引发局部这里唯一的选择?

处理器模块被定义为:

 类WORDCOUNT扩展处理器[字符串(字符串,整数)] {  DEF过程(输入:ReceiverInputDStream [字符串]):DSTREAM [(字符串,整数)] = {
      VAL字= input.flatMap(_。分裂())
      缬氨酸对= words.map(字=>(字,1))
      VAL wordCounts = pairs.reduceByKey(_ + _)
      wordCounts
  }  @SparkConfig
  DEF属性:属性= {
    VAL道具=新特性()
    //任何特定的星火配置属性会去这里。
    //这些属性总是获得最高precedence
    //props.setProperty(\"spark.master,火花://a.b.c.d:7077)
    ** props.setProperty(spark.master,火花://abcd.hadoop.ambari:7077 **)
    道具
  }}

该处理器时的配置是作为当地的正常工作。是否有我的声明我失去了一些东西。

谢谢!

编辑:错误日志

对XD-shell中执行

  //命令
================================================== =================
火花/ sbin目录/ start-all.sh模块上传--file /opt/igc_services/SparkDev/XdWordCount/build/libs/spark-streaming-wordcount-scala-processor-0.1.0.jar --name斯卡拉字计数--type处理器料流创造火花流字计数--definitionHTTP |处理器:斯卡拉字数|日志--deploy
//错误日志
================================================== ==================
2015-09-16T14:28:48 + 0530 1.2.0.RELEASE信息DeploymentsPathChildrenCache-0 container.DeploymentListener - 部署模块的日志对流火花流字计数
2015-09-16T14:28:48 + 0530 1.2.0.RELEASE信息DeploymentsPathChildrenCache-0 container.DeploymentListener - 部署模块[ModuleDescriptor @ 6dbc4f81 MODULENAME ='登录',moduleLabel ='登录',组='火花流字-count',sourceChannelName = [空],sinkChannelName = [空],指数= 2,类型=下沉,参数=地图[空],儿童=列表[空]]]
2015-09-16T14:28:48 + 0530 1.2.0.RELEASE信息DeploymentsPathChildrenCache-0 container.DeploymentListener - 路径缓存事件: path=/deployments/modules/allocated/4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06/spark-streaming-word-count.processor.processor.1,键入= CHILD_ADDED
2015-09-16T14:28:48 + 0530 1.2.0.RELEASE信息DeploymentsPathChildrenCache-0 container.DeploymentListener - 部署模块的处理器的流'火花流字计数
2015-09-16T14:28:48 + 0530 1.2.0.RELEASE信息DeploymentsPathChildrenCache-0 container.DeploymentListener - 部署模块[ModuleDescriptor @ 5e16dafb MODULENAME ='斯卡拉字计数,moduleLabel =处理器,组='火花-streaming字计数,sourceChannelName = [空],sinkChannelName = [空],指数= 1,类型=处理器,参数=地图[空],儿童=列表[空]]]
2015-09-16T14:28:49 + 0530 1.2.0.RELEASE WARN DeploymentsPathChildrenCache-0 util.Native codeLoader - 无法加载原生的Hadoop库平台...使用内置-java类适用
2015-09-16T14:28:49 + 0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-调度-3 remote.ReliableDeliverySupervisor - 协会远程系统[akka.tcp://sparkMaster@abcd.hadoop.ambari :7077]失败,地址选通现为[5000]毫秒。理由是:解除关联。
2015-09-16T14:29:09 + 0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-调度-4 remote.ReliableDeliverySupervisor - 协会远程系统[akka.tcp://sparkMaster@abcd.hadoop.ambari :7077]失败,地址选通现为[5000]毫秒。理由是:解除关联。
2015-09-16T14:29:18 + 0530 1.2.0.RELEASE信息DeploymentsPathChildrenCache-0 container.DeploymentListener - 路径缓存事件: path=/deployments/modules/allocated/8d07cdba-557e-458a-9225-b90e5a5778ce/spark-streaming-word-count.source.http.1,键入= CHILD_ADDED
2015-09-16T14:29:18 + 0530 1.2.0.RELEASE信息DeploymentsPathChildrenCache-0 container.DeploymentListener - 部署模块的http对流火花流字计数
2015-09-16T14:29:18 + 0530 1.2.0.RELEASE信息DeploymentsPathChildrenCache-0 container.DeploymentListener - 部署模块[ModuleDescriptor @ 610e43b0 MODULENAME ='HTTP',moduleLabel ='HTTP',组='火花流字-count',sourceChannelName = [空],sinkChannelName = [空],指数= 0,类型=源=参数地图[空],儿童=列表[空]]]
2015-09-16T14:29:19 + 0530 1.2.0.RELEASE信息DeploymentSupervisor-0 zk.ZKStreamDeploymentHandler - 为流部署状态火花流字计数:{DeploymentStatus状态=失败,错误(S)=部署模块的ModuleDeploymentKey {流='火花流字计数,键入=处理器,标签='处理器'}'集装箱'4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06后30000毫秒超时}
2015-09-16T14:29:29 + 0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-调度-4 remote.ReliableDeliverySupervisor - 协会远程系统[akka.tcp://sparkMaster@abcd.hadoop.ambari :7077]失败,地址选通现为[5000]毫秒。理由是:解除关联。
2015-09-16T14:29:49 + 0530 1.2.0.RELEASE ERROR sparkDriver-akka.actor.default-调度-3 cluster.SparkDeploySchedulerBackend - 应用程序已被杀害。原因:所有的高手都没有反应!放弃。
2015-09-16T14:29:49 + 0530 1.2.0.RELEASE WARN DeploymentsPathChildrenCache-0 cluster.SparkDeploySchedulerBackend - 应用程序ID尚未初始化。
2015-09-16T14:29:49 + 0530 1.2.0.RELEASE ERROR sparkDriver-akka.actor.default-调度-3 scheduler.TaskSchedulerImpl - 从集群调度退出,由于错误:所有的高手都没有反应!放弃。
2015-09-16T14:29:50 + 0530 1.2.0.RELEASE信息DeploymentSupervisor-0 zk.ContainerListener - 路径缓存事件:PATH = /集装箱/ 4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06,键入= CHILD_REMOVED
2015-09-16T14:29:50 + 0530 1.2.0.RELEASE信息DeploymentSupervisor-0 zk.ContainerListener - 集装箱离去:容器{名='4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06',属性= {组=,主机= abcd.hadoop.ambari,ID = 4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06,managementPort = 54998,IP = ABCD,PID = 4597}}


解决方案

错误看起来像因为版本冲突。确保使用星火1.2.1的XD支持开箱即用。

如果你有一个特定的版本,你仍然可以让它通过移除工作 1.2.1 从XD_HOME火花依赖版本/ lib和火花的版本替换它们您使用。

I am trying to run a Spark processor on Spring XD for streaming operation.

The spark processor module on Spring XD works when spark is pointing to local. The processor fails to run when we point spark to spark standalone (running on the same machine) or yarn-client. Is it possible to run spark processor on spark standalone or yarn inside spring XD or is spark local the only option here ?

The processor module is defined as:

class WordCount extends Processor[String, (String, Int)] {

  def process(input: ReceiverInputDStream[String]): DStream[(String, Int)] = {
      val words = input.flatMap(_.split(" "))
      val pairs = words.map(word => (word, 1))
      val wordCounts = pairs.reduceByKey(_ + _)
      wordCounts
  }

  @SparkConfig
  def properties : Properties = {
    val props = new Properties()
    // Any specific Spark configuration properties would go here.
    // These properties always get the highest precedence
    //props.setProperty("spark.master", "spark://a.b.c.d:7077")
    **props.setProperty("spark.master", "spark://abcd.hadoop.ambari:7077**")
    props
  }

}

The processor works fine when the config is given as local. Is there something that i am missing in the declarations.

Thanks !

EDIT : ERROR LOG

//commands executed on xd-shell
===================================================================
spark/sbin/start-all.sh

module upload --file /opt/igc_services/SparkDev/XdWordCount/build/libs/spark-streaming-wordcount-scala-processor-0.1.0.jar  --name scala-word-count --type processor

stream create spark-streaming-word-count --definition "http | processor:scala-word-count | log" --deploy


// Error Log 
====================================================================
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'log' for stream 'spark-streaming-word-count'
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@6dbc4f81 moduleName = 'log', moduleLabel = 'log', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 2, type = sink, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Path cache event: path=/deployments/modules/allocated/4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06/spark-streaming-word-count.processor.processor.1, type=CHILD_ADDED
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'processor' for stream 'spark-streaming-word-count'
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@5e16dafb moduleName = 'scala-word-count', moduleLabel = 'processor', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 1, type = processor, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:28:49+0530 1.2.0.RELEASE WARN DeploymentsPathChildrenCache-0 util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-09-16T14:28:49+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-3 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:09+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-4 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Path cache event: path=/deployments/modules/allocated/8d07cdba-557e-458a-9225-b90e5a5778ce/spark-streaming-word-count.source.http.1, type=CHILD_ADDED
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'http' for stream 'spark-streaming-word-count'
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@610e43b0 moduleName = 'http', moduleLabel = 'http', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 0, type = source, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:29:19+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ZKStreamDeploymentHandler - Deployment status for stream 'spark-streaming-word-count': DeploymentStatus{state=failed,error(s)=Deployment of module 'ModuleDeploymentKey{stream='spark-streaming-word-count', type=processor, label='processor'}' to container '4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06' timed out after 30000 ms}
2015-09-16T14:29:29+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-4 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:49+0530 1.2.0.RELEASE ERROR sparkDriver-akka.actor.default-dispatcher-3 cluster.SparkDeploySchedulerBackend - Application has been killed. Reason: All masters are unresponsive! Giving up.
2015-09-16T14:29:49+0530 1.2.0.RELEASE WARN DeploymentsPathChildrenCache-0 cluster.SparkDeploySchedulerBackend - Application ID is not initialized yet.
2015-09-16T14:29:49+0530 1.2.0.RELEASE ERROR sparkDriver-akka.actor.default-dispatcher-3 scheduler.TaskSchedulerImpl - Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
2015-09-16T14:29:50+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ContainerListener - Path cache event: path=/containers/4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06, type=CHILD_REMOVED
2015-09-16T14:29:50+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ContainerListener - Container departed: Container{name='4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06', attributes={groups=, host=abcd.hadoop.ambari, id=4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06, managementPort=54998, ip=a.b.c.d, pid=4597}}

解决方案

The error looks like because of the version conflict. Make sure to use Spark 1.2.1 that XD supports out of the box.

If you have a specific version, you can still make it work by removing 1.2.1 versions of spark dependencies from XD_HOME/lib and replace them with the version of spark you use.

这篇关于星火集群+ XD春的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆