与水槽星火(配置/类路径?) [英] Spark with Flume (configuration/classpath?)

查看:186
本文介绍了与水槽星火(配置/类路径?)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获得星火与水槽工作,水槽下面的配置:

I am trying to get Spark working with Flume, flume config below:

#Declare
log.sources = src
log.sinks = spark
log.channels = chs

#Define Source

log.sources.src.type = exec
log.sources.src.command = sh /home/user/shell/flume.sh

#Define Sink
log.sinks.spark.type = org.apache.spark.streaming.flume.sink.SparkSink
log.sinks.spark.hostname = localhost
log.sinks.spark.port = 9999
log.sinks.spark.channel = chs

#Define Channels

log.channels.chs.type = memory

#Tie Source and Sink to Channel

log.sinks.snk.channel = chs
log.sources.src.channels = chs

$ LS -lrt $ FLUME_CLASSPATH

$ ls -lrt $FLUME_CLASSPATH

-rw-R - R-- 1根根7126372 2014年3月18日的Scala库,2.10.4.jar

-rw-r--r-- 1 root root 7126372 Mar 18 2014 scala-library-2.10.4.jar

-rw-R - R-- 1根根412739 2014年4月6日公地lang3-3.3.2.jar

-rw-r--r-- 1 root root 412739 Apr 6 2014 commons-lang3-3.3.2.jar

-rw-R - R-- 1根根86020 09月24日00:15火花流式水槽,sink_2.10-1.5.1.jar

-rw-r--r-- 1 root root 86020 Sep 24 00:15 spark-streaming-flume-sink_2.10-1.5.1.jar

-rw-R - R-- 1根根7126003 11月7日19:09斯卡拉 - 库2.10.3.jar

-rw-r--r-- 1 root root 7126003 Nov 7 19:09 scala-library-2.10.3.jar

-rw-R - R-- 1根根82325 11月7日19:26火花流式水槽,sink_2.11-1.2.0.jar

-rw-r--r-- 1 root root 82325 Nov 7 19:26 spark-streaming-flume-sink_2.11-1.2.0.jar

$水槽-NG剂-f simplelogger.conf -n登录

$flume-ng agent -f simplelogger.conf -n log

15/11/07 19:48:05 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
15/11/07 19:48:05 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:simplelogger.conf
15/11/07 19:48:05 INFO conf.FlumeConfiguration: Processing:spark
15/11/07 19:48:05 INFO conf.FlumeConfiguration: Processing:spark
15/11/07 19:48:05 INFO conf.FlumeConfiguration: Processing:spark
15/11/07 19:48:05 INFO conf.FlumeConfiguration: Processing:snk
15/11/07 19:48:05 INFO conf.FlumeConfiguration: Processing:spark
15/11/07 19:48:05 INFO conf.FlumeConfiguration: Added sinks: spark Agent: log
15/11/07 19:48:05 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [log]
15/11/07 19:48:05 INFO node.AbstractConfigurationProvider: Creating channels
15/11/07 19:48:05 INFO channel.DefaultChannelFactory: Creating instance of channel chs type memory
15/11/07 19:48:05 INFO node.AbstractConfigurationProvider: Created channel chs
15/11/07 19:48:05 INFO source.DefaultSourceFactory: Creating instance of source src, type exec
15/11/07 19:48:05 INFO sink.DefaultSinkFactory: Creating instance of sink: spark, type: org.apache.spark.streaming.flume.sink.SparkSink
15/11/07 19:48:05 ERROR node.PollingPropertiesFileConfigurationProvider: Failed to start agent because dependencies were not found in classpath. Error follows.
java.lang.NoClassDefFoundError: scala/Function1
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:190)
        at org.apache.flume.sink.DefaultSinkFactory.getClass(DefaultSinkFactory.java:67)
        at org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:41)
        at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:415)
        at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:103)
        at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.ClassNotFoundException: scala.Function1
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 14 more

也有在PWD一个plugins.d文件夹(在那里我有水槽的conf)

Also have a plugins.d folder in the pwd (where I have the flume conf)

plugins.d /:

plugins.d/:

plugins.d /火花:

plugins.d/spark:

plugins.d /火花/ lib目录下:

plugins.d/spark/lib:

-rw-R - R-- 1 rgopalk rgopalk 82325 11月7日19:31火花流-水槽-sink_2.11-1.2.0.jar

-rw-r--r-- 1 rgopalk rgopalk 82325 Nov 7 19:31 spark-streaming-flume-sink_2.11-1.2.0.jar

任何指针吗?

PS:火花流罐和Scala库罐子的flume_classpath的多个版本没有任何区别。该错误是单版相同的

PS: The multiple version of spark-streaming jar and scala-library jar in flume_classpath doesn't make any difference. The error is the same with single version

推荐答案

我复制了所有以上{FLUME_INSTALLATTION_DIR /库中列出的jar文件。我也复制{} SPARK_HOME / lib目录/火花组件{FLUME_INSTALLATTION_DIR /库和它开始工作。

I copied all the jar files listed above to {FLUME_INSTALLATTION_DIR/libs. I also copied {SPARK_HOME}/lib/spark-assembly to {FLUME_INSTALLATTION_DIR/libs and it started working

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/flume/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/flume/lib/spark-assembly-1.5.1-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/11/07 21:18:15 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
15/11/07 21:18:15 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:simplelogger.conf
15/11/07 21:18:15 INFO conf.FlumeConfiguration: Processing:spark
15/11/07 21:18:15 INFO conf.FlumeConfiguration: Processing:spark
15/11/07 21:18:15 INFO conf.FlumeConfiguration: Processing:spark
15/11/07 21:18:15 INFO conf.FlumeConfiguration: Processing:snk
15/11/07 21:18:15 INFO conf.FlumeConfiguration: Processing:spark
15/11/07 21:18:15 INFO conf.FlumeConfiguration: Added sinks: spark Agent: log
15/11/07 21:18:15 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [log]
15/11/07 21:18:15 INFO node.AbstractConfigurationProvider: Creating channels
15/11/07 21:18:15 INFO channel.DefaultChannelFactory: Creating instance of channel chs type memory
15/11/07 21:18:15 INFO node.AbstractConfigurationProvider: Created channel chs
15/11/07 21:18:15 INFO source.DefaultSourceFactory: Creating instance of source src, type exec
15/11/07 21:18:15 INFO sink.DefaultSinkFactory: Creating instance of sink: spark, type: org.apache.spark.streaming.flume.sink.SparkSink
15/11/07 21:18:15 INFO sink.SparkSink: Configured Spark Sink with hostname: localhost, port: 9999, poolSize: 10, transactionTimeout: 60, backoffInterval: 200
15/11/07 21:18:15 INFO node.AbstractConfigurationProvider: Channel chs connected to [src, spark]
15/11/07 21:18:15 INFO node.Application: Starting new configuration:{ sourceRunners:{src=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:src,state:IDLE} }} sinkRunners:{spark=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@2b430201 counterGroup:{ name:null counters:{} } }} channels:{chs=org.apache.flume.channel.MemoryChannel{name: chs}} }
15/11/07 21:18:15 INFO node.Application: Starting Channel chs
15/11/07 21:18:15 INFO instrumentation.MonitoredCounterGroup: Monitoried counter group for type: CHANNEL, name: chs, registered successfully.
15/11/07 21:18:15 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: chs started
15/11/07 21:18:15 INFO node.Application: Starting Sink spark
15/11/07 21:18:15 INFO sink.SparkSink: Starting Spark Sink: spark on port: 9999 and interface: localhost with pool size: 10 and transaction timeout: 60.
15/11/07 21:18:15 INFO node.Application: Starting Source src
15/11/07 21:18:15 INFO source.ExecSource: Exec source starting with command:sh /home/rgopalk/shell/flume.sh
15/11/07 21:18:15 INFO instrumentation.MonitoredCounterGroup: Monitoried counter group for type: SOURCE, name: src, registered successfully.
15/11/07 21:18:15 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: src started
15/11/07 21:18:16 INFO sink.SparkSink: Starting Avro server for sink: spark
15/11/07 21:18:16 INFO sink.SparkSink: Blocking Sink Runner, sink will continue to run..

这篇关于与水槽星火(配置/类路径?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆