如何集成Ganglia for Spark 2.1作业指标,Spark忽略Ganglia指标 [英] How to integrate Ganglia for Spark 2.1 Job metrics, Spark ignoring Ganglia metrics
问题描述
我正在尝试将Spark 2.1作业的指标集成到Ganglia.
I am trying to integrate Spark 2.1 job's metrics to Ganglia.
我的spark-default.conf看起来像
My spark-default.conf looks like
*.sink.ganglia.class org.apache.spark.metrics.sink.GangliaSink
*.sink.ganglia.name Name
*.sink.ganglia.host $MASTERIP
*.sink.ganglia.port $PORT
*.sink.ganglia.mode unicast
*.sink.ganglia.period 10
*.sink.ganglia.unit seconds
提交工作时,我会看到警告
When i submit my job i can see the warn
Warning: Ignoring non-spark config property: *.sink.ganglia.host=host
Warning: Ignoring non-spark config property: *.sink.ganglia.name=Name
Warning: Ignoring non-spark config property: *.sink.ganglia.mode=unicast
Warning: Ignoring non-spark config property: *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink
Warning: Ignoring non-spark config property: *.sink.ganglia.period=10
Warning: Ignoring non-spark config property: *.sink.ganglia.port=8649
Warning: Ignoring non-spark config property: *.sink.ganglia.unit=seconds
我的环境详细信息是
Hadoop : Amazon 2.7.3 - emr-5.7.0
Spark : Spark 2.1.1,
Ganglia: 3.7.2
如果您有任何输入或其他与Ganglia无关的内容,请回复.
If you have any inputs or any other alternative of Ganglia please reply.
推荐答案
特别是对于EMR,您需要将这些设置放在主节点上的/etc/spark/conf/metrics.properties
中.
For EMR specifically, you'll need to put these settings in /etc/spark/conf/metrics.properties
on the master node.
EMR上的火花确实包含Ganglia库:
Spark on EMR does include the Ganglia library:
$ ls -l /usr/lib/spark/external/lib/spark-ganglia-lgpl_*
-rw-r--r-- 1 root root 28376 Mar 22 00:43 /usr/lib/spark/external/lib/spark-ganglia-lgpl_2.11-2.3.0.jar
此外,您的示例缺少配置名称和值之间的等号(=
)-不确定是否存在问题.以下是对我成功运行的示例配置.
In addition, your example is missing the equals sign (=
) between the config names and values - unsure if that's an issue. Below is an example config that worked successfully for me.
*.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink
*.sink.ganglia.name=AMZN-EMR
*.sink.ganglia.host=$MASTERIP
*.sink.ganglia.port=8649
*.sink.ganglia.mode=unicast
*.sink.ganglia.period=10
*.sink.ganglia.unit=seconds
这篇关于如何集成Ganglia for Spark 2.1作业指标,Spark忽略Ganglia指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!