如何集成Ganglia for Spark 2.1作业指标,Spark忽略Ganglia指标 [英] How to integrate Ganglia for Spark 2.1 Job metrics, Spark ignoring Ganglia metrics

查看:135
本文介绍了如何集成Ganglia for Spark 2.1作业指标,Spark忽略Ganglia指标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将Spark 2.1作业的指标集成到Ganglia.

I am trying to integrate Spark 2.1 job's metrics to Ganglia.

我的spark-default.conf看起来像

My spark-default.conf looks like

*.sink.ganglia.class org.apache.spark.metrics.sink.GangliaSink
*.sink.ganglia.name Name
*.sink.ganglia.host $MASTERIP
*.sink.ganglia.port $PORT

*.sink.ganglia.mode unicast
*.sink.ganglia.period 10
*.sink.ganglia.unit seconds

提交工作时,我会看到警告

When i submit my job i can see the warn

Warning: Ignoring non-spark config property: *.sink.ganglia.host=host
Warning: Ignoring non-spark config property: *.sink.ganglia.name=Name
Warning: Ignoring non-spark config property: *.sink.ganglia.mode=unicast
Warning: Ignoring non-spark config property: *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink
Warning: Ignoring non-spark config property: *.sink.ganglia.period=10
Warning: Ignoring non-spark config property: *.sink.ganglia.port=8649
Warning: Ignoring non-spark config property: *.sink.ganglia.unit=seconds

我的环境详细信息是

Hadoop : Amazon 2.7.3 - emr-5.7.0  
Spark  : Spark 2.1.1, 
Ganglia: 3.7.2

如果您有任何输入或其他与Ganglia无关的内容,请回复.

If you have any inputs or any other alternative of Ganglia please reply.

推荐答案

特别是对于EMR,您需要将这些设置放在主节点上的/etc/spark/conf/metrics.properties中.

For EMR specifically, you'll need to put these settings in /etc/spark/conf/metrics.properties on the master node.

EMR上的火花确实包含Ganglia库:

Spark on EMR does include the Ganglia library:

$ ls -l /usr/lib/spark/external/lib/spark-ganglia-lgpl_*
-rw-r--r-- 1 root root 28376 Mar 22 00:43 /usr/lib/spark/external/lib/spark-ganglia-lgpl_2.11-2.3.0.jar

此外,您的示例缺少配置名称和值之间的等号(=)-不确定是否存在问题.以下是对我成功运行的示例配置.

In addition, your example is missing the equals sign (=) between the config names and values - unsure if that's an issue. Below is an example config that worked successfully for me.

*.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink
*.sink.ganglia.name=AMZN-EMR
*.sink.ganglia.host=$MASTERIP
*.sink.ganglia.port=8649

*.sink.ganglia.mode=unicast
*.sink.ganglia.period=10
*.sink.ganglia.unit=seconds

这篇关于如何集成Ganglia for Spark 2.1作业指标,Spark忽略Ganglia指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆