Spark流自定义指标 [英] Spark streaming custom metrics
问题描述
我正在研究一个Spark Streaming程序,该程序检索Kafka流,对该流进行非常基本的转换,然后将数据插入DB(voltdb,如果相关). 我正在尝试测量将行插入数据库的速率.我认为指标可能有用(使用JMX).但是我找不到如何向Spark添加自定义指标.我查看了Spark的源代码,还发现
I'm working on a Spark Streaming program which retrieves a Kafka stream, does very basic transformation on the stream and then inserts the data to a DB (voltdb if it's relevant). I'm trying to measure the rate in which I insert rows to the DB. I think metrics can be useful (using JMX). However I can't find how to add custom metrics to Spark. I've looked at Spark's source code and also found this thread however it doesn't work for me. I also enabled the JMX sink in the conf.metrics file. What's not working is I don't see my custom metrics with JConsole.
有人可以解释如何添加自定义指标(最好通过JMX)来触发流媒体吗?或者,如何测量我对数据库(特别是VoltDB)的插入率? 我在Java 8中使用Spark.
Could someone explain how to add custom metrics (preferably via JMX) to spark streaming? Or alternatively how to measure my insertion rate to my DB (specifically VoltDB)? I'm using spark with Java 8.
推荐答案
好吧,我发现了如何添加我自己的自定义指标.它需要三件事:
Ok after digging through the source code I found how to add my own custom metrics. It requires 3 things:
- 创建自己的自定义来源.有点像此
- 在sparkmetrics.properties文件中启用Jmx接收器.我使用的特定行是:
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
,它为所有实例启用JmxSink - 在SparkEnv指标系统中注册我的自定义源.可以看到一个示例操作
- Create my own custom source. Sort of like this
- Enable the Jmx sink in the spark metrics.properties file. The specific line I used is:
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
which enable JmxSink for all instances - Register my custom source in the SparkEnv metrics system. An example of how to do can be seen here - I actually viewed this link before but missed the registration part which prevented me from actually seeing my custom metrics in the JVisualVM
由于代码在执行程序上运行,因此我仍然在实际计算VoltDB插入次数方面仍在挣扎,但这是另一个主题的主题:)
I'm still struggling with how to actually count the number of insertions into VoltDB because the code runs on the executors but that's a subject for a different topic :)
我希望这对其他人有帮助
I hope this will help others
这篇关于Spark流自定义指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!