executorRunTime在Spark中包含什么? [英] What does executorRunTime consist of in Spark?

查看:50
本文介绍了executorRunTime在Spark中包含什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当前在Spark上工作,我通过自定义Spark侦听器API收集了一些性能指标,以进行分析.我试图制作一个堆叠的条形图,以显示执行者通过三种不同的机器学习算法执行任务,改组或垃圾回收暂停的时间百分比.这是我发现的屏幕截图:

Currently working on Spark, I collected some performance metrics through the custom Spark listener API for analysis purposes. I tried to make a stacked bar plot that shows the percentage of the time the executor passes executing the task, shuffling or in garbage collection pauses for three different machine learning algorithms. Here is a screenshot of what I found:

剧情出现后立即引起我注意的是利率是错误的.您可以看到kmeans算法的值超过了1,而感知器的值则小于了0.8.

What caught my attention right after the plot appeared is that the rates are false. You can see that it goes beyond the value 1 for the kmeans algorithm, and less than 0.8 for the perceptron.

这是我计算费率的方式:

Here is how I computed the rates:

execution['cpuRate'] = execution['executorCpuTime'] / execution['executorRunTime']
execution['serRate'] = execution['resultSerializationTime'] / execution['executorRunTime']
execution['gcRate'] = execution['jvmGCTime'] / execution['executorRunTime']
execution['shuffleFetchRate'] = execution['shuffleFetchWaitTime'] / execution['executorRunTime']
execution['shuffleWriteRate'] = execution['shuffleWriteTime'] / execution['executorRunTime']

execution = execution[['cpuRate', 'serRate', 'gcRate', 'shuffleFetchRate', 'shuffleWriteRate']]

execution.plot.bar(stacked=True)

我使用Pandas库,执行是包含平均指标的数据框.当然,我的假设是executorRunTime是基础其他指标的总和,但事实证明是错误的.

I use Pandas library and execution is the dataframe containing the averaged metrics. Of course, my assumption is that the executorRunTime is a summation of the underlying other metrics, but it turns out to be false.

那些时代的含义是什么,它们之间有什么联系?我的意思是:如果不是上面指定的所有其他指标,那么executorRunTime由什么组成?

What are the meaning of those times, and how are they correlated? I mean: what does the executorRunTime consist of if not all the other metrics specified above?

谢谢

推荐答案

根据以毫秒为单位.

这篇关于executorRunTime在Spark中包含什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆