为什么没有使用所有可用执行程序的Spark大舞台? [英] Why isn't a very big Spark stage using all available executors?

查看：68 发布时间：2021/4/8 19:42:37 apache-spark

本文介绍了为什么没有使用所有可用执行程序的Spark大舞台?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在执行一些非常大的阶段(例如> 20k个任务)的Spark作业，并使用1k到2k的执行程序来运行它.

I am running a Spark job with some very big stages (e.g. >20k tasks), and am running it with 1k to 2k executors.

在某些情况下，一个阶段似乎运行不稳定:随着时间的推移，许多可用的执行器变得闲置，尽管它们仍处于许多未完成任务的中间.从用户的角度来看，任务似乎正在完成，但是完成给定任务的执行者不会获得分配给他们的新任务.结果，该阶段花费的时间超过了原本应有的时间，并且空闲时浪费了许多执行器CPU时间.这似乎大部分(仅?)发生在从HDFS读取数据的输入阶段.

In some cases, a stage will appear to run unstably: many available executors become idle over time, despite still being in the middle of a stage with many unfinished tasks. From the user perspective, it appears that tasks are finishing, but executors that have finished a given task do not get a new task assigned to them. As a result, the stage takes longer than it should, and a lot of executor CPU-hours are being wasted on idling. This seems to mostly (only?) happen during input stages, where data is being read from HDFS.

Spark stderr日志在不稳定时期的示例-请注意，随着时间的流逝，正在运行的任务数量逐渐减少，直到几乎达到零，然后突然跳回> 1k个正在运行的任务:

Example Spark stderr log during an unstable period -- notice that the number of running tasks decreases over time until it almost reaches zero, then suddenly jumps back up to >1k running tasks:

[Stage 0:==============================>                 (17979 + 1070) / 28504]
[Stage 0:==============================>                 (18042 + 1019) / 28504]
[Stage 0:===============================>                 (18140 + 921) / 28504]
[Stage 0:===============================>                 (18222 + 842) / 28504]
[Stage 0:===============================>                 (18263 + 803) / 28504]
[Stage 0:===============================>                 (18282 + 786) / 28504]
[Stage 0:===============================>                 (18320 + 751) / 28504]
[Stage 0:===============================>                 (18566 + 508) / 28504]
[Stage 0:================================>                (18791 + 284) / 28504]
[Stage 0:================================>                (18897 + 176) / 28504]
[Stage 0:================================>                (18940 + 134) / 28504]
[Stage 0:================================>                (18972 + 107) / 28504]
[Stage 0:=================================>                (19035 + 47) / 28504]
[Stage 0:=================================>                (19067 + 17) / 28504]
[Stage 0:================================>               (19075 + 1070) / 28504]
[Stage 0:================================>               (19107 + 1039) / 28504]
[Stage 0:================================>                (19165 + 982) / 28504]
[Stage 0:=================================>               (19212 + 937) / 28504]
[Stage 0:=================================>               (19251 + 899) / 28504]
[Stage 0:=================================>               (19355 + 831) / 28504]
[Stage 0:=================================>               (19481 + 708) / 28504]

这是阶段稳定运行时stderr的样子-正在运行的任务数量大致保持不变，因为新任务在执行者完成之前的任务时会分配给他们:

This is what the stderr looks like when a stage is running stably -- the number of running tasks remains roughly constant, because new tasks are assigned to executors as they finish their previous tasks:

[Stage 1:===================>                            (11599 + 2043) / 28504]
[Stage 1:===================>                            (11620 + 2042) / 28504]
[Stage 1:===================>                            (11656 + 2044) / 28504]
[Stage 1:===================>                            (11692 + 2045) / 28504]
[Stage 1:===================>                            (11714 + 2045) / 28504]
[Stage 1:===================>                            (11741 + 2047) / 28504]
[Stage 1:===================>                            (11771 + 2047) / 28504]
[Stage 1:===================>                            (11818 + 2047) / 28504]

在什么情况下会发生这种情况，我该如何避免这种行为?

Under what circumstances would this happen, and how can I avoid this behavior?

NB:我正在使用动态分配，但是我很确定这与该问题无关-例如，在不稳定时期，在Spark Application Master UI中，我可以看到预期的执行器数量是活动的"."，但未运行活动任务".

NB: I am using dynamic allocation, but I'm pretty sure this is unrelated to this problem -- e.g., during an unstable period, in the Spark Application Master UI I can see that the expected number of executors are "Active", but are not running "Active Tasks."

为什么没有使用所有可用执行程序的Spark大舞台? [英] Why isn't a very big Spark stage using all available executors?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为什么没有使用所有可用执行程序的Spark大舞台? [英] Why isn&#39;t a very big Spark stage using all available executors?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

为什么没有使用所有可用执行程序的Spark大舞台? [英] Why isn't a very big Spark stage using all available executors?

登录关闭