Spark任务在独立集群上随机阻止 [英] Spark tasks blockes randomly on standalone cluster

查看：65 发布时间：2021/4/8 19:32:32 apache-spark

本文介绍了Spark任务在独立集群上随机阻止的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们有一个非常复杂的应用程序，可以在Spark Standalone上运行.在某些情况下，来自某个工作人员的任务会在RUNNING状态下无限次随机阻止.

We are having a quite complex application that runs on Spark Standalone. In some cases the tasks from one of the workers blocks randomly for an infinite amount of time in the RUNNING state.

其他信息:

日志中没有任何错误
在调试器中运行记录器，但我没有看到任何相关消息(我看到任务何时开始，但是没有任何活动)
如果我只有一名工人，工作就可以了
同一作业可以在适当的时间内第二次执行而没有任何问题
我没有任何很大的分区，可能会导致某些任务的延迟.
在spark 2.0中，我已经从RDD移到了数据集，并且遇到了同样的问题
在spark 1.4中，我可以通过打开推测来解决此问题，但是在spark 2.0中，阻止任务来自不同的工作人员(而在1.4中，我仅对1个工作人员执行阻止任务)，因此推测无法解决我的问题.
我在更多环境中遇到问题，因此我认为这与硬件无关.

有人经历过类似的事情吗?关于如何确定问题的任何建议?

Did anyone experienced something similar? Any suggestions on how could i identify the issue?

非常感谢！

稍后:我认为我也面临此处所述的相同问题:

Later I think i'm facing the same issue described here: Spark Indefinite Waiting with "Asked to send map output locations for shuffle" and here: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-stalling-during-shuffle-maybe-a-memory-issue-td6067.html but both are without a working solution.

日志中无限重复的最后一件事是:[dispatcher-event-loop-18] DEBUG org.apache.spark.scheduler.TaskSchedulerImpl-parentName:，名称:TaskSet_2，runningTasks:6

The last thing in the log repeated infinitely is: [dispatcher-event-loop-18] DEBUG org.apache.spark.scheduler.TaskSchedulerImpl - parentName: , name: TaskSet_2, runningTasks: 6

Spark任务在独立集群上随机阻止 [英] Spark tasks blockes randomly on standalone cluster

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark任务在独立集群上随机阻止 [英] Spark tasks blockes randomly on standalone cluster

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭