为什么 Spark 应用程序会因“executor.CoarseGrainedExecutorBackend: Driver Dis associated"而失败? [英] Why spark application fail with "executor.CoarseGrainedExecutorBackend: Driver Disassociated"?

查看:47
本文介绍了为什么 Spark 应用程序会因“executor.CoarseGrainedExecutorBackend: Driver Dis associated"而失败?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我通过 spark-submit 和 spark-sql 执行查询 sql 时,对应的 spark 应用程序总是失败,错误如下:

When i execute query sql via spark-submit and spark-sql, corresponding spark application always fail with error follows:

15/03/10 18:50:52 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@slave75:60697/user/HeartbeatReceiver
15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave79:35643] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.

以上只是错误之一,我使用yarn logs -application application_1425944520319_8102.log"获取整个应用程序日志并筛选出如下错误:

and above is just one of the error, i used "yarn logs -application application_1425944520319_8102.log" to obtain the whole application log and screen out the error as below:

Line 46: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:55156] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 97: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:32852] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 149: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:45654] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 200: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave10:45702] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 251: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave10:21596] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 302: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave10:58845] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 353: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave13:1697] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 437: 15/03/10 18:52:06 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.
Line 481: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 3.0 in stage 0.0 (TID 10)
Line 504: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave13:6289] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 556: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave14:37070] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 607: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave14:43424] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 658: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave15:38083] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 710: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave15:3106] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 761: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave15:35533] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 812: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave16:63207] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 863: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave16:11250] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 910: 15/03/10 18:52:09 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM
Line 961: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave18:26917] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1012: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave18:3058] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1063: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave19:1885] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1114: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave19:14795] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1165: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave19:39794] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1216: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave20:19614] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1267: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave20:38776] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1318: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave21:19231] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1370: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave21:18816] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1454: 15/03/10 18:52:06 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.
Line 1498: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.0 in stage 0.0 (TID 18)
Line 1524: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.1 in stage 0.0 (TID 28)
Line 1550: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.2 in stage 0.0 (TID 31)
Line 1576: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.3 in stage 0.0 (TID 32)
Line 1602: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.4 in stage 0.0 (TID 33)
Line 1628: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.5 in stage 0.0 (TID 36)
Line 1654: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.6 in stage 0.0 (TID 37)
Line 1680: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.7 in stage 0.0 (TID 39)
Line 1706: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.8 in stage 0.0 (TID 41)
Line 1732: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.9 in stage 0.0 (TID 42)
Line 1755: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave22:24322] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1806: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave23:38508] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1858: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave24:19707] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1909: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave25:33683] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 1976: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave25:18587] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2027: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave26:64531] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2078: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave27:23333] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2129: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave27:61136] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2180: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave27:25118] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2231: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave28:16274] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2282: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave29:1324] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2334: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave29:51664] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2385: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave29:38854] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2452: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave30:30088] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2504: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave30:30778] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2556: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave31:52263] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2623: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave31:17806] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2674: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave32:3251] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2725: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave32:17832] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2776: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave32:11629] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2827: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave33:22629] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 2911: 15/03/10 18:52:07 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.

您可以从 https://www.dropbox 获取日志文件.com/s/lf50ger18v3ngtb/application_1425944520319_8102.log?dl=0 如果我没有表达清楚.

you can get the log file from https://www.dropbox.com/s/lf50ger18v3ngtb/application_1425944520319_8102.log?dl=0 if i didn't express clearly.

slave75的网络没问题,所有节点的hosts都配置正确.任何回复都会有所帮助,谢谢!

The network of slave75 is ok and hosts in all nodes are correctly configured. Any response will help, thanks!

推荐答案

终于找到原因了.是因为 Yarn 杀死了 executor(容器),因为 executor 是内存开销.只需调高 spark.yarn.driver.memoryOverheadspark.yarn.executor.memoryOverhead 或两者的值.

Finally I found the reason. It is because Yarn kills the executor (container) because the executor is memory overhead. Just turn up values of spark.yarn.driver.memoryOverhead or spark.yarn.executor.memoryOverhead or both.

这篇关于为什么 Spark 应用程序会因“executor.CoarseGrainedExecutorBackend: Driver Dis associated"而失败?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆