Spark错误:executor.CoarseGrainedExecutor后端:收到的信号条款 [英] Spark Error : executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM

查看:215
本文介绍了Spark错误:executor.CoarseGrainedExecutor后端:收到的信号条款的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用以下spark配置

I am working with following spark config

maxCores = 5
 driverMemory=2g
 executorMemory=17g
 executorInstances=100

问题:在100位执行者中,我的工作最终只有10位有效执行者,但是仍然有足够的可用内存.甚至尝试将执行程序设置为250,只有10个仍处于活动状态.我要做的就是加载mulitpartition配置单元表并对其进行df.count.

Issue: Out of 100 Executors, My job ends up with only 10 active executors, nonetheless enough memory is available. Even tried setting the executors to 250 only 10 remains active.All I am trying to do is loading a mulitpartition hive table and doing df.count over it.

Please help me understanding the issue causing the executors kill
17/12/20 11:08:21 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
17/12/20 11:08:21 INFO storage.DiskBlockManager: Shutdown hook called
17/12/20 11:08:21 INFO util.ShutdownHookManager: Shutdown hook called

不知道为什么yarn杀死了我的执行者.

Not sure why yarn is killing my executors.

推荐答案

我遇到了类似的问题,对NodeManager-Logs的调查使我找到了根本原因.您可以通过Web界面访问它们

I faced a similar issue where the investigation of the NodeManager-Logs lead me to the root cause. You can access them via the Web-interface

nodeManagerAddress:PORT/logs

端口 yarn.nodemanager.webapp.address 下的 yarn-site.xml 中指定.(默认: 8042 )

The PORT is specified in the yarn-site.xml under yarn.nodemanager.webapp.address. (default: 8042)

我的调查工作流程:

  1. 收集日志(纱线日志...命令)
  2. 在这些日志中识别出发出错误
  3. 的节点和容器
  4. 错误的时间戳搜索NodeManager日志以查找根本原因
  1. Collect logs (yarn logs ... command)
  2. Identify node and container (in these logs) emitting the error
  3. Search the NodeManager-logs by Timestamp of the error for a root cause

顺便说一句:您可以使用以下命令访问影响同一端口上的节点的所有配置的聚合集合(xml):

Btw: you can access the aggregated collection (xml) of all configurations affecting a node at the same port with:

 nodeManagerAdress:PORT/conf

这篇关于Spark错误:executor.CoarseGrainedExecutor后端:收到的信号条款的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆