错误清除广播错误异常 [英] ERROR Error cleaning broadcast Exception

查看:503
本文介绍了错误清除广播错误异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在运行我的Spark Streaming应用程序时出现以下错误,我们有一个大型应用程序在运行多个有状态(带有mapWithState)和无状态操作.隔离错误变得越来越困难,因为spark本身会挂起,并且我们看到的唯一错误是在spark日志中,而不是在应用程序日志中.

I get the following error while running my spark streaming application, we have a large application running multiple stateful (with mapWithState) and stateless operations. It's getting difficult to isolate the error since spark itself hangs and the only error we see is in the spark log and not the application log itself.

仅在大约4-5分钟且微间歇时间间隔为10秒后才会发生错误.我在具有基于Kafka的输入和输出流的ubuntu服务器上使用Spark 1.6.1.

The error happens only after abount 4-5 mins with a micro-batch interval of 10 seconds. I am using Spark 1.6.1 on an ubuntu server with Kafka based input and output streams.

请注意,我无法提供最小的代码来重新创建此bug,因为它在单元测试用例中不会发生,并且应用程序本身很大

您可以提供解决该问题的任何指导都会有所帮助.请让我知道是否可以提供更多信息.

Any direction you can give to solve this issue will be helpful. Please let me know if I can provide any more information.

下面的错误内联:

[2017-07-11 16:15:15,338] ERROR Error cleaning broadcast 2211 (org.apache.spark.ContextCleaner)

org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout

        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)

        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)

        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)

        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)

        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)

        at org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:136)

        at org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:228)

        at org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)

        at org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:77)

        at org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:233)

        at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:189)

        at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:180)

        at scala.Option.foreach(Option.scala:236)

        at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:180)

        at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180)

        at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:173)

        at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:68)

    Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]

        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

        at scala.concurrent.Await$.result(package.scala:107)

        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

推荐答案

您的异常消息清楚地表明,由于默认配置为120秒,它的RPCTimeout可以根据您的工作负载进行调整,以达到最佳值.请参阅 1.6配置

Your exception message clearly says that its RPCTimeout due to default configuration of 120 seconds and adjust to optimal value as per your work load. please see 1.6 configuration

您的错误消息 org.apache.spark.rpc.RpcTimeoutException:期货在[120秒]后超时.和org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)上的确认这一点.

your error messages org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. and at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) confirms that.

为了更好地理解,请参见

For Better understanding please see the below code from

请参见

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆