Spark 退出异常 [英] Spark Exits with exception

查看:139
本文介绍了Spark 退出异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我在运行应用程序时得到的堆栈跟踪:

16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在 executor id 上启动任务 233:4 主机名:10.178.149.243.16/11/03 11:25:45 警告 TaskSetManager:在 11.0 阶段丢失任务 1.0(TID 217,10.178.149.243):java.util.NoSuchElementException:None.get在 scala.None$.get(Option.scala:347)在 scala.None$.get(Option.scala:345)在 org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)在 org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)在 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在 java.lang.Thread.run(Thread.java:745)16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 225)中丢失任务 14.0:java.util.NoSuchElementException(None.get)[重复 1]16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 14.1(TID 234,10.178.149.243,分区 14,NODE_LOCAL,8828 字节)16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 232)中丢失任务 22.0:java.util.NoSuchElementException(None.get)[重复 2]16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 234:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 22.1(TID 235,10.178.149.243,分区 22,NODE_LOCAL,9066 字节)16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 233)中丢失任务 24.0:java.util.NoSuchElementException(None.get)[重复 3]16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 235:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 24.1(TID 236,10.178.149.243,分区 24,NODE_LOCAL,9185 字节)16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 236:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 235)中丢失任务 22.1:java.util.NoSuchElementException(None.get)[重复 4]16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 22.2(TID 237、10.178.149.243、分区 22、NODE_LOCAL、9066 字节)16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 237:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 234)中丢失任务 14.1:java.util.NoSuchElementException(None.get)[重复 5]16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 14.2(TID 238、10.178.149.243、分区 14、NODE_LOCAL、8828 字节)16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 238:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 236)中丢失任务 24.1:java.util.NoSuchElementException(None.get)[重复 6]16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 24.2(TID 239、10.178.149.243、分区 24、NODE_LOCAL、9185 字节)16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 239:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 237)中丢失任务 22.2:java.util.NoSuchElementException(None.get)[重复 7]16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 22.3(TID 240,10.178.149.243,分区 22,NODE_LOCAL,9066 字节)16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 238)中丢失任务 14.2:java.util.NoSuchElementException(None.get)[重复 8]16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 240:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 14.3(TID 241、10.178.149.243、分区 14、NODE_LOCAL、8828 字节)16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 239)中丢失任务 24.2:java.util.NoSuchElementException(None.get)[重复 9]16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 241:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在阶段 11.0 中启动任务 24.3(TID 242、10.178.149.243、分区 24、NODE_LOCAL、9185 字节)16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 242:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 240)中丢失任务 22.3:java.util.NoSuchElementException(None.get)[重复 10]16/11/03 11:25:45 错误 TaskSetManager:阶段 11.0 中的任务 22 失败了 4 次;中止工作16/11/03 11:25:45 INFO TaskSetManager:在阶段 12.0 中启动任务 0.0(TID 243,10.178.149.243,分区 0,NODE_LOCAL,10016 字节)16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 243:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 241)中丢失任务 14.3:java.util.NoSuchElementException(None.get)[重复 11]16/11/03 11:25:45 INFO TaskSchedulerImpl:取消阶段 1216/11/03 11:25:45 INFO TaskSchedulerImpl:第 12 阶段被取消16/11/03 11:25:45 INFO TaskSetManager:在阶段 14.0 中启动任务 0.0(TID 244、10.178.149.243、分区 0、NODE_LOCAL、7638 字节)16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:在执行程序 ID 上启动任务 244:4 主机名:10.178.149.243.16/11/03 11:25:45 INFO TaskSetManager:在执行程序 10.178.149.243 的阶段 11.0(TID 242)中丢失任务 24.3:java.util.NoSuchElementException(None.get)[重复 12]16/11/03 11:25:45 INFO DAGScheduler:ShuffleMapStage 12(显示在 RNFBackTagger.scala:97)在 0.112 秒内失败16/11/03 11:25:45 INFO TaskSchedulerImpl:取消阶段 1416/11/03 11:25:45 INFO TaskSchedulerImpl:第 14 阶段被取消16/11/03 11:25:45 INFO DAGScheduler:ShuffleMapStage 14(显示在 RNFBackTagger.scala:97)在 0.104 秒内失败16/11/03 11:25:45 INFO TaskSchedulerImpl:取消阶段 1116/11/03 11:25:45 INFO TaskSchedulerImpl:第 11 阶段被取消16/11/03 11:25:45 INFO DAGScheduler:ShuffleMapStage 11(显示在 RNFBackTagger.scala:97)在 0.126 秒内失败16/11/03 11:25:45 警告 TaskSetManager:在 12.0 阶段丢失任务 0.0(TID 243,10.178.149.243):java.util.NoSuchElementException:None.get在 scala.None$.get(Option.scala:347)在 scala.None$.get(Option.scala:345)在 org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)在 org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)在 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在 java.lang.Thread.run(Thread.java:745)16/11/03 11:25:45 INFO DAGScheduler:作业 7 失败:在 RNFBackTagger.scala:97 显示,耗时 0.141681 秒16/11/03 11:25:45 INFO TaskSchedulerImpl:从池中删除了任务集 12.0,其任务已全部完成线程main" org.apache.spark.SparkException 中的异常:作业因阶段失败而中止:阶段 11.0 中的任务 22 失败 4 次,最近失败:阶段 11.0 中的任务 22.3 丢失(TID 240、10.178.149.243):java.util.NoSuchElementException: None.get在 scala.None$.get(Option.scala:347)在 scala.None$.get(Option.scala:345)在 org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)在 org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)在 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在 java.lang.Thread.run(Thread.java:745)驱动程序堆栈跟踪:在 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1450)在 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1438)在 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1437)在 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)在 scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)在 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1437)在 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)在 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)在 scala.Option.foreach(Option.scala:257)在 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)在 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1659)在 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)在 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)在 org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)在 org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)在 org.apache.spark.SparkContext.runJob(SparkContext.scala:1871)在 org.apache.spark.SparkContext.runJob(SparkContext.scala:1884)在 org.apache.spark.SparkContext.runJob(SparkContext.scala:1897)在 org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:347)在 org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:39)在 org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2183)在 org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)在 org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2532)在 org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2182)在 org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2189)在 org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1925)在 org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1924)在 org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2562)在 org.apache.spark.sql.Dataset.head(Dataset.scala:1924)在 org.apache.spark.sql.Dataset.take(Dataset.scala:2139)在 org.apache.spark.sql.Dataset.showString(Dataset.scala:239)在 org.apache.spark.sql.Dataset.show(Dataset.scala:526)在 com.knoldus.xml.RNFBackTagger$.main(RNFBackTagger.scala:97)在 com.knoldus.xml.RNFBackTagger.main(RNFBackTagger.scala)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:498)在 org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)在 org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)在 org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)在 org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)在 org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)引起:java.util.NoSuchElementException:None.get在 scala.None$.get(Option.scala:347)在 scala.None$.get(Option.scala:345)在 org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)在 org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)在 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在 java.lang.Thread.run(Thread.java:745)16/11/03 11:25:45 警告 JobProgressListener:未知阶段 12 的任务开始16/11/03 11:25:45 警告 TaskSetManager:在 14.0 阶段丢失任务 0.0(TID 244,10.178.149.243):java.util.NoSuchElementException:None.get在 scala.None$.get(Option.scala:347)在 scala.None$.get(Option.scala:345)在 org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)在 org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)在 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在 java.lang.Thread.run(Thread.java:745)16/11/03 11:25:45 INFO TaskSchedulerImpl:从池中删除了任务集 14.0,其任务已全部完成16/11/03 11:25:45 INFO SparkContext:从关闭钩子调用 stop()16/11/03 11:25:45 警告 JobProgressListener:未知阶段 14 的任务开始16/11/03 11:25:45 INFO SerialShutdownHooks:成功执行关闭挂钩:清除 C* 连接器的会话缓存16/11/03 11:25:45 INFO TaskSetManager:在 10.178.149.22 (1/35) 的 137 毫秒内完成了阶段 11.0 (TID 219) 中的任务 5.016/11/03 11:25:45 INFO SparkUI:在 http://10.178.149.133:4040 停止 Spark Web UI16/11/03 11:25:45 INFO StandaloneSchedulerBackend:关闭所有执行程序16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:要求每个执行程序关闭16/11/03 11:25:45 错误 TransportRequestHandler:为单向消息调用 RpcHandler#receive() 时出错.org.apache.spark.SparkException:找不到 CoarseGrainedScheduler.在 org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)在 org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:132)在 org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)在 org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:179)在 org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:108)在 org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)在 org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)在 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)在 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)在 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)在 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)在 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)在 io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)在 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)在 java.lang.Thread.run(Thread.java:745)16/11/03 11:25:45 错误 TransportRequestHandler:为单向消息调用 RpcHandler#receive() 时出错.org.apache.spark.SparkException:找不到 CoarseGrainedScheduler.在 org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)在 org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:132)在 org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)在 org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:179)在 org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:108)在 org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)在 org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)在 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)在 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)在 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)在 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)在 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)在 io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)在 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)在 java.lang.Thread.run(Thread.java:745)16/11/03 11:25:45 信息 MapOutputTrackerMasterEndpoint:MapOutputTrackerMasterEndpoint 停止了!16/11/03 11:25:45 信息 MemoryStore:MemoryStore 已清除16/11/03 11:25:45 信息块管理器:块管理器停止16/11/03 11:25:45 信息 BlockManagerMaster:BlockManagerMaster 停止16/11/03 11:25:45 信息 OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:OutputCommitCoordinator 停止了!16/11/03 11:25:45 错误 TransportRequestHandler:为单向消息调用 RpcHandler#receive() 时出错.org.apache.spark.rpc.RpcEnvStoppedException: RpcEnv 已经停止.在 org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:150)在 org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:132)在 org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)在 org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:179)在 org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:108)在 org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)在 org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)在 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)在 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)在 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)在 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)在 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)在 io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)在 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)在 java.lang.Thread.run(Thread.java:745)16/11/03 11:25:45 错误 TransportRequestHandler:为单向消息调用 RpcHandler#receive() 时出错.org.apache.spark.rpc.RpcEnvStoppedException: RpcEnv 已经停止.在 org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:150)在 org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:132)在 org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)在 org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:179)在 org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:108)在 org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)在 org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)在 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)在 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)在 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)在 io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)在 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)在 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)在 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)在 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)在 io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)在 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)在 java.lang.Thread.run(Thread.java:745)16/11/03 11:25:45 INFO SparkContext:成功停止 SparkContext16/11/03 11:25:45 INFO ShutdownHookManager:关闭钩子调用16/11/03 11:25:45 INFO ShutdownHookManager:删除目录/tmp/spark-c52a6da9-5702-4128-9950-805d5f9dd75e

之前我无法指出问题所在!然后我尝试了删除不必要的代码方法!

然后我发现问题出在这个:

 val groupedDF = selectedDF.groupBy("id").agg(collect_list("name"))分组DF.show

因为如果我尝试显示 selectedDF 它会显示正确的结果!

我使用的 spark 版本是 2.0.0 !请帮助我,让我知道是什么问题.

代码链接是:https://gist.github.com/shiv4nsh/0c3f62e3afd95634a65207在线显示 19 打印和显示在 28 抛出此异常.

服务器配置:我在具有 10 GB 内存的 8 核工作线程上运行 spark 2.0,并在 centOS 上运行

启动应用程序的脚本:

./bin/spark-submit --class com.knoldus.Application/root/code/newCode/project1/target/deployable.jar

感谢任何帮助!

注意:该代码在本地模式下工作正常.当我尝试在集群上运行它时会抛出此错误.

解决方案

我遇到了类似的问题,结果是因为我的应用程序每次尝试重新加载加载某些类时都会创建一个新的 SparkContext执行者.如果执行程序需要加载以运行某些步骤的代码与实例化 SparkContext 的代码处于相同的逻辑上下文"中,那么在您的情况下很可能会出现相同的问题.您需要通过重构代码来确保您的 SparkContext 最多只加载一次.

This is the stackTrace that I am getting while running the application:

16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 233 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 WARN TaskSetManager: Lost task 1.0 in stage 11.0 (TID 217, 10.178.149.243): java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:347)
    at scala.None$.get(Option.scala:345)
    at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
    at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

16/11/03 11:25:45 INFO TaskSetManager: Lost task 14.0 in stage 11.0 (TID 225) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 1]
16/11/03 11:25:45 INFO TaskSetManager: Starting task 14.1 in stage 11.0 (TID 234, 10.178.149.243, partition 14, NODE_LOCAL, 8828 bytes)
16/11/03 11:25:45 INFO TaskSetManager: Lost task 22.0 in stage 11.0 (TID 232) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 2]
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 234 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Starting task 22.1 in stage 11.0 (TID 235, 10.178.149.243, partition 22, NODE_LOCAL, 9066 bytes)
16/11/03 11:25:45 INFO TaskSetManager: Lost task 24.0 in stage 11.0 (TID 233) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 3]
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 235 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Starting task 24.1 in stage 11.0 (TID 236, 10.178.149.243, partition 24, NODE_LOCAL, 9185 bytes)
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 236 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Lost task 22.1 in stage 11.0 (TID 235) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 4]
16/11/03 11:25:45 INFO TaskSetManager: Starting task 22.2 in stage 11.0 (TID 237, 10.178.149.243, partition 22, NODE_LOCAL, 9066 bytes)
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 237 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Lost task 14.1 in stage 11.0 (TID 234) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 5]
16/11/03 11:25:45 INFO TaskSetManager: Starting task 14.2 in stage 11.0 (TID 238, 10.178.149.243, partition 14, NODE_LOCAL, 8828 bytes)
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 238 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Lost task 24.1 in stage 11.0 (TID 236) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 6]
16/11/03 11:25:45 INFO TaskSetManager: Starting task 24.2 in stage 11.0 (TID 239, 10.178.149.243, partition 24, NODE_LOCAL, 9185 bytes)
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 239 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Lost task 22.2 in stage 11.0 (TID 237) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 7]
16/11/03 11:25:45 INFO TaskSetManager: Starting task 22.3 in stage 11.0 (TID 240, 10.178.149.243, partition 22, NODE_LOCAL, 9066 bytes)
16/11/03 11:25:45 INFO TaskSetManager: Lost task 14.2 in stage 11.0 (TID 238) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 8]
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 240 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Starting task 14.3 in stage 11.0 (TID 241, 10.178.149.243, partition 14, NODE_LOCAL, 8828 bytes)
16/11/03 11:25:45 INFO TaskSetManager: Lost task 24.2 in stage 11.0 (TID 239) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 9]
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 241 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Starting task 24.3 in stage 11.0 (TID 242, 10.178.149.243, partition 24, NODE_LOCAL, 9185 bytes)
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 242 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Lost task 22.3 in stage 11.0 (TID 240) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 10]
16/11/03 11:25:45 ERROR TaskSetManager: Task 22 in stage 11.0 failed 4 times; aborting job
16/11/03 11:25:45 INFO TaskSetManager: Starting task 0.0 in stage 12.0 (TID 243, 10.178.149.243, partition 0, NODE_LOCAL, 10016 bytes)
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 243 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Lost task 14.3 in stage 11.0 (TID 241) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 11]
16/11/03 11:25:45 INFO TaskSchedulerImpl: Cancelling stage 12
16/11/03 11:25:45 INFO TaskSchedulerImpl: Stage 12 was cancelled
16/11/03 11:25:45 INFO TaskSetManager: Starting task 0.0 in stage 14.0 (TID 244, 10.178.149.243, partition 0, NODE_LOCAL, 7638 bytes)
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 244 on executor id: 4 hostname: 10.178.149.243.
16/11/03 11:25:45 INFO TaskSetManager: Lost task 24.3 in stage 11.0 (TID 242) on executor 10.178.149.243: java.util.NoSuchElementException (None.get) [duplicate 12]
16/11/03 11:25:45 INFO DAGScheduler: ShuffleMapStage 12 (show at RNFBackTagger.scala:97) failed in 0.112 s
16/11/03 11:25:45 INFO TaskSchedulerImpl: Cancelling stage 14
16/11/03 11:25:45 INFO TaskSchedulerImpl: Stage 14 was cancelled
16/11/03 11:25:45 INFO DAGScheduler: ShuffleMapStage 14 (show at RNFBackTagger.scala:97) failed in 0.104 s
16/11/03 11:25:45 INFO TaskSchedulerImpl: Cancelling stage 11
16/11/03 11:25:45 INFO TaskSchedulerImpl: Stage 11 was cancelled
16/11/03 11:25:45 INFO DAGScheduler: ShuffleMapStage 11 (show at RNFBackTagger.scala:97) failed in 0.126 s
16/11/03 11:25:45 WARN TaskSetManager: Lost task 0.0 in stage 12.0 (TID 243, 10.178.149.243): java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:347)
    at scala.None$.get(Option.scala:345)
    at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
    at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

16/11/03 11:25:45 INFO DAGScheduler: Job 7 failed: show at RNFBackTagger.scala:97, took 0.141681 s
16/11/03 11:25:45 INFO TaskSchedulerImpl: Removed TaskSet 12.0, whose tasks have all completed, from pool 
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 22 in stage 11.0 failed 4 times, most recent failure: Lost task 22.3 in stage 11.0 (TID 240, 10.178.149.243): java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:347)
    at scala.None$.get(Option.scala:345)
    at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
    at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1450)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1438)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1437)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1437)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1659)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1871)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1884)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1897)
    at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:347)
    at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:39)
    at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2183)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
    at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2532)
    at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2182)
    at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2189)
    at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1925)
    at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1924)
    at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2562)
    at org.apache.spark.sql.Dataset.head(Dataset.scala:1924)
    at org.apache.spark.sql.Dataset.take(Dataset.scala:2139)
    at org.apache.spark.sql.Dataset.showString(Dataset.scala:239)
    at org.apache.spark.sql.Dataset.show(Dataset.scala:526)
    at com.knoldus.xml.RNFBackTagger$.main(RNFBackTagger.scala:97)
    at com.knoldus.xml.RNFBackTagger.main(RNFBackTagger.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:347)
    at scala.None$.get(Option.scala:345)
    at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
    at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
16/11/03 11:25:45 WARN JobProgressListener: Task start for unknown stage 12
16/11/03 11:25:45 WARN TaskSetManager: Lost task 0.0 in stage 14.0 (TID 244, 10.178.149.243): java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:347)
    at scala.None$.get(Option.scala:345)
    at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:343)
    at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:644)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

16/11/03 11:25:45 INFO TaskSchedulerImpl: Removed TaskSet 14.0, whose tasks have all completed, from pool 
16/11/03 11:25:45 INFO SparkContext: Invoking stop() from shutdown hook
16/11/03 11:25:45 WARN JobProgressListener: Task start for unknown stage 14
16/11/03 11:25:45 INFO SerialShutdownHooks: Successfully executed shutdown hook: Clearing session cache for C* connector
16/11/03 11:25:45 INFO TaskSetManager: Finished task 5.0 in stage 11.0 (TID 219) in 137 ms on 10.178.149.22 (1/35)
16/11/03 11:25:45 INFO SparkUI: Stopped Spark web UI at http://10.178.149.133:4040
16/11/03 11:25:45 INFO StandaloneSchedulerBackend: Shutting down all executors
16/11/03 11:25:45 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
16/11/03 11:25:45 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
org.apache.spark.SparkException: Could not find CoarseGrainedScheduler.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:132)
    at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)
    at org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:179)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:108)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    at java.lang.Thread.run(Thread.java:745)
16/11/03 11:25:45 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
org.apache.spark.SparkException: Could not find CoarseGrainedScheduler.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:132)
    at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)
    at org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:179)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:108)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    at java.lang.Thread.run(Thread.java:745)
16/11/03 11:25:45 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/11/03 11:25:45 INFO MemoryStore: MemoryStore cleared
16/11/03 11:25:45 INFO BlockManager: BlockManager stopped
16/11/03 11:25:45 INFO BlockManagerMaster: BlockManagerMaster stopped
16/11/03 11:25:45 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/11/03 11:25:45 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
org.apache.spark.rpc.RpcEnvStoppedException: RpcEnv already stopped.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:150)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:132)
    at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)
    at org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:179)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:108)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    at java.lang.Thread.run(Thread.java:745)
16/11/03 11:25:45 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
org.apache.spark.rpc.RpcEnvStoppedException: RpcEnv already stopped.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:150)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:132)
    at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:571)
    at org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:179)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:108)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    at java.lang.Thread.run(Thread.java:745)
16/11/03 11:25:45 INFO SparkContext: Successfully stopped SparkContext
16/11/03 11:25:45 INFO ShutdownHookManager: Shutdown hook called
16/11/03 11:25:45 INFO ShutdownHookManager: Deleting directory /tmp/spark-c52a6da9-5702-4128-9950-805d5f9dd75e

Earlier I was not able to pin point the problem ! Then I tried the removing unncessary Code approach !

Then I found out the problem lies in this :

 val groupedDF = selectedDF.groupBy("id").agg(collect_list("name"))
    groupedDF.show

Because if I try to show selectedDF it displays the correct result!

The spark version that I am using is 2.0.0 ! Please help me out and let me know what is the problem.

Link to Code is : https://gist.github.com/shiv4nsh/0c3f62e3afd95634a6061b405c774582

Show on line 19 prints and the show on 28 throws this exception.

Server Configuration: I have spark 2.0 running on 8 core worker with 10 gb memory and its running on centOS

Script for launching application:

./bin/spark-submit --class com.knoldus.Application /root/code/newCode/project1/target/deployable.jar

Any help is appreciated !

Note: The code works fine in local mode. This error is thrown when i try to run it on cluster.

解决方案

I had a similar issue and it turned out to be because of the fact that my application was creating a new SparkContext everytime it tried to reload load certain classes in the executors. It's very likely that the same problem in your case if the code that will need to be loaded by the executors to run certain steps is in the same 'logical context' as code that instantiates the SparkContext. You need to make sure that your SparkContext is loaded only once at most simply by restructuring your code.

这篇关于Spark 退出异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆