提交拓扑后异常 [英] exception after submitting topology
本文介绍了提交拓扑后异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我是 Storm 新手,尝试提交拓扑并发现了这个在主管 我在工人的日志文件中发现了这个
I'm new in storm and trying to submit a topology and found this in supervisor I found this in log file of workers
[ERROR] Async loop died!
java.lang.RuntimeException: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection refused
at backtype.storm.drpc.DRPCInvocationsClient.<init>(DRPCInvocationsClient.java:23)
at backtype.storm.drpc.DRPCSpout.open(DRPCSpout.java:69)
at storm.trident.spout.RichSpoutBatchTriggerer.open(RichSpoutBatchTriggerer.java:41)
at backtype.storm.daemon.executor$fn__3985$fn__3997.invoke(executor.clj:460)
at backtype.storm.util$async_loop$fn__465.invoke(util.clj:375)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection refused
主管的日志文件
supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:54 supervisor [ERROR] Error when processing event
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:72)
at com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:74)
at com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:353)
at com.netflix.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:149)
at com.netflix.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:138)
at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:85)
at com.netflix.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:134)
at com.netflix.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:125)
at com.netflix.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:34)
at backtype.storm.zookeeper$exists_node_QMARK_.invoke(zookeeper.clj:78)
at backtype.storm.zookeeper$mkdirs.invoke(zookeeper.clj:88)
at backtype.storm.cluster$mk_distributed_cluster_state$reify__1996.set_ephemeral_node(cluster.clj:54)
at backtype.storm.cluster$mk_storm_cluster_state$reify__2415.supervisor_heartbeat_BANG_(cluster.clj:300)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28)
这也在主管日志文件中
at java.lang.Thread.run(Unknown Source)
2015-09-15 02:00:54 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:55 ClientCnxn [INFO] Client session timed out, have not heard from server in 20020ms for sessionid 0x14fce3996380015, closing socket connection and attempting reconnect
2015-09-15 02:00:58 ClientCnxn [INFO] Opening socket connection to server localhost/127.0.0.1:2181
2015-09-15 02:00:58 ClientCnxn [INFO] Socket connection established to localhost/127.0.0.1:2181, initiating session
2015-09-15 02:00:59 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:01:01 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:59 util [INFO] Halting process: ("Error when processing an event")
推荐答案
这个问题有很多可能的原因.
There are many possible reasons for this issue.
- zookeeper 未启动.
- CPU 有一段时间达到峰值,超时没有心跳发送,因此 nimbus 认为主管已死,断开连接.
- worker timeout 太短了,可能默认是 10sec,你可以改成 600 以上试试.这几乎就像 #2.
- 确保 nimbus 工作正常.
- worker.childopts 不正确,说明内存设置不正确,更改 xmx 和 maxpermsize 再试一次.
- 如果你用winrm或者powershell启动风暴,可能默认内存不够用,因为默认内存只有1024M,需要设置更多,比如2048M试试.
这篇关于提交拓扑后异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文