StormCrawler:超时等待来自池的连接 [英] StormCrawler: Timeout waiting for connection from pool

查看:173
本文介绍了StormCrawler:超时等待来自池的连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

增加线程数量或Fetcher螺栓执行器的数量时,我们始终会遇到以下错误.

We are consistently getting the following error when we increase either the number of threads or the number of executors for Fetcher bolt.

org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:286) ~[stormjar.jar:?]
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:263) ~[stormjar.jar:?]
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190) ~[stormjar.jar:?]
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) ~[stormjar.jar:?]
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ~[stormjar.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71) ~[stormjar.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:220) ~[stormjar.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:164) ~[stormjar.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:139) ~[stormjar.jar:?]
at com.digitalpebble.stormcrawler.protocol.httpclient.HttpProtocol.getProtocolOutput(HttpProtocol.java:206) ~[stormjar.jar:?]

这是由于资源泄漏或HTTP线程池大小受到一定限制吗?如果是关于线程池的,有什么方法可以增加线程的大小?

Is this due to a resource leak or some hard limit on the size of the http thread pool? If it is about the thread pool, is there any way to increase the pool size?

推荐答案

There is a max number of connections for the pool set in HttpProtocol, which is the number of threads used (fetcher.threads.number). Since the pool is static, it is used by all the executors on the same worker. I'd recommend that you use one FetcherBolt instance per worker, it will then be the same value as fetcher.threads.number and you won't have this problem.

或者,您可以给 okhttp协议.对于开放式和大规模爬网,它更健壮.有关功能比较,请参见协议上的WIKI页面.

Alternatively, you could give the okhttp protocol a try. It is more robust for open and large-scale crawls. See WIKI page on protocols for a feature comparison.

这篇关于StormCrawler:超时等待来自池的连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆