每个代理而不是每个路由的 Apache HttpClient 4 持久连接 [英] Apache HttpClient 4 persistent connection per Proxy instead of per route

查看:31
本文介绍了每个代理而不是每个路由的 Apache HttpClient 4 持久连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的理解,ClientConnectionManager 的所有实现都基于路由保持连接.如果涉及代理,这将导致基本上没有持久连接.例如,HttpClient 需要通过具有固定 IP 的 HTTP 代理访问 1000 个不同的域,它必须与代理建立至少 1000 个连接,而不是创建与代理的 1 个持久连接,并将其重复用于 1000 个请求.

我正在模拟多个用户访问数千个域(假域,所有 dns 都解析为几个 IP,解析发生在代理之后,因此与 HttpClient 无关).当我增加用户和域的数量时,上述行为会迅速耗尽本地主机中的所有可用端口,结果发生地址绑定错误.

有没有办法让 HttpClient 在代理的基础上保持连接?IE.HttpClient 仅维护与给定代理的指定连接数.

解决方案

经过深入研究,Apache HttpClient 似乎不支持这种开箱即用的行为.我必须修改 HttpClient/HttpCore 源代码才能拥有此功能,即.仅基于 localAddress 和 First Proxy 地址维护持久连接.

我修改的类是:

org.apache.http.conn.routing.HttpRouter.java 和org.apache.http.conn.routing.BasicRouteDirector.java.

基本上我更改了 HttpRoute 中的 hashCode 和 equal 方法(用作持久连接查找的哈希表的键),因此如果涉及代理,查找不会考虑目标地址.

上述修改的初始测试结果显示,在我的场景中,请求吞吐量提高了约 100 倍.到目前为止,它对我来说效果很好.

凯文

My understanding, all implementations of ClientConnectionManager persist connections base on route. This results in basically no persistent connections if a proxy is involved. For example, the HttpClient needs to visit 1000 different domains via a HTTP proxy with an fix IP, it has to establish at least 1000 connection to the proxy instead of creating 1 persistent connection to the proxy and reuse that for the 1000 requests.

I'm simulating multiple users visiting thousands of domains (fake domains, all dns resolved to a couple of IPs, the resolving happen after the proxy, so nothing to do with HttpClient). The above behavior quickly use up all available ports in the localhost as I increase the # of users and domains, the Address Bind errors occur as result.

Is there a way to make the HttpClient to persist connection on proxy basis? ie. A HttpClient only maintain specified number of connections to a given proxy.

解决方案

After intensive research, it seems that Apache HttpClient doesn't support this behavior out-of-box. I have to modify the HttpClient/HttpCore source in order to have this feature, ie. maintain persistent connections based only on localAddress and First Proxy address.

The classes I modified are:

org.apache.http.conn.routing.HttpRounte.java and org.apache.http.conn.routing.BasicRouteDirector.java.

Basically I changed the hashCode and equal method in HttpRoute (which is used as a key to hashtable for persistent conn lookup), so the lookup doesn't consider target address if a proxy is involved.

Initial test results of above modification shows about 100 times improvement in terms of request throughput in my scenario. So far it works fine for me.

Kevin

这篇关于每个代理而不是每个路由的 Apache HttpClient 4 持久连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆