从 Commons HttpClient 迁移到 HttpComponents 客户端 [英] Migrate from Commons HttpClient to HttpComponents Client
问题描述
我想从 Commons HttpClient (3.x) 迁移到 HttpComponents Client (4.x),但在如何处理重定向方面有困难.该代码在 Commons HttpClient 下正常工作,但在迁移到 HttpComponents Client 时会中断.一些链接获得了不受欢迎的重定向,但是当我将http.protocol.handle-redirects"设置为false"时,大量链接完全停止工作.
I would like to migrate from Commons HttpClient (3.x) to HttpComponents Client (4.x) but having difficulty how to handle redirects. The code works properly under Commons HttpClient but breaks when migrated to HttpComponents Client. Some of the links get undesirable redirects but when I set "http.protocol.handle-redirects" to 'false' a large number links stop working altogether.
公共 HttpClient 3.x:
Commons HttpClient 3.x:
private static HttpClient httpClient = null;
private static MultiThreadedHttpConnectionManager connectionManager = null;
private static final long MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds
static {
//HttpURLConnection.setFollowRedirects(true);
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(manager);
connectionManager = new MultiThreadedHttpConnectionManager();
connectionManager.getParams().setDefaultMaxConnectionsPerHost(1000); // will need to set from properties file
connectionManager.getParams().setMaxTotalConnections(1000);
httpClient = new HttpClient(connectionManager);
}
/*
* Retrieve HTML
*/
public String fetchURL(String url) throws IOException{
if ( StringUtils.isEmpty(url) )
return null;
GetMethod getMethod = new GetMethod(url);
HttpClient httpClient = new HttpClient();
//configureMethod(getMethod);
//ObjectInputStream oin = null;
InputStream in = null;
int code = -1;
String html = "";
String lastModified = null;
try {
code = httpClient.executeMethod(getMethod);
in = getMethod.getResponseBodyAsStream();
//oin = new ObjectInputStream(in);
//html = getMethod.getResponseBodyAsString();
html = CharStreams.toString(new InputStreamReader(in));
}
catch (Exception except) {
}
finally {
try {
//oin.close();
in.close();
}
catch (Exception except) {}
getMethod.releaseConnection();
connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME);
}
if (code <= 400){
return html.replaceAll("\\s+", " ");
} else {
throw new Exception("URL: " + url + " returned response code " + code);
}
}
HttpComponents 客户端 4.x :
HttpComponents Client 4.x :
private static HttpClient httpClient = null;
private static HttpParams params = null;
//private static MultiThreadedHttpConnectionManager connectionManager = null;
private static ThreadSafeClientConnManager connectionManager = null;
private static final int MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds
static {
//HttpURLConnection.setFollowRedirects(true);
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(manager);
connectionManager = new ThreadSafeClientConnManager();
connectionManager.setDefaultMaxPerRoute(1000); // will need to set from properties file
connectionManager.setMaxTotal(1000);
httpClient = new DefaultHttpClient(connectionManager);
// HTTP parameters stores header etc.
params = new BasicHttpParams();
params.setParameter("http.protocol.handle-redirects",false);
}
/*
* Retrieve HTML
*/
public String fetchURL(String url) throws IOException{
if ( StringUtils.isEmpty(url) )
return null;
InputStream in = null;
//int code = -1;
String html = "";
// Prepare a request object
HttpGet httpget = new HttpGet(url);
httpget.setParams(params);
// Execute the request
HttpResponse response = httpClient.execute(httpget);
// The response status
//System.out.println(response.getStatusLine());
int code = response.getStatusLine().getStatusCode();
// Get hold of the response entity
HttpEntity entity = response.getEntity();
// If the response does not enclose an entity, there is no need
// to worry about connection release
if (entity != null) {
try {
//code = httpClient.executeMethod(getMethod);
//in = getMethod.getResponseBodyAsStream();
in = entity.getContent();
html = CharStreams.toString(new InputStreamReader(in));
}
catch (Exception except) {
throw new Exception("URL: " + url + " returned response code " + code);
}
finally {
try {
//oin.close();
in.close();
}
catch (Exception except) {}
//getMethod.releaseConnection();
connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME, TimeUnit.MILLISECONDS);
connectionManager.closeExpiredConnections();
}
}
if (code <= 400){
return html;
} else {
throw new Exception("URL: " + url + " returned response code " + code);
}
}
我不想要重定向,但在 HttpClient 4.x 下,如果我启用重定向,那么我会得到一些不需要的,例如http://www.walmart.com/ => http://mobile.walmart.com/.在 HttpClient 3.x 下不会发生此类重定向.
I won't want redirects but under HttpClient 4.x if I enable redirects then I get some that are undesirable, e.g. http://www.walmart.com/ => http://mobile.walmart.com/. Under HttpClient 3.x no such redirects happens.
我需要做什么才能在不破坏代码的情况下将 HttpClient 3.x 迁移到 HttpClient 4.x?
What do I need to do to migrate HttpClient 3.x to HttpClient 4.x without breaking the code?
推荐答案
这不是 HttpClient 4.x 的问题,可能是目标服务器处理请求的方式,因为用户代理是 httpclient,它可能被处理为移动(目标服务器可能会考虑将其他可用浏览器(例如 chrome、mozilla 等)视为移动浏览器.)
It is not the issue with HttpClient 4.x, might be the way target server handle the request, since the user agent is httpclient, it may be handled as mobile (target server may consider other than available browsers like, i.e, chrome, mozilla etc as mobile.)
请使用以下代码手动设置
Please use below code to set it manually
httpclient.getParams().setParameter(
org.apache.http.params.HttpProtocolParams.USER_AGENT,
"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2"
);
这篇关于从 Commons HttpClient 迁移到 HttpComponents 客户端的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!