从Commons HttpClient迁移到HttpComponents Client [英] Migrate from Commons HttpClient to HttpComponents Client
问题描述
我想从Commons HttpClient(3.x)迁移到HttpComponents Client(4.x),但是很难处理重定向.该代码在Commons HttpClient下正常工作,但是在迁移到HttpComponents Client时会中断.有些链接会发生不良的重定向,但是当我将"http.protocol.handle-redirects"设置为"false"时,大量链接会完全停止工作.
I would like to migrate from Commons HttpClient (3.x) to HttpComponents Client (4.x) but having difficulty how to handle redirects. The code works properly under Commons HttpClient but breaks when migrated to HttpComponents Client. Some of the links get undesirable redirects but when I set "http.protocol.handle-redirects" to 'false' a large number links stop working altogether.
使用HttpClient 3.x:
Commons HttpClient 3.x:
private static HttpClient httpClient = null;
private static MultiThreadedHttpConnectionManager connectionManager = null;
private static final long MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds
static {
//HttpURLConnection.setFollowRedirects(true);
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(manager);
connectionManager = new MultiThreadedHttpConnectionManager();
connectionManager.getParams().setDefaultMaxConnectionsPerHost(1000); // will need to set from properties file
connectionManager.getParams().setMaxTotalConnections(1000);
httpClient = new HttpClient(connectionManager);
}
/*
* Retrieve HTML
*/
public String fetchURL(String url) throws IOException{
if ( StringUtils.isEmpty(url) )
return null;
GetMethod getMethod = new GetMethod(url);
HttpClient httpClient = new HttpClient();
//configureMethod(getMethod);
//ObjectInputStream oin = null;
InputStream in = null;
int code = -1;
String html = "";
String lastModified = null;
try {
code = httpClient.executeMethod(getMethod);
in = getMethod.getResponseBodyAsStream();
//oin = new ObjectInputStream(in);
//html = getMethod.getResponseBodyAsString();
html = CharStreams.toString(new InputStreamReader(in));
}
catch (Exception except) {
}
finally {
try {
//oin.close();
in.close();
}
catch (Exception except) {}
getMethod.releaseConnection();
connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME);
}
if (code <= 400){
return html.replaceAll("\\s+", " ");
} else {
throw new Exception("URL: " + url + " returned response code " + code);
}
}
HttpComponents Client 4.x:
HttpComponents Client 4.x :
private static HttpClient httpClient = null;
private static HttpParams params = null;
//private static MultiThreadedHttpConnectionManager connectionManager = null;
private static ThreadSafeClientConnManager connectionManager = null;
private static final int MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds
static {
//HttpURLConnection.setFollowRedirects(true);
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(manager);
connectionManager = new ThreadSafeClientConnManager();
connectionManager.setDefaultMaxPerRoute(1000); // will need to set from properties file
connectionManager.setMaxTotal(1000);
httpClient = new DefaultHttpClient(connectionManager);
// HTTP parameters stores header etc.
params = new BasicHttpParams();
params.setParameter("http.protocol.handle-redirects",false);
}
/*
* Retrieve HTML
*/
public String fetchURL(String url) throws IOException{
if ( StringUtils.isEmpty(url) )
return null;
InputStream in = null;
//int code = -1;
String html = "";
// Prepare a request object
HttpGet httpget = new HttpGet(url);
httpget.setParams(params);
// Execute the request
HttpResponse response = httpClient.execute(httpget);
// The response status
//System.out.println(response.getStatusLine());
int code = response.getStatusLine().getStatusCode();
// Get hold of the response entity
HttpEntity entity = response.getEntity();
// If the response does not enclose an entity, there is no need
// to worry about connection release
if (entity != null) {
try {
//code = httpClient.executeMethod(getMethod);
//in = getMethod.getResponseBodyAsStream();
in = entity.getContent();
html = CharStreams.toString(new InputStreamReader(in));
}
catch (Exception except) {
throw new Exception("URL: " + url + " returned response code " + code);
}
finally {
try {
//oin.close();
in.close();
}
catch (Exception except) {}
//getMethod.releaseConnection();
connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME, TimeUnit.MILLISECONDS);
connectionManager.closeExpiredConnections();
}
}
if (code <= 400){
return html;
} else {
throw new Exception("URL: " + url + " returned response code " + code);
}
}
我不希望重定向,但是在HttpClient 4.x下,如果启用了重定向,那么我会得到一些不受欢迎的信息,例如 http://www.walmart.com/ =>
I won't want redirects but under HttpClient 4.x if I enable redirects then I get some that are undesirable, e.g. http://www.walmart.com/ => http://mobile.walmart.com/. Under HttpClient 3.x no such redirects happens.
在不破坏代码的情况下,如何将HttpClient 3.x迁移到HttpClient 4.x?
What do I need to do to migrate HttpClient 3.x to HttpClient 4.x without breaking the code?
推荐答案
对于HttpClient 4.x而言,这不是问题,这可能是目标服务器处理请求的方式,因为用户代理是httpclient,因此可以将其处理为移动设备(目标服务器可能会将其他可用的浏览器(例如chrome,mozilla等)视为移动设备.)
It is not the issue with HttpClient 4.x, might be the way target server handle the request, since the user agent is httpclient, it may be handled as mobile (target server may consider other than available browsers like, i.e, chrome, mozilla etc as mobile.)
请使用以下代码进行手动设置
Please use below code to set it manually
httpclient.getParams().setParameter(
org.apache.http.params.HttpProtocolParams.USER_AGENT,
"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2"
);
这篇关于从Commons HttpClient迁移到HttpComponents Client的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!