从Commons HttpClient迁移到HttpComponents Client [英] Migrate from Commons HttpClient to HttpComponents Client

查看:123
本文介绍了从Commons HttpClient迁移到HttpComponents Client的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从Commons HttpClient(3.x)迁移到HttpComponents Client(4.x),但是很难处理重定向.该代码在Commons HttpClient下正常工作,但是在迁移到HttpComponents Client时会中断.有些链接会发生不良的重定向,但是当我将"http.protocol.handle-redirects"设置为"false"时,大量链接会完全停止工作.

I would like to migrate from Commons HttpClient (3.x) to HttpComponents Client (4.x) but having difficulty how to handle redirects. The code works properly under Commons HttpClient but breaks when migrated to HttpComponents Client. Some of the links get undesirable redirects but when I set "http.protocol.handle-redirects" to 'false' a large number links stop working altogether.

使用HttpClient 3.x:

Commons HttpClient 3.x:

private static HttpClient httpClient = null;
private static MultiThreadedHttpConnectionManager connectionManager = null;
private static final long MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds

static {
    //HttpURLConnection.setFollowRedirects(true);
    CookieManager manager = new CookieManager();
    manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
    CookieHandler.setDefault(manager);

connectionManager = new MultiThreadedHttpConnectionManager();
connectionManager.getParams().setDefaultMaxConnectionsPerHost(1000); // will need to set from properties file
connectionManager.getParams().setMaxTotalConnections(1000);
httpClient = new HttpClient(connectionManager);
}




/*
* Retrieve HTML
*/  
public String fetchURL(String url) throws IOException{

    if ( StringUtils.isEmpty(url) )
        return null;

    GetMethod getMethod = new GetMethod(url);
    HttpClient httpClient = new HttpClient();
    //configureMethod(getMethod);
    //ObjectInputStream oin = null;
    InputStream in = null;
    int code = -1;
    String html = "";
    String lastModified = null;
    try {
      code = httpClient.executeMethod(getMethod);

      in = getMethod.getResponseBodyAsStream();
        //oin = new ObjectInputStream(in);
        //html = getMethod.getResponseBodyAsString();
        html = CharStreams.toString(new InputStreamReader(in));

    }


    catch (Exception except) {
    }
    finally {

      try {
        //oin.close();
        in.close();
      }
      catch (Exception except) {}

      getMethod.releaseConnection();
      connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME);
    }

    if (code <= 400){
        return html.replaceAll("\\s+", " ");
    } else {
        throw new Exception("URL: " + url + " returned response code " + code);
    }

}

HttpComponents Client 4.x:

HttpComponents Client 4.x :

private static HttpClient httpClient = null;
private static HttpParams params = null;
//private static MultiThreadedHttpConnectionManager connectionManager = null;
private static ThreadSafeClientConnManager connectionManager = null;
private static final int MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds


static {
    //HttpURLConnection.setFollowRedirects(true);
    CookieManager manager = new CookieManager();
    manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
    CookieHandler.setDefault(manager);


connectionManager = new ThreadSafeClientConnManager();
connectionManager.setDefaultMaxPerRoute(1000); // will need to set from properties file
connectionManager.setMaxTotal(1000);
httpClient = new DefaultHttpClient(connectionManager);



    // HTTP parameters stores header etc.
    params = new BasicHttpParams();
    params.setParameter("http.protocol.handle-redirects",false);

}




/*
* Retrieve HTML
*/  
public String fetchURL(String url) throws IOException{

    if ( StringUtils.isEmpty(url) )
        return null;

    InputStream in = null;
    //int code = -1;
    String html = "";

 // Prepare a request object
 HttpGet httpget = new HttpGet(url);
httpget.setParams(params);

 // Execute the request
 HttpResponse response = httpClient.execute(httpget);

 // The response status
 //System.out.println(response.getStatusLine());
int code = response.getStatusLine().getStatusCode();

 // Get hold of the response entity
 HttpEntity entity = response.getEntity();

 // If the response does not enclose an entity, there is no need
 // to worry about connection release
 if (entity != null) {

        try {
            //code = httpClient.executeMethod(getMethod);

            //in = getMethod.getResponseBodyAsStream();
            in = entity.getContent();
            html = CharStreams.toString(new InputStreamReader(in));

        }


        catch (Exception except) {
            throw new Exception("URL: " + url + " returned response code " + code);
        }
        finally {

            try {
                //oin.close();
                in.close();
            }
            catch (Exception except) {}

            //getMethod.releaseConnection();
            connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME, TimeUnit.MILLISECONDS);
            connectionManager.closeExpiredConnections();
        }

    }

    if (code <= 400){
        return html;
    } else {
        throw new Exception("URL: " + url + " returned response code " + code);
    }


}

我不希望重定向,但是在HttpClient 4.x下,如果启用了重定向,那么我会得到一些不受欢迎的信息,例如 http://www.walmart.com/ =>

I won't want redirects but under HttpClient 4.x if I enable redirects then I get some that are undesirable, e.g. http://www.walmart.com/ => http://mobile.walmart.com/. Under HttpClient 3.x no such redirects happens.

在不破坏代码的情况下,如何将HttpClient 3.x迁移到HttpClient 4.x?

What do I need to do to migrate HttpClient 3.x to HttpClient 4.x without breaking the code?

推荐答案

对于HttpClient 4.x而言,这不是问题,这可能是目标服务器处理请求的方式,因为用户代理是httpclient,因此可以将其处理为移动设备(目标服务器可能会将其他可用的浏览器(例如chrome,mozilla等)视为移动设备.)

It is not the issue with HttpClient 4.x, might be the way target server handle the request, since the user agent is httpclient, it may be handled as mobile (target server may consider other than available browsers like, i.e, chrome, mozilla etc as mobile.)

请使用以下代码进行手动设置

Please use below code to set it manually

 httpclient.getParams().setParameter(
            org.apache.http.params.HttpProtocolParams.USER_AGENT,
            "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2"
        );

这篇关于从Commons HttpClient迁移到HttpComponents Client的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆