从 Commons HttpClient 迁移到 HttpComponents 客户端 [英] Migrate from Commons HttpClient to HttpComponents Client

查看:29
本文介绍了从 Commons HttpClient 迁移到 HttpComponents 客户端的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从 Commons HttpClient (3.x) 迁移到 HttpComponents Client (4.x),但在如何处理重定向方面有困难.该代码在 Commons HttpClient 下正常工作,但在迁移到 HttpComponents Client 时会中断.一些链接获得了不受欢迎的重定向,但是当我将http.protocol.handle-redirects"设置为false"时,大量链接完全停止工作.

I would like to migrate from Commons HttpClient (3.x) to HttpComponents Client (4.x) but having difficulty how to handle redirects. The code works properly under Commons HttpClient but breaks when migrated to HttpComponents Client. Some of the links get undesirable redirects but when I set "http.protocol.handle-redirects" to 'false' a large number links stop working altogether.

公共 HttpClient 3.x:

Commons HttpClient 3.x:

private static HttpClient httpClient = null;
private static MultiThreadedHttpConnectionManager connectionManager = null;
private static final long MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds

static {
    //HttpURLConnection.setFollowRedirects(true);
    CookieManager manager = new CookieManager();
    manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
    CookieHandler.setDefault(manager);

connectionManager = new MultiThreadedHttpConnectionManager();
connectionManager.getParams().setDefaultMaxConnectionsPerHost(1000); // will need to set from properties file
connectionManager.getParams().setMaxTotalConnections(1000);
httpClient = new HttpClient(connectionManager);
}




/*
* Retrieve HTML
*/  
public String fetchURL(String url) throws IOException{

    if ( StringUtils.isEmpty(url) )
        return null;

    GetMethod getMethod = new GetMethod(url);
    HttpClient httpClient = new HttpClient();
    //configureMethod(getMethod);
    //ObjectInputStream oin = null;
    InputStream in = null;
    int code = -1;
    String html = "";
    String lastModified = null;
    try {
      code = httpClient.executeMethod(getMethod);

      in = getMethod.getResponseBodyAsStream();
        //oin = new ObjectInputStream(in);
        //html = getMethod.getResponseBodyAsString();
        html = CharStreams.toString(new InputStreamReader(in));

    }


    catch (Exception except) {
    }
    finally {

      try {
        //oin.close();
        in.close();
      }
      catch (Exception except) {}

      getMethod.releaseConnection();
      connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME);
    }

    if (code <= 400){
        return html.replaceAll("\\s+", " ");
    } else {
        throw new Exception("URL: " + url + " returned response code " + code);
    }

}

HttpComponents 客户端 4.x :

HttpComponents Client 4.x :

private static HttpClient httpClient = null;
private static HttpParams params = null;
//private static MultiThreadedHttpConnectionManager connectionManager = null;
private static ThreadSafeClientConnManager connectionManager = null;
private static final int MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds


static {
    //HttpURLConnection.setFollowRedirects(true);
    CookieManager manager = new CookieManager();
    manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
    CookieHandler.setDefault(manager);


connectionManager = new ThreadSafeClientConnManager();
connectionManager.setDefaultMaxPerRoute(1000); // will need to set from properties file
connectionManager.setMaxTotal(1000);
httpClient = new DefaultHttpClient(connectionManager);



    // HTTP parameters stores header etc.
    params = new BasicHttpParams();
    params.setParameter("http.protocol.handle-redirects",false);

}




/*
* Retrieve HTML
*/  
public String fetchURL(String url) throws IOException{

    if ( StringUtils.isEmpty(url) )
        return null;

    InputStream in = null;
    //int code = -1;
    String html = "";

 // Prepare a request object
 HttpGet httpget = new HttpGet(url);
httpget.setParams(params);

 // Execute the request
 HttpResponse response = httpClient.execute(httpget);

 // The response status
 //System.out.println(response.getStatusLine());
int code = response.getStatusLine().getStatusCode();

 // Get hold of the response entity
 HttpEntity entity = response.getEntity();

 // If the response does not enclose an entity, there is no need
 // to worry about connection release
 if (entity != null) {

        try {
            //code = httpClient.executeMethod(getMethod);

            //in = getMethod.getResponseBodyAsStream();
            in = entity.getContent();
            html = CharStreams.toString(new InputStreamReader(in));

        }


        catch (Exception except) {
            throw new Exception("URL: " + url + " returned response code " + code);
        }
        finally {

            try {
                //oin.close();
                in.close();
            }
            catch (Exception except) {}

            //getMethod.releaseConnection();
            connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME, TimeUnit.MILLISECONDS);
            connectionManager.closeExpiredConnections();
        }

    }

    if (code <= 400){
        return html;
    } else {
        throw new Exception("URL: " + url + " returned response code " + code);
    }


}

我不想要重定向,但在 HttpClient 4.x 下,如果我启用重定向,那么我会得到一些不需要的,例如http://www.walmart.com/ => http://mobile.walmart.com/.在 HttpClient 3.x 下不会发生此类重定向.

I won't want redirects but under HttpClient 4.x if I enable redirects then I get some that are undesirable, e.g. http://www.walmart.com/ => http://mobile.walmart.com/. Under HttpClient 3.x no such redirects happens.

我需要做什么才能在不破坏代码的情况下将 HttpClient 3.x 迁移到 HttpClient 4.x?

What do I need to do to migrate HttpClient 3.x to HttpClient 4.x without breaking the code?

推荐答案

这不是 HttpClient 4.x 的问题,可能是目标服务器处理请求的方式,因为用户代理是 httpclient,它可能被处理为移动(目标服务器可能会考虑将其他可用浏览器(例如 chrome、mozilla 等)视为移动浏览器.)

It is not the issue with HttpClient 4.x, might be the way target server handle the request, since the user agent is httpclient, it may be handled as mobile (target server may consider other than available browsers like, i.e, chrome, mozilla etc as mobile.)

请使用以下代码手动设置

Please use below code to set it manually

 httpclient.getParams().setParameter(
            org.apache.http.params.HttpProtocolParams.USER_AGENT,
            "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2"
        );

这篇关于从 Commons HttpClient 迁移到 HttpComponents 客户端的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆