Android会截断HTTPS页面 [英] Android gets HTTPS page truncated

查看:247
本文介绍了Android会截断HTTPS页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Android上使用HTTPS获取网页(忽略了证书,因为它是自签名和过时的,如上所示这里 - 不要问,这不是我的服务器:))。

I am fetching a web page on Android using HTTPS (ignoring the certificate as it is both self-signed and outdated, as seen here - don't ask, it's not my server :)).

我已经定义了我的

public class MyHttpClient extends DefaultHttpClient {


    public MyHttpClient() {
        super();
        final HttpParams params = getParams();
        HttpConnectionParams.setConnectionTimeout(params,
                REGISTRATION_TIMEOUT);
        HttpConnectionParams.setSoTimeout(params, REGISTRATION_TIMEOUT);
        ConnManagerParams.setTimeout(params, REGISTRATION_TIMEOUT);
    }

    @Override
    protected ClientConnectionManager createClientConnectionManager() {
        SchemeRegistry registry = new SchemeRegistry();
        registry.register(new Scheme("http", PlainSocketFactory
                .getSocketFactory(), 80));
        registry.register(new Scheme("https", new UnsecureSSLSocketFactory(), 443));
        return new SingleClientConnManager(getParams(), registry);
    }
}

其中提到的UnsecureSSLSocketFactory基于给出的建议上述话题

where the UnsecureSSLSocketFactory mentioned is based on the suggestion given on the aforementioned topic.

然后我使用这个类来构建一个页面

I'm then using this class to fecth a page

public class HTTPHelper {

    private final static String TAG = "HTTPHelper";
    private final static String CHARSET = "ISO-8859-1";

    public static final String USER_AGENT = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8 (.NET CLR 3.5.30729)";
    public static final String ACCEPT_CHARSET = "ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    public static final String ACCEPT = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";


    /**
     * Sends an HTTP request
     * @param url
     * @param post
     * @return
     */
    public String sendRequest(String url, String post) throws ConnectionException {

        MyHttpClient httpclient = new MyHttpClient();

        HttpGet httpget = new HttpGet(url);
        httpget.addHeader("User-Agent", USER_AGENT);
        httpget.addHeader("Accept", ACCEPT);
        httpget.addHeader("Accept-Charset", ACCEPT_CHARSET);

        HttpResponse response;
        try {
            response = httpclient.execute(httpget);
        } catch (Exception e) {
            throw new ConnectionException(e.getMessage());
        }

        HttpEntity entity = response.getEntity();

        try {
            pageSource = convertStreamToString(entity.getContent());
        } catch (Exception e) {
            throw new ConnectionException(e.getMessage());
        }
        finally {
            if (entity != null) {
                try {
                    entity.consumeContent();
                } catch (IOException e) {
                    throw new ConnectionException(e.getMessage());
                }
            }
        }

        httpclient.getConnectionManager().shutdown();
        return pageSource;

    }

    /**
     * Converts a stream to a string
     * @param is
     * @return
     */
    private static String convertStreamToString(InputStream is) 
    {
        try {
            BufferedReader reader = new BufferedReader(new InputStreamReader(is, CHARSET));
            StringBuilder stringBuilder = new StringBuilder();
            String line = null;
            try {
                while ((line = reader.readLine()) != null) {
                    stringBuilder.append(line + "\n");
                }
            } catch (IOException e) {
                Log.d(TAG, "Exception in convertStreamToString", e);
            } finally {
                try {
                    is.close();
                } catch (IOException e) {}
            }
            return stringBuilder.toString();
        } catch (Exception e) {
            throw new Error("Unsupported charset");
        }
    }

}

页面大约一百行后我被截断了。它被截断在一个精确的点上,其中'_'(下划线)字符后跟一个'r'字符。它不是页面中的第一个下划线。

The page I get is truncated after about a hundred of lines. It's truncated at a precise point, where a '_' (underscore) char is followed by a 'r' char. It's not the first underscore in the page.

我认为它可能是一个编码问题,所以我尝试了UTF-8和ISO-8859-1,但它是仍然被截断。如果我用Firefox打开页面,它会报告编码为ISO-8851-1。

I thought it might have been an encoding issue, so I tried both UTF-8 and ISO-8859-1, but it's still truncated. If I open the page with Firefox, it reports the encoding being ISO-8851-1.

如果你想知道,网页是 https://ricarichiamoci.dsu.pisa.it/
并在第169行被截断,

In case you are wondering, the webpage is https://ricarichiamoci.dsu.pisa.it/ and it gets truncated at line 169,

function ChangeOffset(NewOffset) {
  document.mainForm.last

它应该是

function ChangeOffset(NewOffset) {
  document.mainForm.last_record.value = NewOffset;

有没有人知道页面被截断的原因?

Does anyone have an idea of why the page is truncated?

推荐答案

我发现下载的页面没有被截断,但是我用来打印它的函数(Log.d)会截断字符串。

I figured out the page downloaded is not truncated, but the function I'm using to print it out (Log.d) does truncate the string.

所以下载页面源代码的方法工作正常,但Log.d()可能不打算打印那么多文本。

So the method to download the page source code is working fine, but Log.d() is probably not meant to print that much amount of text.

这篇关于Android会截断HTTPS页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆