下载与Android网页 [英] Downloading a web page with Android

查看:94
本文介绍了下载与Android网页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在下载一个网页,然后提取一些数据出来,使用正则表达式(我不喊,我知道一个适当的解析器会更好,但是这是一个很简单的机器生成的页面)。这在仿真器工作正常,并且我的手机上时,由Wi-Fi连接,但不能在3G - 返回的字符串是不一样的,我没有得到匹配。我能想象它是与数据包的大小或延迟,但我不能弄明白。

I'm downloading a web page then extracting some data out of it, using regex (don't yell at me, I know a proper parser would be better, but this is a very simple machine generated page). This works fine in the emulator, and on my phone when connected by wi-fi, but not on 3G - the string returned is not the same, and I don't get a match. I can imagine it has something to do with packet size or latency, but I can't figure it out.

我的code:

public static String getPage(URL url) throws IOException {
    final URLConnection connection = url.openConnection();
    HttpGet httpRequest = null;

    try {
        httpRequest = new HttpGet(url.toURI());
    } catch (URISyntaxException e) {
        e.printStackTrace();
    }

    HttpClient httpclient = new DefaultHttpClient();
    HttpResponse response = (HttpResponse) httpclient.execute(httpRequest);

    HttpEntity entity = response.getEntity();
    BufferedHttpEntity bufHttpEntity = new BufferedHttpEntity(entity); 
    InputStream stream = bufHttpEntity.getContent();

    String ct = connection.getContentType();

    final BufferedReader reader;

    if (ct.indexOf("charset=") != -1) {
        ct = ct.substring(ct.indexOf("charset=") + 8);
        reader = new BufferedReader(new InputStreamReader(stream, ct));
    }else {
         reader = new BufferedReader(new InputStreamReader(stream));
    }

    final StringBuilder sb = new StringBuilder();

    String line;
    while ((line = reader.readLine()) != null) {
        sb.append(line);
    }

    stream.close();
    return sb.toString();
}

这是我的接触不良造成这一点,或者是有一个错误在那里?无论哪种方式,我该如何解决呢?

Is it my poor connection causing this, or is there a bug in there? Either way, how do I solve it?

更新:
下载超过3G的文件是一个比通过Wi-Fi更小的201字节。虽然他们显然都下载正确的页面,在3G缺少一个一大堆空白,还有些HTML注释是在原来的页面present,我觉得有点怪。难道Android的页面获取不同的3G上以减少文件的大小?

Update: The file downloaded over 3G is 201 bytes smaller than the one over wi-fi. While they are obviously both downloading the correct page, the 3G one is missing a whole bunch of whitespace, and also some HTML comments that are present in the original page which I find a little strange. Does Android fetch pages differently on 3G as to reduce file size?

推荐答案

用户代理(UA),则不应使用3G或WiFi接入ü网页改变。
由于这是前面提到的,摆脱的URLConnection的,原因很明显code是完全使用了HTTPClient方法,你可以使用UA设置:

UserAgent (UA) shouldn't be changed if u access web page using 3g or wifi. As it is mentioned before, get rid of UrlConnection, cause obviously code is complete for using HTTPClient method, and you are able to set UA using:

httpclient.getParams().setParameter(CoreProtocolPNames.USER_AGENT, userAgent);

最后one..it可能是愚蠢的,但也许网页是动态的?这可能吗?

last one..it might be silly but maybe web page is dynamic?! is that possible?

这篇关于下载与Android网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆