如何从HttpWebResponse的全部内容,如果返回的内容传输编码:分块? [英] How to get the full content from HttpWebResponse if the return content is Transfer-Encoding:chunked?

查看:721
本文介绍了如何从HttpWebResponse的全部内容,如果返回的内容传输编码:分块?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写一个程序,从其他网站上下载的HTML页面。 我发现一个问题,对于某些特定的网站,我不能得到完整的HTML code。我只能得到部分内容。 这个问题的服务器发送数据传输编码:分块 恐怕这才是问题的原因。

I am writing a program to download html page from other website. I found a problem that for some particular website, I cannot get the full html code. And I only can get partial content. The server with this problem are sending data in "Transfer-Encoding:chunked" I am afraid this is the reason of the problem.

这由服务器返回的头信息:

Transfer-Encoding: chunked
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Content-Type: text/html; charset=UTF-8
Date: Sun, 11 Sep 2011 09:46:23 GMT
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Server: nginx/1.0.6

这是我的code:

HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
HttpWebResponse response;
CookieContainer cookie = new CookieContainer();
request.CookieContainer = cookie;
request.AllowAutoRedirect = true;
request.KeepAlive = true;
request.UserAgent =
    @"Mozilla/5.0 (Windows NT 6.1; rv:6.0.2) Gecko/20100101 Firefox/6.0.2 FirePHP/0.6";
request.Accept = @"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
string html = string.Empty;
response = request.GetResponse() as HttpWebResponse;

using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
    html = reader.ReadToEnd();
}

我只能得到部分的html code(我认为这是来自服务器的第一个块)。谁能帮助吗?任何解决方案?

I can only get partial html code ( I think it is the first chunk from the server). Could anyone help? Any Solution?

谢谢!

推荐答案

您不能使用ReadToEnd读取分块的数据。您需要直接从使用的GetBytes响应流中读取。

You can't use ReadToEnd to read chunked data. You need to read directly from the response stream using GetBytes.

StringBuilder sb = new StringBuilder();
Byte[] buf = new byte[8192];
Stream resStream = response.GetResponseStream();

do
{
     count = resStream.Read(buf, 0, buf.Length);
     if(count != 0)
     {
          sb.Append(Encoding.UTF8.GetString(buf,0,count)); // just hardcoding UTF8 here
     }
}while (count > 0);
String html = sb.ToString();

这篇关于如何从HttpWebResponse的全部内容,如果返回的内容传输编码:分块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆