使用HttpWebResponse编码问题 [英] Encoding trouble with HttpWebResponse

查看:106
本文介绍了使用HttpWebResponse编码问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是代码片段:

  HttpWebRequest webRequest =(HttpWebRequest)WebRequest.Create(request.RawUrl); 
WebRequest.DefaultWebProxy = null; //确保我们不会在代理中再次循环
HttpWebResponse response =(HttpWebResponse)webRequest.GetResponse();
string charSet = response.CharacterSet;
编码编码;
if(String.IsNullOrEmpty(charSet))
encoding = Encoding.Default;
else
encoding = Encoding.GetEncoding(charSet);

StreamReader resStream = new StreamReader(response.GetResponseStream(),encoding);
return resStream.ReadToEnd();

问题是如果我测试: http://www.google.fr



所有é显示不正常。我尝试将ASCII更改为UTF8,但仍显示错误。我已经在浏览器中测试了html文件,浏览器显示html文本,所以我很确定问题是在我用来下载html文件的方法。



我应该更改什么?



删除死亡ImageShack链接



更新1:代码和测试文件已更改


解决方案

首先,编写该代码的更简单的方法是使用StreamReader和ReadToEnd:

  HttpWebRequest webRequest =(HttpWebRequest)WebRequest.Create(myURL); 
using(HttpWebResponse response =(HttpWebResponse)webRequest.GetResponse())
{
using(Stream resStream = response.GetResponseStream())
{
StreamReader reader =新的StreamReader(resStream,Encoding。???);
return reader.ReadToEnd();
}
}

那么这是只是一个找到正确的问题编码。你如何创建文件?如果是记事本,那么您可能需要 Encoding.Default - 但这显然是不可移植的,因为它是您的 PC的默认编码。 >

在运行良好的Web服务器中,响应将在其标题中指示编码。话虽如此,响应标头有时候要求一件事,HTML在某些情况下要求另一个。


Here is a snippet of the code :

HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(request.RawUrl);
WebRequest.DefaultWebProxy = null;//Ensure that we will not loop by going again in the proxy
HttpWebResponse response = (HttpWebResponse)webRequest.GetResponse();
string charSet = response.CharacterSet;
Encoding encoding;
if (String.IsNullOrEmpty(charSet))
encoding = Encoding.Default;
else
encoding = Encoding.GetEncoding(charSet);

StreamReader resStream = new StreamReader(response.GetResponseStream(), encoding);
return resStream.ReadToEnd();

The problem is if I test with : http://www.google.fr

All "é" are not displaying well. I have try to change ASCII to UTF8 and it still display wrong. I have tested the html file in a browser and the browser display the html text well so I am pretty sure the problem is in the method I use to download the html file.

What should I change?

removed dead ImageShack link

Update 1: Code and test file changed

解决方案

Firstly, the easier way of writing that code is to use a StreamReader and ReadToEnd:

HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(myURL);
using (HttpWebResponse response = (HttpWebResponse)webRequest.GetResponse())
{
    using (Stream resStream = response.GetResponseStream())
    {
        StreamReader reader = new StreamReader(resStream, Encoding.???);
        return reader.ReadToEnd();
    }
}

Then it's "just" a matter of finding the right encoding. How did you create the file? If it's with Notepad then you probably want Encoding.Default - but that's obviously not portable, as it's the default encoding for your PC.

In a well-run web server, the response will indicate the encoding in its headers. Having said that, response headers sometimes claim one thing and the HTML claims another, in some cases.

这篇关于使用HttpWebResponse编码问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆