C#HttpWebResponse头编码 [英] c# HttpWebResponse Header encoding

查看:422
本文介绍了C#HttpWebResponse头编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题。我联系,我知道使用301重定向的地址



使用 HttpWebRequest的loHttp =(HttpWebRequest的)WebRequest.Create(lcUrl);
loHttp.AllowAutoRedirect = FALSE; 所以,我不重定向



现在我得到,以确定新的URL响应的头



使用 loWebResponse.GetResponseHeader(位置);



的问题是,因为该网址中包含希腊字符返回的混在一起(由于编码)的字符串。



全貌codewise:

  HttpWebRequest的loHttp =(HttpWebRequest的)WebRequest.Create(lcUrl); 
loHttp.ContentType =应用/的X WWW的形式,进行了urlencoded
loHttp.Method =GET;

超时= 10000;

loHttp.AllowAutoRedirect = FALSE;
HttpWebResponse loWebResponse =(HttpWebResponse)loHttp.GetResponse();

字符串URL = loWebResponse.Headers [所在地];


解决方案

如果你让默认行为( loHttp.AllowAutoRedirect = TRUE )和你的代码不能正常工作(你不重定向到新的资源),这意味着该服务器没有编码位置正确头。 ?是重定向在浏览器中工作。



例如,如果重定向URL为的http://网站/Μία_Σελίδα Location头必须看起来像的http://网站/ CE%%95%CE%BD%CE%B9%CE%B1%CE%AF%CE%BF_%CE%94%CE%B5 %CE%<​​/ code>






更新:



在进一步调查我开始怀疑有东西的奇怪的有的HttpWebRequest 的问题。当发送请求的服务器发送以下响应:

  HTTP / 1.1 301永久移动
日期:周五, 2009年12月11日17时01分04秒GMT
服务器:Microsoft-IIS / 6.0
的X技术,通过:ASP.NET
地点:http://www.site.com/buy /κινητή-σταθερή-τηλεφωνία/ C / cn69569 /
的Content-Length:112
的Content-Type:text / html的;字符集= UTF-8
缓存控制:私人
连接:关闭
的Set-Cookie:BIGipServerpool_webserver_gr = 1007732746.36895.0000; PATH = /


<!DOCTYPE HTML PUBLIC - // W3C // DTD HTML 4.01过渡// ENhttp://www.w3.org/TR/html4 /loose.dtd\">



正如我们所看到的位置标题包含这不是URL编码希腊字符。我不太肯定这是否根据 HTTP规范有效。我们可以肯定的是,Web浏览器正确解释它。



下面到了有趣的部分。看来,的HttpWebRequest 不使用UTF-8编码来解析响应报头,因为分析位置时这头得到:http://www.site.com/buy/κινηÏή-ÏÏαθÎμÏή-ÏηλÎμÏÏνία/ C / cn69569 / ,这当然是错误的,当它试图重定向到该位置服务器用新的重定向响应,并依此类推,直到达到重定向的最大数量,并抛出一个异常。



我找不到任何方式解析响应头时,指定由的HttpWebRequest 使用的编码。如果我们使用的TcpClient 手动将其工作完全正常

 使用(VAR的客户=新的TcpClient())
{
client.Connect(WWW .site.com,80);使用

(VAR流= client.GetStream())
{
变种作家=新的StreamWriter(流);
writer.WriteLine(GET /default/defaultcatg.asp?catg=69569 HTTP / 1.1);
writer.WriteLine(主持人:www.site.com);
writer.WriteLine(用户代理:Mozilla的/ 5.0(视窗; U; Windows NT的6.1; EN-US; rv中:1.9.1.2)的Gecko / 20090805知床/ 3.5.2);
writer.WriteLine(接受:text / html的,是application / xhtml + xml的,应用/ XML; Q = 0.9 * / *; Q = 0.8);
writer.WriteLine(接受语言:EN-US,连接; Q = 0.5);
writer.WriteLine(接收字符集:ISO-8859-1,UTF-8,Q = 0.7 *; Q = 0.7);
writer.WriteLine(连接:关闭);
writer.WriteLine(的String.Empty);
writer.WriteLine(的String.Empty);
writer.WriteLine(的String.Empty);
writer.Flush();

变种读者=新的StreamReader(流);
变种响应= reader.ReadToEnd();
//当在响应中查找正确读取
//地点:http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/c/cn69569/
$} b $ b}

所以,我真的被这种行为感到不解。有什么办法来指定的HttpWebRequest 使用了正确的编码?也许有些请求头应设置?



作为一种解决方法,你可以尝试修改执行重定向的 ASP 页并进行urlencode的位置头。例如,当你执行一个ASP.NET应用程序的Response.Redirect(位置),该位置将被自动HTML编码任何非标准的字符将被转换为它们对应的实体。



例如,如果你这样做:的Response.Redirect(http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/ C / cn69569 /); 在ASP.NET应用程序的位置头将被设置为:

  http://www.site.com/buy/%ce%ba%ce%b9%ce%bd%ce%b7%cf%84%ce% AE-%的CF%83%CF%84%CE%B1%CE%B8%CE%B5%CF%81%CE%AE-%的CF%84%CE%B7%CE%BB%CE%B5%CF% 86%CF%89%CE%BD%CE%AF%CE%B1 / C / cn69569 

看来,这是不符合传统的ASP的情况。


I have the following problem. I contact an address which I know employs a 301 redirect.

using HttpWebRequest loHttp = (HttpWebRequest)WebRequest.Create(lcUrl); and loHttp.AllowAutoRedirect = false; so that I am not redirected.

Now I get the header of the response in order to identify the new url.

using loWebResponse.GetResponseHeader("Location");

The problem is that since this url contains greek characters the string returned is all jumbled up (due to encoding).

The full picture codewise:

HttpWebRequest loHttp = (HttpWebRequest)WebRequest.Create(lcUrl);
loHttp.ContentType = "application/x-www-form-urlencoded";
loHttp.Method = "GET";

Timeout = 10000;

loHttp.AllowAutoRedirect = false;
HttpWebResponse loWebResponse = (HttpWebResponse)loHttp.GetResponse();

string url= loWebResponse.Headers["Location"];

解决方案

If you let the default behavior (loHttp.AllowAutoRedirect = true) and your code doesn't work (you don't get redirected to the new resource) it means that the server is not encoding the Location header correctly. Is the redirect working in the browser?

For example if the redirect url is http://site/Μία_Σελίδα the Location header must look like http://site/%CE%95%CE%BD%CE%B9%CE%B1%CE%AF%CE%BF_%CE%94%CE%B5%CE%.


UPDATE:

After further investigating the issue I begin to suspect that there's something strange with HttpWebRequest. When the request is sent the server sends the following response:

HTTP/1.1 301 Moved Permanently
Date: Fri, 11 Dec 2009 17:01:04 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Location: http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/c/cn69569/
Content-Length: 112
Content-Type: text/html; Charset=UTF-8
Cache-control: private
Connection: close
Set-Cookie: BIGipServerpool_webserver_gr=1007732746.36895.0000; path=/


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

As we can see the Location header contains greek characters which are not url encoded. I am not quite sure if this is valid according to the HTTP specification. What we can say for sure is that a web browser interprets it correctly.

Here comes the interesting part. It seems that HttpWebRequest doesn't use UTF-8 encoding to parse the response headers because when analyzing the Location header it gives: http://www.site.com/buy/κινηÏή-ÏÏαθεÏή-ÏηλεÏÏνία/c/cn69569/, which of course is wrong and when it tries to redirect to this location the server responds with a new redirect and so on until the maximum number of redirects is reached and an exception is thrown.

I couldn't find any way to specify the encoding used by HttpWebRequest when parsing the response headers. If we use TcpCLient manually it works perfectly fine:

using (var client = new TcpClient())
{
    client.Connect("www.site.com", 80);

    using (var stream = client.GetStream())
    {
        var writer = new StreamWriter(stream);
        writer.WriteLine("GET /default/defaultcatg.asp?catg=69569 HTTP/1.1");
        writer.WriteLine("Host: www.site.com");
        writer.WriteLine("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090805 Shiretoko/3.5.2");
        writer.WriteLine("Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
        writer.WriteLine("Accept-Language: en-us,en;q=0.5");
        writer.WriteLine("Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7");
        writer.WriteLine("Connection: close");
        writer.WriteLine(string.Empty);
        writer.WriteLine(string.Empty);
        writer.WriteLine(string.Empty);
        writer.Flush();

        var reader = new StreamReader(stream);
        var response = reader.ReadToEnd();
        // When looking at the response it correctly reads 
        // Location: http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/c/cn69569/
    }
}

So I am really puzzled by this behavior. Is there any way to specify the correct encoding used by HttpWebRequest? Maybe some request header should be set?

As a workaround you could try modifying the asp page that performs the redirect and urlencode the Location header. For example when in an ASP.NET application you perform a Response.Redirect(location), the location will be automatically html encoded and any non standard characters will be converted to their corresponding entities.

For example if you do: Response.Redirect("http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/c/cn69569/"); in an ASP.NET application the Location header will be set to :

http://www.site.com/buy/%ce%ba%ce%b9%ce%bd%ce%b7%cf%84%ce%ae-%cf%83%cf%84%ce%b1%ce%b8%ce%b5%cf%81%ce%ae-%cf%84%ce%b7%ce%bb%ce%b5%cf%86%cf%89%ce%bd%ce%af%ce%b1/c/cn69569

It seems that this is not the case with classic ASP.

这篇关于C#HttpWebResponse头编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆