c#HttpWebResponse标题编码 [英] c# HttpWebResponse Header encoding

查看:228
本文介绍了c#HttpWebResponse标题编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题。我联系一个我知道的一个地址,使用了301重定向。



使用 HttpWebRequest loHttp =(HttpWebRequest)WebRequest.Create(lcUrl);
loHttp.AllowAutoRedirect = false; ,以便我没有重定向。



现在我得到响应标题,以确定新的URL。



使用 loWebResponse.GetResponseHeader(Location);



问题是,由于此网址包含希腊文字符,因此返回的字符串全部乱码(由于编码)。



完整图片编码:

  HttpWebRequest loHttp =(HttpWebRequest)WebRequest.Create(lcUrl); 
loHttp.ContentType =application / x-www-form-urlencoded;
loHttp.Method =GET;

超时= 10000;

loHttp.AllowAutoRedirect = false;
HttpWebResponse loWebResponse =(HttpWebResponse)loHttp.GetResponse();

string url = loWebResponse.Headers [Location];


解决方案

如果您让默认行为( loHttp.AllowAutoRedirect = true ),您的代码不起作用(您不会被重定向到新的资源),这意味着服务器不编码位置头正确。重定向网址是否是 http:// site /Μία_Σελίδα位置标题必须看起来像 http:// site /%CE%95%CE%BD%CE%B9%CE%B1%CE%AF%CE%BF_%CE%94%CE%B5 %CE%






更新:



在进一步调查这个问题后,我开始怀疑有一些奇怪的 HttpWebRequest 。当请求发送时,服务器发送以下响应:

  HTTP / 1.1 301永久移动
日期: 11 Dec 2009 17:01:04 GMT
服务器:Microsoft-IIS / 6.0
X-Powered by:ASP.NET
位置:http://www.site.com/buy /κινητή-σταθερή-τηλεφωνία/ c / cn69569 /
内容长度:112
内容类型:text / html; Charset = UTF-8
缓存控制:私人
连接:关闭
设置Cookie:BIGipServerpool_webserver_gr = 1007732746.36895.0000; path = /


<!DOCTYPE HTML PUBLIC - // W3C // DTD HTML 4.01 Transitional // ENhttp://www.w3.org/TR/html4 /loose.dtd\">

正如我们可以看到位置标题包含不是url编码的希腊字符。我不太确定这是否符合 HTTP规范。我们可以肯定地说,网络浏览器会正确解释它。



这里是有趣的部分。似乎 HttpWebRequest 不使用UTF-8编码来解析响应标头,因为在分析位置标题时给出: http://www.site.com/buy/?????&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& c>,这当然是错误的,当它尝试重定向到这个位置时,服务器用一个新的重定向进行响应,依此类推,直到达到最大重定向次数并抛出异常。



解析响应头时,找不到任何方式指定 HttpWebRequest 使用的编码。如果我们手动使用 TcpCLient ,它的工作完全正常使用(var client = new TcpClient())
{
client.Connect(www) .site.com,80);

使用(var stream = client.GetStream())
{
var writer = new StreamWriter(stream);
writer.WriteLine(GET /default/defaultcatg.asp?catg=69569 HTTP / 1.1);
writer.WriteLine(Host:www.site.com);
writer.WriteLine(User-Agent:Mozilla / 5.0(Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2)Gecko / 20090805 Shiretoko / 3.5.2);
writer.WriteLine(Accept:text / html,application / xhtml + xml,application / xml; q = 0.9,* / *; q = 0.8);
writer.WriteLine(Accept-Language:en-us,en; q = 0.5);
writer.WriteLine(Accept-Charset:ISO-8859-1,utf-8; q = 0.7,*; q = 0.7);
writer.WriteLine(Connection:close);
writer.WriteLine(string.Empty);
writer.WriteLine(string.Empty);
writer.WriteLine(string.Empty);
writer.Flush();

var reader = new StreamReader(stream);
var response = reader.ReadToEnd();
//当看到响应时,它正确读取
//位置:http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/c/cn69569/
}
}

所以我真的很困惑这个行为。有没有办法指定 HttpWebRequest 使用的正确编码?可能需要设置一些请求头?



作为解决方法,您可以尝试修改执行重定向的 asp 并urlencode 位置标题。例如,当您在ASP.NET应用程序中执行 Response.Redirect(location)时,该位置将自动进行html编码,并将任何非标准字符转换为相应的实体。



例如,如果你这样做: Response.Redirect(http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/ c / cn69569 /); 在ASP.NET应用程序中,位置头将被设置为:

  http://www.site.com/buy/%ce%ba%ce%b9%ce%bd%ce%b7%cf%84%ce% AE-%CF%83%CF%84%CE%B1%CE%B8%CE%B5%CF%81%CE%AE-%CF%84%CE%B7%的Ce%BB%的Ce%B5%CF% 86%cf%89%ce%bd%ce%af%ce%b1 / c / cn69569 

传统的ASP似乎并非如此。


I have the following problem. I contact an address which I know employs a 301 redirect.

using HttpWebRequest loHttp = (HttpWebRequest)WebRequest.Create(lcUrl); and loHttp.AllowAutoRedirect = false; so that I am not redirected.

Now I get the header of the response in order to identify the new url.

using loWebResponse.GetResponseHeader("Location");

The problem is that since this url contains greek characters the string returned is all jumbled up (due to encoding).

The full picture codewise:

HttpWebRequest loHttp = (HttpWebRequest)WebRequest.Create(lcUrl);
loHttp.ContentType = "application/x-www-form-urlencoded";
loHttp.Method = "GET";

Timeout = 10000;

loHttp.AllowAutoRedirect = false;
HttpWebResponse loWebResponse = (HttpWebResponse)loHttp.GetResponse();

string url= loWebResponse.Headers["Location"];

解决方案

If you let the default behavior (loHttp.AllowAutoRedirect = true) and your code doesn't work (you don't get redirected to the new resource) it means that the server is not encoding the Location header correctly. Is the redirect working in the browser?

For example if the redirect url is http://site/Μία_Σελίδα the Location header must look like http://site/%CE%95%CE%BD%CE%B9%CE%B1%CE%AF%CE%BF_%CE%94%CE%B5%CE%.


UPDATE:

After further investigating the issue I begin to suspect that there's something strange with HttpWebRequest. When the request is sent the server sends the following response:

HTTP/1.1 301 Moved Permanently
Date: Fri, 11 Dec 2009 17:01:04 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Location: http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/c/cn69569/
Content-Length: 112
Content-Type: text/html; Charset=UTF-8
Cache-control: private
Connection: close
Set-Cookie: BIGipServerpool_webserver_gr=1007732746.36895.0000; path=/


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

As we can see the Location header contains greek characters which are not url encoded. I am not quite sure if this is valid according to the HTTP specification. What we can say for sure is that a web browser interprets it correctly.

Here comes the interesting part. It seems that HttpWebRequest doesn't use UTF-8 encoding to parse the response headers because when analyzing the Location header it gives: http://www.site.com/buy/κινηÏή-ÏÏαθεÏή-ÏηλεÏÏνία/c/cn69569/, which of course is wrong and when it tries to redirect to this location the server responds with a new redirect and so on until the maximum number of redirects is reached and an exception is thrown.

I couldn't find any way to specify the encoding used by HttpWebRequest when parsing the response headers. If we use TcpCLient manually it works perfectly fine:

using (var client = new TcpClient())
{
    client.Connect("www.site.com", 80);

    using (var stream = client.GetStream())
    {
        var writer = new StreamWriter(stream);
        writer.WriteLine("GET /default/defaultcatg.asp?catg=69569 HTTP/1.1");
        writer.WriteLine("Host: www.site.com");
        writer.WriteLine("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090805 Shiretoko/3.5.2");
        writer.WriteLine("Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
        writer.WriteLine("Accept-Language: en-us,en;q=0.5");
        writer.WriteLine("Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7");
        writer.WriteLine("Connection: close");
        writer.WriteLine(string.Empty);
        writer.WriteLine(string.Empty);
        writer.WriteLine(string.Empty);
        writer.Flush();

        var reader = new StreamReader(stream);
        var response = reader.ReadToEnd();
        // When looking at the response it correctly reads 
        // Location: http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/c/cn69569/
    }
}

So I am really puzzled by this behavior. Is there any way to specify the correct encoding used by HttpWebRequest? Maybe some request header should be set?

As a workaround you could try modifying the asp page that performs the redirect and urlencode the Location header. For example when in an ASP.NET application you perform a Response.Redirect(location), the location will be automatically html encoded and any non standard characters will be converted to their corresponding entities.

For example if you do: Response.Redirect("http://www.site.com/buy/κινητή-σταθερή-τηλεφωνία/c/cn69569/"); in an ASP.NET application the Location header will be set to :

http://www.site.com/buy/%ce%ba%ce%b9%ce%bd%ce%b7%cf%84%ce%ae-%cf%83%cf%84%ce%b1%ce%b8%ce%b5%cf%81%ce%ae-%cf%84%ce%b7%ce%bb%ce%b5%cf%86%cf%89%ce%bd%ce%af%ce%b1/c/cn69569

It seems that this is not the case with classic ASP.

这篇关于c#HttpWebResponse标题编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆