HttpWebRequest的对浏览器请求 [英] HttpWebRequest versus browser request

查看:254
本文介绍了HttpWebRequest的对浏览器请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用来检索使用C#程序网站的数据。(nseindia.com)然而最近NSE做了一些改变,这样从任一程序的任何请求都会报以403 Forbidden错误。谁能告诉我一个方法,使从程序相同,从浏览器的请求。我尝试设置userAgent属性,但多数民众赞成在不工作。代码如下粘贴。

 字符串DownloadData(字符串公司名称)
{
字符串地址=字符串。格式(@http://www.nseindia.com);
//http://www.nseindia.com/marketinfo/sym_map/symbolMapping.jsp?dataType=priceVolumeDeliverable&symbol=abb&
//http://www.nseindia.com/content/equities/scripvol/datafiles/01-12-2008-TO-29-12-2010ABBALLN.csv
HttpWebRequest的要求=(HttpWebRequest的)的WebRequest .Create(地址);
request.UserAgent =Mozilla的/ 5.0(视窗; U; Windows NT的5.1; EN-US; RV:1.9.2.12)的Gecko / 20101026火狐/ 3

串strData是= ;

{
request.Proxy = WebProxy.GetDefaultProxy();
HttpWebResponse响应=(HttpWebResponse)request.GetResponse();
的System.IO.Stream流= response.GetResponseStream();
System.Text.Encoding EC = System.Text.Encoding.GetEncoding(UTF-8);
就是System.IO.StreamReader读卡器=新System.IO。的StreamReader(流EC);
strData是= reader.ReadToEnd();
如果(strData.Contains(错误))
{
异常E =新的异常(strData是);
罚球Ë;
}
}
赶上(例外五)
{
Console.WriteLine(e.ToString());
}

返回strData是,
}


解决方案

尝试设置了接受 HTTP头; 。例如:

  request.Accept =接受:text / html的,是application / xhtml + xml的,应用程序/ XML; 



我来到这个建议被为了运行Fiddler2(如在另一个回答评论建议),以看到我的浏览器(Firefox 4测试版)如何使HTTP请求,你提到的网站。



然后我在代码中设置所有页眉和消灭一个接一个。当我删除了接受头,在 403 返回状态代码。



我的浏览器发出确切的请求:



<预类=郎无prettyprint-覆盖> GET / HTTP / 1.0
主机:www.nseindia.com
的User-Agent:Mozilla的/ 5.0(Windows NT的5.1; RV:2.0b8)的Gecko / 20100101火狐/ 4.0b8
接受:text / html的,应用程序/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
接受语言:德语,英语; q = 0.5
接受编码:gzip,紧缩
接收字符集:ISO-8859-1,UTF-8,q = 0.7 * q = 0.7

PS:你在评论中提到其他的URI似乎是无效的。一个是不完整的,并产生一个 500内部服务器错误,其他的收益率 404未找到响应。


I used to retrieve data from a site using a c# program.(nseindia.com) however recently NSE made some changes so that any request from any program is responded with a "403 Forbidden Error". Can anyone tell me a way to make the request from the program identical to that from the browser. I tried setting the userAgent property but thats not working. The code is pasted below.

string DownloadData(string CompanyName)
{
    string address = string.Format(@"http://www.nseindia.com");
    //http://www.nseindia.com/marketinfo/sym_map/symbolMapping.jsp?dataType=priceVolumeDeliverable&symbol=abb&
    //http://www.nseindia.com/content/equities/scripvol/datafiles/01-12-2008-TO-29-12-2010ABBALLN.csv
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(address);
    request.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3

    string strData = "";
    try
    {
        request.Proxy = WebProxy.GetDefaultProxy();
        HttpWebResponse response = (HttpWebResponse)request.GetResponse();
        System.IO.Stream stream = response.GetResponseStream();
        System.Text.Encoding ec = System.Text.Encoding.GetEncoding("utf-8");
        System.IO.StreamReader reader = new System.IO.StreamReader(stream, ec);
        strData = reader.ReadToEnd();
        if (strData.Contains("Error"))
        {
            Exception e = new Exception(strData);
            throw e;
        }
    }
    catch(Exception e)
    {
        Console.WriteLine(e.ToString());
    }

    return strData;
}

解决方案

Try setting the Accept HTTP header; e.g.:

request.Accept = "Accept: text/html,application/xhtml+xml,application/xml";

I arrived at this suggestion by running Fiddler2 (as suggested in a comment to another answer) in order to see how my browser (Firefox 4 Beta) makes the HTTP request to the website you mentioned.

I then set all headers in the code and eliminated one by one. As soon as I removed the Accept header, the 403 status code was returned.

Exact request made by my browser:

GET / HTTP/1.0
Host: www.nseindia.com
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0b8) Gecko/20100101 Firefox/4.0b8
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: de,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

PS: The other URIs you mention in the comments seem to be invalid. One is incomplete and yields a 500 Internal Server Error, the other yields a 404 Not Found response.

这篇关于HttpWebRequest的对浏览器请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆