无法下载此网页 [英] Cannot download this webpage

查看:136
本文介绍了无法下载此网页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些问题下载网页的来源,我可以在任何浏览器中查看网页,我也可以运行网页蜘蛛并下载第一页没问题。每当我运行代码来获取该页面的源代码时,我总是会收到403禁止错误。



一旦发送请求,就会返回403 forbidden错误。有什么想法吗?

I am having some issues download the source of a webpage, I can view the webpage fine in any browser, I can also run a web spider and download the first page no problem. Whenever I run the code to grab the source of that page I always get 403 forbidden error.

As soon as the request is sent the 403 forbidden error is returned. Anyone have any ideas?

string urlAddress = "http://www.brownells.com/";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);

HttpWebResponse response = (HttpWebResponse)request.GetResponse();

if (response.StatusCode == HttpStatusCode.OK)
{
      Stream receiveStream = response.GetResponseStream();
      StreamReader readStream = null;

.................................

      response.Close();
      readStream.Close();



感谢任何帮助。


Appreciate any help.

推荐答案

正如Florian所说Braun网站检查用户代理并拒绝下载403错误。您可以尝试通过更改用户代理来欺骗网站:

As said by Florian Braun the site checks the user agent and denies the download with a 403 error. You can try to trick the website, by changing the user agent:
string urlAddress = "http://www.brownells.com/";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
//this is the UserAgent string of Firefox 36.0, use your favourite browser's user agent string :)
request.UserAgent = "Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0";
//get response as usual
HttpWebResponse response = (HttpWebResponse)request.GetResponse();


这有效,希望其他人也能从中受益。



This works, hopefully others will benefit from it.

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
request.UserAgent = @"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36";

request.Accept = @"text/html";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
     
if (response.StatusCode == HttpStatusCode.OK)
{
     Stream dataStream = response.GetResponseStream();
     using (StreamReader streamread = new StreamReader(dataStream))
     doc.Load(streamread);


这篇关于无法下载此网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆