无法下载此网页 [英] Cannot download this webpage
本文介绍了无法下载此网页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一些问题下载网页的来源,我可以在任何浏览器中查看网页,我也可以运行网页蜘蛛并下载第一页没问题。每当我运行代码来获取该页面的源代码时,我总是会收到403禁止错误。
一旦发送请求,就会返回403 forbidden错误。有什么想法吗?
I am having some issues download the source of a webpage, I can view the webpage fine in any browser, I can also run a web spider and download the first page no problem. Whenever I run the code to grab the source of that page I always get 403 forbidden error.
As soon as the request is sent the 403 forbidden error is returned. Anyone have any ideas?
string urlAddress = "http://www.brownells.com/";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
Stream receiveStream = response.GetResponseStream();
StreamReader readStream = null;
.................................
response.Close();
readStream.Close();
感谢任何帮助。
Appreciate any help.
推荐答案
正如Florian所说Braun网站检查用户代理并拒绝下载403错误。您可以尝试通过更改用户代理来欺骗网站:
As said by Florian Braun the site checks the user agent and denies the download with a 403 error. You can try to trick the website, by changing the user agent:
string urlAddress = "http://www.brownells.com/";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
//this is the UserAgent string of Firefox 36.0, use your favourite browser's user agent string :)
request.UserAgent = "Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0";
//get response as usual
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
这有效,希望其他人也能从中受益。
This works, hopefully others will benefit from it.
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
request.UserAgent = @"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36";
request.Accept = @"text/html";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
Stream dataStream = response.GetResponseStream();
using (StreamReader streamread = new StreamReader(dataStream))
doc.Load(streamread);
这篇关于无法下载此网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文