网页抓取嵌入了inc文件的经典asp网站 [英] web scraping a classic asp site with inc file embed in it

查看:124
本文介绍了网页抓取嵌入了inc文件的经典asp网站的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用webclient网站抓取一个网站,但它包含了包含文件,并没有刮掉包含文件生成的实际输出。有人可以请帮助我如何网上刮你在morzilla中的firebug中检查时看到的实际html



谢谢

Hi, I tried to web scrape a site using webclient but it has include files and it doesn't scrape the actual output that is produced from include file. could someone please help me how to web scrape the actual html that you see when you do inspect in firebug in morzilla

thanks

推荐答案

首先, Web抓取与ASP.NET无关,包含文件或其他服务器端技术:

http://en.wikipedia.org/wiki/Web_scraping [ ^ ]。



所有重要的是您可以进行的HTTP请求和HTTP响应。从客户端隐藏服务器端行为的所有详细信息。您可以实现所有Web服务器用户可以获得的,也许更多,但不能少。



请参阅我对该问题的评论:您没有描述您的问题。所以,我只能建议一件事:而不是 System.Net.WebClient 使用更通用的强大类 System.Net.HttpWebRequest

http://msdn.microsoft .com / zh-cn / library / system.net.httpwebrequest.aspx [ ^ ],

http://msdn.microsoft.com/en-us/library/system.net.webrequest.aspx [ ^ ]。



请看看我过去的答案:

获取具体数据来自网页 [ ^ ],

如何从另一个站点获取数据 [ ^ ]。



-SA
First of all, Web scraping is something irrelevant to ASP.NET, "include files" or other server-side technology:
http://en.wikipedia.org/wiki/Web_scraping[^].

All what matters is HTTP request you can make and HTTP response. All details of server-side behavior are hidden from the client side. You can achieve all the Web server user can get, perhaps a little more, but no less.

Please see my comment to the question: you did not describe your problem. So, I can advice just one thing: instead of System.Net.WebClient use more general and hence powerful class System.Net.HttpWebRequest,
http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.aspx[^],
http://msdn.microsoft.com/en-us/library/system.net.webrequest.aspx[^].

Please see my past answers:
get specific data from web page[^],
How to get the data from another site[^].

—SA


这篇关于网页抓取嵌入了inc文件的经典asp网站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆