如何从WebClient获取htmlDocument(下载文件或字符串) [英] how to get htmlDocument from WebClient (download file or string sorce)

查看:131
本文介绍了如何从WebClient获取htmlDocument(下载文件或字符串)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从网站上提取数据?

某些表中的所有数据并没有id jast类我试试这个:



i添加HtmlAgilityPack参考



if(!File.Exists(index.html))

{

WebClient web2 = new WebClient();

web2.DownloadFile(http://www.beatport.com/genre/electro-house/17/top -100,index.html);

}





HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

doc.Load(index.html);





你是什么?b $ b

您认为最简单的方法是什么?

I want to pull data from a website ??
all the data in some table and not have there id jast classes i try this:

i add the "HtmlAgilityPack" reference

if(!File.Exists("index.html"))
{
WebClient web2 = new WebClient();
web2.DownloadFile("http://www.beatport.com/genre/electro-house/17/top-100", "index.html");
}


HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load("index.html");


what you

What do you think the easiest way to do this?

推荐答案

我认为这就是你要找的它打印你所请求页面的html代码:

I think this is what you are looking for it prints the html code of your requested page :
public static void Main(string[] args)
       {
           string URl = "http://www.beatport.com/genre/electro-house/17/top-100";

           WebClient client = new WebClient();

           client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");

           Stream data = client.OpenRead(URl);
           StreamReader reader = new StreamReader(data);
           string s = reader.ReadToEnd();

           Console.WriteLine(s);
           data.Close();
           reader.Close();
           Console.ReadKey();
       }





更多信息请参阅MSDN

http://msdn.microsoft.com/en-us/library/system.net.webclient.aspx [ ^ ]



希望有所帮助



for more info refer to MSDN
http://msdn.microsoft.com/en-us/library/system.net.webclient.aspx[^]

Hope it helps


这篇关于如何从WebClient获取htmlDocument(下载文件或字符串)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆