如何从WebClient获取htmlDocument(下载文件或字符串) [英] how to get htmlDocument from WebClient (download file or string sorce)
问题描述
我想从网站上提取数据?
某些表中的所有数据并没有id jast类我试试这个:
i添加HtmlAgilityPack参考
if(!File.Exists(index.html))
{
WebClient web2 = new WebClient();
web2.DownloadFile(http://www.beatport.com/genre/electro-house/17/top -100,index.html);
}
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(index.html);
你是什么?b $ b
您认为最简单的方法是什么?
I want to pull data from a website ??
all the data in some table and not have there id jast classes i try this:
i add the "HtmlAgilityPack" reference
if(!File.Exists("index.html"))
{
WebClient web2 = new WebClient();
web2.DownloadFile("http://www.beatport.com/genre/electro-house/17/top-100", "index.html");
}
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load("index.html");
what you
What do you think the easiest way to do this?
推荐答案
我认为这就是你要找的它打印你所请求页面的html代码:
I think this is what you are looking for it prints the html code of your requested page :
public static void Main(string[] args)
{
string URl = "http://www.beatport.com/genre/electro-house/17/top-100";
WebClient client = new WebClient();
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
Stream data = client.OpenRead(URl);
StreamReader reader = new StreamReader(data);
string s = reader.ReadToEnd();
Console.WriteLine(s);
data.Close();
reader.Close();
Console.ReadKey();
}
更多信息请参阅MSDN
http://msdn.microsoft.com/en-us/library/system.net.webclient.aspx [ ^ ]
希望有所帮助
for more info refer to MSDN
http://msdn.microsoft.com/en-us/library/system.net.webclient.aspx[^]
Hope it helps
这篇关于如何从WebClient获取htmlDocument(下载文件或字符串)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!