从网页抓取C#中提取数据 [英] Extract data from Web Scraping C#
问题描述
我ASP.NET MVC开发。
I am MVC ASP.NET developer.
我一直在使用WebRequest类接收的内容,从任何URL,即HTTP,HTTPS等。
I have received the contents from any url, i.e. http, https etc. using WebRequest class.
我收到特定网址的所有内容。 (现在我把 http://google.com )
I have received all the content of that particular url. (for now I took http://google.com)
我的下一步是提取按钮,页眉,页脚,颜色,文本等。
My next step is to extract buttons, header, footer, colors, text etc.
下面是我的code现在:
Here is my code for now:
public ActionResult GetContent(UrlModel model) //model having a string URL
which is entered in a text box and method hits using submit button.
{
//WebRequest request = WebRequest.Create(model.URL);
WebRequest request = WebRequest.Create(model.URL);
request.Credentials = CredentialCache.DefaultCredentials;
WebResponse response = request.GetResponse();
Stream dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
string responseFromServer = reader.ReadToEnd();
ViewBag.Response = responseFromServer;
reader.Close();
response.Close();
return View();
}
有人可以帮我写code?
Can someone help me with writing the code ?
也可做建议我在C#中的数据提取的一些技巧。
Also do suggest me with some techniques of data extraction in C#.
推荐答案
这是您的路要走
HTTP://htmlagilitypack.$c$cplex.com/
有关于它的很多计算器的职位。你可以很容易地得到来自HTML的任何元素。
There are numerous stackoverflow posts about it. You can easily get any elements from the html.
这篇关于从网页抓取C#中提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!