使用HtmlAgilityPack.NETCore获取网页 [英] Get web page using HtmlAgilityPack.NETCore
问题描述
我使用 HtmlAgilityPack
处理html页面。
以前我是这样做的:
I used the HtmlAgilityPack
for work with html pages.
Previously I did this:
HtmlWeb web = new HtmlWeb();
HtmlDocument document = web.Load(url);
var nodes = document.DocumentNode.SelectNodes("necessary node");
但现在我需要使用HtmlAgilityPack.NETCore,其中 HtmlWeb
不存在。
应该如何使用 HtmlWeb
获得相同的结果?
but now i need to use the HtmlAgilityPack.NETCore where HtmlWeb
is absent.
What should i use instead HtmlWeb
to have the same result?
推荐答案
使用 HttpClient
作为与远程资源交互的新方法通过http。
Use the HttpClient
as a new way to interact with remote resources via http.
对于您的解决方案,您可能需要在此处使用 async
方法来非阻塞线程,而不是 .Result
的用法。还要注意, HttpClient
打算从不同的线程使用从.Net 4.5开始,因此您不应该每次都重新创建它:
As for your solution, you probably need to use the async
methods here for non-blocking your thread, instead of .Result
usage. Also note that HttpClient
was meant to be used from different threads starting from .Net 4.5, so you should not recreate it each time:
// instance or static variable
HttpClient client = new HttpClient();
// get answer in non-blocking way
using (var response = await client.GetAsync(url))
{
using (var content = response.Content)
{
// read answer in non-blocking way
var result = await content.ReadAsStringAsync();
var document = new HtmlDocument();
document.LoadHtml(result);
var nodes = document.DocumentNode.SelectNodes("Your nodes");
//Some work with page....
}
}
关于异步/等待的出色文章:异步/等待-异步最佳实践编程,来自@StephenCleary | 2013年3月
Great article about async/await: Async/Await - Best Practices in Asynchronous Programming by @StephenCleary | March 2013
这篇关于使用HtmlAgilityPack.NETCore获取网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!