使用 HtmlAgilityPack.NETCore 获取网页 [英] Get web page using HtmlAgilityPack.NETCore
问题描述
我使用 HtmlAgilityPack
来处理 html 页面.以前我是这样做的:
I used the HtmlAgilityPack
for work with html pages.
Previously I did this:
HtmlWeb web = new HtmlWeb();
HtmlDocument document = web.Load(url);
var nodes = document.DocumentNode.SelectNodes("necessary node");
但现在我需要使用没有 HtmlWeb
的 HtmlAgilityPack.NETCore.我应该使用什么来代替 HtmlWeb
以获得相同的结果?
but now i need to use the HtmlAgilityPack.NETCore where HtmlWeb
is absent.
What should i use instead HtmlWeb
to have the same result?
推荐答案
使用 HttpClient
作为通过 http 与远程资源交互的新方式.
Use the HttpClient
as a new way to interact with remote resources via http.
至于您的解决方案,您可能需要在此处使用 async
方法来非阻塞线程,而不是使用 .Result
.另请注意,HttpClient
旨在用于从 .Net 4.5 开始的不同线程,因此你不应该每次都重新创建它:
As for your solution, you probably need to use the async
methods here for non-blocking your thread, instead of .Result
usage. Also note that HttpClient
was meant to be used from different threads starting from .Net 4.5, so you should not recreate it each time:
// instance or static variable
HttpClient client = new HttpClient();
// get answer in non-blocking way
using (var response = await client.GetAsync(url))
{
using (var content = response.Content)
{
// read answer in non-blocking way
var result = await content.ReadAsStringAsync();
var document = new HtmlDocument();
document.LoadHtml(result);
var nodes = document.DocumentNode.SelectNodes("Your nodes");
//Some work with page....
}
}
关于 async/await 的好文章:Async/Await - 异步编程的最佳实践 来自@StephenCleary |2013 年 3 月
Great article about async/await: Async/Await - Best Practices in Asynchronous Programming by @StephenCleary | March 2013
这篇关于使用 HtmlAgilityPack.NETCore 获取网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!