在一个特定的DIV使用HtmlAgilityPack只选择项目 [英] Select only items in a specific DIV using HtmlAgilityPack

查看:87
本文介绍了在一个特定的DIV使用HtmlAgilityPack只选择项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用HtmlAgilityPack拉都从一个页面都包含在一个div声明为&LT的联系; DIV CLASS ='内容'> 然而,当我使用code下面我简单地得到整个页面上的所有链接。这并没有真正意义的我,因为我打电话从我之前选择的子节点(在调试器中查看时,只显示该特定DIV的HTML)的SelectNodes。所以,它就像它的每一次我打电话的SelectNodes时光倒流到非常根节点。在code我用的是如下:

I'm trying to use the HtmlAgilityPack to pull all of the links from a page that are contained within a div declared as <div class='content'> However, when I use the code below I simply get ALL links on the entire page. This doesn't really make sense to me since I am calling SelectNodes from the sub-node I selected earlier (which when viewed in the debugger only shows the HTML from that specific div). So, it's like it's going back to the very root node every time I call SelectNodes. The code I use is below:

HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(@"http://example.com");
HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='content']");
foreach(HtmlNode link in node.SelectNodes("//a[@href]"))
{
    Console.WriteLine(link.Value);
}

这是预期的行为?如果是这样,我怎么得到它做什么,我期待?

Is this the expected behavior? And if so, how do I get it to do what I'm expecting?

推荐答案

这将工作:

node.SelectNodes("a[@href]")

此外,您还可以做到在一个单一的选择:

Also, you can do it in a single selector:

doc.DocumentNode.SelectSingleNode("//div[@class='content']//a[@href]")

另外,还要注意 link.Value HtmlNode 没有定义,所以你的code没有按T编译。

Also, note that link.Value isn't defined for HtmlNode, so your code doesn't compile.

这篇关于在一个特定的DIV使用HtmlAgilityPack只选择项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆