HTML Agility Pack-在特定段落之后选择节点 [英] HTML Agility Pack - Select node after particular paragraph

查看:53
本文介绍了HTML Agility Pack-在特定段落之后选择节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这种情况:带有以下HTML的各种文件.我只需要检索"targetWord"段之后的列表(当然,它会更改我需要解析的页面中的位置).如何使用HTML Agility Pack?

I have this kind of situation : various files with the following HTML. I need to retreive only the list after "targetWord" paragraph (of course it changes position in the pages I need to parse). How can I do with HTML Agility Pack?

<p>Word1</p>
<ul>
<li>listobject1</li>
<li>listobject2</li>
<li>listobject3</li>
</ul>

<p>targetWord</p>
<ul>
<li>listobject4</li>
<li>listobject5</li>
<li>listobject6</li>
</ul>

<p>Word2</p>
<ul>
<li>listobject7</li>
<li>listobject8</li>
<li>listobject9</li>
</ul>

我只需要用我的代码获得targetWord之后的列表节点:

I need to obtain with my code only the list nodes after targetWord:

foreach (var node in retreivedNodes)
{
    s[i] = node.InnerText;
    i++;
    console.writeline (s[i]);
}

OUTPUT:

   listobject4
   listobject5
   listobject6

推荐答案

您需要制作一个xpath表达式以符合您的要求

假设我已将您的代码段加载为HAP.HtmlDocument,则代码段为var htmlSnippet

Assuming that I have loaded a HAP.HtmlDocument with your snippet as var htmlSnippet then

htmlSnippet.DocumentNode.SelectNodes('//p[text()="targetWord"]/following-sibling::ul[1]//li')

将返回目标单词p标签之后的第一个ul节点的li个子节点的节点集.

will return the nodeset of li children of the first ul node following your target word p tag.

这篇关于HTML Agility Pack-在特定段落之后选择节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆