选择 `text()` 1) 在给定节点之前但 2) 也是另一个给定节点的后代 [英] Select `text()` that 1) precede a given node but 2) are also descendants of another given node

查看:21
本文介绍了选择 `text()` 1) 在给定节点之前但 2) 也是另一个给定节点的后代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是this,但不幸的是,该问题的答案不适用.

假设我有以下 XML:

<div id="global-header">标题

<div id="a"><h3>一些标题</h3><p>文本 1<b>粗体</b></p><div><p>abc</p><p>文本2</p><p>def</p>

我想

  1. 找到值为text 2"的

    节点;(假设我们只有一个这样的

    ),然后

  2. 查找在此特定

    之前但也是

    节点的后代的所有节点(您可以使用像 [@id='a'] 之类的东西来定位它),最后
  3. 从步骤 2 中提取 text().

所需的输出应如下所示:

一些标题文本1胆大美国广播公司

需要注意的是

  1. 前面的节点可以包含任意节点类型,而不仅仅是

    .

  2. <p>text 2</p> 节点可以任意嵌入到树的深处,因此 xpath 就像 .//p[text()=text 2"]/preceding-sibling::* 只会提取 <p>abc</p> 而忽略其他.

解决方案

你可以试试这个 XPath 表达式:

//p[.='text 2']/preceding::text()[ancestor::div[@id='a']]

这种方法的缺点是 text() 节点可能没有明确分开,而是合并为子元素.要将它们分开,您需要某种 for 循环.

This is a follow-up question of this, but unfortunately the answer from that question doesn't apply.

Say I have the following XML:

<body>
    <div id="global-header">
        header
    </div>

    <div id="a">
        <h3>some title</h3>
        
        <p>text 1 
            <b>bold</b>
        </p>
        
        <div>
            <p>abc</p>
            <p>text 2</p>
            <p>def</p>
        </div>
    </div>

</body>

I want to

  1. find the <p> node whose value is "text 2" (assume we only have exactly one such <p>), and then
  2. find all the nodes that precede this particular <p> but are also descendants of the <div id='a'> node(you can use something like [@id='a'] to locate it), and finally
  3. extract text() from step 2.

The desired output should look like:

some title
text 1
bold
abc

The caveat is that

  1. the preceding nodes may contain arbitrary node type, not only <h3> and <p>.
  2. the <p>text 2</p> node may be embeded arbitrarly deep in the tree, hence xpath like .//p[text()="text 2"]/preceding-sibling::* would only extract <p>abc</p> and leave out others.

解决方案

You can try this XPath expression:

//p[.='text 2']/preceding::text()[ancestor::div[@id='a']]

The disadvantage of this approach is that the text() nodes may not be clearly separated, but rather merged for the sub-elements. To separate them, you'd need some kind of for-loop.

这篇关于选择 `text()` 1) 在给定节点之前但 2) 也是另一个给定节点的后代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
其他开发最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆