XPath 选择带有可选中间空白文本节点的前一个元素 [英] XPath to select preceding element with optional intervening whitespace-only text node

查看：60 发布时间：2021/6/8 18:48:03 ruby xpath nokogiri

本文介绍了XPath 选择带有可选中间空白文本节点的前一个元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给定一个元素作为上下文，我想选择前面的同级元素并检查它是否具有特定名称.需要注意的是，如果中间文本节点具有非空白内容，我不想选择它.

Given an element as context I want to select the preceding sibling element and check to see if it has a particular name. The caveat is that I do not want to select it if there is an intervening text node that has non-whitespace content.

例如，给定这个 XML 文档......

For example, given this XML document…

<r>
  <a>a1</a><a>a2</a>
   b
  <a>a3</a>
    <a>a4</a>
  <b/>
  <a>a5</a>
</r>

……然后:

对于a1"应该没有匹配项(在它之前没有同级元素)
对于a2"，则应匹配a1"(没有中间文本节点)
对于a3"应该没有匹配项(中间有一个非空白内容的文本节点)
对于a4"，则应匹配a3"(中间文本节点仅为空格)
对于a5"，应该没有匹配项(前面的同级元素不是 ).

For "a1" there should be no match (there is no <a> sibling element immediately preceding it)
For "a2" then "a1" should be matched (there is no intervening text node)
For "a3" there should be no match (there is an intervening text node with non-whitespace contents)
For "a4" then "a3" should be matched (the intervening text node is only whitespace)
For "a5" there should be no match (the preceding sibling element is not an <a>).

我可以使用 preceding-sibling::*[1][name()="a"]>

但是，我不知道如何说选择以下同级节点，无论元素或文本如何，看看它是否不是文本或normalize-space(.)="". 我最好的猜测是:

However, I can't figure out how to say "select the following sibling node, regardless of element or textness, and see if that's not text or normalize-space(.)="". My best guess was this:

preceding-sibling::*[1][name()="a"][following-sibling::node()[1][not(text()) or normalize-space(.)=""]]

……但这似乎没有效果.

…but that appears to have no effect.

这是我的测试 Ruby 文件:

Here's my test Ruby file:

require 'nokogiri'

xpath = 'preceding-sibling::*[1][name()="a"][following-sibling::node()[1][not(text()) or normalize-space(.)=""]]'
fragment = Nokogiri::XML.fragment '<a>a1</a><a>a2</a> b <a>a3</a> <a>a4</a> <b/> <a>a5</a>'    

fragment.css('a').each{ |a| p [a.text,a.xpath(xpath).to_s] }
#=> ["a1", ""]
#=> ["a2", ""]
#=> ["a3", "<a>a2</a>"]
#=> ["a4", "<a>a3</a>"]
#=> ["a5", ""]

a2"和a3"的结果是错误的，让我感到困惑.它正确地找到了前面的 <a>，但没有正确验证它的第一个后续兄弟不是文本(应该允许a2"找到a1")或它只是空格(这应该可以防止a3"找到a2".

The result for "a2" and "a3" are what is wrong and confuses me. It finds the preceding <a> correctly, but then does not correctly verify that the first following-sibling of that is either not text (which should allow "a2" to find "a1") or that it is whitespace only (which should prevent "a3" from finding "a2".

编辑:这是我正在编写的 XPath，以及我打算用来做什么:

Edit: Here's the XPath I was writing, and what I intended it to do:

preceding-sibling::*[1][name()="a"]... - 找到前面的第一个元素，并确保它是一个 <<代码>.这似乎按预期工作.



preceding-sibling::*[1][name()="a"]… - find the first preceding element, and ensure that it is an <a>. This appears to be working as desired.

[following-sibling::node()[1][…]] - 确保第一个跟随节点(找到的前一个 代码>) 匹配某些条件


[following-sibling::node()[1][…]] - ensure that the first following node (of the found preceding <a>) matches some conditions
not(text()) 或 normalize-space(.)="" - 确保后面的节点不是文本节点，或者它的规范化空间为空

not(text()) or normalize-space(.)="" - ensure that this following node is either not a text node, or that the normalized space of it is empty

推荐答案
使用:
/*/a/preceding-sibling::node()
       [not(self::text()[not(normalize-space())])]
            [1]
              [self::a]

基于 XSLT 的验证:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
     <xsl:copy-of select=
       "/*/a
          /preceding-sibling::node()
                      [not(self::text()[not(normalize-space())])]
                                        [1]
                                         [self::a]
    "/>
 </xsl:template>
</xsl:stylesheet>

当此转换应用于提供的 XML 文档时:
<r>
  <a>a1</a><a>a2</a>
   b
  <a>a3</a>
    <a>a4</a>
  <b/>
  <a>a5</a>
</r>

XPath 表达式被评估，并且被评估选择的节点被复制到输出:
<a>a1</a>
<a>a3</a>

更新:
问题中的 XPath 表达式有什么问题?
What is wrong with the XPath expression in the question?
问题就在这里:
[not(text()) or normalize-space(.)='']

这会测试上下文节点是否没有文本节点子.
This tests if the context node doesn't have a text node child.
但是 OP 想要测试上下文节点是否是一个文本节点.
But the OP wants to test if the context node is a text node.
解决方案:
将上面的替换为:
[not(self::text()) or normalize-space(.)='']

基于 XSLT 的验证:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/*/a">
     <xsl:copy-of select=
     "preceding-sibling::*[1]
                      [name()='a']
                         [following-sibling::node()[1]
                                    [not(self::text()) or normalize-space(.)='']
                       ]"/>
 </xsl:template>
 <xsl:template match="text()"/>
</xsl:stylesheet>

现在这个转换产生了想要的结果:
<a>a1</a>
<a>a3</a>


                        这篇关于XPath 选择带有可选中间空白文本节点的前一个元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

XPath 选择带有可选中间空白文本节点的前一个元素 [英] XPath to select preceding element with optional intervening whitespace-only text node

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

XPath 选择带有可选中间空白文本节点的前一个元素 [英] XPath to select preceding element with optional intervening whitespace-only text node

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭