XPath 选择带有可选中间空白文本节点的前一个元素 [英] XPath to select preceding element with optional intervening whitespace-only text node
问题描述
给定一个元素作为上下文,我想选择前面的同级元素并检查它是否具有特定名称.需要注意的是,如果中间文本节点具有非空白内容,我不想选择它.
Given an element as context I want to select the preceding sibling element and check to see if it has a particular name. The caveat is that I do not want to select it if there is an intervening text node that has non-whitespace content.
例如,给定这个 XML 文档......
For example, given this XML document…
<r>
<a>a1</a><a>a2</a>
b
<a>a3</a>
<a>a4</a>
<b/>
<a>a5</a>
</r>
……然后:
- 对于a1"应该没有匹配项(在它之前没有
同级元素)
- 对于a2",则应匹配a1"(没有中间文本节点)
- 对于a3"应该没有匹配项(中间有一个非空白内容的文本节点)
- 对于a4",则应匹配a3"(中间文本节点仅为空格)
- 对于a5",应该没有匹配项(前面的同级元素不是
).
- For "a1" there should be no match (there is no
<a>
sibling element immediately preceding it) - For "a2" then "a1" should be matched (there is no intervening text node)
- For "a3" there should be no match (there is an intervening text node with non-whitespace contents)
- For "a4" then "a3" should be matched (the intervening text node is only whitespace)
- For "a5" there should be no match (the preceding sibling element is not an
<a>
).
我可以使用 preceding-sibling::*[1][name()="a"]
>
但是,我不知道如何说选择以下同级节点,无论元素或文本如何,看看它是否不是文本或normalize-space(.)=""
. 我最好的猜测是:
However, I can't figure out how to say "select the following sibling node, regardless of element or textness, and see if that's not text or normalize-space(.)=""
. My best guess was this:
preceding-sibling::*[1][name()="a"][following-sibling::node()[1][not(text()) or normalize-space(.)=""]]
……但这似乎没有效果.
…but that appears to have no effect.
这是我的测试 Ruby 文件:
Here's my test Ruby file:
require 'nokogiri'
xpath = 'preceding-sibling::*[1][name()="a"][following-sibling::node()[1][not(text()) or normalize-space(.)=""]]'
fragment = Nokogiri::XML.fragment '<a>a1</a><a>a2</a> b <a>a3</a> <a>a4</a> <b/> <a>a5</a>'
fragment.css('a').each{ |a| p [a.text,a.xpath(xpath).to_s] }
#=> ["a1", ""]
#=> ["a2", ""]
#=> ["a3", "<a>a2</a>"]
#=> ["a4", "<a>a3</a>"]
#=> ["a5", ""]
a2"和a3"的结果是错误的,让我感到困惑.它正确地找到了前面的 <a>
,但没有正确验证它的第一个后续兄弟不是文本(应该允许a2"找到a1")或它只是空格(这应该可以防止a3"找到a2".
The result for "a2" and "a3" are what is wrong and confuses me. It finds the preceding <a>
correctly, but then does not correctly verify that the first following-sibling of that is either not text (which should allow "a2" to find "a1") or that it is whitespace only (which should prevent "a3" from finding "a2".
编辑:这是我正在编写的 XPath,以及我打算用来做什么:
Edit: Here's the XPath I was writing, and what I intended it to do:
preceding-sibling::*[1][name()="a"]...
- 找到前面的第一个元素,并确保它是一个<<代码>.这似乎按预期工作.
preceding-sibling::*[1][name()="a"]…
- find the first preceding element, and ensure that it is an<a>
. This appears to be working as desired.
[following-sibling::node()[1][…]]
- ensure that the first following node (of the found preceding<a>
) matches some conditions
not(text()) 或 normalize-space(.)=""
- 确保后面的节点不是文本节点,或者它的规范化空间为空立>
not(text()) or normalize-space(.)=""
- ensure that this following node is either not a text node, or that the normalized space of it is empty
推荐答案
使用:
/*/a/preceding-sibling::node()
[not(self::text()[not(normalize-space())])]
[1]
[self::a]
基于 XSLT 的验证:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/a
/preceding-sibling::node()
[not(self::text()[not(normalize-space())])]
[1]
[self::a]
"/>
</xsl:template>
</xsl:stylesheet>
当此转换应用于提供的 XML 文档时:
<r>
<a>a1</a><a>a2</a>
b
<a>a3</a>
<a>a4</a>
<b/>
<a>a5</a>
</r>
XPath 表达式被评估,并且被评估选择的节点被复制到输出:
<a>a1</a>
<a>a3</a>
更新:
问题中的 XPath 表达式有什么问题?
What is wrong with the XPath expression in the question?
问题就在这里:
[not(text()) or normalize-space(.)='']
这会测试上下文节点是否没有文本节点子.
This tests if the context node doesn't have a text node child.
但是 OP 想要测试上下文节点是否是一个文本节点.
But the OP wants to test if the context node is a text node.
解决方案:
将上面的替换为:
[not(self::text()) or normalize-space(.)='']
基于 XSLT 的验证:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*/a">
<xsl:copy-of select=
"preceding-sibling::*[1]
[name()='a']
[following-sibling::node()[1]
[not(self::text()) or normalize-space(.)='']
]"/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
现在这个转换产生了想要的结果:
<a>a1</a>
<a>a3</a>
这篇关于XPath 选择带有可选中间空白文本节点的前一个元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!