查找XML文档中的所有名称空间声明 - xPath 1.0 vs xPath 2.0 [英] Find all namespace declarations in an XML document - xPath 1.0 vs xPath 2.0
问题描述
作为Java 6应用程序的一部分,我想在XML文档中找到所有名称空间声明,包括任何重复项。
As part of a Java 6 application, I want to find all namespace declarations in an XML document, including any duplicates.
编辑 :Per Martin的请求,这是我正在使用的Java代码:
Edit: Per Martin's request, here's the Java code I am using:
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression xPathExpression = xPathExpression = xPath.compile("//namespace::*");
NodeList nodeList = (NodeList) xPathExpression.evaluate(xmlDomDocument, XPathConstants.NODESET);
假设我有这个XML文档:
Suppose I have this XML document:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:ele="element.com" xmlns:att="attribute.com" xmlns:txt="textnode.com">
<ele:one>a</ele:one>
<two att:c="d">e</two>
<three>txt:f</three>
</root>
要查找所有名称空间声明,我将此xPath语句应用于XML文档使用xPath 1.0 :
To find all namespace declarations, I applied this xPath statement to the XML document using xPath 1.0:
//namespace::*
它找到4个命名空间声明,这是我期望的(和期望):
It finds 4 namespace declarations, which is what I expect (and desire):
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
但如果我改用使用xPath 2.0 ,那么我得到16个命名空间声明(每个先前的声明4次),这不是我的意思期待(或渴望):
But if I change to using xPath 2.0, then I get 16 namespace declarations (each of the previous declarations 4 times), which is not what I expect (or desire):
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/@xmlns:att - attribute.com
/root[1]/@xmlns:ele - element.com
/root[1]/@xmlns:txt - textnode.com
即使是我使用xPath语句的非缩写版本:
This same difference is seen even when I use the non-abbreviated version of the xPath statement:
/descendant-or-self::node()/namespace::*
它可以在各种XML解析器中看到(LIBXML,MSXML.NET,Saxon)在oXygen测试。 (编辑:正如我在评论中稍后提到的,这种说法不正确。虽然我认为我正在测试各种XML解析器,但我真的不是。)
And it is seen across a variety of XML parsers (LIBXML, MSXML.NET, Saxon) as tested in oXygen. ( As I mention later in the comments, this statement is not true. Though I thought I was testing a variety of XML parsers, I really wasn't.)
问题#1:为什么从xPath 1.0到xPath 2.0的区别?
Question #1: Why the difference from xPath 1.0 to xPath 2.0?
问题#2 :使用xPath 2.0获得所需结果是否可能/合理?
Question #2: Is it possible/reasonable to get desired results using xPath 2.0?
提示:使用 distinct-values()
xPath 2.0中的函数将不返回所需的结果,因为我想要所有名称空间声明,即使同一名称空间被声明两次。例如,请考虑以下XML文档:
Hint: Using the distinct-values()
function in xPath 2.0 will not return the desired results, as I want all namespace declarations, even if the same namespace is declared twice. For example, consider this XML document:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<bar:one xmlns:bar="http://www.bar.com">alpha</bar:one>
<bar:two xmlns:bar="http://www.bar.com">bravo</bar:two>
</root>
所需的结果是:
/root[1]/@xmlns:xml - http://www.w3.org/XML/1998/namespace
/root[1]/bar:one[1]/@xmlns:bar - http://www.bar.com
/root[1]/bar:two[1]/@xmlns:bar - http://www.bar.com
推荐答案
我认为这将获得所有名称空间,没有任何重复:
I think this will get all namespaces, without any duplicates:
for $i in 1 to count(//namespace::*) return
if (empty(index-of((//namespace::*)[position() = (1 to ($i - 1))][name() = name((//namespace::*)[$i])], (//namespace::*)[$i])))
then (//namespace::*)[$i]
else ()
这篇关于查找XML文档中的所有名称空间声明 - xPath 1.0 vs xPath 2.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!