XPath contains(text(),'some string') 与具有多个 Text 子节点的节点一起使用时不起作用 [英] XPath contains(text(),'some string') doesn't work when used with node with more than one Text subnode

查看:21
本文介绍了XPath contains(text(),'some string') 与具有多个 Text 子节点的节点一起使用时不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的 Xpath contains with dom4j 有一个小问题......

I have a small problem with Xpath contains with dom4j ...

假设我的 XML 是

Lets say my XML is

<Home>
    <Addr>
        <Street>ABC</Street>
        <Number>5</Number>
        <Comment>BLAH BLAH BLAH <br/><br/>ABC</Comment>
    </Addr>
</Home>

假设我想在给定根元素的文本中找到所有具有 ABC 的节点...

Lets say I want to find all the nodes that have ABC in the text given the root Element...

所以我需要编写的 xpath 是

So the xpath that i would needed to write would be

//*[contains(text(),'ABC')]

然而,这不是 Dom4j 返回的内容......这是一个 dom4j 问题还是我对 xpath 工作原理的理解.因为该查询只返回 Street Element 而不是 Comment 元素.

However this is not what Dom4j returns .... is this a dom4j problem or my understanding how xpath works. since that query returns only the Street Element and not the Comment element.

DOM 使 Comment 元素成为具有四个标签的复合元素 2

The DOM makes the Comment element a composite element with four tags two

[Text = 'XYZ'][BR][BR][Text = 'ABC'] 

我认为查询仍然应该返回元素,因为它应该找到元素并在其上运行 contains 但它没有......

I would assume that the query should still return the element since it should find the element and run contains on it but it doesn't ... ...

以下查询返回元素,但它返回的不仅仅是元素,它还返回父元素......这对问题来说是不受欢迎的......

the following query returns the element but it returns far more then just the element, it returns the parent elements as well ... which is undesirable to the problem ...

//*[contains(text(),'ABC')]

有人知道只返回元素 的 xpath 查询吗?

Does any one know the xpath query that would return just the Elements <Street/> and <Comment/> ?

推荐答案

标签包含两个文本节点和两个
节点作为子节点.

The <Comment> tag contains two text nodes and two <br> nodes as children.

你的 xpath 表达式是

Your xpath expression was

//*[contains(text(),'ABC')]

分解一下,

  1. * 是一个匹配任何元素(即标签)的选择器——它返回一个节点集.
  2. [] 是在该节点集中的每个单独节点上运行的条件.如果它操作的任何单个节点与括号内的条件匹配,则匹配.
  3. text() 是一个选择器,它匹配作为上下文节点的所有子节点的所有文本节点——它返回一个节点集.
  4. contains 是一个对字符串进行操作的函数.如果传入一个节点集,该节点集通过返回转换为字符串节点集中节点的字符串值,在文档顺序中排在第一位.因此,它只能匹配 元素中的第一个文本节点——即 BLAH BLAH BLAH.由于这不匹配,因此您不会在结果中看到 .
  1. * is a selector that matches any element (i.e. tag) -- it returns a node-set.
  2. The [] are a conditional that operates on each individual node in that node set. It matches if any of the individual nodes it operates on match the conditions inside the brackets.
  3. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set.
  4. contains is a function that operates on a string. If it is passed a node set, the node set is converted into a string by returning the string-value of the node in the node-set that is first in document order. Hence, it can match only the first text node in your <Comment> element -- namely BLAH BLAH BLAH. Since that doesn't match, you don't get a <Comment> in your results.

你需要把它改成

//*[text()[contains(.,'ABC')]]

  1. * 是一个匹配任何元素(即标签)的选择器——它返回一个节点集.
  2. 外部 [] 是一个条件,作用于该节点集中的每个单独节点——这里它作用于文档中的每个元素.
  3. text() 是一个选择器,它匹配作为上下文节点的所有子节点的所有文本节点——它返回一个节点集.
  4. 内部 [] 是一个条件,在该节点集中的每个节点上运行——这里是每个单独的文本节点.每个单独的文本节点都是括号中任何路径的起点,也可以在括号内显式地称为 ..如果它操作的任何单个节点与括号内的条件匹配,则匹配.
  5. contains 是一个对字符串进行操作的函数.这里传递了一个单独的文本节点 (.).由于它单独传递了 标记中的第二个文本节点,因此它将看到 'ABC' 字符串并能够匹配它.
  1. * is a selector that matches any element (i.e. tag) -- it returns a node-set.
  2. The outer [] are a conditional that operates on each individual node in that node set -- here it operates on each element in the document.
  3. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set.
  4. The inner [] are a conditional that operates on each node in that node set -- here each individual text node. Each individual text node is the starting point for any path in the brackets, and can also be referred to explicitly as . within the brackets. It matches if any of the individual nodes it operates on match the conditions inside the brackets.
  5. contains is a function that operates on a string. Here it is passed an individual text node (.). Since it is passed the second text node in the <Comment> tag individually, it will see the 'ABC' string and be able to match it.

这篇关于XPath contains(text(),'some string') 与具有多个 Text 子节点的节点一起使用时不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆