获取< a>的文本当XPath埋入其他标签时&LT;强&GT; [英] Getting the the text of an <a> with XPath when it's buried in another tag e.g.

查看：102 发布时间：2018/6/25 13:43:23 html xml xpath xhtml

本文介绍了获取< a>的文本当XPath埋入其他标签时&LT;强&GT;的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

以下XPath通常足以匹配所有文本包含特定字符串的锚：

  // a [contains（ （），'SENIOR ASSOCIATES'）]

给出这样的例子：

< strong&
SENIOR ASSOCIATES 
< / a>

文字包装在 ，在锚点关闭之前还有一个  ，所以上面的XPath不会返回任何结果。

XPath如何进行调整，以便它允许包含 < a> $ c>， ， ， 等等，而仍然在标准情况下工作？
解决方案不要使用 text（）。 // a [contains（。，'SENIOR ASSOCIATES'） ] 与您可能认为的相反， text（）不会给你一个元素的文本。这是一个节点测试，即一个表达式，一个元素的实际节点（！）列表，即文本节点子元素。这里： < a href =http：// www.freshminds.net/job/senior-associate/\"> SENIOR ASSOCIATES < / a> 没有 a 的文本节点子元素，。所有文本节点都是 strong 的子项。所以 text（）给你零结点。这里： < a href =http://www.freshminds.net/job/senior-associate/> <强> SENIOR ASSOCIATES < / a> 有一个 a 。它是空的（如仅限空白）。。另一方面仅选择一个节点（上下文节点，< a> 本身）。现在， contains（）需要字符串作为参数。如果一个参数不是字符串，则首先完成对字符串的转换。将节点集（由1个或多个节点组成）转换为字符串是通过将所有集合^（*）中第一个节点的文本节点后代。因此，使用。（或者其更明确的等价的 string（。））给你 SENIOR ASSOCIATES 被一堆空白包围，因为XML中有一堆空白。为了消除这个空格，使用 normalize-space（）函数： // a [contains（normalize-space（。），'SENIOR ASSOCIATES'）] 或更短，因为当前节点是此函数的默认值： // a [contains（normalize-space（），'SENIOR ASSOCIATES'）] ^（*）这就是为什么使用 // a [contains（.// text（），' SENIOR ASSOCIATES'）] 可以在上面两个样本中的第一个样本中工作，但不在第二个样本中。 The following XPath is usually sufficient for matching all anchors whose text contains a certain string: //a[contains(text(), 'SENIOR ASSOCIATES')] Given a case like this though: <a href="http://www.freshminds.net/job/senior-associate/"> SENIOR ASSOCIATES </a> The text is wrapped in a , also there's also a before the anchor closes, and so the above XPath returns nothing. How can the XPath be adapted so that it allows for the <a> containing additional tags such as , , , etc. while still working in the standard case? 解决方案 Don't use text(). //a[contains(., 'SENIOR ASSOCIATES')] Contrary to what you might think, text() does not give you the text of an element. It is a node test, i.e. an expression that selects a list of actual nodes (!), namely the text node children of an element. Here: <a href="http://www.freshminds.net/job/senior-associate/"> SENIOR ASSOCIATES </a> there are no text node children of a. All the text nodes are children of strong. So text() gives you zero nodes. Here: <a href="http://www.freshminds.net/job/senior-associate/"> SENIOR ASSOCIATES </a> there is one text node child of a. It's empty (as in "whitespace only"). . on the other hand selects only one node (the context node, the <a> itself). Now, contains() expects strings as its arguments. If one argument is not a string, a conversion to string is done first. Converting a node set (consisting of 1 or more nodes) to string is done by concatenating all text node descendants of the first node in the set^(*). Therefore using . (or its more explicit equivalent string(.)) gives you SENIOR ASSOCIATES surrounded by a bunch of whitespace, because there is a bunch of whitespace in your XML. To get rid of that whitespace, use the normalize-space() function: //a[contains(normalize-space(.), 'SENIOR ASSOCIATES')] or, shorter, because "the current node" is the default for this function: //a[contains(normalize-space(), 'SENIOR ASSOCIATES')] ^(*) That's the reason why using //a[contains(.//text(), 'SENIOR ASSOCIATES')] would work in the first of the two samples above but not in the second one. 这篇关于获取< a>的文本当XPath埋入其他标签时&LT;强&GT;的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！
查看全文

获取< a>的文本当XPath埋入其他标签时&LT;强&GT; [英] Getting the the text of an <a> with XPath when it's buried in another tag e.g. <strong>

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

获取&lt; a&gt;的文本当XPath埋入其他标签时&LT;强&GT; [英] Getting the the text of an &lt;a&gt; with XPath when it&#39;s buried in another tag e.g. &lt;strong&gt;

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

获取< a>的文本当XPath埋入其他标签时&LT;强&GT; [英] Getting the the text of an <a> with XPath when it's buried in another tag e.g. <strong>

登录关闭