XPath:通过*纯*文本查找HTML元素 [英] XPath: Find HTML element by *plain* text
问题描述
请注意:可以找到此问题的更精致版本,并提供适当的答案此处.
我想使用 Selenium Python 绑定在网页上查找具有给定文本的元素.例如,假设我有以下 HTML:
<头>...</头><身体><someElement>这可以找到</someElement><someOtherElement>这可以<em>not</em>找到</someOtherElement></html>
我需要按文本进行搜索,并且能够使用以下 XPath 找到
:
//*[contains(text(), '这个可以找到')]
我正在寻找一个类似的 XPath,它可以让我使用 plain 文本 "This can not be found"
找到
.以下方法不起作用:
//*[contains(text(), '这个无法找到')]
我理解这是因为嵌套的 em
元素中断"了This can not be found"的文本流.是否有可能通过 XPaths 以某种方式忽略与上面类似的嵌套?
您可以使用 //*[contains(., 'This can not be found')]
.
上下文节点.
将在与'This can not be found'进行比较之前转换为其字符串表示.
小心,因为您使用的是 //*
,因此它将匹配包含此字符串的 ALL 全局元素.
在您的示例中,它将匹配:
- 和
- 和
!
您可以通过定位文档中的特定元素标签或特定部分(具有已知 ID 或类的 在关于如何找到与文本条件匹配的最多嵌套元素的评论中编辑 OP 的问题: 接受的答案这里建议 结合您的子字符串条件后,我能够在此处进行测试本文件 使用这个 XPath 2.0 表达式 并且它匹配包含This can not be found most nested"的元素. 可能有一种更优雅的方式来做到这一点. Please note: A more refined version of this question, with an appropriate answer can be found here. I would like to use the Selenium Python bindings to find elements with a given text on a web page. For example, suppose I have the following HTML: I need to search by text and am able to find I am looking for a similar XPath that lets me find I understand that this is because of the nested You can use The context node Be careful though since you are using In your example case, it will match: You could restrict this by targeting specific element tags or specific section in your document (a Edit for the OP's question in comment on how to find the most nested elements matching the text condition: The accepted answer here suggests When combined with your substring condition, I was able to test it here with this document and with this XPath 2.0 expression And it matches the element containing "This can not be found most nested". There probably is a more elegant way to do that. 这篇关于XPath:通过*纯*文本查找HTML元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! 或
//*[count(ancestor::*) = max(///*/count(ancestor::*))]
选择嵌套最多的元素.我认为它只是 XPath 2.0.<头>...</头><身体><someElement>这可以找到</someElement><嵌套><someOtherElement>这可以<em>not</em>被发现最嵌套</someOtherElement></嵌套><someOtherElement>这可以<em>not</em>找到</someOtherElement></html>
//*[contains(., '无法找到')][count(ancestor::*) = max(///*/count(./*[contains(., '无法找到')]/ancestor::*))]
<html>
<head>...</head>
<body>
<someElement>This can be found</someElement>
<someOtherElement>This can <em>not</em> be found</someOtherElement>
</body>
</html>
<someElement>
using the following XPath://*[contains(text(), 'This can be found')]
<someOtherElement>
using the plain text "This can not be found"
. The following does not work://*[contains(text(), 'This can not be found')]
em
element that "disrupts" the text flow of "This can not be found". Is it possible via XPaths to, in a way, ignore such or similar nestings as the one above?//*[contains(., 'This can not be found')]
..
will be converted to its string representation before comparison to 'This can not be found'.//*
, so it will match ALL englobing elements that contain this string.
<someOtherElement>
<body>
<html>
!<table>
or <div>
with a known id or class)
//*[count(ancestor::*) = max(//*/count(ancestor::*))]
to select the most nested element. I think it's only XPath 2.0.<html>
<head>...</head>
<body>
<someElement>This can be found</someElement>
<nested>
<someOtherElement>This can <em>not</em> be found most nested</someOtherElement>
</nested>
<someOtherElement>This can <em>not</em> be found</someOtherElement>
</body>
</html>
//*[contains(., 'This can not be found')]
[count(ancestor::*) = max(//*/count(./*[contains(., 'This can not be found')]/ancestor::*))]
登录
关闭