如何在XPath(lxml)中匹配元素的内容? [英] How do I match contents of an element in XPath (lxml)?

查看:133
本文介绍了如何在XPath(lxml)中匹配元素的内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用XPath表达式用lxml解析HTML.我的问题是匹配标签的内容:

I want to parse HTML with lxml using XPath expressions. My problem is matching for the contents of a tag:

例如给出

<a href="http://something">Example</a>

元素,我可以使用

.//a[@href='http://something']

但是给定表达式

.//a[.='Example']

甚至

.//a[contains(.,'Example')]

lxml抛出无效节点谓词"异常.

lxml throws the 'invalid node predicate' exception.

我在做什么错了?

示例代码:

from lxml import etree
from cStringIO import StringIO

html = '<a href="http://something">Example</a>'
parser = etree.HTMLParser()
tree   = etree.parse(StringIO(html), parser)

print tree.find(".//a[text()='Example']").tag

预期输出为"a".我收到"SyntaxError:无效的节点谓词"

Expected output is 'a'. I get 'SyntaxError: invalid node predicate'

推荐答案

我会尝试:

.//a[text()='Example']

使用xpath()方法:

using xpath() method:

tree.xpath(".//a[text()='Example']")[0].tag

如果要使用iterfind(),findall(),find(),findtext(),请记住,中不提供值比较和函数等高级功能href ="http://effbot.org/zone/element-xpath.htm" rel ="noreferrer"> ElementPath .

If case you would like to use iterfind(), findall(), find(), findtext(), keep in mind that advanced features like value comparison and functions are not available in ElementPath.

lxml.etree 支持简单路径 find,findall和 ElementTree上的findtext方法和 元素,从原著中得知 ElementTree库(ElementPath).作为 lxml特定的扩展名,这些 类还提供了xpath()方法 支持中的表达式 完整的XPath语法,以及 自定义扩展功能.

lxml.etree supports the simple path syntax of the find, findall and findtext methods on ElementTree and Element, as known from the original ElementTree library (ElementPath). As an lxml specific extension, these classes also provide an xpath() method that supports expressions in the complete XPath syntax, as well as custom extension functions.

这篇关于如何在XPath(lxml)中匹配元素的内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆